Release idAll changesDateNumber of IFEs
0.22 (current)1 changes2011-06-184189
0.212 changes2011-06-114154
0.205 changes2011-06-044146
0.1913 changes2011-05-284142
0.186 changes2011-05-214130
0.1717 changes2011-05-144120
0.1631 changes2011-05-074116
0.156 changes2011-04-304106
0.144 changes2011-04-234086
0.134 changes2011-04-164064
0.123 changes2011-04-114058
0.1112 changes2011-04-094058
0.101 changes2011-04-024040
0.93 changes2011-03-264036
0.84 changes2011-03-194036
0.73 changes2011-03-124035
0.66 changes2011-03-054031
0.52 changes2011-02-264025
0.40 changes2011-02-193993
0.320 changes2011-02-163993
0.23 changes2011-02-123972
0.13891 changes2011-02-053970

The Representative Sets of RNA 3D Structures organize all RNA-containing 3D structures from PDB into sequence/structure equivalence classes and selects a high-quality representative structure from each class. The resulting Representative Sets of RNA 3D structures are appropriate for tasks which require searching or training over the breadth of the entire RNA 3D structure database, but which should avoid the redundancy inherent in PDB due to multiple 3D structures of the same molecule from the same organism. Equivalence classes show all structures of the same molecule, and the associated heat maps show all-against-all geometric comparisons of the structures within each class.

Releases are generated weekly, and previous releases are available starting from 2011. The default listing shows structures at 4 Angstrom resolution or better, but different resolution thresholds are available for each release. The set of representative structures can be viewed online along with information about the resolution, experimental method, molecule name, species, and number of equivalent structures. Releases can also be downloaded and parsed by computer programs. Some weeks, when many new structures are released, the representative set listing can be delayed because of the time it takes to compute all-against-all geometric comparisons within large equivalences classes such as Thermus thermophilus small ribosomal subunit.

With Release 3.0, we modified the procedure for choosing the representative of each equivalence class. The representative is now chosen as the IFE (Integrated Functional Element) which optimizes a combination of resolution, RSR, RSCC, Rfree, percent of nucleotides with steric clashes, and the fraction of the molecule observed. The intention is to select the structure with the best experimental evidence for the coordinates being reported. Details will be provided in an upcoming publication.

Individual chains are named in the format XXXX|M|C, where XXXX is the PDB entry, M is the model number, usually 1, and C is the chain identifier, one to four characters. IFEs are made up of individual chains linked with + signs.

With Release 2.0, we upgraded the BGSU RNA 3D Hub Site to include new RNA 3D structures distributed in mmCIF format. Some RNA-containing mmCIF structures are very large, containing multiple full ribosomes. Most of the individual RNA molecules of interest occur as single covalently-bonded chains (e.g., tRNA, small ribosomal subunit) but others occur as two or more chains that are strongly coupled by persistent RNA basepairing (e.g., eukaryotic large ribosomal subunit with 5.8S RNA). We refer to these single or coupled chains as Integrated Functional Elements (IFEs). We extract IFEs from each 3D structure file, compare them to one another by sequence and geometry, and group together those which share highly similar sequence, geometry, and species, if known. The groups are referred to as sequence/structure Equivalence Classes. Before Release 3.0, the representative of each equivalence class was chosen as the structure with the most annotated basepairs per nucleotide, as a proxy for modeling quality.

Note that the representative sets were formerly referred to as non-redundant lists, but in fact these lists have one instance of homologous IFEs from each species, so they have some redundancy at the level of molecule.

Unique and stable ids are assigned to all equivalence classes of structure files. Representative sets are updated automatically every week, and a versioning system is implemented to provide independent access to data snapshots.

Notice PDB files with no full nucleotides are not included in the representative sets. For example, see PDB 1DV4.


Please use the following citation when using this resource:

Leontis, N. B., & Zirbel, C. L. (2012). Nonredundant 3D Structure Datasets for RNA Knowledge Extraction and Benchmarking. In RNA 3D Structure Analysis and Prediction N. Leontis & E. Westhof (Eds.), (Vol. 27, pp. 281–298). Springer Berlin Heidelberg. doi:10.1007/978-3-642-25740-7_13

Copyright 2024 BGSU RNA group. Page generated in 0.9954 s