🌐 English / Беларуская / Українська / Русский

RiboGrove is a database of 16S rRNA gene sequences of bacteria and archaea.
RiboGrove is based on the RefSeq database. It contains only full-length sequences of 16S rRNA genes, and the sequences are derived from completely assembled prokaryotic genomes deposited in RefSeq. Hence we posit high reliability of RiboGrove sequences.
Here is a summary showing what is the (qualitative) difference between RiboGrove and similar rRNA sequence databases, namely rrnDB, Silva, RDP, and Greengenes. Briefly, RiboGrove is inferior in sequence amount and diversity, but superior in sequence reliability.
| RiboGrove | rrnDB | Silva | RDP | Greengenes | |
|---|---|---|---|---|---|
| Represented organisms | Bacteria Archaea | Bacteria Archaea | Bacteria Archaea Eukaryotes | Bacteria Archaea Eukaryotes | Bacteria Archaea |
| Represented ribosome subunits | Small | Small | Large Small | Large Small | Small |
| Contains sequences from assembled genomes | Yes | Yes | Yes | Yes | Yes |
| Contains amplicon sequences | No | No | Yes | Yes | Yes |
| Contains partial gene sequences | No | Yes | Yes | Yes | Yes |
| Discriminates genome categories | Yes | No | Not applicable | Not applicable | Not applicable |
All genomes used for RiboGrove construction were divided into three categories according to their expected reliability:
Signs of a low-quality assembly are the following:
The software used for the RiboGrove construction can be found in the following GitHub repository: ribogrove-tools.
The release is based on RefSeq release 233.
DOI of RiboGrove release 27.233: 10.5281/zenodo.17273649.
The metadata consists of the following files:
The fasta file is compressed with gzip, and the metadata file is a zip archive. To uncompress them, Linux and Mac OS users may use gzip and zip programs, they should be built-in. For Windows users, the free and open-source (de)compression program 7-Zip is available.
You can find all releases in the RiboGrove release archive.
No important differences from the previous release.
You can find notes to all RiboGrove releases on the release notes page.
| Bacteria | Archaea | Total | |
|---|---|---|---|
| Number of gene sequences | 295,630 | 1,137 | 296,767 |
| Number of unique gene sequences | 69,363 | 789 | 70,152 |
| Number of species | 13,503 | 507 | 14,010 |
| Number of genomes | 53,779 | 643 | 54,422 |
| Number of genomes of category 1 | 35,489 | 266 | 35,755 |
| Number of genomes of category 2 | 18,012 | 377 | 18,389 |
| Number of genomes of category 3 | 278 | 0 | 278 |
| Bacteria | Archaea | |
|---|---|---|
| Minimum (bp) | 1,401.00 | 1,439.00 |
| 25th percentile (bp) * | 1,517.00 | 1,471.00 |
| Median (bp) * | 1,529.00 | 1,473.50 |
| 75th percentile (bp) * | 1,542.00 | 1,483.00 |
| Average (bp) * | 1,527.17 | 1,490.92 |
| Mode (bp) * | 1,537.00 | 1,472.00 |
| Maximum (bp) | 2,438.00 | 3,604.00 |
| Standard deviation (bp) * | 25.02 | 119.76 |
* Metrics marked with an asterisk were calculated with preliminary normalization, i.e. median within-species gene length was used for the summary.
| Copy number * | Bacteria | Archaea | ||
|---|---|---|---|---|
| Number of species | Percent of species (%) | Number of species | Percent of species (%) | |
| 1 | 1,673 | 12.39 | 250 | 49.31 |
| 2 | 2,312 | 17.12 | 150 | 29.59 |
| 3 | 1,829 | 13.55 | 82 | 16.17 |
| 4 | 1,732 | 12.83 | 19 | 3.75 |
| 5 | 1,080 | 8.00 | 6 | 1.18 |
| 6 | 1,765 | 13.07 | 0 | 0.00 |
| 7 | 1,207 | 8.94 | 0 | 0.00 |
| 8 | 685 | 5.07 | 0 | 0.00 |
| 9 | 347 | 2.57 | 0 | 0.00 |
| 10 | 333 | 2.47 | 0 | 0.00 |
| 11 | 163 | 1.21 | 0 | 0.00 |
| 12 | 148 | 1.10 | 0 | 0.00 |
| 13 | 62 | 0.46 | 0 | 0.00 |
| 14 | 92 | 0.68 | 0 | 0.00 |
| 15 | 27 | 0.20 | 0 | 0.00 |
| 16 | 12 | 0.09 | 0 | 0.00 |
| 17 | 13 | 0.10 | 0 | 0.00 |
| 18 | 6 | 0.04 | 0 | 0.00 |
| 19 | 3 | 0.02 | 0 | 0.00 |
| 20 | 8 | 0.06 | 0 | 0.00 |
| 21 | 1 | 0.01 | 0 | 0.00 |
| 22 | 1 | 0.01 | 0 | 0.00 |
| 24 | 1 | 0.01 | 0 | 0.00 |
| 25 | 1 | 0.01 | 0 | 0.00 |
| 27 | 1 | 0.01 | 0 | 0.00 |
| 37 | 1 | 0.01 | 0 | 0.00 |
* These are median within-species copy numbers.
| Organism | Gene length (bp) | RiboGrove Sequence ID(s) | Assembly accession |
|---|---|---|---|
| Bacteria | |||
| Thermus thermophilus strain AA2-2 | 2,438 | GCF_019974355.1:NZ_AP024929.1:249100-251537:minus | GCF_019974355.1 |
| Ca. Annandia pinicola strain Ad13-065 | 1,887 | GCF_020541245.1:NZ_CP045876.1:290071-291957:minus | GCF_020541245.1 |
| Thermoanaerobacter ethanolicus strain JW 200 | 1,812 | GCF_003722315.1:NZ_CP033580.1:456062-457873:plus | GCF_003722315.1 |
| Nitrosophilus labii strain HRV44 | 1,806 | GCF_014466985.1:NZ_AP022826.1:1939914-1941719:minus GCF_014466985.1:NZ_AP022826.1:1532588-1534393:minus GCF_014466985.1:NZ_AP022826.1:1258017-1259822:minus |
GCF_014466985.1 |
| Agarivorans sp. Z349TD_7 | 1,803 | GCF_050870845.2:NZ_CP194040.2:4273139-4274941:minus | GCF_050870845.2 |
| Agarivorans sp. QJM3NY_30 | 1,803 | GCF_050870855.2:NZ_CP194038.2:4273147-4274949:minus | GCF_050870855.2 |
| Agarivorans sp. QJM3NY_29 | 1,803 | GCF_050870835.2:NZ_CP194036.2:4273146-4274948:minus | GCF_050870835.2 |
| Sporomusa rhizae strain DSM 16652 | 1,802 | GCF_041428845.1:NZ_CP156925.1:3123180-3124981:minus | GCF_041428845.1 |
| Gelria sp. Kuro-4 | 1,788 | GCF_019668485.1:NZ_AP024619.1:2016182-2017969:minus | GCF_019668485.1 |
| Helicobacter mastomyrinus strain Hm-17 | 1,785 | GCF_039555295.1:NZ_CP145316.1:765140-766924:minus | GCF_039555295.1 |
| Archaea | |||
| Organism | Gene length (bp) | RiboGrove Sequence ID(s) | Assembly accession |
|---|---|---|---|
| Pyrobaculum ferrireducens strain 1860 | 3,604 | GCF_000234805.1:NC_016645.1:127214-130817:plus | GCF_000234805.1 |
| Pyrobaculum aerophilum strain IM2 | 2,213 | GCF_000007225.1:NC_003364.1:1089640-1091852:plus | GCF_000007225.1 |
| Pyrobaculum arsenaticum strain DSM 13514 | 2,212 | GCF_000016385.1:NC_009376.1:623323-625534:minus | GCF_000016385.1 |
| Aeropyrum pernix strain K1 | 2,202 | GCF_000011125.1:NC_000854.2:1218712-1220913:minus | GCF_000011125.1 |
| Pyrobaculum neutrophilum strain V24Sta | 2,197 | GCF_000019805.1:NC_010525.1:690419-692615:plus | GCF_000019805.1 |
| Ca. Mancarchaeum acidiphilum strain Mia14 | 2,008 | GCF_002214165.1:NZ_CP019964.1:751297-753304:minus | GCF_002214165.1 |
| Ca. Micrarchaeum sp. A_DKE | 2,003 | GCF_016806735.1:NZ_CP060530.1:203642-205644:minus | GCF_016806735.1 |
| Caldivirga maquilingensis strain IC-167 | 1,679 | GCF_000018305.1:NC_009954.1:129150-130828:minus | GCF_000018305.1 |
| Aeropyrum camini strain SY1 | 1,650 | GCF_000591035.1:NC_022521.1:1165168-1166817:minus | GCF_000591035.1 |
| Pyrolobus fumarii strain 1A | 1,576 | GCF_000223395.1:NC_015931.1:84671-86246:minus | GCF_000223395.1 |
| Organism | Gene length (bp) | RiboGrove Sequence ID(s) | Assembly accession |
|---|---|---|---|
| Bacteria | |||
| Anabaena sp. YBS01 | 1,401 | GCF_009498015.1:NZ_CP034058.1:6920299-6921699:minus | GCF_009498015.1 |
| Bacillus paralicheniformis strain TB197 | 1,417 | GCF_054134005.1:NZ_CP192612.1:4277038-4278454:minus | GCF_054134005.1 |
| Clostridioides difficile strain TW11 | 1,426 | GCF_009362915.1:NZ_CP045224.1:4068440-4069865:minus | GCF_009362915.1 |
| Roseicitreum antarcticum strain ZS2-28 | 1,447 | GCF_014681765.1:NZ_CP061498.1:3436150-3437596:plus | GCF_014681765.1 |
| Hirschia baltica strain ATCC 49814 | 1,448 | GCF_000023785.1:NC_012982.1:2336679-2338126:minus | GCF_000023785.1 |
| Sagittula stellata strain E-37 | 1,449 | GCF_039724765.1:NZ_CP155729.1:664616-666064:plus GCF_039724765.1:NZ_CP155729.1:1804792-1806240:plus |
GCF_039724765.1 |
| Mameliella sp. | 1,449 | GCF_965277915.1:NZ_OZ255849.1:4859504-4860952:plus GCF_965277915.1:NZ_OZ255849.1:1028793-1030241:plus GCF_965277915.1:NZ_OZ255849.1:2596915-2598363:minus |
GCF_965277915.1 |
| Mameliella alba strain KU6B | 1,449 | GCF_011405015.1:NZ_AP022337.1:1420943-1422391:plus GCF_011405015.1:NZ_AP022337.1:267140-268588:plus GCF_011405015.1:NZ_AP022337.1:3191212-3192660:minus |
GCF_011405015.1 |
| Mameliella sp. | 1,449 | GCF_965249415.1:NZ_OZ252233.1:3463560-3465008:minus GCF_965249415.1:NZ_OZ252233.1:1895495-1896943:plus GCF_965249415.1:NZ_OZ252233.1:702863-704311:plus |
GCF_965249415.1 |
| Sagittula sp. MA-2 | 1,449 | GCF_030126985.1:NZ_CP126145.1:2907211-2908659:minus GCF_030126985.1:NZ_CP126145.1:439-1887:plus |
GCF_030126985.1 |
| Organism | Gene length (bp) | RiboGrove Sequence ID(s) | Assembly accession |
|---|---|---|---|
| Mameliella sp. | 1,449 | GCF_965212485.1:NZ_OZ243118.1:3042962-3044410:plus GCF_965212485.1:NZ_OZ243118.1:4611080-4612528:minus GCF_965212485.1:NZ_OZ243118.1:780420-781868:minus |
GCF_965212485.1 |
| Sagittula sp. P11 | 1,449 | GCF_002814095.1:NZ_CP021913.1:2386837-2388285:plus GCF_002814095.1:NZ_CP021913.1:3597920-3599368:plus |
GCF_002814095.1 |
| Archaea | |||
| Organism | Gene length (bp) | RiboGrove Sequence ID(s) | Assembly accession |
|---|---|---|---|
| Ignicoccus hospitalis strain KIN4/I | 1,439 | GCF_000017945.1:NC_009776.1:728362-729800:plus | GCF_000017945.1 |
| Methanocaldococcus lauensis strain SG7 | 1,457 | GCF_902827225.1:NZ_LR792632.1:542755-544211:plus | GCF_902827225.1 |
| Halorubrum sp. BOL3-1 | 1,463 | GCF_004114375.1:NZ_CP034692.1:397753-399215:minus | GCF_004114375.1 |
| Methanomethylophilus alvi strain MGYG-HGUT-02456 | 1,466 | GCF_902387285.1:NZ_LR699000.1:283607-285072:plus | GCF_902387285.1 |
| Methanospirillum purgamenti strain GP1 | 1,466 | GCF_019263745.1:NZ_CP077107.1:1365502-1366967:minus GCF_019263745.1:NZ_CP077107.1:1359562-1361027:minus GCF_019263745.1:NZ_CP077107.1:1986020-1987485:minus GCF_019263745.1:NZ_CP077107.1:4649-6114:plus |
GCF_019263745.1 |
| Ca. Methanomethylophilus alvi strain Mx1201 | 1,466 | GCF_000300255.2:NC_020913.1:283607-285072:plus | GCF_000300255.2 |
| Natronomonas halophila strain C90 | 1,466 | GCF_013391085.1:NZ_CP058334.1:1530622-1532087:minus | GCF_013391085.1 |
| Salinirubellus litoreus strain SYNS196 | 1,466 | GCF_037335815.1:NZ_CP147841.1:597195-598660:minus | GCF_037335815.1 |
| Methanomethylophilus alvi strain Mx-05 | 1,466 | GCF_003711245.1:NZ_CP017686.1:283608-285073:plus | GCF_003711245.1 |
| Natronomonas gomsonensis strain KCTC 4088 | 1,466 | GCF_024300825.1:NZ_CP101323.1:2500564-2502029:plus | GCF_024300825.1 |
| Organism | Gene length (bp) | RiboGrove Sequence ID(s) | Assembly accession |
|---|---|---|---|
| Natronomonas marina strain ZY43 | 1,466 | GCF_024298905.1:NZ_CP101154.1:18680-20145:plus | GCF_024298905.1 |
| Salinirubellus salinus strain ZS-35-S2 | 1,466 | GCF_025231485.1:NZ_CP104003.1:3070232-3071697:plus | GCF_025231485.1 |
| Methanospirillum purgamenti strain J.3.6.1-F.2.7.3 | 1,466 | GCF_018502485.1:NZ_CP075546.1:133354-134819:plus GCF_018502485.1:NZ_CP075546.1:872641-874106:plus GCF_018502485.1:NZ_CP075546.1:825954-827419:plus GCF_018502485.1:NZ_CP075546.1:1727419-1728884:plus |
GCF_018502485.1 |
| Methanospirillum hungatei strain JF-1 | 1,466 | GCF_000013445.1:NC_007796.1:39814-41279:plus GCF_000013445.1:NC_007796.1:3501525-3502990:minus GCF_000013445.1:NC_007796.1:3507609-3509074:minus GCF_000013445.1:NC_007796.1:1301079-1302544:minus |
GCF_000013445.1 |
| Methanospirillum stamsii strain Pt1 | 1,466 | GCF_046244385.1:NZ_CP176366.1:2035802-2037267:plus GCF_046244385.1:NZ_CP176366.1:3625347-3626812:minus GCF_046244385.1:NZ_CP176366.1:1311724-1313189:plus GCF_046244385.1:NZ_CP176366.1:2042927-2044392:plus |
GCF_046244385.1 |
| Organism | Copy number | Assembly accession | |
|---|---|---|---|
| Bacteria | |||
| Tumebacillus avium strain AR23208 | 37 | GCF_002162355.1 | |
| Tumebacillus algifaecis strain THMBR28 | 27 | GCF_002243515.1 | |
| Photobacterium piscicola strain WVL24019 | 25 | GCF_046058925.1 | |
| Photobacterium phosphoreum strain MIP2473 | 24 | GCF_949787665.1 | |
| Mesobacillus maritimus strain ADH-29 | 22 | GCF_044803185.1 | |
| Photobacterium leiognathi strain Sr3.21 | 21 | GCF_048537525.1 | |
| Photobacterium damselae strain Pdd1411 | 21 | GCF_030168855.1 | |
| Photobacterium leiognathi strain Sr3.10 | 21 | GCF_048537505.1 | |
| Photobacterium damselae strain Phdp Wu-1 | 21 | GCF_003130755.1 | |
| Peribacillus asahii strain KF4 | 21 | GCF_023823975.1 | |
| Organism | Copy number | Assembly accession |
|---|---|---|
| Aneurinibacillus sp. Ricciae_BoGa-3 | 21 | GCF_028421645.1 |
| Organism | Copy number | Assembly accession | |
|---|---|---|---|
| Archaea | |||
| Natronorubrum aibiense strain 7-3 | 5 | GCF_009392895.1 | |
| Methanococcoides methylutens strain Q3c | 5 | GCF_052657215.1 | |
| Natrinema sp. SYSU A 869 | 5 | GCF_019879105.1 | |
| Natronorubrum bangense strain JCM 10635 | 5 | GCF_004799645.1 | |
| Methanococcoides orientis strain LMO-1 | 5 | GCF_021184045.1 | |
| Methanoplanus endosymbiosus strain DSM 3599 | 5 | GCF_024662215.1 | |
| Methanolobus sp. ZRKC3 | 5 | GCF_045291275.1 | |
| Methanospirillum purgamenti strain J.3.6.1-F.2.7.3 | 4 | GCF_018502485.1 | |
| Methanospirillum purgamenti strain GP1 | 4 | GCF_019263745.1 | |
| Methanolobus sediminis strain FTZ6 | 4 | GCF_031312595.1 | |
| Organism | Copy number | Assembly accession |
|---|---|---|
| Methanosphaera stadtmanae strain MGYG-HGUT-02164 | 4 | GCF_902384015.1 |
| Methanolobus sp. WCC4 | 4 | GCF_038022665.1 |
| Halomicrobium salinisoli strain LT50 | 4 | GCF_020405185.1 |
| Halomicrobium urmianum strain IBRC-M: 10911 | 4 | GCF_020217425.1 |
| Halomicrobium salinisoli strain TH30 | 4 | GCF_020405245.1 |
| Methanogenium organophilum strain DSM 3596 | 4 | GCF_026684035.1 |
| Methanochimaera problematica strain FWC-SCC4 | 4 | GCF_032878975.1 |
| Methanosphaera stadtmanae strain DSM 3091 | 4 | GCF_000012545.1 |
| Methanogenium sp. S4BF | 4 | GCF_029633965.1 |
| Methanospirillum stamsii strain Pt1 | 4 | GCF_046244385.1 |
| Methanospirillum lacunae strain Ki8-1 | 4 | GCF_046195335.1 |
| Methanolobus mangrovi strain FTZ2 | 4 | GCF_031312535.1 |
| Natrinema thermotolerans strain A29 | 4 | GCF_031165565.1 |
| Methanococcoides sp. FTZ1 | 4 | GCF_052057775.1 |
| Methanospirillum hungatei strain JF-1 | 4 | GCF_000013445.1 |
| Haloarcula marismortui strain ATCC 33800 | 4 | GCF_018200015.1 |
| Haloterrigena salifodinae strain BOL5-1 | 4 | GCF_016906025.1 |
| Natronococcus occultus strain SP4 | 4 | GCF_000328685.1 |
| Methanococcus vannielii strain SB | 4 | GCF_000017165.1 |
| Organism | Sum of entropy * (bits) | Mean entropy * (bits) | Number of variable positions | Gene copy number | Assembly accession |
|---|---|---|---|---|---|
| Bacteria | |||||
| Clostridium perfringens strain A SNU21005 | 780.95 | 0.41 | 1,171 | 9 | GCF_047150065.1 |
| Escherichia coli strain P276M | 433.81 | 0.26 | 569 | 6 | GCF_009762385.1 |
| Listeria monocytogenes strain 10-092876-1155 LM6 | 357.10 | 0.20 | 370 | 3 | GCF_001999045.1 |
| Klebsiella pneumoniae strain GZ-1 | 304.27 | 0.18 | 464 | 8 | GCF_014854815.1 |
| Streptococcus infantis strain SO | 291.50 | 0.18 | 308 | 3 | GCF_021497965.1 |
| Cupriavidus oxalaticus strain USM2A2 | 243.94 | 0.16 | 371 | 6 | GCF_052400445.1 |
| Synechococcus sp. NB0720_010 | 243.35 | 0.16 | 265 | 3 | GCF_023078835.1 |
| Streptomyces griseorubiginosus strain NBC_00586 | 231.55 | 0.15 | 342 | 6 | GCF_036345135.1 |
| Caminibacter mediatlanticus strain TB-2 | 228.78 | 0.15 | 282 | 4 | GCF_005843985.1 |
| Xanthomonas oryzae strain YNCX | 227.74 | 0.15 | 248 | 3 | GCF_024499285.1 |
| Archaea | |||||
| Halomicrobium sp. ZPS1 ** | 137.00 | 0.09 | 137 | 2 | GCF_009217585.1 |
| Halomicrobium urmianum strain IBRC-M: 10911 | 131.55 | 0.09 | 146 | 4 | GCF_020217425.1 |
| Halapricum desulfuricans strain HSR12-2 | 128.00 | 0.09 | 128 | 2 | GCF_017094525.1 |
| Halomicrobium salinisoli strain TH30 | 127.74 | 0.09 | 145 | 4 | GCF_020405245.1 |
| Halapricum desulfuricans strain HSR-Bgl | 127.00 | 0.09 | 127 | 2 | GCF_017094445.1 |
| Halomicrobium mukohataei strain JP60 | 125.81 | 0.09 | 137 | 3 | GCF_004803735.1 |
| Halomicrobium sp. HM KBTZ05 | 124.38 | 0.08 | 134 | 3 | GCF_041530035.1 |
| Halomicrobium salinisoli strain LT50 | 123.31 | 0.08 | 140 | 4 | GCF_020405185.1 |
| Halapricum desulfuricans strain HSR-Est | 111.00 | 0.08 | 111 | 2 | GCF_017094465.1 |
| Halapricum desulfuricans strain HSR12-1 | 109.00 | 0.07 | 109 | 2 | GCF_017094505.1 |
* Entropy is Shannon entropy calculated for each column of the multiple sequence alignment (MSA) of all full-length 16S rRNA genes of a genome. Entropy is then summed up (column “Sum of entropy”) and averaged (column “Mean entropy”).
** Halomicrobium sp. ZPS1 is a quite remarkable case. This genome harbours two 16S rRNA genes, therefore entropy is equal to the number of mismatching nucleotides between sequences of the genes. Respectively, percent of identity between these two gene sequences is 90.70%! This is remarkable because the usual (however arbitrary) genus demarcation threshold of percent of identity is 95%.
* Coverage of a primer pair is the percent of genomes having at least one 16S rRNA gene which can be amplified by PCR using this primer pair. For details, see our paper about RiboGrove.
In the tables below, you can find coverage of primer pairs that are being commonly used to amplify bacterial and archaeal genes (“bacterial” and “archaeal” primers).
You can find a more detailed table in the file primer_pair_genomic_coverage.tsv in the metadata. That table contains coverage not just for phyla, but also for each kingdom, class, order, family, genus, and species. Moreover, that table contains coverage values for additional primer pairs, namely 1115F-1492R, 349f-519r, 1106F-Ar1378R, 1106F-SSU1492Rngs, SSU1ArF-SSU468R, SSU1ArF-SSU520R. In the tables below, they are omitted for brevity.
| Phylum | Number of genomes |
Full gene | V1–V2 | V1–V3 | V3–V4 | V3–V5 | V4 | V4–V5 | V4–V6 | V5–V6 | V5–V7 | V6–V7 | V6–V8 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 27F– 1492R (%) |
27F– 338R (%) |
27F– 534R (%) |
341F– 785R (%) |
341F– 944R (%) |
515F– 806R (%) |
515F– 944R (%) |
515F– 1100R (%) |
784F– 1100R (%) |
784F– 1193R (%) |
939F– 1193R (%) |
939F– 1378R (%) |
||
| Pseudomonadota | 29,495 | 99.49 | 99.31 | 99.47 | 99.82 | 84.09 | 99.77 | 84.26 | 87.53 | 87.21 | 93.65 | 92.75 | 96.38 |
| Bacillota | 12,450 | 99.85 | 99.77 | 99.82 | 99.94 | 95.24 | 99.98 | 95.12 | 99.50 | 98.17 | 97.64 | 98.77 | 99.43 |
| Actinomycetota | 5,542 | 99.91 | 99.13 | 99.75 | 94.73 | 65.34 | 94.51 | 65.10 | 97.17 | 99.78 | 99.86 | 99.86 | 97.19 |
| Bacteroidota | 1,829 | 96.83 | 96.50 | 96.88 | 99.89 | 64.79 | 99.40 | 64.41 | 38.27 | 38.38 | 92.29 | 92.29 | 95.90 |
| Campylobacterota | 1,333 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 99.92 | 99.92 | 99.92 | 99.47 | 99.47 | 99.70 | 99.55 |
| Mycoplasmatota | 859 | 90.45 | 84.63 | 73.81 | 99.07 | 91.62 | 99.19 | 91.97 | 72.18 | 48.54 | 44.24 | 78.35 | 0.70 |
| Spirochaetota | 424 | 57.55 | 57.78 | 58.02 | 93.63 | 99.76 | 93.63 | 99.76 | 99.76 | 72.64 | 72.64 | 89.15 | 45.28 |
| Cyanobacteriota | 396 | 99.75 | 99.75 | 99.75 | 100.00 | 3.79 | 100.00 | 3.79 | 100.00 | 1.26 | 1.26 | 100.00 | 99.75 |
| Fusobacteriota | 246 | 100.00 | 98.78 | 99.59 | 99.59 | 99.59 | 99.59 | 99.59 | 99.59 | 99.59 | 99.59 | 100.00 | 0.00 |
| Chlamydiota | 241 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 100.00 | 94.61 |
| Thermodesulfobacteriota | 161 | 100.00 | 99.38 | 100.00 | 100.00 | 38.51 | 100.00 | 38.51 | 100.00 | 95.65 | 91.93 | 96.27 | 99.38 |
| Verrucomicrobiota | 143 | 99.30 | 0.00 | 99.30 | 100.00 | 13.29 | 100.00 | 13.29 | 100.00 | 1.40 | 1.40 | 98.60 | 98.60 |
| Myxococcota | 124 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 |
| Deinococcota | 98 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 0.00 | 0.00 | 52.04 | 100.00 |
| Planctomycetota | 84 | 100.00 | 26.19 | 100.00 | 100.00 | 61.90 | 100.00 | 61.90 | 0.00 | 0.00 | 0.00 | 2.38 | 0.00 |
| Chloroflexota | 52 | 100.00 | 92.31 | 100.00 | 42.31 | 0.00 | 94.23 | 0.00 | 90.38 | 11.54 | 11.54 | 94.23 | 26.92 |
| Thermotogota | 50 | 100.00 | 98.00 | 100.00 | 100.00 | 8.00 | 100.00 | 8.00 | 100.00 | 0.00 | 0.00 | 52.00 | 98.00 |
| Acidobacteriota | 46 | 97.83 | 97.83 | 97.83 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 73.91 | 54.35 | 80.43 | 100.00 |
| Bdellovibrionota | 44 | 100.00 | 100.00 | 100.00 | 100.00 | 77.27 | 100.00 | 77.27 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 |
| Nitrospirota | 23 | 100.00 | 100.00 | 100.00 | 100.00 | 82.61 | 100.00 | 82.61 | 100.00 | 100.00 | 82.61 | 82.61 | 100.00 |
| Aquificota | 18 | 100.00 | 16.67 | 100.00 | 100.00 | 16.67 | 100.00 | 16.67 | 100.00 | 0.00 | 0.00 | 0.00 | 16.67 |
| Rhodothermota | 16 | 43.75 | 43.75 | 43.75 | 100.00 | 100.00 | 100.00 | 100.00 | 81.25 | 81.25 | 100.00 | 100.00 | 100.00 |
| Chlorobiota | 15 | 100.00 | 100.00 | 100.00 | 100.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 93.33 | 86.67 | 6.67 |
| Ca. Saccharimonadota | 14 | 100.00 | 100.00 | 100.00 | 100.00 | 14.29 | 7.14 | 7.14 | 7.14 | 7.14 | 7.14 | 100.00 | 100.00 |
| Gemmatimonadota | 13 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 |
| Synergistota | 10 | 100.00 | 100.00 | 100.00 | 100.00 | 0.00 | 100.00 | 0.00 | 100.00 | 0.00 | 0.00 | 100.00 | 100.00 |
| Deferribacterota | 6 | 100.00 | 100.00 | 100.00 | 100.00 | 0.00 | 100.00 | 0.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 |
| Elusimicrobiota | 6 | 100.00 | 66.67 | 100.00 | 100.00 | 0.00 | 100.00 | 0.00 | 100.00 | 50.00 | 50.00 | 100.00 | 100.00 |
| Atribacterota | 5 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 0.00 | 0.00 | 100.00 | 100.00 |
| Balneolota | 3 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 |
| Ignavibacteriota | 3 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 |
| Thermomicrobiota | 2 | 100.00 | 100.00 | 100.00 | 100.00 | 0.00 | 100.00 | 0.00 | 100.00 | 0.00 | 0.00 | 50.00 | 50.00 |
| Armatimonadota | 2 | 100.00 | 50.00 | 100.00 | 50.00 | 50.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 |
| Chrysiogenota | 2 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 |
| Dictyoglomota | 2 | 100.00 | 100.00 | 100.00 | 100.00 | 0.00 | 100.00 | 0.00 | 100.00 | 0.00 | 0.00 | 100.00 | 0.00 |
| Fibrobacterota | 2 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 |
| Kiritimatiellota | 2 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 0.00 | 0.00 | 100.00 | 100.00 |
| Thermodesulfobiota | 2 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 0.00 | 0.00 | 100.00 |
| Ca. Omnitrophota | 1 | 100.00 | 100.00 | 100.00 | 100.00 | 0.00 | 100.00 | 0.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 |
| Ca. Fervidibacterota | 1 | 100.00 | 0.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 0.00 | 0.00 | 100.00 | 100.00 |
| Ca. Cloacimonadota | 1 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 |
| Ca. Bipolaricaulota | 1 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 100.00 | 100.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| Ca. Absconditibacteriota | 1 | 100.00 | 0.00 | 100.00 | 100.00 | 0.00 | 100.00 | 0.00 | 0.00 | 0.00 | 100.00 | 0.00 | 0.00 |
| Calditrichota | 1 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 |
| Caldisericota | 1 | 100.00 | 100.00 | 100.00 | 100.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 100.00 |
| Coprothermobacterota | 1 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 100.00 | 100.00 | 0.00 | 0.00 | 0.00 | 100.00 | 0.00 |
| Zhurongbacterota | 1 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| Vulcanimicrobiota | 1 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 0.00 | 0.00 | 100.00 | 100.00 |
| Thermosulfidibacterota | 1 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 0.00 | 0.00 | 100.00 | 100.00 |
| Nitrospinota | 1 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 0.00 | 0.00 | 100.00 | 100.00 |
| Minisyncoccota | 1 | 0.00 | 0.00 | 0.00 | 100.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| Lentisphaerota | 1 | 100.00 | 0.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 0.00 | 0.00 | 100.00 | 100.00 |
| Fidelibacterota | 1 | 100.00 | 100.00 | 100.00 | 100.00 | 0.00 | 100.00 | 0.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 |
| Phylum | Number of genomes |
Full gene | V1–V2 | V1–V3 | V1–V3 | V3–V4 | V3–V4 | V3–V4 | V3–V5 | V3–V5 | V4 | V4–V5 | V5–V7 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| SSU1ArF– SSU1492Rngs (%) |
SSU1ArF– SSU280ArR (%) |
SSU1ArF– SSU470R (%) |
SSU1ArF– A519R (%) |
349f– SSU666ArR (%) |
340f– SSU666ArR (%) |
340f– 806rB (%) |
349f– SSU1000ArR (%) |
340f– SSU1000ArR (%) |
515fB– 806rB (%) |
Parch519f– Arch915r (%) |
A751F– UA1204R (%) |
||
| Methanobacteriota | 469 | 88.91 | 85.93 | 89.13 | 88.91 | 51.81 | 50.75 | 100.00 | 99.36 | 100.00 | 100.00 | 99.57 | 89.77 |
| Thermoproteota | 112 | 96.43 | 98.21 | 100.00 | 100.00 | 71.43 | 97.32 | 99.11 | 68.75 | 92.86 | 100.00 | 99.11 | 98.21 |
| Nitrososphaerota | 33 | 96.97 | 96.97 | 96.97 | 96.97 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 |
| Thermoplasmatota | 19 | 84.21 | 68.42 | 100.00 | 100.00 | 42.11 | 42.11 | 100.00 | 63.16 | 84.21 | 100.00 | 100.00 | 52.63 |
| Ca. Nanohalarchaeota | 4 | 0.00 | 25.00 | 0.00 | 100.00 | 0.00 | 0.00 | 100.00 | 50.00 | 100.00 | 100.00 | 100.00 | 0.00 |
| Promethearchaeota | 3 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 0.00 | 0.00 | 100.00 | 100.00 | 100.00 |
| Microcaldota | 2 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 0.00 | 0.00 | 0.00 |
| Nanobdellota | 1 | 100.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 0.00 | 0.00 | 0.00 |
| Phylum | Number of genomes |
Full gene | V1–V2 | V1–V3 | V1–V3 | V3–V4 | V3–V4 | V3–V4 | V3–V5 | V3–V5 | V4 | V4–V5 | V5–V7 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| SSU1ArF– SSU1492Rngs (%) |
SSU1ArF– SSU280ArR (%) |
SSU1ArF– SSU470R (%) |
SSU1ArF– A519R (%) |
349f– SSU666ArR (%) |
340f– SSU666ArR (%) |
340f– 806rB (%) |
349f– SSU1000ArR (%) |
340f– SSU1000ArR (%) |
515fB– 806rB (%) |
Parch519f– Arch915r (%) |
A751F– UA1204R (%) |
||
| Pseudomonadota | 29,495 | 1.18 | 0.02 | 0.51 | 0.57 | 0.00 | 0.00 | 0.08 | 0.00 | 0.00 | 99.77 | 27.58 | 0.00 |
| Bacillota | 12,450 | 2.38 | 0.06 | 0.12 | 1.36 | 0.02 | 0.00 | 0.06 | 0.01 | 0.00 | 99.98 | 98.47 | 0.00 |
| Actinomycetota | 5,542 | 0.96 | 0.22 | 0.78 | 1.21 | 0.00 | 0.00 | 0.04 | 0.00 | 0.00 | 94.51 | 88.02 | 0.00 |
| Bacteroidota | 1,829 | 1.91 | 0.00 | 1.86 | 1.97 | 0.00 | 0.00 | 0.16 | 0.00 | 0.00 | 99.40 | 99.29 | 0.00 |
| Campylobacterota | 1,333 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 99.92 | 0.15 | 0.00 |
| Mycoplasmatota | 859 | 1.75 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 99.19 | 80.09 | 0.00 |
| Spirochaetota | 424 | 0.47 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 93.63 | 93.40 | 0.00 |
| Cyanobacteriota | 396 | 3.03 | 0.00 | 0.25 | 0.25 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
| Fusobacteriota | 246 | 0.41 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 99.59 | 99.59 | 0.00 |
| Chlamydiota | 241 | 1.66 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| Thermodesulfobacteriota | 161 | 5.59 | 0.62 | 1.24 | 1.24 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 69.57 | 0.00 |
| Verrucomicrobiota | 143 | 6.29 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 10.49 | 0.70 |
| Myxococcota | 124 | 30.65 | 4.03 | 3.23 | 3.23 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
| Deinococcota | 98 | 38.78 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 96.94 | 0.00 |
| Planctomycetota | 84 | 2.38 | 1.19 | 1.19 | 1.19 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 83.33 | 0.00 |
| Chloroflexota | 52 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 94.23 | 100.00 | 0.00 |
| Thermotogota | 50 | 38.00 | 0.00 | 28.00 | 28.00 | 0.00 | 0.00 | 6.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
| Acidobacteriota | 46 | 13.04 | 0.00 | 0.00 | 6.52 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
| Bdellovibrionota | 44 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 4.55 | 0.00 | 100.00 | 27.27 | 0.00 |
| Nitrospirota | 23 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
| Aquificota | 18 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 83.33 | 44.44 |
| Rhodothermota | 16 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
| Chlorobiota | 15 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| Ca. Saccharimonadota | 14 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 7.14 | 7.14 | 0.00 |
| Gemmatimonadota | 13 | 0.00 | 7.69 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
| Synergistota | 10 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
| Deferribacterota | 6 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
| Elusimicrobiota | 6 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
| Atribacterota | 5 | 60.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
| Balneolota | 3 | 33.33 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
| Ignavibacteriota | 3 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
| Thermomicrobiota | 2 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
| Armatimonadota | 2 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 50.00 | 0.00 |
| Chrysiogenota | 2 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
| Dictyoglomota | 2 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
| Fibrobacterota | 2 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
| Kiritimatiellota | 2 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
| Thermodesulfobiota | 2 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
| Ca. Omnitrophota | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
| Ca. Fervidibacterota | 1 | 100.00 | 0.00 | 0.00 | 100.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 0.00 | 0.00 |
| Ca. Cloacimonadota | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
| Ca. Bipolaricaulota | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 0.00 | 0.00 |
| Ca. Absconditibacteriota | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 0.00 | 0.00 |
| Calditrichota | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
| Caldisericota | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| Coprothermobacterota | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
| Zhurongbacterota | 1 | 100.00 | 0.00 | 0.00 | 100.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
| Vulcanimicrobiota | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
| Thermosulfidibacterota | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
| Nitrospinota | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
| Minisyncoccota | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| Lentisphaerota | 1 | 100.00 | 0.00 | 100.00 | 100.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
| Fidelibacterota | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
| Phylum | Number of genomes |
Full gene | V1–V2 | V1–V3 | V3–V4 | V3–V5 | V4 | V4–V5 | V4–V6 | V5–V6 | V5–V7 | V6–V7 | V6–V8 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 27F– 1492R (%) |
27F– 338R (%) |
27F– 534R (%) |
341F– 785R (%) |
341F– 944R (%) |
515F– 806R (%) |
515F– 944R (%) |
515F– 1100R (%) |
784F– 1100R (%) |
784F– 1193R (%) |
939F– 1193R (%) |
939F– 1378R (%) |
||
| Methanobacteriota | 469 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 0.00 | 82.30 | 0.00 | 0.00 | 0.00 | 0.00 |
| Thermoproteota | 112 | 0.89 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 0.00 | 87.50 | 0.00 | 0.00 | 0.00 | 0.00 |
| Nitrososphaerota | 33 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| Thermoplasmatota | 19 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| Ca. Nanohalarchaeota | 4 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| Promethearchaeota | 3 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| Microcaldota | 2 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| Nanobdellota | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| Primer name | Sequence | Reference |
|---|---|---|
| 27F | AGAGTTTGATYMTGGCTCAG | Frank et al., 2008 |
| 338R | GCTGCCTCCCGTAGGAGT | Suzuki et al., 1996 |
| 341F * | CCTACGGGNGGCWGCAG | Klindworth et al., 2013 |
| 515F | GTGCCAGCMGCCGCGGTAA | Turner et al., 1999 |
| 534R | ATTACCGCGGCTGCTGG | Walker et al., 2015 |
| 784F | AGGATTAGATACCCTGGTA | Andersson et al., 2008 |
| 785R * | GACTACHVGGGTATCTAATCC | Klindworth et al., 2013 |
| 806R | GGACTACHVGGGTWTCTAAT | Caporaso et al., 2010 |
| 939F | GAATTGACGGGGGCCCGCACAAG | Lebuhn et al., 2014 |
| 944R | GAATTAAACCACATGCTC | Fuks et al., 2018 |
| 1100R | AGGGTTGCGCTCGTTG | Turner et al., 1999 |
| 1193R | ACGTCATCCCCACCTTCC | Bodenhausen et al, 2013 |
| 1378R | CGGTGTGTACAAGGCCCGGGAACG | Lebuhn et al., 2014 |
| 1492R | TACCTTGTTACGACTT | Frank et al., 2008 |
| SSU1ArF | TCCGGTTGATCCYGCBRG | Bahram et al., 2018 |
| SSU520R | GCTACGRRYGYTTTARRC | Bahram et al., 2018 |
| 340f | CCCTAYGGGGYGCASCAG | Gantner et al., 2011 |
| 806rB | GGACTACNVGGGTWTCTAAT | Appril et al., 2015 |
| 349f | GYGCASCAGKCGMGAAW | Takai and Horikoshi, 2000 |
| 519r | TTACCGCGGCKGCTG | Klindworth et al., 2013 |
| 515fB | GTGYCAGCMGCCGCGGTAA | Parada et al., 2015 |
| Parch519f | CAGCCGCCGCGGTAA | Ovreås et al., 1997 |
| Arch915r | GTGCTCCCCCGCCAATTCCT | Raskin et al., 1994 |
| 1106F | TTWAGTCAGGCAACGAGC | Watanabe et al., 2007 |
| Ar1378R ** | TGTGCAAGGAGCAGGGAC | Watanabe et al., 2007 |
| A751F | CCGACGGTGAGRGRYGAA | Baker et al., 2003 |
| SSU1492Rngs | CGGNTACCTTGTKACGAC | Bahram et al., 2018 |
| SSU280ArR | TCAGWNYCCNWCTCSRGG | Bahram et al., 2018 |
| SSU470R | DCNGCNGGTDTTACCGCG | Bahram et al., 2018 |
| SSU468R | GNDCNGCNGGTDTTACCG | Bahram et al., 2018 |
| A519R | GGTDTTACCGCGGCKGCTG | Wang and Qian, 2009 |
| SSU666ArR | HGCYTTCGCCACHGGTRG | Bahram et al., 2018 |
| SSU1000ArR | GGCCATGCAMYWCCTCTC | Bahram et al., 2018 |
| UA1204R | TTMGGGGCATRCIKACCT | Baker et al., 2003 |
* Primers 341F and 785R are used in the protocol for library preparation for sequencing of V3–V4 region of 16S rRNA genes on Illumina MiSeq.
** Ar1378R is originally named 1378R. We use amended name to avoid confusion.
RiboGrove is a very minimalistic database — it comprises a collection of plain fasta files with metadata. Thus, extended search instruments are not available for it. We admit this problem and provide a list of suggestions below. The suggestions would help you to explore and select RiboGrove data.
RiboGrove fasta data has the following format of header:
>GCF_000978375.1:NZ_CP009686.1:8908-10459:plus ;d__Bacteria;k__Bacillati;p__Bacillota;c__Bacilli;o__Bacillales;f__Bacillaceae;g__Bacillus;s__cereus; category:1
Major blocks of a header are separated by spaces. A header consists of three such blocks:
You can select specific sequences from fasta files using the Seqkit program (GitHub repo, documentation). It is free, cross-platform, multifunctional and pretty fast and can process both gzipped and uncompressed fasta files. Programs seqkit grep and seqkit seq are useful for sequence selection.
Given the downloaded fasta file ribogrove_27.233_sequences.fasta.gz, consider the following examples of sequence selection using seqkit grep:
Example 1. Select a single sequence by SeqID.
seqkit grep -p "GCF_000978375.1:NZ_CP009686.1:8908-10459:plus" ribogrove_27.233_sequences.fasta.gz
The -p option sets a pattern to search in fasta headers (only in sequence IDs, actually).
Example 2. Select all gene sequences of a single RefSeq genomic sequence by accession number NZ_CP009686.1.
seqkit grep -nrp ":NZ_CP009686.1:" ribogrove_27.233_sequences.fasta.gz
Here, two more options are required: -n and -r. The former tells the program to match the whole headers instead of IDs only. The latter tells the program to include partial matches into output, i.e. if the pattern is a substring of a header, the header will be printed to output.
To ensure search specificity, surround the Accession.Version with colons (:).
Example 3. Select all gene sequences of a single genome (Assembly accession GCF_019357495.1).
seqkit grep -nrp "GCF_019357495.1:" ribogrove_27.233_sequences.fasta.gz
To ensure search specificity, put a colon (:) after the assembly accession.
Example 4. Select all actinobacterial sequences.
seqkit grep -nrp ";p__Actinobacteria;" ribogrove_27.233_sequences.fasta.gz
To ensure search specificity, surround the taxonomy name with semicolons (;).
Example 5. Select all sequences originating from category 1 genomes.
seqkit grep -nrp "category:1" ribogrove_27.233_sequences.fasta.gz
Example 6. Select all sequences except for those belonging to Bacillota.
seqkit grep -nvrp ";p__Bacillota;" ribogrove_27.233_sequences.fasta.gz
Recognize the -v option within the option sequence -nvrp. This option inverts match, i.e. output will comprise sequences, headers of which do not contain the substring “;p__Bacillota;”.
You can use the seqkit seq program to select sequences by length.
Example 1. Select all sequences longer than 1600 bp.
seqkit seq -m 1601 ribogrove_27.233_sequences.fasta.gz
The -m option sets the minimum length of a sequence to be printed to output.
Example 2. Select all sequences shorter than 1500 bp.
seqkit seq -M 1499 ribogrove_27.233_sequences.fasta.gz
The -M option sets the maximum length of a sequence to be printed to output.
Example 3. Select all sequences having length in range [1500, 1600] bp.
seqkit seq -m 1500 -M 1600 ribogrove_27.233_sequences.fasta.gz
It is sometimes useful to retrieve only header information from a fasta file. You can use the seqkit seq program for it.
Example 1. Select all headers.
seqkit seq -n ribogrove_27.233_sequences.fasta.gz
The -n option tells the program to output only headers.
Example 2. Select all SeqIDs (header parts before the first space).
seqkit seq -ni ribogrove_27.233_sequences.fasta.gz
The -i option tells the program to output only sequence IDs.
Example 3. Select all RefSeq “Assession.Version”s.
seqkit seq -ni ribogrove_27.233_sequences.fasta.gz | cut -f2 -d':' | sort | uniq
This might be done only if you have cut, sort and uniq utilities installed (Linux and Mac OS systems should have them built-in).
Example 4. Select all Assembly accessions.
seqkit seq -ni ribogrove_27.233_sequences.fasta.gz | cut -f1 -d':' | sort | uniq
This might be done only if you have cut, sort and uniq utilities installed (Linux and Mac OS systems should have them built-in).
Example 5. Select all phylum names.
seqkit seq -n ribogrove_27.233_sequences.fasta.gz | grep -Eo ';p__[^;]+' | sed -E 's/;|p__//g' | sort | uniq
This might be done only if you have grep, sed, sort and uniq utilities installed (Linux and Mac OS systems should have them built-in).
For any questions concerning RiboGrove, please contact Maksim Sikolenko at sikolenko@mbio.bas-net.by or maximdeynonih@gmail.com.
If you find RiboGrove useful for your research please cite:
Maxim A. Sikolenko, Leonid N. Valentovich. “RiboGrove: a database of full-length prokaryotic 16S rRNA genes derived from completely assembled genomes” // Research in Microbiology, Volume 173, Issue 4, May 2022, 103936.
(DOI: 10.1016/j.resmic.2022.103936).
You can also cite RiboGrove itself on Zenodo:
Please use the make_qiime_taxonomy_file.py script to convert the RiboGrove file metadata/taxonomy.tsv to a QIIME2-compatible file. You can find out how to use this script in the corresponding README file.
People have already provided several useful answers in the corresponding discussion: https://bioinformatics.stackexchange.com/questions/20915/how-do-i-save-selected-sequences-in-seqkit-to-a-file.
People have already provided several useful answers in the corresponding discussion: https://www.biostars.org/p/9561418.
RiboGrove, 2026-02-03