🌐 English / Беларуская / Українська / Русский
RiboGrove is a database of 16S rRNA gene sequences of bacteria and archaea.
RiboGrove is based on the RefSeq database. It contains only full-length sequences of 16S rRNA genes, and the sequences are derived from completely assembled prokaryotic genomes deposited in RefSeq. Hence we posit high reliability of RiboGrove sequences.
Here is a summary showing what is the (qualitative) difference between RiboGrove and similar rRNA sequence databases, namely rrnDB, Silva, RDP, and Greengenes. Briefly, RiboGrove is inferior in sequence amount and diversity, but superior in sequence reliability.
RiboGrove | rrnDB | Silva | RDP | Greengenes | |
---|---|---|---|---|---|
Represented organisms | Bacteria Archaea | Bacteria Archaea | Bacteria Archaea Eukaryotes | Bacteria Archaea Eukaryotes | Bacteria Archaea |
Represented ribosome subunits | Small | Small | Large Small | Large Small | Small |
Contains sequences from assembled genomes | Yes | Yes | Yes | Yes | Yes |
Contains amplicon sequences | No | No | Yes | Yes | Yes |
Contains partial gene sequences | No | Yes | Yes | Yes | Yes |
Discriminates genome categories | Yes | No | Not applicable | Not applicable | Not applicable |
All genomes used for RiboGrove construction were divided into three categories according to their expected reliability:
Signs of a low-quality assembly are the following:
The software used for the RiboGrove construction can be found in the following GitHub repository: ribogrove-tools.
The release is based on RefSeq release 230.
The metadata consists of the following files:
The fasta file is compressed with gzip, and the metadata file is a zip archive. To uncompress them, Linux and Mac OS users may use gzip and zip programs, they should be built-in. For Windows users, the free and open-source (de)compression program 7-Zip is available.
You can find all releases in the RiboGrove release archive.
No important differences from the previous release.
You can find notes to all RiboGrove releases on the release notes page.
Bacteria | Archaea | Total | |
---|---|---|---|
Number of gene sequences | 268,758 | 1,076 | 269,834 |
Number of unique gene sequences | 64,436 | 759 | 65,195 |
Number of species | 12,338 | 492 | 12,830 |
Number of genomes | 48,707 | 616 | 49,323 |
Number of genomes of category 1 | 32,507 | 250 | 32,757 |
Number of genomes of category 2 | 15,934 | 366 | 16,300 |
Number of genomes of category 3 | 266 | 0 | 266 |
Bacteria | Archaea | |
---|---|---|
Minimum (bp) | 1,401.00 | 1,439.00 |
25th percentile (bp) * | 1,517.00 | 1,471.00 |
Median (bp) * | 1,529.00 | 1,473.50 |
75th percentile (bp) * | 1,542.00 | 1,483.00 |
Average (bp) * | 1,526.85 | 1,491.31 |
Mode (bp) * | 1,537.00 | 1,472.00 |
Maximum (bp) | 2,438.00 | 3,604.00 |
Standard deviation (bp) * | 25.10 | 121.54 |
* Metrics marked with an asterisk were calculated with preliminary normalization, i.e. median within-species gene length was used for the summary.
Copy number * | Bacteria | Archaea | ||
---|---|---|---|---|
Number of species | Percent of species (%) | Number of species | Percent of species (%) | |
1 | 1,575 | 12.77 | 245 | 49.80 |
2 | 2,126 | 17.23 | 149 | 30.28 |
3 | 1,694 | 13.73 | 74 | 15.04 |
4 | 1,486 | 12.04 | 18 | 3.66 |
5 | 934 | 7.57 | 6 | 1.22 |
6 | 1,599 | 12.96 | 0 | 0.00 |
7 | 1,131 | 9.17 | 0 | 0.00 |
8 | 649 | 5.26 | 0 | 0.00 |
9 | 324 | 2.63 | 0 | 0.00 |
10 | 312 | 2.53 | 0 | 0.00 |
11 | 156 | 1.26 | 0 | 0.00 |
12 | 137 | 1.11 | 0 | 0.00 |
13 | 56 | 0.45 | 0 | 0.00 |
14 | 86 | 0.70 | 0 | 0.00 |
15 | 26 | 0.21 | 0 | 0.00 |
16 | 12 | 0.10 | 0 | 0.00 |
17 | 12 | 0.10 | 0 | 0.00 |
18 | 6 | 0.05 | 0 | 0.00 |
19 | 2 | 0.02 | 0 | 0.00 |
20 | 9 | 0.07 | 0 | 0.00 |
21 | 1 | 0.01 | 0 | 0.00 |
22 | 1 | 0.01 | 0 | 0.00 |
24 | 1 | 0.01 | 0 | 0.00 |
25 | 1 | 0.01 | 0 | 0.00 |
27 | 1 | 0.01 | 0 | 0.00 |
37 | 1 | 0.01 | 0 | 0.00 |
* These are median within-species copy numbers.
Organism | Gene length (bp) | RiboGrove Sequence ID(s) | Assembly accession |
---|---|---|---|
Bacteria | |||
Thermus thermophilus strain AA2-2 | 2,438 | GCF_019974355.1:NZ_AP024929.1:249100-251537:minus | GCF_019974355.1 |
Ca. Annandia pinicola strain Ad13-065 | 1,887 | GCF_020541245.1:NZ_CP045876.1:290071-291957:minus | GCF_020541245.1 |
Thermoanaerobacter ethanolicus strain JW 200 | 1,812 | GCF_003722315.1:NZ_CP033580.1:456062-457873:plus | GCF_003722315.1 |
Nitrosophilus labii strain HRV44 | 1,806 | GCF_014466985.1:NZ_AP022826.1:1258017-1259822:minus GCF_014466985.1:NZ_AP022826.1:1532588-1534393:minus GCF_014466985.1:NZ_AP022826.1:1939914-1941719:minus |
GCF_014466985.1 |
Sporomusa rhizae strain DSM 16652 | 1,802 | GCF_041428845.1:NZ_CP156925.1:3123180-3124981:minus | GCF_041428845.1 |
Gelria sp. Kuro-4 | 1,788 | GCF_019668485.1:NZ_AP024619.1:2016182-2017969:minus | GCF_019668485.1 |
Helicobacter mastomyrinus strain Hm-17 | 1,785 | GCF_039555295.1:NZ_CP145316.1:765140-766924:minus | GCF_039555295.1 |
Thermoanaerobacter brockii strain Ako-1 | 1,781 | GCF_000175295.2:NC_014964.1:2252888-2254668:minus | GCF_000175295.2 |
Thermoanaerobacter pseudethanolicus strain ATCC 33223 |
1,781 | GCF_000019085.1:NC_010321.1:2265744-2267524:minus | GCF_000019085.1 |
Thermoanaerobacter sp. RKWS2 | 1,754 | GCF_026240795.1:NZ_CP110888.1:94012-95765:plus | GCF_026240795.1 |
Archaea | |||
Pyrobaculum ferrireducens strain 1860 | 3,604 | GCF_000234805.1:NC_016645.1:127214-130817:plus | GCF_000234805.1 |
Pyrobaculum aerophilum strain IM2 | 2,213 | GCF_000007225.1:NC_003364.1:1089640-1091852:plus | GCF_000007225.1 |
Pyrobaculum arsenaticum strain DSM 13514 | 2,212 | GCF_000016385.1:NC_009376.1:623323-625534:minus | GCF_000016385.1 |
Aeropyrum pernix strain K1 | 2,202 | GCF_000011125.1:NC_000854.2:1218712-1220913:minus | GCF_000011125.1 |
Pyrobaculum neutrophilum strain V24Sta | 2,197 | GCF_000019805.1:NC_010525.1:690419-692615:plus | GCF_000019805.1 |
Ca. Mancarchaeum acidiphilum strain Mia14 | 2,008 | GCF_002214165.1:NZ_CP019964.1:751297-753304:minus | GCF_002214165.1 |
Ca. Micrarchaeum sp. A_DKE | 2,003 | GCF_016806735.1:NZ_CP060530.1:203642-205644:minus | GCF_016806735.1 |
Caldivirga maquilingensis strain IC-167 | 1,679 | GCF_000018305.1:NC_009954.1:129150-130828:minus | GCF_000018305.1 |
Aeropyrum camini strain SY1 | 1,650 | GCF_000591035.1:NC_022521.1:1165168-1166817:minus | GCF_000591035.1 |
Pyrolobus fumarii strain 1A | 1,576 | GCF_000223395.1:NC_015931.1:84671-86246:minus | GCF_000223395.1 |
Organism | Gene length (bp) | RiboGrove Sequence ID(s) | Assembly accession |
---|---|---|---|
Bacteria | |||
Anabaena sp. YBS01 | 1,401 | GCF_009498015.1:NZ_CP034058.1:6920299-6921699:minus | GCF_009498015.1 |
Clostridioides difficile strain TW11 | 1,426 | GCF_009362915.1:NZ_CP045224.1:4068440-4069865:minus | GCF_009362915.1 |
Staphylococcus warneri strain TWSL_1 | 1,440 | GCF_032147125.1:NZ_CP135051.1:2625669-2627108:plus | GCF_032147125.1 |
Roseicitreum antarcticum strain ZS2-28 | 1,447 | GCF_014681765.1:NZ_CP061498.1:3436150-3437596:plus | GCF_014681765.1 |
Hirschia baltica strain ATCC 49814 | 1,448 | GCF_000023785.1:NC_012982.1:2336679-2338126:minus | GCF_000023785.1 |
Sagittula sp. P11 | 1,449 | GCF_002814095.1:NZ_CP021913.1:3597920-3599368:plus GCF_002814095.1:NZ_CP021913.1:2386837-2388285:plus |
GCF_002814095.1 |
Mameliella sp. | 1,449 | GCF_965212485.1:NZ_OZ243118.1:780420-781868:minus GCF_965212485.1:NZ_OZ243118.1:3042962-3044410:plus GCF_965212485.1:NZ_OZ243118.1:4611080-4612528:minus |
GCF_965212485.1 |
Mameliella sp. | 1,449 | GCF_965249415.1:NZ_OZ252233.1:702863-704311:plus GCF_965249415.1:NZ_OZ252233.1:1895495-1896943:plus GCF_965249415.1:NZ_OZ252233.1:3463560-3465008:minus |
GCF_965249415.1 |
Sagittula sp. MA-2 | 1,449 | GCF_030126985.1:NZ_CP126145.1:439-1887:plus GCF_030126985.1:NZ_CP126145.1:2907211-2908659:minus |
GCF_030126985.1 |
Sagittula stellata strain E-37 | 1,449 | GCF_039724765.1:NZ_CP155729.1:664616-666064:plus GCF_039724765.1:NZ_CP155729.1:1804792-1806240:plus |
GCF_039724765.1 |
Mameliella alba strain KU6B | 1,449 | GCF_011405015.1:NZ_AP022337.1:1420943-1422391:plus GCF_011405015.1:NZ_AP022337.1:3191212-3192660:minus GCF_011405015.1:NZ_AP022337.1:267140-268588:plus |
GCF_011405015.1 |
Archaea | |||
Ignicoccus hospitalis strain KIN4/I | 1,439 | GCF_000017945.1:NC_009776.1:728362-729800:plus | GCF_000017945.1 |
Methanocaldococcus lauensis strain SG7 | 1,457 | GCF_902827225.1:NZ_LR792632.1:542755-544211:plus | GCF_902827225.1 |
Halorubrum sp. BOL3-1 | 1,463 | GCF_004114375.1:NZ_CP034692.1:397753-399215:minus | GCF_004114375.1 |
Salinirubellus litoreus strain SYNS196 | 1,466 | GCF_037335815.1:NZ_CP147841.1:597195-598660:minus | GCF_037335815.1 |
Natronomonas marina strain ZY43 | 1,466 | GCF_024298905.1:NZ_CP101154.1:18680-20145:plus | GCF_024298905.1 |
Natronomonas gomsonensis strain KCTC 4088 | 1,466 | GCF_024300825.1:NZ_CP101323.1:2500564-2502029:plus | GCF_024300825.1 |
Ca. Methanomethylophilus alvi strain Mx1201 | 1,466 | GCF_000300255.2:NC_020913.1:283607-285072:plus | GCF_000300255.2 |
Salinirubellus salinus strain ZS-35-S2 | 1,466 | GCF_025231485.1:NZ_CP104003.1:3070232-3071697:plus | GCF_025231485.1 |
Methanomethylophilus alvi strain MGYG-HGUT-02456 |
1,466 | GCF_902387285.1:NZ_LR699000.1:283607-285072:plus | GCF_902387285.1 |
Methanospirillum purgamenti strain J.3.6.1-F.2.7.3 |
1,466 | GCF_018502485.1:NZ_CP075546.1:133354-134819:plus GCF_018502485.1:NZ_CP075546.1:825954-827419:plus GCF_018502485.1:NZ_CP075546.1:872641-874106:plus GCF_018502485.1:NZ_CP075546.1:1727419-1728884:plus |
GCF_018502485.1 |
Methanospirillum stamsii strain Pt1 | 1,466 | GCF_046244385.1:NZ_CP176366.1:1311724-1313189:plus GCF_046244385.1:NZ_CP176366.1:2035802-2037267:plus GCF_046244385.1:NZ_CP176366.1:2042927-2044392:plus GCF_046244385.1:NZ_CP176366.1:3625347-3626812:minus |
GCF_046244385.1 |
Methanomethylophilus alvi strain Mx-05 | 1,466 | GCF_003711245.1:NZ_CP017686.1:283608-285073:plus | GCF_003711245.1 |
Natronomonas halophila strain C90 | 1,466 | GCF_013391085.1:NZ_CP058334.1:1530622-1532087:minus | GCF_013391085.1 |
Methanospirillum purgamenti strain GP1 | 1,466 | GCF_019263745.1:NZ_CP077107.1:4649-6114:plus GCF_019263745.1:NZ_CP077107.1:1359562-1361027:minus GCF_019263745.1:NZ_CP077107.1:1365502-1366967:minus GCF_019263745.1:NZ_CP077107.1:1986020-1987485:minus |
GCF_019263745.1 |
Methanospirillum hungatei strain JF-1 | 1,466 | GCF_000013445.1:NC_007796.1:39814-41279:plus GCF_000013445.1:NC_007796.1:1301079-1302544:minus GCF_000013445.1:NC_007796.1:3501525-3502990:minus GCF_000013445.1:NC_007796.1:3507609-3509074:minus |
GCF_000013445.1 |
Organism | Copy number | Assembly accession | |
---|---|---|---|
Bacteria | |||
Tumebacillus avium strain AR23208 | 37 | GCF_002162355.1 | |
Tumebacillus algifaecis strain THMBR28 | 27 | GCF_002243515.1 | |
Photobacterium piscicola strain WVL24019 | 25 | GCF_046058925.1 | |
Photobacterium phosphoreum strain MIP2473 | 24 | GCF_949787665.1 | |
Mesobacillus maritimus strain ADH-29 | 22 | GCF_044803185.1 | |
Photobacterium damselae strain Pdd1411 | 21 | GCF_030168855.1 | |
Photobacterium damselae strain Phdp Wu-1 | 21 | GCF_003130755.1 | |
Photobacterium leiognathi strain Sr3.10 | 21 | GCF_048537505.1 | |
Aneurinibacillus sp. Ricciae_BoGa-3 | 21 | GCF_028421645.1 | |
Photobacterium leiognathi strain Sr3.21 | 21 | GCF_048537525.1 | |
Peribacillus asahii strain KF4 | 21 | GCF_023823975.1 | |
Archaea | |||
Natronorubrum aibiense strain 7-3 | 5 | GCF_009392895.1 | |
Methanococcoides orientis strain LMO-1 | 5 | GCF_021184045.1 | |
Natrinema sp. SYSU A 869 | 5 | GCF_019879105.1 | |
Methanolobus sp. ZRKC3 | 5 | GCF_045291275.1 | |
Natronorubrum bangense strain JCM 10635 | 5 | GCF_004799645.1 | |
Methanoplanus endosymbiosus strain DSM 3599 | 5 | GCF_024662215.1 | |
Halomicrobium urmianum strain IBRC-M: 10911 | 4 | GCF_020217425.1 | |
Halomicrobium salinisoli strain LT50 | 4 | GCF_020405185.1 | |
Halomicrobium salinisoli strain TH30 | 4 | GCF_020405245.1 | |
Methanospirillum purgamenti strain J.3.6.1-F.2.7.3 | 4 | GCF_018502485.1 | |
Haloarcula sinaiiensis strain ATCC 33800 | 4 | GCF_018200015.1 | |
Haloterrigena salifodinae strain BOL5-1 | 4 | GCF_016906025.1 | |
Methanolobus sediminis strain FTZ6 | 4 | GCF_031312595.1 | |
Methanogenium sp. S4BF | 4 | GCF_029633965.1 | |
Methanospirillum hungatei strain JF-1 | 4 | GCF_000013445.1 | |
Natronococcus occultus strain SP4 | 4 | GCF_000328685.1 | |
Methanosphaera stadtmanae strain MGYG-HGUT-02164 |
4 | GCF_902384015.1 | |
Methanolobus sp. WCC4 | 4 | GCF_038022665.1 | |
Methanochimaera problematica strain FWC-SCC4 | 4 | GCF_032878975.1 | |
Methanolobus mangrovi strain FTZ2 | 4 | GCF_031312535.1 | |
Methanococcus vannielii strain SB | 4 | GCF_000017165.1 | |
Methanospirillum lacunae strain Ki8-1 | 4 | GCF_046195335.1 | |
Methanosphaera stadtmanae strain DSM 3091 | 4 | GCF_000012545.1 | |
Methanospirillum purgamenti strain GP1 | 4 | GCF_019263745.1 | |
Natrinema thermotolerans strain A29 | 4 | GCF_031165565.1 | |
Methanospirillum stamsii strain Pt1 | 4 | GCF_046244385.1 | |
Methanogenium organophilum strain DSM 3596 | 4 | GCF_026684035.1 |
Organism | Sum of entropy * (bits) | Mean entropy * (bits) | Number of variable positions | Gene copy number | Assembly accession |
---|---|---|---|---|---|
Bacteria | |||||
Clostridium perfringens strain A SNU21005 | 780.95 | 0.41 | 1,171 | 9 | GCF_047150065.1 |
Escherichia coli strain P276M | 433.81 | 0.26 | 569 | 6 | GCF_009762385.1 |
Listeria monocytogenes strain 10-092876-1155 LM6 |
357.10 | 0.20 | 370 | 3 | GCF_001999045.1 |
Klebsiella pneumoniae strain GZ-1 | 304.27 | 0.18 | 464 | 8 | GCF_014854815.1 |
Streptococcus infantis strain SO | 291.50 | 0.18 | 308 | 3 | GCF_021497965.1 |
Synechococcus sp. NB0720_010 | 243.35 | 0.16 | 265 | 3 | GCF_023078835.1 |
Streptomyces griseorubiginosus strain NBC_00586 |
231.55 | 0.15 | 342 | 6 | GCF_036345135.1 |
Caminibacter mediatlanticus strain TB-2 | 228.78 | 0.15 | 282 | 4 | GCF_005843985.1 |
Xanthomonas oryzae strain YNCX | 227.74 | 0.15 | 248 | 3 | GCF_024499285.1 |
Sporomusa termitida strain DSM 4440 | 226.25 | 0.13 | 247 | 12 | GCF_007641255.1 |
Archaea | |||||
Halomicrobium sp. ZPS1 ** | 137.00 | 0.09 | 137 | 2 | GCF_009217585.1 |
Halomicrobium urmianum strain IBRC-M: 10911 |
131.55 | 0.09 | 146 | 4 | GCF_020217425.1 |
Halapricum desulfuricans strain HSR12-2 | 128.00 | 0.09 | 128 | 2 | GCF_017094525.1 |
Halomicrobium salinisoli strain TH30 | 127.74 | 0.09 | 145 | 4 | GCF_020405245.1 |
Halapricum desulfuricans strain HSR-Bgl | 127.00 | 0.09 | 127 | 2 | GCF_017094445.1 |
Halomicrobium mukohataei strain JP60 | 125.81 | 0.09 | 137 | 3 | GCF_004803735.1 |
Halomicrobium sp. HM KBTZ05 | 124.38 | 0.08 | 134 | 3 | GCF_041530035.1 |
Halomicrobium salinisoli strain LT50 | 123.31 | 0.08 | 140 | 4 | GCF_020405185.1 |
Halapricum desulfuricans strain HSR-Est | 111.00 | 0.08 | 111 | 2 | GCF_017094465.1 |
Halapricum desulfuricans strain HSR12-1 | 109.00 | 0.07 | 109 | 2 | GCF_017094505.1 |
* Entropy is Shannon entropy calculated for each column of the multiple sequence alignment (MSA) of all full-length 16S rRNA genes of a genome. Entropy is then summed up (column “Sum of entropy”) and averaged (column “Mean entropy”).
** Halomicrobium sp. ZPS1 is a quite remarkable case. This genome harbours two 16S rRNA genes, therefore entropy is equal to the number of mismatching nucleotides between sequences of the genes. Respectively, percent of identity between these two gene sequences is 90.70%! This is remarkable because the usual (however arbitrary) genus demarcation threshold of percent of identity is 95%.
* Coverage of a primer pair is the percent of genomes having at least one 16S rRNA gene which can be amplified by PCR using this primer pair. For details, see our paper about RiboGrove.
In the tables below, you can find coverage of primer pairs that are being commonly used to amplify bacterial and archaeal genes (“bacterial” and “archaeal” primers).
You can find a more detailed table in the file primer_pair_genomic_coverage.tsv in the metadata. That table contains coverage not just for phyla, but also for each class, order, family, genus, and species. Moreover, that table contains coverage values for additional primer pairs, namely 1115F-1492R, 349f-519r, 1106F-Ar1378R, 1106F-SSU1492Rngs, SSU1ArF-SSU468R, SSU1ArF-SSU520R. In the tables below, they are omitted for brevity.
Phylum | Number of genomes |
Full gene | V1–V2 | V1–V3 | V3–V4 | V3–V5 | V4 | V4–V5 | V4–V6 | V5–V6 | V5–V7 | V6–V7 | V6–V8 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
27F– 1492R (%) |
27F– 338R (%) |
27F– 534R (%) |
341F– 785R (%) |
341F– 944R (%) |
515F– 806R (%) |
515F– 944R (%) |
515F– 1100R (%) |
784F– 1100R (%) |
784F– 1193R (%) |
939F– 1193R (%) |
939F– 1378R (%) |
||
Pseudomonadota | 26,698 | 99.70 | 99.50 | 99.68 | 99.93 | 84.03 | 99.89 | 84.10 | 88.96 | 88.65 | 93.47 | 92.52 | 96.43 |
Bacillota | 11,206 | 99.83 | 99.75 | 99.79 | 99.93 | 95.23 | 99.97 | 95.10 | 99.46 | 98.08 | 97.50 | 98.63 | 99.37 |
Actinomycetota | 4,976 | 99.90 | 99.14 | 99.72 | 94.82 | 67.02 | 94.61 | 66.78 | 96.91 | 99.76 | 99.84 | 99.84 | 96.93 |
Bacteroidota | 1,681 | 96.43 | 96.07 | 96.55 | 99.94 | 64.78 | 99.41 | 64.37 | 37.89 | 38.01 | 92.44 | 91.97 | 95.48 |
Campylobacterota | 1,314 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 99.92 | 99.92 | 99.92 | 99.47 | 99.47 | 99.70 | 99.54 |
Mycoplasmatota | 751 | 90.28 | 83.89 | 72.30 | 98.93 | 91.21 | 99.07 | 91.61 | 74.43 | 48.34 | 42.74 | 76.43 | 0.67 |
Spirochaetota | 398 | 54.27 | 54.77 | 54.77 | 93.22 | 99.75 | 93.22 | 99.75 | 99.75 | 75.38 | 75.38 | 90.20 | 43.47 |
Cyanobacteriota | 370 | 99.73 | 99.73 | 99.73 | 100.00 | 3.78 | 100.00 | 3.78 | 100.00 | 1.08 | 1.08 | 100.00 | 99.73 |
Chlamydiota | 234 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 100.00 | 94.44 |
Fusobacteriota | 229 | 100.00 | 98.69 | 99.56 | 99.56 | 99.56 | 99.56 | 99.56 | 99.56 | 99.56 | 99.56 | 100.00 | 0.00 |
Thermodesulfobacteriota | 147 | 100.00 | 99.32 | 100.00 | 100.00 | 41.50 | 100.00 | 41.50 | 100.00 | 95.24 | 91.16 | 95.92 | 99.32 |
Verrucomicrobiota | 140 | 99.29 | 0.00 | 99.29 | 100.00 | 12.86 | 100.00 | 12.86 | 100.00 | 1.43 | 1.43 | 98.57 | 98.57 |
Deinococcota | 97 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 0.00 | 0.00 | 52.58 | 100.00 |
Planctomycetota | 72 | 100.00 | 25.00 | 100.00 | 100.00 | 62.50 | 100.00 | 62.50 | 0.00 | 0.00 | 0.00 | 2.78 | 0.00 |
Myxococcota | 65 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 |
Chloroflexota | 52 | 100.00 | 92.31 | 100.00 | 42.31 | 0.00 | 94.23 | 0.00 | 90.38 | 11.54 | 11.54 | 94.23 | 26.92 |
Bdellovibrionota | 44 | 100.00 | 100.00 | 100.00 | 100.00 | 77.27 | 100.00 | 77.27 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 |
Thermotogota | 44 | 100.00 | 97.73 | 100.00 | 100.00 | 9.09 | 100.00 | 9.09 | 100.00 | 0.00 | 0.00 | 59.09 | 97.73 |
Acidobacteriota | 43 | 97.67 | 97.67 | 97.67 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 72.09 | 58.14 | 86.05 | 100.00 |
Aquificota | 18 | 100.00 | 16.67 | 100.00 | 100.00 | 16.67 | 100.00 | 16.67 | 100.00 | 0.00 | 0.00 | 0.00 | 16.67 |
Rhodothermota | 16 | 43.75 | 43.75 | 43.75 | 100.00 | 100.00 | 100.00 | 100.00 | 81.25 | 81.25 | 100.00 | 100.00 | 100.00 |
Chlorobiota | 15 | 100.00 | 100.00 | 100.00 | 100.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 93.33 | 86.67 | 6.67 |
Nitrospirota | 15 | 100.00 | 100.00 | 100.00 | 100.00 | 73.33 | 100.00 | 73.33 | 100.00 | 100.00 | 73.33 | 73.33 | 100.00 |
Ca. Saccharibacteria | 12 | 100.00 | 100.00 | 100.00 | 100.00 | 8.33 | 8.33 | 8.33 | 8.33 | 0.00 | 0.00 | 100.00 | 100.00 |
Gemmatimonadota | 12 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 |
Synergistota | 10 | 100.00 | 100.00 | 100.00 | 100.00 | 0.00 | 100.00 | 0.00 | 100.00 | 0.00 | 0.00 | 100.00 | 100.00 |
Deferribacterota | 6 | 100.00 | 100.00 | 100.00 | 100.00 | 0.00 | 100.00 | 0.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 |
Elusimicrobiota | 5 | 100.00 | 60.00 | 100.00 | 100.00 | 0.00 | 100.00 | 0.00 | 100.00 | 60.00 | 60.00 | 100.00 | 100.00 |
Atribacterota | 3 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 0.00 | 0.00 | 100.00 | 100.00 |
Ignavibacteriota | 3 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 |
Armatimonadota | 2 | 100.00 | 50.00 | 100.00 | 50.00 | 50.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 |
Thermodesulfobiota | 2 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 0.00 | 0.00 | 100.00 |
Thermomicrobiota | 2 | 100.00 | 100.00 | 100.00 | 100.00 | 0.00 | 100.00 | 0.00 | 100.00 | 0.00 | 0.00 | 50.00 | 50.00 |
Balneolota | 2 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 |
Chrysiogenota | 2 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 |
Dictyoglomota | 2 | 100.00 | 100.00 | 100.00 | 100.00 | 0.00 | 100.00 | 0.00 | 100.00 | 0.00 | 0.00 | 100.00 | 0.00 |
Fibrobacterota | 2 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 |
Kiritimatiellota | 2 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 0.00 | 0.00 | 100.00 | 100.00 |
Ca. Fervidibacterota | 1 | 100.00 | 0.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 0.00 | 0.00 | 100.00 | 100.00 |
Ca. Cloacimonadota | 1 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 |
Ca. Bipolaricaulota | 1 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 100.00 | 100.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
Ca. Absconditibacteriota | 1 | 100.00 | 0.00 | 100.00 | 100.00 | 0.00 | 100.00 | 0.00 | 0.00 | 0.00 | 100.00 | 0.00 | 0.00 |
Calditrichota | 1 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 |
Caldisericota | 1 | 100.00 | 100.00 | 100.00 | 100.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 100.00 |
Ca. Omnitrophota | 1 | 100.00 | 100.00 | 100.00 | 100.00 | 0.00 | 100.00 | 0.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 |
Ca. Paceibacterota | 1 | 0.00 | 0.00 | 0.00 | 100.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
Vulcanimicrobiota | 1 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 0.00 | 0.00 | 100.00 | 100.00 |
Thermosulfidibacterota | 1 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 0.00 | 0.00 | 100.00 | 100.00 |
Nitrospinota | 1 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 0.00 | 0.00 | 100.00 | 100.00 |
Lentisphaerota | 1 | 100.00 | 0.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 0.00 | 0.00 | 100.00 | 100.00 |
Fidelibacterota | 1 | 100.00 | 100.00 | 100.00 | 100.00 | 0.00 | 100.00 | 0.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 |
Coprothermobacterota | 1 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 100.00 | 100.00 | 0.00 | 0.00 | 0.00 | 100.00 | 0.00 |
Phylum | Number of genomes |
Full gene | V1–V2 | V1–V3 | V1–V3 | V3–V4 | V3–V4 | V3–V4 | V3–V5 | V3–V5 | V4 | V4–V5 | V5–V7 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
SSU1ArF– SSU1492Rngs (%) |
SSU1ArF– SSU280ArR (%) |
SSU1ArF– SSU470R (%) |
SSU1ArF– A519R (%) |
349f– SSU666ArR (%) |
340f– SSU666ArR (%) |
340f– 806rB (%) |
349f– SSU1000ArR (%) |
340f– SSU1000ArR (%) |
515fB– 806rB (%) |
Parch519f– Arch915r (%) |
A751F– UA1204R (%) |
||
Methanobacteriota | 452 | 89.16 | 86.95 | 89.38 | 89.16 | 51.55 | 50.66 | 100.00 | 99.34 | 100.00 | 100.00 | 99.56 | 89.60 |
Thermoproteota | 107 | 96.26 | 98.13 | 100.00 | 100.00 | 72.90 | 98.13 | 100.00 | 69.16 | 93.46 | 100.00 | 99.07 | 98.13 |
Nitrososphaerota | 30 | 96.67 | 96.67 | 96.67 | 96.67 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 |
Thermoplasmatota | 19 | 84.21 | 68.42 | 100.00 | 100.00 | 42.11 | 42.11 | 100.00 | 63.16 | 84.21 | 100.00 | 100.00 | 52.63 |
Ca. Nanohalarchaeota | 4 | 0.00 | 25.00 | 0.00 | 100.00 | 0.00 | 0.00 | 100.00 | 50.00 | 100.00 | 100.00 | 100.00 | 0.00 |
Ca. Micrarchaeota | 2 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 0.00 | 0.00 | 0.00 |
Nanobdellota | 1 | 100.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 0.00 | 0.00 | 0.00 |
Promethearchaeota | 1 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 0.00 | 0.00 | 100.00 | 100.00 | 100.00 |
Phylum | Number of genomes |
Full gene | V1–V2 | V1–V3 | V1–V3 | V3–V4 | V3–V4 | V3–V4 | V3–V5 | V3–V5 | V4 | V4–V5 | V5–V7 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
SSU1ArF– SSU1492Rngs (%) |
SSU1ArF– SSU280ArR (%) |
SSU1ArF– SSU470R (%) |
SSU1ArF– A519R (%) |
349f– SSU666ArR (%) |
340f– SSU666ArR (%) |
340f– 806rB (%) |
349f– SSU1000ArR (%) |
340f– SSU1000ArR (%) |
515fB– 806rB (%) |
Parch519f– Arch915r (%) |
A751F– UA1204R (%) |
||
Pseudomonadota | 26,698 | 1.22 | 0.03 | 0.55 | 0.58 | 0.00 | 0.00 | 0.09 | 0.00 | 0.00 | 99.89 | 28.03 | 0.00 |
Bacillota | 11,206 | 2.54 | 0.05 | 0.13 | 1.45 | 0.02 | 0.00 | 0.06 | 0.01 | 0.00 | 99.97 | 98.42 | 0.00 |
Actinomycetota | 4,976 | 0.94 | 0.24 | 0.74 | 1.21 | 0.00 | 0.00 | 0.04 | 0.00 | 0.00 | 94.61 | 87.64 | 0.00 |
Bacteroidota | 1,681 | 1.90 | 0.00 | 1.84 | 1.96 | 0.00 | 0.00 | 0.18 | 0.00 | 0.00 | 99.41 | 99.29 | 0.00 |
Campylobacterota | 1,314 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 99.92 | 0.15 | 0.00 |
Mycoplasmatota | 751 | 2.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 99.07 | 78.16 | 0.00 |
Spirochaetota | 398 | 0.50 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 93.22 | 92.96 | 0.00 |
Cyanobacteriota | 370 | 2.97 | 0.00 | 0.27 | 0.27 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
Chlamydiota | 234 | 1.71 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
Fusobacteriota | 229 | 0.44 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 99.56 | 99.56 | 0.00 |
Thermodesulfobacteriota | 147 | 6.12 | 0.68 | 1.36 | 1.36 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 72.79 | 0.00 |
Verrucomicrobiota | 140 | 5.71 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 10.00 | 0.71 |
Deinococcota | 97 | 39.18 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 96.91 | 0.00 |
Planctomycetota | 72 | 1.39 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 81.94 | 0.00 |
Myxococcota | 65 | 13.85 | 7.69 | 6.15 | 6.15 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
Chloroflexota | 52 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 94.23 | 100.00 | 0.00 |
Bdellovibrionota | 44 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 4.55 | 0.00 | 100.00 | 27.27 | 0.00 |
Thermotogota | 44 | 43.18 | 0.00 | 31.82 | 31.82 | 0.00 | 0.00 | 2.27 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
Acidobacteriota | 43 | 11.63 | 0.00 | 0.00 | 6.98 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
Aquificota | 18 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 83.33 | 44.44 |
Rhodothermota | 16 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
Chlorobiota | 15 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
Nitrospirota | 15 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
Ca. Saccharibacteria | 12 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 8.33 | 8.33 | 0.00 |
Gemmatimonadota | 12 | 0.00 | 8.33 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
Synergistota | 10 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
Deferribacterota | 6 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
Elusimicrobiota | 5 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
Atribacterota | 3 | 33.33 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
Ignavibacteriota | 3 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
Armatimonadota | 2 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 50.00 | 0.00 |
Thermodesulfobiota | 2 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
Thermomicrobiota | 2 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
Balneolota | 2 | 50.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
Chrysiogenota | 2 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
Dictyoglomota | 2 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
Fibrobacterota | 2 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
Kiritimatiellota | 2 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
Ca. Fervidibacterota | 1 | 100.00 | 0.00 | 0.00 | 100.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 0.00 | 0.00 |
Ca. Cloacimonadota | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
Ca. Bipolaricaulota | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 0.00 | 0.00 |
Ca. Absconditibacteriota | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 0.00 | 0.00 |
Calditrichota | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
Caldisericota | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
Ca. Omnitrophota | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
Ca. Paceibacterota | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
Vulcanimicrobiota | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
Thermosulfidibacterota | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
Nitrospinota | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
Lentisphaerota | 1 | 100.00 | 0.00 | 100.00 | 100.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
Fidelibacterota | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
Coprothermobacterota | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 100.00 | 0.00 |
Phylum | Number of genomes |
Full gene | V1–V2 | V1–V3 | V3–V4 | V3–V5 | V4 | V4–V5 | V4–V6 | V5–V6 | V5–V7 | V6–V7 | V6–V8 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
27F– 1492R (%) |
27F– 338R (%) |
27F– 534R (%) |
341F– 785R (%) |
341F– 944R (%) |
515F– 806R (%) |
515F– 944R (%) |
515F– 1100R (%) |
784F– 1100R (%) |
784F– 1193R (%) |
939F– 1193R (%) |
939F– 1378R (%) |
||
Methanobacteriota | 452 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 0.00 | 82.08 | 0.00 | 0.00 | 0.00 | 0.00 |
Thermoproteota | 107 | 0.93 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 0.00 | 88.79 | 0.00 | 0.00 | 0.00 | 0.00 |
Nitrososphaerota | 30 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
Thermoplasmatota | 19 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
Ca. Nanohalarchaeota | 4 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
Ca. Micrarchaeota | 2 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
Nanobdellota | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
Promethearchaeota | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
Primer name | Sequence | Reference |
---|---|---|
27F | AGAGTTTGATYMTGGCTCAG | Frank et al., 2008 |
338R | GCTGCCTCCCGTAGGAGT | Suzuki et al., 1996 |
341F * | CCTACGGGNGGCWGCAG | Klindworth et al., 2013 |
515F | GTGCCAGCMGCCGCGGTAA | Turner et al., 1999 |
534R | ATTACCGCGGCTGCTGG | Walker et al., 2015 |
784F | AGGATTAGATACCCTGGTA | Andersson et al., 2008 |
785R * | GACTACHVGGGTATCTAATCC | Klindworth et al., 2013 |
806R | GGACTACHVGGGTWTCTAAT | Caporaso et al., 2010 |
939F | GAATTGACGGGGGCCCGCACAAG | Lebuhn et al., 2014 |
944R | GAATTAAACCACATGCTC | Fuks et al., 2018 |
1100R | AGGGTTGCGCTCGTTG | Turner et al., 1999 |
1193R | ACGTCATCCCCACCTTCC | Bodenhausen et al, 2013 |
1378R | CGGTGTGTACAAGGCCCGGGAACG | Lebuhn et al., 2014 |
1492R | TACCTTGTTACGACTT | Frank et al., 2008 |
SSU1ArF | TCCGGTTGATCCYGCBRG | Bahram et al., 2018 |
SSU520R | GCTACGRRYGYTTTARRC | Bahram et al., 2018 |
340f | CCCTAYGGGGYGCASCAG | Gantner et al., 2011 |
806rB | GGACTACNVGGGTWTCTAAT | Appril et al., 2015 |
349f | GYGCASCAGKCGMGAAW | Takai and Horikoshi, 2000 |
519r | TTACCGCGGCKGCTG | Klindworth et al., 2013 |
515fB | GTGYCAGCMGCCGCGGTAA | Parada et al., 2015 |
Parch519f | CAGCCGCCGCGGTAA | Ovreås et al., 1997 |
Arch915r | GTGCTCCCCCGCCAATTCCT | Raskin et al., 1994 |
1106F | TTWAGTCAGGCAACGAGC | Watanabe et al., 2007 |
Ar1378R ** | TGTGCAAGGAGCAGGGAC | Watanabe et al., 2007 |
A751F | CCGACGGTGAGRGRYGAA | Baker et al., 2003 |
SSU1492Rngs | CGGNTACCTTGTKACGAC | Bahram et al., 2018 |
SSU280ArR | TCAGWNYCCNWCTCSRGG | Bahram et al., 2018 |
SSU470R | DCNGCNGGTDTTACCGCG | Bahram et al., 2018 |
SSU468R | GNDCNGCNGGTDTTACCG | Bahram et al., 2018 |
A519R | GGTDTTACCGCGGCKGCTG | Wang and Qian, 2009 |
SSU666ArR | HGCYTTCGCCACHGGTRG | Bahram et al., 2018 |
SSU1000ArR | GGCCATGCAMYWCCTCTC | Bahram et al., 2018 |
UA1204R | TTMGGGGCATRCIKACCT | Baker et al., 2003 |
* Primers 341F and 785R are used in the protocol for library preparation for sequencing of V3–V4 region of 16S rRNA genes on Illumina MiSeq.
** Ar1378R is originally named 1378R. We use amended name to avoid confusion.
RiboGrove is a very minimalistic database — it comprises a collection of plain fasta files with metadata. Thus, extended search instruments are not available for it. We admit this problem and provide a list of suggestions below. The suggestions would help you to explore and select RiboGrove data.
RiboGrove fasta data has the following format of header:
>GCF_000978375.1:NZ_CP009686.1:8908-10459:plus ;d__Bacteria;p__Firmicutes;c__Bacilli;o__Bacillales;f__Bacillaceae;g__Bacillus;s__cereus; category:1
Major blocks of a header are separated by spaces. A header consists of three such blocks:
You can select specific sequences from fasta files using the Seqkit program (GitHub repo, documentation). It is free, cross-platform, multifunctional and pretty fast and can process both gzipped and uncompressed fasta files. Programs seqkit grep and seqkit seq are useful for sequence selection.
Given the downloaded fasta file ribogrove_24.230_sequences.fasta.gz, consider the following examples of sequence selection using seqkit grep:
Example 1. Select a single sequence by SeqID.
seqkit grep -p "GCF_000978375.1:NZ_CP009686.1:8908-10459:plus" ribogrove_24.230_sequences.fasta.gz
The -p option sets a pattern to search in fasta headers (only in sequence IDs, actually).
Example 2. Select all gene sequences of a single RefSeq genomic sequence by accession number NZ_CP009686.1.
seqkit grep -nrp ":NZ_CP009686.1:" ribogrove_24.230_sequences.fasta.gz
Here, two more options are required: -n and -r. The former tells the program to match the whole headers instead of IDs only. The latter tells the program to include partial matches into output, i.e. if the pattern is a substring of a header, the header will be printed to output.
To ensure search specificity, surround the Accession.Version with colons (:).
Example 3. Select all gene sequences of a single genome (Assembly accession GCF_019357495.1).
seqkit grep -nrp "GCF_019357495.1:" ribogrove_24.230_sequences.fasta.gz
To ensure search specificity, put a colon (:) after the assembly accession.
Example 4. Select all actinobacterial sequences.
seqkit grep -nrp ";p__Actinobacteria;" ribogrove_24.230_sequences.fasta.gz
To ensure search specificity, surround the taxonomy name with semicolons (;).
Example 5. Select all sequences originating from category 1 genomes.
seqkit grep -nrp "category:1" ribogrove_24.230_sequences.fasta.gz
Example 6. Select all sequences except for those belonging to Firmicutes.
seqkit grep -nvrp ";p__Firmicutes;" ribogrove_24.230_sequences.fasta.gz
Recognize the -v option within the option sequence -nvrp. This option inverts match, i.e. output will comprise sequences, headers of which do not contain the substring “;p__Firmicutes;”.
You can use the seqkit seq program to select sequences by length.
Example 1. Select all sequences longer than 1600 bp.
seqkit seq -m 1601 ribogrove_24.230_sequences.fasta.gz
The -m option sets the minimum length of a sequence to be printed to output.
Example 2. Select all sequences shorter than 1500 bp.
seqkit seq -M 1499 ribogrove_24.230_sequences.fasta.gz
The -M option sets the maximum length of a sequence to be printed to output.
Example 3. Select all sequences having length in range [1500, 1600] bp.
seqkit seq -m 1500 -M 1600 ribogrove_24.230_sequences.fasta.gz
It is sometimes useful to retrieve only header information from a fasta file. You can use the seqkit seq program for it.
Example 1. Select all headers.
seqkit seq -n ribogrove_24.230_sequences.fasta.gz
The -n option tells the program to output only headers.
Example 2. Select all SeqIDs (header parts before the first space).
seqkit seq -ni ribogrove_24.230_sequences.fasta.gz
The -i option tells the program to output only sequence IDs.
Example 3. Select all RefSeq “Assession.Version”s.
seqkit seq -ni ribogrove_24.230_sequences.fasta.gz | cut -f2 -d':' | sort | uniq
This might be done only if you have cut, sort and uniq utilities installed (Linux and Mac OS systems should have them built-in).
Example 4. Select all Assembly accessions.
seqkit seq -ni ribogrove_24.230_sequences.fasta.gz | cut -f1 -d':' | sort | uniq
This might be done only if you have cut, sort and uniq utilities installed (Linux and Mac OS systems should have them built-in).
Example 5. Select all phylum names.
seqkit seq -n ribogrove_24.230_sequences.fasta.gz | grep -Eo ';p__[^;]+' | sed -E 's/;|p__//g' | sort | uniq
This might be done only if you have grep, sed, sort and uniq utilities installed (Linux and Mac OS systems should have them built-in).
For any questions concerning RiboGrove, please contact Maksim Sikolenko at sikolenkombio.bas-net.by or maximdeynonih
gmail.com.
If you find RiboGrove useful for your research please cite:
Maxim A. Sikolenko, Leonid N. Valentovich. “RiboGrove: a database of full-length prokaryotic 16S rRNA genes derived from completely assembled genomes” // Research in Microbiology, Volume 173, Issue 4, May 2022, 103936.
(DOI: 10.1016/j.resmic.2022.103936).
Please use the make_qiime_taxonomy_file.py script to convert the RiboGrove file metadata/taxonomy.tsv to a QIIME2-compatible file. You can find out how to use this script in the corresponding README file.
People have already provided several useful answers in the corresponding discussion: https://bioinformatics.stackexchange.com/questions/20915/how-do-i-save-selected-sequences-in-seqkit-to-a-file.
People have already provided several useful answers in the corresponding discussion: https://www.biostars.org/p/9561418.
RiboGrove, 2025-05-08