Selected article for: "biological dark matter and human gut"

Author: Dutilh, Bas E
Title: Metagenomic ventures into outer sequence space
  • Document date: 2014_12_15
  • ID: ybd8hi8y_3
    Snippet: Metagenomics has traditionally addressed the 2 classical questions listed above by aligning the sequencing reads in metagenomic data sets to a reference database containing known, annotated sequences. This allows the taxonomic and functional diversity of the sampled microbes to be described in terms of existing knowledge, allowing for straightforward interpretation of the results. However, a persistent concern in the analysis of metagenomes has b.....
    Document: Metagenomics has traditionally addressed the 2 classical questions listed above by aligning the sequencing reads in metagenomic data sets to a reference database containing known, annotated sequences. This allows the taxonomic and functional diversity of the sampled microbes to be described in terms of existing knowledge, allowing for straightforward interpretation of the results. However, a persistent concern in the analysis of metagenomes has been the unknown fraction, consisting of the reads Keywords: biological dark matter, crAssphage, human gut, human virome, metagenomics, metagenome assembly, unknowns that cannot be annotated by using database searches. The level of unknowns can range up to 99% of the metagenomic reads, depending on the sampled environment, the protocols used for nucleotide isolation and sequencing, the homology search algorithm, and the reference database. 8 Unknowns exist for 4 reasons that are not unrelated. The first reason is technical. Due to limitations of some next-generation sequencing platforms and library preparation protocols, spurious sequences may be generated that do not reflect true biological molecules. These artificial sequences include artifacts due to the sequencing technology 9 and chimeras, i.e., sequences generated from separate genetic molecules derived from different organisms. Since chimeras frequently arise during PCR amplification, they are expected to be more abundant in environmental amplicon sequencing than in shotgun metagenomics, and can be detected using bioinformatic tools. 10 The second reason that unknowns exist is biological, as they reflect the enormous natural diversity of microorganisms that we are only beginning to unveil with metagenomics. This is both overwhelming and exciting, highlighting how much remains to be discovered in biology. This genetic diversity has been referred to as biological "dark matter," 11, 12 and is especially pronounced in viral metagenomes. 8 This issue can only be resolved by expanding reference databases, as exemplified by recent studies of one of the most studied microbial ecosystems: the human gut. The first metagenomic snapshots of the microbiota in the human gut were taken from 2 healthy adults, and revealed a high interindividual diversity and many unknowns. 13 To a large extent, these unknowns were resolved when a reference catalog was created based on the sequences in the gut metagenomes themselves, decreasing the percentage of unknowns from »85% to »20%. 14 Moreover, subsequent large scale sequencing efforts revealed that in fact, many people share a similar intestinal flora, regardless of whether these similarities are viewed as discrete enterotypes 15 or as gradients. 16 These results illustrate how unknowns can be depleted by expanding the databases with appropriate reference sequences. This not only requires increased sequencing effort of phylogenetically diverse isolates 17 or single cells, 11 but also mining of draft genomes from metagenomes, 18 sampled from microbial environments around the globe. 19 Thus, by mapping the global sequence space, we can provide reassurance that at least some level of sampling saturation can be achieved. For viruses, and particularly for bacteriophages, efforts to provide a denser sampling of sequence space are still lacking.

    Search related documents:
    Co phrase search for related documents
    • biological dark matter and dark matter: 1
    • biological molecule and data set: 1