Selected article for: "accession number and GenBank database"

Author: Dutilh, Bas E
Title: Metagenomic ventures into outer sequence space
  • Document date: 2014_12_15
  • ID: ybd8hi8y_5
    Snippet: The fourth reason that unknowns exist is logistical. Most research projects that generate metagenomic sequencing datasets deposit the read files in large repositories, provide an accession number in the associated publication, and move on. It is not unlikely that many of these data sets, consisting of files sometimes gigabytes in size, are never looked at again. Thus, while a certain sequence may have been "seen" in a metagenome and is thus stric.....
    Document: The fourth reason that unknowns exist is logistical. Most research projects that generate metagenomic sequencing datasets deposit the read files in large repositories, provide an accession number in the associated publication, and move on. It is not unlikely that many of these data sets, consisting of files sometimes gigabytes in size, are never looked at again. Thus, while a certain sequence may have been "seen" in a metagenome and is thus strictly no longer "dark matter," it will still not be recognized when it is observed again. Reidentification of this sequence would only be possible if the publishing researcher identified it as an interesting sequence in his or her (assembled) metagenome, and submitted it to a searchable database like Genbank. 21 Because GenBank maintains very high standards for the sequences it accepts, submission can be a tedious process that is rarely worthwhile for unknown metagenomic contigs. An in depth investigation of the unknowns is rarely within the scope of a research project, and those sequences are thus first ignored and later forgotten. This is a waste of valuable resources: time, money, and work. The metagenomes available in public databases should be better exploited and mined for common sequences. To facilitate this, it is critical that metadata annotations of the metagenomes include a detailed description of the samples and sequencing protocol. 22 Exploiting these datasets will allow us to create more comprehensive maps of sequence space, and greatly improve our understanding and interpretation of metagenomes.

    Search related documents:
    Co phrase search for related documents
    • better exploit and data set: 1
    • certain sequence and data set: 1
    • common sequence and data set: 1
    • dark matter and data set: 1
    • data set and detailed description: 1, 2