Selected article for: "database sequence and sequence database"

Author: Alejandro A Schäffer; Eneida Hatcher; Linda Yankie; Lara Shonkwiler; J Rodney Brister; Ilene Karsch-Mizrachi; Eric P Nawrocki
Title: VADR: validation and annotation of virus sequence submissions to GenBank
  • Document date: 2019_11_22
  • ID: besvz92f_3
    Snippet: Early in the history of GenBank, Michael Waterman presciently wrote that "Entering new sequences into the databases requires the database staff to analyze and interpret the sequences and the associated scientific literature [4] ." Even earlier, Margaret Dayhoff had justified her enormous efforts at sequence database curation reasoning that "a carefully verified collection [of sequences] was more economical in the long run than a quick and dirty c.....
    Document: Early in the history of GenBank, Michael Waterman presciently wrote that "Entering new sequences into the databases requires the database staff to analyze and interpret the sequences and the associated scientific literature [4] ." Even earlier, Margaret Dayhoff had justified her enormous efforts at sequence database curation reasoning that "a carefully verified collection [of sequences] was more economical in the long run than a quick and dirty collection" [5] . At NCBI, sequence submissions have been checked by indexers since NCBI began operating GenBank, but over the years, limitations to this arrangment have become apparent. First, the number and size of sequence submissions has grown much faster than the NCBI budget and the capability to hire more indexers. Second, although GenBank indexers are trained rigorously and homogeneously and use the same software, there is no formal mechanism to enforce "inter-observer agreement", meaning that all GenBank indexers would be guaranteed to give the same evaluation of the same submission. Third, researchers who wish to submit their sequences cannot reproduce all the checks that GenBank indexers do. Consequently, problematic sequences cause delays and e-mail interactions between submitters and indexers that might be avoidable using a more deterministic and open system of checks. The importance of transparency in curatorial analysis was emphasized by Walter Goad, one of the founders of GenBank: "It is important that we be perceived by the molecular biology community as offering free and open access to the information and programs we will be collecting" [5] .

    Search related documents: