Author: Alejandro A Schäffer; Eneida Hatcher; Linda Yankie; Lara Shonkwiler; J Rodney Brister; Ilene Karsch-Mizrachi; Eric P Nawrocki
Title: VADR: validation and annotation of virus sequence submissions to GenBank Document date: 2019_11_22
ID: besvz92f_68
Snippet: On the other hand, a larger reference database can contain more diversity than a smaller one, and the most common VADR failure in the NC dataset is due to an early stop codon in the nonstructural polyprotein CDS by three nucleotides (11 of the 16 cdsstopn alerts in the set of 35 sequences mentioned above). This failure may have been avoided with a larger reference database that included a norovirus sequence with this three nucleotide shorter CDS .....
Document: On the other hand, a larger reference database can contain more diversity than a smaller one, and the most common VADR failure in the NC dataset is due to an early stop codon in the nonstructural polyprotein CDS by three nucleotides (11 of the 16 cdsstopn alerts in the set of 35 sequences mentioned above). This failure may have been avoided with a larger reference database that included a norovirus sequence with this three nucleotide shorter CDS variant (an example is AB933745.1). We plan to add to VADR's reference library as we find areas of sequence space that it does not adequately cover; for example, while the manuscript was out for review, 10 norovirus RefSeqs were added for internal testing. We also plan to extend VADR to use profiles built from multiple alignments instead of single sequences which should enhance its ability to analyze and annotate some sequences that are divergent from available RefSeqs.
Search related documents:
Co phrase search for related documents- early stop and nonstructural polyprotein: 1
- early stop codon and nonstructural polyprotein: 1
- multiple alignment and nonstructural polyprotein: 1
Co phrase search for related documents, hyperlinks ordered by date