Selected article for: "Government work and RNA secondary structure"

Author: Alejandro A Schäffer; Eneida Hatcher; Linda Yankie; Lara Shonkwiler; J Rodney Brister; Ilene Karsch-Mizrachi; Eric P Nawrocki
Title: VADR: validation and annotation of virus sequence submissions to GenBank
  • Document date: 2019_11_22
  • ID: besvz92f_75
    Snippet: As noted above, VAPiD and VIGOR do not attempt to annotate features other than coding sequences: both annotate CDS, and VIGOR annotates mat peptides. VADR has the added capability of annotating any sequence feature that is also annotated in the RefSeq, including conserved structural RNA elements which, though present in many viral genomes [28, 29] , are typically not annotated in GenBank. We added annotation of an ncRNA feature, a subgenomic flav.....
    Document: As noted above, VAPiD and VIGOR do not attempt to annotate features other than coding sequences: both annotate CDS, and VIGOR annotates mat peptides. VADR has the added capability of annotating any sequence feature that is also annotated in the RefSeq, including conserved structural RNA elements which, though present in many viral genomes [28, 29] , are typically not annotated in GenBank. We added annotation of an ncRNA feature, a subgenomic flavivirus RNA (sfRNA), and associated stem loop features, to the four dengue virus RefSeqs in preparation of VADR use for dengue sequence submissions. These structural RNA elements are relevant in pathogenicity and evasion of the host immune system in at least some flaviviruses [13, 30, 31] . Incoming dengue virus sequence submissions will now include these RNA annotations because of VADR, which employs covariance models of both the conserved sequence and secondary structure of the RNA elements. In the set of 4171 sequences that passed VADR in the full DC dataset of 4580 sequences, VADR annotated between 4 and 9 stem loop features and exactly 1 ncRNA feature in each sequence for a total of 35,676 stem loop features and 4171 ncRNA features. In the set of 17,276 sequences that passed VADR in the full 20,973 sequence DP dataset, VADR annotated between 1 and 6 stem loop features in 2335 sequences, and exactly one ncRNA feature in 623 of This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available for use under a CC0 license.

    Search related documents:
    Co phrase search for related documents
    • CC0 license and DP dataset: 1
    • conserve sequence and dengue virus: 1
    • covariance model and dengue virus: 1, 2
    • dengue virus and DP dataset: 1, 2