Selected article for: "deletion large insertion and large insertion"

Author: Alejandro A Schäffer; Eneida Hatcher; Linda Yankie; Lara Shonkwiler; J Rodney Brister; Ilene Karsch-Mizrachi; Eric P Nawrocki
Title: VADR: validation and annotation of virus sequence submissions to GenBank
  • Document date: 2019_11_22
  • ID: besvz92f_58
    Snippet: In our tests of the NC, NP, and DC datasets, VADR failed 53 sequences; 18 of these were also failed by VAPiD and/or VIGOR as mentioned above (exactly one sequence, FV536857.1, in the NC dataset failed all three methods), and 35 passed VAPiD and/or VIGOR (Table 8 ). These 35 sequences had issues that were not flagged by either of the other two programs but that indexers would like to manually review before possibly accepting the sequences into Gen.....
    Document: In our tests of the NC, NP, and DC datasets, VADR failed 53 sequences; 18 of these were also failed by VAPiD and/or VIGOR as mentioned above (exactly one sequence, FV536857.1, in the NC dataset failed all three methods), and 35 passed VAPiD and/or VIGOR (Table 8 ). These 35 sequences had issues that were not flagged by either of the other two programs but that indexers would like to manually review before possibly accepting the sequences into GenBank. Those issues include: 16 sequences with early stop codons compared with the closest RefSeq (cdsstopn alerts), 12 sequences for which a blastx alignment in the protein validation stage did not extend close enough to the 3' boundary predicted from the nucleotide-based alignment (indf5pst or indf3pst alerts), ten sequences with low similarity to the RefSeq at the 5' or 3' end of the sequence or an annotated feature (lowsim5f, lowsim3f, or lowsim3s alerts), seven sequences where the 5' or 3' boundary of a feature was not aligned with sufficient confidence (indf5loc, indf5gap, or indf3loc alerts), five sequences that were expected to be Norovirus but were classified as Sapovirus, another Caliciviridae genus (incgroup alerts), five sequences with too large of an insertion or deletion in a blastx alignment (insertnp or deletinp alerts), and one sequence not recognized by any of the Caliciviridae or Flaviviridae models (a Salivirus from the Picornaviridae family, noannotn alert). Twelve of the 35 sequences had more than one of the above listed alerts. For the sequences with truncated blastx alignments, we observed that identical or similar truncated alignments were found in VIGOR, but in the current design of VIGOR, truncated nucleotide-to-protein alignments do not trigger an error, at least in these cases.

    Search related documents:
    Co phrase search for related documents
    • early stop codon and low similarity: 1