Selected article for: "blast database and CDS feature"

Author: Alejandro A Schäffer; Eneida Hatcher; Linda Yankie; Lara Shonkwiler; J Rodney Brister; Ilene Karsch-Mizrachi; Eric P Nawrocki
Title: VADR: validation and annotation of virus sequence submissions to GenBank
  • Document date: 2019_11_22
  • ID: besvz92f_35
    Snippet: For each input sequence S, each predicted CDS feature of 30 or more nucleotides, as well as the full sequence S, is then used as a blastx query against the BLAST database of the RefSeq protein sequences created by v-build.pl for model M (S). The top blastx match for each RefSeq protein is compared to the CM-based prediction and alerts are generated if specific differences exist. For example, if the endpoints differ by more than five nucleotides o.....
    Document: For each input sequence S, each predicted CDS feature of 30 or more nucleotides, as well as the full sequence S, is then used as a blastx query against the BLAST database of the RefSeq protein sequences created by v-build.pl for model M (S). The top blastx match for each RefSeq protein is compared to the CM-based prediction and alerts are generated if specific differences exist. For example, if the endpoints differ by more than five nucleotides on either the 5' or 3' ends, a indf5pst or indf3pst alert is reported. The main purpose of this stage is to identify any frameshift mutations in the input sequence, which may not have triggered any upstream alerts. Also at this stage, insertnp or deletinp alerts are reported for sequences with in-frame insertions or deletions longer than an indexer-specified taxon-specific threshold (set at 27 nucleotides by default), so that an indexer can check whether such a large insertion or deletion is plausible.

    Search related documents:
    Co phrase search for related documents
    • CDS feature and input sequence: 1
    • deletion frame insertion and frame insertion: 1, 2, 3, 4, 5, 6
    • deletion large insertion and large insertion: 1, 2