Selected article for: "amino acid and modeled structure"

Author: Alejandro A Schäffer; Eneida Hatcher; Linda Yankie; Lara Shonkwiler; J Rodney Brister; Ilene Karsch-Mizrachi; Eric P Nawrocki
Title: VADR: validation and annotation of virus sequence submissions to GenBank
  • Document date: 2019_11_22
  • ID: besvz92f_18
    Snippet: The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. . https://doi.org/10.1101/852657 doi: bioRxiv preprint The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. . https://doi.org/10.1101/852657 doi: bioRxiv preprint these files and outputs a VADR "model information" file with coordinates of CDS, mat peptide, and gene features and product and exception qualifiers. By defau.....
    Document: The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. . https://doi.org/10.1101/852657 doi: bioRxiv preprint The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. . https://doi.org/10.1101/852657 doi: bioRxiv preprint these files and outputs a VADR "model information" file with coordinates of CDS, mat peptide, and gene features and product and exception qualifiers. By default, all other feature types and qualifiers are ignored, although command-line options can be used to specify that additional feature types, such as ncRNA and stem loop, and qualifiers should be included. v-build.pl also outputs a covariance model (CM) file for the input RefSeq created using Infernal v1.1.3's cmbuild program. If the RefSeq nucleotide sequence being modeled has known secondary structural elements, such as stem loops, then a Stockholm format file with structure annotation can be provided, and the resulting CM file will model the specified structure. This structure will inform the sequence-and-structure based alignment of input sequences by Infernal's cmalign program in the annotation stage of v-annotate.pl. By default, if no Stockholm file is input to v-build.pl, then the RefSeq sequence is modeled without secondary structure. Additionally, v-build.pl uses the makeblastdb program from BLAST v2.9.0+ [15] to create a BLAST database from amino acid translations of the RefSeq CDS features. v-annotate.pl uses this database with blastx to validate its nucleotide-based predictions of CDS features. This design in which VADR annotates nucleotide sequences, but validates with protein alignments allows us to mitigate the known limitation that the 4-letter nucleic acid alphabet is less sensitive in homology searching than the 20-letter amino acid alphabet [16] , while retaining the capability to annotate ncRNAs and other features that are not translated.

    Search related documents:
    Co phrase search for related documents
    • amino acid and base alignment: 1
    • amino acid and blast database: 1, 2, 3
    • amino acid and command line: 1
    • blast database and CDS feature: 1
    • blast database and CM file: 1