Author: Alejandro A Schäffer; Eneida Hatcher; Linda Yankie; Lara Shonkwiler; J Rodney Brister; Ilene Karsch-Mizrachi; Eric P Nawrocki
Title: VADR: validation and annotation of virus sequence submissions to GenBank Document date: 2019_11_22
ID: besvz92f_81
Snippet: An alternative strategy to building models from a single RefSeq is to create and use profile models (profile HMMs and CMs) from trusted sequence alignments of multiple representative sequences that cover the known diversity of the virus species or subspecies being modelled. Profile-based methods are more sensitive at homology detection [33, 34, 35, 36] then single sequence based methods and so this strategy may improve performance. Extending VADR.....
Document: An alternative strategy to building models from a single RefSeq is to create and use profile models (profile HMMs and CMs) from trusted sequence alignments of multiple representative sequences that cover the known diversity of the virus species or subspecies being modelled. Profile-based methods are more sensitive at homology detection [33, 34, 35, 36] then single sequence based methods and so this strategy may improve performance. Extending VADR to profiles was envisioned from its design inception and the cmbuild program which creates VADR's CM files can take as input a multiple alignment. All sequences in the alignment the profile is constructed from should include the same set of features with start and end points aligned, which will limit the phylogenetic breadth of some alignments. For example, among noroviruses, eight of the nine RefSeqs encode three proteins, but murine norovirus (represented by NC 08311) encodes four proteins. A similar problem arises among different subtypes of West Nile virus that may or may not encode WARF4 [37] .
Search related documents:
Co phrase search for related documents- cm file and model build: 1
- input multiple alignment and multiple alignment: 1, 2, 3, 4, 5, 6, 7
Co phrase search for related documents, hyperlinks ordered by date