Author: Wang, Shiliang; Sundaram, Jaideep P; Spiro, David
Title: VIGOR, an annotation program for small viral genomes Document date: 2010_9_7
ID: 0lbxvudt_10
Snippet: A similarity search is performed again between the potential coding region and the custom protein database. The protein sequence with highest identity in the similarity search is established as the reference sequence for the identification of the start and stop codons. If the first codon in the potential coding sequence is ATG and aligns with the first residue in the reference sequence, this ATG is selected as the start codon; otherwise, the near.....
Document: A similarity search is performed again between the potential coding region and the custom protein database. The protein sequence with highest identity in the similarity search is established as the reference sequence for the identification of the start and stop codons. If the first codon in the potential coding sequence is ATG and aligns with the first residue in the reference sequence, this ATG is selected as the start codon; otherwise, the nearest upstream in-frame ATG is selected as the start codon. If no in-frame ATG is present in the upstream region of the aligned sequences, the 60 nucleotides downstream of the first aligned residue are scanned for the start codon. Sequences downstream of the last aligned residue of the potential coding sequence are scanned for in-frame stop codons (TAA, TGA, and TAG) and the closest stop codon to the last aligned residue is selected. Mature mRNA for the influenza M2 and NS2 genes is produced by internal splicing. The conserved splice donor and acceptor sites (GT...AG) [5] are scanned around the alignment joint sites between the gap and aligned regions. The splice sites which result in the best alignment between the translated protein and the reference sequence are selected. The two main criteria for the selection of splice sites are identity to reference sequence and sequence length of the translated protein; however, if these two do not agree with each other, sequence length has priority in choosing the final splice sites.
Search related documents:
Co phrase search for related documents- acceptor site and conserved splice donor: 1
- align sequence and codon stop: 1
- aligned region and custom protein: 1
- aligned region and custom protein database: 1
- close codon stop and codon stop: 1
- codon stop and custom protein: 1
Co phrase search for related documents, hyperlinks ordered by date