Selected article for: "protein sequence alignment and sequence identity"

Author: Wang, Shiliang; Sundaram, Jaideep P; Spiro, David
Title: VIGOR, an annotation program for small viral genomes
  • Document date: 2010_9_7
  • ID: 0lbxvudt_10
    Snippet: A similarity search is performed again between the potential coding region and the custom protein database. The protein sequence with highest identity in the similarity search is established as the reference sequence for the identification of the start and stop codons. If the first codon in the potential coding sequence is ATG and aligns with the first residue in the reference sequence, this ATG is selected as the start codon; otherwise, the near.....
    Document: A similarity search is performed again between the potential coding region and the custom protein database. The protein sequence with highest identity in the similarity search is established as the reference sequence for the identification of the start and stop codons. If the first codon in the potential coding sequence is ATG and aligns with the first residue in the reference sequence, this ATG is selected as the start codon; otherwise, the nearest upstream in-frame ATG is selected as the start codon. If no in-frame ATG is present in the upstream region of the aligned sequences, the 60 nucleotides downstream of the first aligned residue are scanned for the start codon. Sequences downstream of the last aligned residue of the potential coding sequence are scanned for in-frame stop codons (TAA, TGA, and TAG) and the closest stop codon to the last aligned residue is selected. Mature mRNA for the influenza M2 and NS2 genes is produced by internal splicing. The conserved splice donor and acceptor sites (GT...AG) [5] are scanned around the alignment joint sites between the gap and aligned regions. The splice sites which result in the best alignment between the translated protein and the reference sequence are selected. The two main criteria for the selection of splice sites are identity to reference sequence and sequence length of the translated protein; however, if these two do not agree with each other, sequence length has priority in choosing the final splice sites.

    Search related documents:
    Co phrase search for related documents
    • aligned region and sequence length: 1
    • aligned residue and sequence length: 1
    • code sequence and sequence length: 1, 2
    • codon stop and sequence length: 1
    • codon stop and sequence length sequence: 1
    • custom protein and sequence length: 1, 2
    • custom protein database and sequence length: 1
    • frame stop and sequence length: 1, 2
    • frame stop and sequence length sequence: 1
    • frame stop codon and sequence length: 1
    • frame stop codon and sequence length sequence: 1
    • high identity and sequence length: 1, 2, 3, 4
    • protein sequence and sequence length: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25
    • protein sequence and sequence length sequence: 1, 2, 3
    • reference sequence and sequence length: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21