Selected article for: "amino acid and short sequence"

Author: Anthony, Simon J.; Epstein, Jonathan H.; Murray, Kris A.; Navarrete-Macias, Isamara; Zambrana-Torrelio, Carlos M.; Solovyov, Alexander; Ojeda-Flores, Rafael; Arrigo, Nicole C.; Islam, Ariful; Ali Khan, Shahneaz; Hosseini, Parviez; Bogich, Tiffany L.; Olival, Kevin J.; Sanchez-Leon, Maria D.; Karesh, William B.; Goldstein, Tracey; Luby, Stephen P.; Morse, Stephen S.; Mazet, Jonna A. K.; Daszak, Peter; Lipkin, W. Ian
Title: A Strategy To Estimate Unknown Viral Diversity in Mammals
  • Document date: 2013_9_3
  • ID: 6lobyyj4_20
    Snippet: Virus classification. For the purposes of this study, we avoided the use of taxonomic concepts such as species or genotype because of the variable criteria used for such distinctions (9) and because the degree of sequence conservation used to establish such distinctions can vary across the genome and may be affected by the relatively short sequence fragments generated in this study. We focused instead on collections of viral sequences that form d.....
    Document: Virus classification. For the purposes of this study, we avoided the use of taxonomic concepts such as species or genotype because of the variable criteria used for such distinctions (9) and because the degree of sequence conservation used to establish such distinctions can vary across the genome and may be affected by the relatively short sequence fragments generated in this study. We focused instead on collections of viral sequences that form distinct monophyletic clades within a particular family, and we considered a virus novel if the sequence identity to its closest relative is less than or equal to the identity between the two closest species for a given viral family. Due to the very large number of herpesvirus sequences identified in this study (n ϭ 650), we used hierarchical clustering to segregate sequences for this particular family. To do this, we first extracted 598 polymerase sequences from published complete genomes (downloaded from NCBI on 14 September 2012) and combined them with the sequences generated in this study (total of 1,248 sequences). Coding sequences were translated and aligned using MUSCLE (version 3.8.31) (57) with the default settings. The nucleotide alignment was constructed by replacing each amino acid with the codon that gave rise to it. Columns containing gaps in more than 1,000 of the 1,248 sequences were removed. The genetic distance between HV species was subsequently established using the published sequences in the alignment only, as de-scribed previously (58) , and a Ͼ7% nucleotide difference (Hamming distance) was used to define HV clusters. PgHV sequences were then segregated using hierarchical clustering, as implemented in the SciPy package (59) using average linkage clustering.

    Search related documents:
    Co phrase search for related documents
    • amino acid and close relative identity: 1
    • amino acid and code sequence: 1, 2
    • amino acid and complete genome: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25
    • amino acid and genetic distance: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20
    • amino acid and genome vary: 1
    • close relative and complete genome: 1, 2
    • complete genome and genetic distance: 1, 2, 3, 4, 5