Author: Nasir, Arshan; Caetano-Anollés, Gustavo
Title: A phylogenomic data-driven exploration of viral origins and evolution Document date: 2015_9_25
ID: 49360l2a_68
Snippet: Data retrieval Viral protein sequences were retrieved from the National Center for Biotechnology Information Viral Genomes Project (June 2014) (119) . A total of 190,610 viral proteins corresponded to proteomes of 3966 viruses. For simplicity, unclassified and unassigned phages and viruses, and deltaviruses that require helper coviruses to replicate in host tissues (for example, Hepatitis delta virus) were excluded from the analysis. Viral proteo.....
Document: Data retrieval Viral protein sequences were retrieved from the National Center for Biotechnology Information Viral Genomes Project (June 2014) (119) . A total of 190,610 viral proteins corresponded to proteomes of 3966 viruses. For simplicity, unclassified and unassigned phages and viruses, and deltaviruses that require helper coviruses to replicate in host tissues (for example, Hepatitis delta virus) were excluded from the analysis. Viral proteomes were scanned against SUPER-FAMILY HMMs (20) to detect significant SCOP FSF domains (E < 0.0001). Proteomes with no hits were further excluded from the analysis. This yielded a final viral data set of 3460 viral proteomes. In turn, FSF assignments for 10,930,447 proteins in 1620 cellular organisms were directly retrieved from the local installation of the SUPERFAMILY MySQL database (release July 2014; version 1.75). A total repertoire of 1995 significant FSF domains were detected in the entire set of 5080 proteomes.
Search related documents:
Co phrase search for related documents, hyperlinks ordered by date