Author: Dong, Rui; Zheng, Hui; Tian, Kun; Yau, Shek-Chung; Mao, Weiguang; Yu, Wenping; Yin, Changchuan; Yu, Chenglong; He, Rong Lucy; Yang, Jie; Yau, Stephen ST
Title: Virus Database and Online Inquiry System Based on Natural Vectors Document date: 2017_12_17
ID: 09a32vyg_11
Snippet: The natural vector method has quite a few successful applications recently. In 2011, we applied natural vector to cluster H1N1 genomes and reported as significant results in dendrogram, which coincided with biologists' analyses in the work by Deng et al. 9 We also predict that the A(H1N1) genomes are originally from swine flu virus genome lineage, which shows the direction toward how to resist the threat of the new influenza; in 2013, we also pro.....
Document: The natural vector method has quite a few successful applications recently. In 2011, we applied natural vector to cluster H1N1 genomes and reported as significant results in dendrogram, which coincided with biologists' analyses in the work by Deng et al. 9 We also predict that the A(H1N1) genomes are originally from swine flu virus genome lineage, which shows the direction toward how to resist the threat of the new influenza; in 2013, we also proposed 12-dimensional natural vectors for classifying all single-segmented viruses (DNA and RNA genomes) which broadened the range of natural vector. 8 Among the 7 Baltimore classes, the error rates of classifying Baltimore labels were below 0.01% for Baltimore I, II, IV, V, and VII, and the error rates of classifying family labels were 0 for Baltimore II, III, V, VI, and VII. After validating with the published references, we successfully predicted 21 missing labels of viruses. In the work by Huang et al, 10 we extended the natural vector approach to include multiple-segmented viruses by introducing Hausdorff metric in the GenBank at that time. The error rates of the predictions of 2384 viruses were 3.5% for Baltimore labels, 4.6% for family labels, 0.3% for subfamily labels, and 4.4% for genus labels. We also analyzed the influenza A(H7N9) virus whose genome consists of 8 segments and drew the conclusion that the analysis based on whole genomes through Hausdorff distance is more reliable than the classical one based only on 2 segments, which proves that our method performs well in multiple-segmented viruses. In 2015, we applied natural vectors on Ebola viruses of the 2014 outbreak. 11 The accuracy rates on classifying family and genus labels were 100%. Our phylogenetic analysis showed that a protein named VP24 is the most consistent one to the variation of virulence among the 7 proteins related to Ebola viruses, which suggests that VP24 would be a pharmaceutical target for preventing and treating Ebola virus. As natural vector can reflect core information stored in sequences and genomes, we use it to construct the virus classification system introduced in this article.
Search related documents:
Co phrase search for related documents- accuracy rate and Baltimore label: 1
- accuracy rate and classification system: 1, 2, 3
- accuracy rate and Ebola virus: 1
- accuracy rate and error rate: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14
- accuracy rate and family label: 1
- accuracy rate and family label Baltimore label: 1
- Baltimore class and family label: 1
- Baltimore class and family label Baltimore label: 1
- Baltimore label and family label: 1
- Baltimore label and family label Baltimore label: 1
- classification system and Ebola virus: 1
- classification system and error rate: 1
- Ebola virus and error rate: 1
Co phrase search for related documents, hyperlinks ordered by date