Author: Zhengqiao Zhao; Bahrad A. Sokhansanj; Gail L. Rosen
Title: Characterizing geographical and temporal dynamics of novel coronavirus SARS-CoV-2 using informative subtype markers Document date: 2020_4_9
ID: 9sk11214_42
Snippet: In this paper, we propose to use short sets of nucleotides, based on error corrected entropy-based 528 identification of highly informative nucleotide sites in the viral genome, as markers to define subtypes of The copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/2020.04.07.030759 doi: bioRxiv preprint United States. In addition, we demonstrate that by using ISMs for subtyping, we can also readily .....
Document: In this paper, we propose to use short sets of nucleotides, based on error corrected entropy-based 528 identification of highly informative nucleotide sites in the viral genome, as markers to define subtypes of The copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/2020.04.07.030759 doi: bioRxiv preprint United States. In addition, we demonstrate that by using ISMs for subtyping, we can also readily visualize 537 the geographic and temporal distribution of subtypes in an efficient and uniform manner. We have developed 538 and are making available a pipeline to generate quantitative profiles of subtypes and the visualizations that 539 are presented in this paper on Github at http://github.com/EESI/ISM. 540 Overall, the entropy-based, and error corrected, subtyping approach described in this paper represents a 541 potentially efficient way for researchers to gain further insight on the diversity of SARS-CoV-2 sequences and 542 their evolution over time. An important caveat of this approach, as with others based on analysis of viral 543 genome sequence, is that it is limited by the sampling of viral sequences. Small and non-uniform samples of 544 sequences may not accurately reflect the true diversity of viral subtypes within a given population. However, 545 the ISM-based approach has the advantage of being scalable as sequence information grows, and as a result 546 will be able to become both more accurate and precise as sequence information grows within different 547 geographical and other subpopulations.
Search related documents:
Co phrase search for related documents- genome sequence and nucleotide site: 1, 2, 3, 4, 5, 6, 7, 8, 9
- genome sequence and paper present: 1, 2, 3
- give population and paper present: 1
Co phrase search for related documents, hyperlinks ordered by date