Author: Chiara, Matteo; Horner, David S.; Gissi, Carmela; Pesole, Graziano
Title: Comparative genomics provides an operational classification system and reveals early emergence and biased spatio-temporal distribution of SARS-CoV-2 Cord-id: yoe84ta7 Document date: 2020_6_30
ID: yoe84ta7
Snippet: Effective systems for the analysis of molecular data are of fundamental importance for real-time monitoring of the spread of infectious diseases and the study of pathogen evolution. While the Nextstrain and GISAID portals offer widely used systems for the classification of SARS-CoV-2 genomes, both present relevant limitations. Here we propose a highly reproducible method for the systematic classification of SARS-CoV-2 viral types. To demonstrate the validity of our approach, we conduct an extens
Document: Effective systems for the analysis of molecular data are of fundamental importance for real-time monitoring of the spread of infectious diseases and the study of pathogen evolution. While the Nextstrain and GISAID portals offer widely used systems for the classification of SARS-CoV-2 genomes, both present relevant limitations. Here we propose a highly reproducible method for the systematic classification of SARS-CoV-2 viral types. To demonstrate the validity of our approach, we conduct an extensive comparative genomic analysis of more than 20,000 SARS-CoV-2 genomes. Our classification system delineates 12 clusters and 4 super-clusters in SARS-CoV-2, with a highly biased spatio-temporal distribution worldwide, and provides important observations concerning the evolutionary processes associated with the emergence of novel viral types. Based on the estimates of SARS-CoV-2 evolutionary rate and genetic distances of genomes of the early pandemic phase, we infer that SARS-CoV-2 could have been circulating in humans since August-November 2019. The observed pattern of genomic variability is remarkably similar between all clusters and super-clusters, being UTRs and the s2m element, a highly conserved secondary structure element, the most variable genomic regions. While several polymorphic sites that are specific to one or more clusters were predicted to be under positive or negative selection, overall, our analyses also suggest that the emergence of novel genome types is unlikely to be driven by widespread convergent evolution and independent fixation of advantageous substitutions. While, in the absence of rigorous experimental validation, several questions concerning the evolutionary processes and the phenotypic characteristics (increased/decreased virulence) remain open, we believe that the approach outlined in this study can be of relevance for the tracking and functional characterization of different types of SARS-CoV-2 genomes.
Search related documents:
Co phrase search for related documents- absence presence and additional cluster: 1
- absence presence and long period: 1, 2, 3, 4, 5
- absence presence and low diversity: 1, 2
- absence presence and low mutation: 1
- accurate complete and acute respiratory syndrome: 1, 2, 3, 4, 5, 6, 7
- acute respiratory syndrome and adaptive evolution: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17
- acute respiratory syndrome and adaptive selection: 1, 2, 3, 4, 5, 6
- acute respiratory syndrome and additional cluster: 1, 2, 3, 4
- acute respiratory syndrome and additional sampling: 1, 2, 3, 4
- acute respiratory syndrome and long incubation time: 1, 2, 3, 4, 5, 6
- acute respiratory syndrome and long period: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25
- acute respiratory syndrome and low diversity: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
- acute respiratory syndrome and low mutation: 1, 2, 3, 4, 5, 6, 7, 8, 9
- adaptive evolution and low diversity: 1
- additional sampling and long period: 1, 2
Co phrase search for related documents, hyperlinks ordered by date