Selected article for: "codon usage and correspondence analysis"

Author: Haogao Gu; Daniel Chu; Malik Peiris; Leo L.M. Poon
Title: Multivariate Analyses of Codon Usage of SARS-CoV-2 and other betacoronaviruses
  • Document date: 2020_2_20
  • ID: 9aegg5sd_3_0
    Snippet: Correspondence analysis 154 We first conducted a multivariate analysis of codon usage on the dataset by using global 155 correspondence analysis. We also conducted WCA and BCA to study these sequences at 156 synonymous codon usage and amino acid usage levels, respectively. Given that there were 157 different amino acid usage biases among different genes (Supplementary Figure S3) , we 158 performed correspondence analyses of these genes separately.....
    Document: Correspondence analysis 154 We first conducted a multivariate analysis of codon usage on the dataset by using global 155 correspondence analysis. We also conducted WCA and BCA to study these sequences at 156 synonymous codon usage and amino acid usage levels, respectively. Given that there were 157 different amino acid usage biases among different genes (Supplementary Figure S3) , we 158 performed correspondence analyses of these genes separately. 159 Of all the four correspondence analyses for the four genes, the extracted first factors 160 explained more than 50% of the total variance (see Supplementary Figure S4 The global codon usages of bat RatG13 virus were found most similar to SARS-CoV-2 in 196 orf1ab, spike and nucleocapsid genes, but not in membrane gene ( Figure. 2). In the analysis 197 of membrane protein, pangolin P1E virus had a more similar codon usage to SARS-CoV-2 198 than all the other viruses. We found the similarity in codon usage between pangolin P1E and 199 SARS-CoV-2 were also high in orf1ab, where P1E was the second closest data point to 200 SARS-CoV-2. But this is not the case for spike and nucleocapsid genes. 201 We also observed that the codon usage pattern in spike gene was more complex than in other (Table 217 1). 218 Results from the BCA suggested that the amino acid usage of SARS-CoV-2 is closely related 219 to bat and human SARSr-CoVs in all four genes ( Figure 3B and Figure 4B ). Specifically, we 220 discovered that the SARS-CoV-2 had amino acid usage pattern most similar to bat RaTG13 Figure S8A ). It is evident that the synonymous codon usage pattern of SARS-CoV-2 is 243 distinct from other bat origin coronaviruses. The difference in synonymous codon usage is 244 largely explained by the first factor (more than 50%), and our analysis on codon usages 245 suggest that the first factor maybe highly related to the preferential usage of codons ending 246 . CC-BY-NC-ND 4.0 International license author/funder. It is made available under a The copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/2020.02.15.950568 doi: bioRxiv preprint with cytosine (Supplementary Figure S9) . We also had similar observation for the membrane 247 gene. Our three-dimensional analysis revealed that the synonymous codon usage of SARS-248 CoV-2 in membrane was most similar to P1E and CoVZXC21 (Supplementary Figure S8B) . 249 It is worth noting that comparing to RaTG13, P1E and CoVZXC21 had lower synonymous 250 codon usage similarity to SARS-CoV-2 in the other three genes. 251 Overall, our WCA results support a more complex synonymous codon usage background on 252 spike and membrane genes, though we identified unique codon usage patterns of SARS-CoV-253 2 on these two genes. The copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10. 1101 /2020 In addition to global CA analysis, the application of WCA and BCA can eliminate the effects 279 caused by amino acid compositions and synonymous codon usage, respectively. These 280 alternative analytical tools were important to our study. It is because the amino acid 281 sequences are expected to be more conserved such that they can preserve biological functions 282 of the translated genes. By contrast, mutations at synonymous level tend to be more frequent, 283 as most of these codon alternatives do not affect the biological function of a protein. The S protein is responsible for receptor binding which is im

    Search related documents:
    Co phrase search for related documents
    • amino acid and bat origin: 1, 2, 3, 4, 5
    • amino acid and biological function: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14
    • amino acid and codon alternative: 1, 2
    • amino acid and codon SARS usage pattern: 1
    • amino acid and codon usage: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25
    • amino acid composition and bat origin: 1
    • amino acid composition and codon usage: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11
    • amino acid synonymous codon usage and codon usage: 1, 2, 3, 4, 5, 6
    • amino acid usage and codon SARS usage pattern: 1
    • amino acid usage and codon usage: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25
    • biological function and codon alternative: 1
    • biological function and codon usage: 1, 2, 3