Author: Huang, Yi; Lau, Susanna K. P.; Woo, Patrick C. Y.; Yuen, Kwok-yung
Title: CoVDB: a comprehensive database for comparative analysis of coronavirus genes and genomes Document date: 2007_10_2
ID: ujhgb3b0_11
Snippet: Polyprotein annotation. In all coronavirus genomes, orf1ab occupies two-thirds of the genome and it is translated as a polyprotein. This polyprotein is posttranslationally cleaved by 3C-like protease (3CL pro ) and papain-like protease (PL pro ) into 15-16 non-structural proteins. Some of the non-structural proteins, such as RNA-dependent RNA polymerase, helicase, 3CL pro and PL pro are essential for replication or virulence of the coronavirus, a.....
Document: Polyprotein annotation. In all coronavirus genomes, orf1ab occupies two-thirds of the genome and it is translated as a polyprotein. This polyprotein is posttranslationally cleaved by 3C-like protease (3CL pro ) and papain-like protease (PL pro ) into 15-16 non-structural proteins. Some of the non-structural proteins, such as RNA-dependent RNA polymerase, helicase, 3CL pro and PL pro are essential for replication or virulence of the coronavirus, although the functions of others are still unclear. Due to the essentiality of the non-structural proteins, these sequences are often used for evolutionary analysis, primer design, etc. However, except for the reference sequences, detailed cleavage site information is not provided for the non-structural proteins in other sequences in GenBank. Since it has been shown that 3CL pro and PL pro of coronavirus cleave at conserved specific amino acids, the putative cleavage sites of the 15-16 non-structural proteins can be predicted by multiple sequence alignment. Using these pieces of information, we have annotated these non-structural proteins in all the coronavirus sequences for easy retrieval in CoVDB. Protein/gene name unification. By convention, all nonstructural proteins in the polyprotein encoded by orf1ab are named as 'nsp', with each protein numbered consecutively starting from the 5 0 end (nsp1-nsp16). The structural proteins after the polyprotein are hemagglutinin esterase (HE, in group 2a coronaviruses), spike glycoprotein (S), envelope protein (E), membrane protein (M) and nucleocapsid protein (N). However, there is no unified naming system for the non-structural proteins encoded by ORFs downstream to orf1ab. This lack of a unified system greatly reduces the stability and accuracy of ortholog retrieval.
Search related documents:
Co phrase search for related documents- amino acid and cleavage site: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25
- amino acid and cleavage site information: 1
- amino acid and consecutively number: 1
- amino acid and coronavirus genome: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25
- amino acid and coronavirus sequence: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25
- amino acid and envelope protein: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25
- amino acid and evolutionary analysis: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25
- amino acid and GenBank sequence: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25
- cleavage site and coronavirus genome: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
- cleavage site and coronavirus sequence: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14
- cleavage site and envelope protein: 1, 2, 3, 4, 5, 6, 7, 8, 9
- cleavage site and evolutionary analysis: 1
- cleavage site and GenBank sequence: 1, 2, 3, 4
- coronavirus genome and envelope protein: 1, 2, 3, 4, 5, 6, 7, 8
- coronavirus genome and evolutionary analysis: 1, 2, 3, 4, 5, 6, 7, 8, 9
- coronavirus genome and GenBank sequence: 1, 2, 3, 4, 5, 6, 7, 8
- coronavirus sequence and envelope protein: 1, 2, 3, 4, 5, 6, 7, 8, 9
- coronavirus sequence and evolutionary analysis: 1, 2, 3, 4, 5, 6, 7, 8
- coronavirus sequence and GenBank sequence: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
Co phrase search for related documents, hyperlinks ordered by date