Selected article for: "dependent RNA polymerase and nucleocapsid protein membrane protein envelope protein"

Author: Huang, Yi; Lau, Susanna K. P.; Woo, Patrick C. Y.; Yuen, Kwok-yung
Title: CoVDB: a comprehensive database for comparative analysis of coronavirus genes and genomes
  • Document date: 2007_10_2
  • ID: ujhgb3b0_11
    Snippet: Polyprotein annotation. In all coronavirus genomes, orf1ab occupies two-thirds of the genome and it is translated as a polyprotein. This polyprotein is posttranslationally cleaved by 3C-like protease (3CL pro ) and papain-like protease (PL pro ) into 15-16 non-structural proteins. Some of the non-structural proteins, such as RNA-dependent RNA polymerase, helicase, 3CL pro and PL pro are essential for replication or virulence of the coronavirus, a.....
    Document: Polyprotein annotation. In all coronavirus genomes, orf1ab occupies two-thirds of the genome and it is translated as a polyprotein. This polyprotein is posttranslationally cleaved by 3C-like protease (3CL pro ) and papain-like protease (PL pro ) into 15-16 non-structural proteins. Some of the non-structural proteins, such as RNA-dependent RNA polymerase, helicase, 3CL pro and PL pro are essential for replication or virulence of the coronavirus, although the functions of others are still unclear. Due to the essentiality of the non-structural proteins, these sequences are often used for evolutionary analysis, primer design, etc. However, except for the reference sequences, detailed cleavage site information is not provided for the non-structural proteins in other sequences in GenBank. Since it has been shown that 3CL pro and PL pro of coronavirus cleave at conserved specific amino acids, the putative cleavage sites of the 15-16 non-structural proteins can be predicted by multiple sequence alignment. Using these pieces of information, we have annotated these non-structural proteins in all the coronavirus sequences for easy retrieval in CoVDB. Protein/gene name unification. By convention, all nonstructural proteins in the polyprotein encoded by orf1ab are named as 'nsp', with each protein numbered consecutively starting from the 5 0 end (nsp1-nsp16). The structural proteins after the polyprotein are hemagglutinin esterase (HE, in group 2a coronaviruses), spike glycoprotein (S), envelope protein (E), membrane protein (M) and nucleocapsid protein (N). However, there is no unified naming system for the non-structural proteins encoded by ORFs downstream to orf1ab. This lack of a unified system greatly reduces the stability and accuracy of ortholog retrieval.

    Search related documents:
    Co phrase search for related documents