Author: Huang, Yi; Lau, Susanna K. P.; Woo, Patrick C. Y.; Yuen, Kwok-yung
Title: CoVDB: a comprehensive database for comparative analysis of coronavirus genes and genomes Document date: 2007_10_2
ID: ujhgb3b0_3
Snippet: By July 2007, more than 3000 coronavirus sequence records, including a total of 264 complete genomes, are available in GenBank (24) . Among the 25 coronavirus species with complete genome sequence available, six were sequenced by our group, including CoV-HKU1 and bat SARS-CoV (13, 16, 18, 19) . Furthermore, we defined two novel subgroups of group 2 coronavirus (18) . During the process of batch sequence retrieval for comparative genome analysis o.....
Document: By July 2007, more than 3000 coronavirus sequence records, including a total of 264 complete genomes, are available in GenBank (24) . Among the 25 coronavirus species with complete genome sequence available, six were sequenced by our group, including CoV-HKU1 and bat SARS-CoV (13, 16, 18, 19) . Furthermore, we defined two novel subgroups of group 2 coronavirus (18) . During the process of batch sequence retrieval for comparative genome analysis of the coronavirus genomes that we sequenced, we encountered several major problems about the coronavirus sequences in GenBank as well as other coronavirus databases (Coronaviridae Bioinformatics Resource, http://athena.bioc.uvic.ca/database.php?db= coronaviridae; PATRIC http://patric.vbi.vt.edu) (25) . First, in GenBank, the non-structural proteins in the polyprotein encoded by orf1ab were not annotated. Second, in all databases, for the non-structural proteins encoded by ORFs downstream to orf1ab, the annotations are often confusing because they are not annotated using a standardized system. Third, multiple accession numbers are often present for reference sequences (26) . These problems often lead to confusion when sequence retrieval is performed. Fourth, coronaviruses, especially SARS-CoV, amplified from different specimens may contain the same genome or gene sequences. These sequences usually lead to redundant work when they are analyzed.
Search related documents:
Co phrase search for related documents- batch sequence retrieval and standardized system: 1, 2
- complete genome and standardized system: 1
- coronavirus database and standardized system: 1, 2
- coronavirus genome and standardized system: 1
- coronavirus sequence and standardized system: 1, 2
- GenBank coronavirus sequence and standardized system: 1
- gene genome and standardized system: 1, 2
- gene genome sequence and standardized system: 1
- genome analysis and standardized system: 1, 2
- genome sequence and standardized system: 1
Co phrase search for related documents, hyperlinks ordered by date