Author: Huang, Yi; Lau, Susanna K. P.; Woo, Patrick C. Y.; Yuen, Kwok-yung
Title: CoVDB: a comprehensive database for comparative analysis of coronavirus genes and genomes Document date: 2007_10_2
ID: ujhgb3b0_3
Snippet: By July 2007, more than 3000 coronavirus sequence records, including a total of 264 complete genomes, are available in GenBank (24) . Among the 25 coronavirus species with complete genome sequence available, six were sequenced by our group, including CoV-HKU1 and bat SARS-CoV (13, 16, 18, 19) . Furthermore, we defined two novel subgroups of group 2 coronavirus (18) . During the process of batch sequence retrieval for comparative genome analysis o.....
Document: By July 2007, more than 3000 coronavirus sequence records, including a total of 264 complete genomes, are available in GenBank (24) . Among the 25 coronavirus species with complete genome sequence available, six were sequenced by our group, including CoV-HKU1 and bat SARS-CoV (13, 16, 18, 19) . Furthermore, we defined two novel subgroups of group 2 coronavirus (18) . During the process of batch sequence retrieval for comparative genome analysis of the coronavirus genomes that we sequenced, we encountered several major problems about the coronavirus sequences in GenBank as well as other coronavirus databases (Coronaviridae Bioinformatics Resource, http://athena.bioc.uvic.ca/database.php?db= coronaviridae; PATRIC http://patric.vbi.vt.edu) (25) . First, in GenBank, the non-structural proteins in the polyprotein encoded by orf1ab were not annotated. Second, in all databases, for the non-structural proteins encoded by ORFs downstream to orf1ab, the annotations are often confusing because they are not annotated using a standardized system. Third, multiple accession numbers are often present for reference sequences (26) . These problems often lead to confusion when sequence retrieval is performed. Fourth, coronaviruses, especially SARS-CoV, amplified from different specimens may contain the same genome or gene sequences. These sequences usually lead to redundant work when they are analyzed.
Search related documents:
Co phrase search for related documents- available genome sequence and complete genome: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11
- available genome sequence and coronavirus genome: 1, 2, 3, 4, 5, 6
- available genome sequence and coronavirus sequence: 1, 2, 3, 4, 5, 6
- available genome sequence and gene genome: 1, 2
- available genome sequence and gene genome sequence: 1
- batch sequence and complete genome: 1, 2
- batch sequence and coronavirus database: 1, 2, 3
- batch sequence and coronavirus genome: 1, 2
- batch sequence and coronavirus sequence: 1, 2
- batch sequence and GenBank coronavirus sequence: 1
- batch sequence and gene genome: 1, 2, 3, 4
- batch sequence and gene genome sequence: 1
- batch sequence retrieval and complete genome: 1, 2
- batch sequence retrieval and coronavirus database: 1, 2, 3
- batch sequence retrieval and coronavirus genome: 1, 2
- batch sequence retrieval and coronavirus sequence: 1, 2
- batch sequence retrieval and GenBank coronavirus sequence: 1
- batch sequence retrieval and gene genome: 1, 2, 3, 4
- batch sequence retrieval and gene genome sequence: 1
Co phrase search for related documents, hyperlinks ordered by date