Author: Wang, Shiliang; Sundaram, Jaideep P.; Stockwell, Timothy B.
Title: VIGOR extended to annotate genomes for additional 12 different viruses Document date: 2012_6_4
ID: wd3ir3wg_19
Snippet: Since no golden standard dataset is available for the viruses that VIGOR is designed to predict genes for and no species-specific gene prediction programs are available for the designated viruses, the performance of VIGOR gene prediction was evaluated by comparing VIGOR prediction with GenBank annotation of the same viral sequence. For each individual virus types, 27 to 240 complete genomes, depending on the availabilities in NCBI, were run throu.....
Document: Since no golden standard dataset is available for the viruses that VIGOR is designed to predict genes for and no species-specific gene prediction programs are available for the designated viruses, the performance of VIGOR gene prediction was evaluated by comparing VIGOR prediction with GenBank annotation of the same viral sequence. For each individual virus types, 27 to 240 complete genomes, depending on the availabilities in NCBI, were run through VIGOR for comparison. The comparison data is presented in Table 1 . The full descriptions of correct prediction, partial agreement, missing gene and new gene were depicted previously (3) . In brief, a prediction is considered correct if both start and stop codons of the gene predicted by VIGOR are the same as these in the GenBank annotation, and if the gene-specific features (like the RNA editing site and stop read-through site) exist, then they must agree with the GenBank record. The following two types of predictions will be counted as partial agreements: stop codon and the reading frame are the same as these in GenBank record, but start codon is different; for some genes, an internal stop codon is detected and the truncated protein sequence is shorter than 95% of its reference sequence length (protein length is >150 aa), VIGOR defines these genes as nonfunctional genes and marks the predictions with 'possible sequence mutation,' but these genes are annotated as functional genes in GenBank. Missing gene means that no gene is detected by VIGOR in a region where one or more genes are annotated in GenBank. If a gene is predicted in a genomic sequence and no gene was documented in the same region with same reading frame in GenBank, this prediction will be inspected manually. If it is highly homologous to a related viral protein (E < 1e-10), the prediction will be counted as a new gene.
Search related documents:
Co phrase search for related documents- codon different start and editing site: 1, 2
- codon different start and GenBank annotation: 1
- comparison data and functional gene: 1
- complete genome and functional gene: 1, 2, 3
- correct prediction and GenBank annotation: 1
- functional gene and GenBank functional gene: 1
- functional gene and gene define: 1, 2, 3, 4
Co phrase search for related documents, hyperlinks ordered by date