Selected article for: "automatically accept and GenBank file"

Author: Tcherepanov, Vasily; Ehlers, Angelika; Upton, Chris
Title: Genome Annotation Transfer Utility (GATU): rapid annotation of viral genomes using a closely related reference genome
  • Document date: 2006_6_13
  • ID: 1e2kkhht_20
    Snippet: The interactive table and graphical display allow the user to review the automatically generated annotations and accept or reject them as desired. To aid the user, GATU pre-selects the Accept annotation box for all annotations that meet user-specified requirements of length, percent sequence identity and coding strand identity. In our example, GATU found and automatically accepted target sequence counterparts of 146 of the 148 genes (genes 1-147 .....
    Document: The interactive table and graphical display allow the user to review the automatically generated annotations and accept or reject them as desired. To aid the user, GATU pre-selects the Accept annotation box for all annotations that meet user-specified requirements of length, percent sequence identity and coding strand identity. In our example, GATU found and automatically accepted target sequence counterparts of 146 of the 148 genes (genes 1-147 and gene 101a) present in the reference genome. The accepted ORFs were 99.1-100% similar (predicted amino acid sequence) to the reference genes and 127 were 100% similar; this also indicates that the start/stop positions of the reference and target genes matched. Variation in the start/stop positions can also be examined by comparing the P. size column (predicted size) with the Size column GATU process flow chart Before deciding which annotations to include in the genome file, the user may wish for more information about a particular annotation. In our example, genes 02 and 146 (these two genes happen to be identical because they are present in the terminal inverted repeats of the virus) need reviewing, as they were not automatically accepted for inclusion; the user will have to determine whether they should be accepted. To assist with this task, a global alignment of the reference protein and its putative counterpart on the target genome (generated by the NEEDLE program) can be obtained by clicking on the Needle Alignment button (Figure 4) . The NEEDLE align-ment shows that the ORF in the target genome is truncated at the N-terminus but contains the remaining 240 aa encoded by the reference genome. This global alignment also provides a useful indication as to the level of similarity between the two ORFs. Another useful tool is a TBLASTN search of the target genome using the reference gene as a query; the results of this search can be obtained by clicking on the Blast Alignment(s) button ( Figure 5) . From the data shown, it is apparent that a frame-shifting mutation is responsible for the difference between the target and reference ORFs. If desired, the user could open these two genomes in our Viral Genome Organizer (VGO) program to determine if the promoter regions are similar and the position of the frame-shifting mutation (located at a run of Ts). Another application that users will find GATU GUI screen shot after loading genomes and clicking Annotation button; the annotations that have been read from the reference genome GenBank file are displayed Figure 2 GATU GUI screen shot after loading genomes and clicking Annotation button; the annotations that have been read from the reference genome GenBank file are displayed.

    Search related documents:
    Co phrase search for related documents
    • amino acid and automatically generate: 1
    • amino acid and frame shift: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14
    • amino acid sequence and frame shift: 1, 2