Author: Schlub, Timothy E; Buchmann, Jan P; Holmes, Edward C
Title: A Simple Method to Detect Candidate Overlapping Genes in Viruses Using Single Genome Sequences Document date: 2018_8_7
ID: yiqdsf9z_26
Snippet: A second important feature of our method is the relatively high sensitivity to detect overlapping genes, whilst maintaining acceptable false discovery rates. This is best achieved by using the combined test where newly detected ORFs must be larger than expected by both the codon permutation and synonymous mutation tests. The combined test is advantageous as true positives are readily detected by both tests, so the constraint of requiring both tes.....
Document: A second important feature of our method is the relatively high sensitivity to detect overlapping genes, whilst maintaining acceptable false discovery rates. This is best achieved by using the combined test where newly detected ORFs must be larger than expected by both the codon permutation and synonymous mutation tests. The combined test is advantageous as true positives are readily detected by both tests, so the constraint of requiring both tests to detect the ORF does not impact the sensitivity. However, the combined test does substantially reduce the false positives rate, as false positives detected by one test are frequently excluded by the other. There is also scope to further reduce false discovery by modifying our method, or by imposing post analysis constraints, for example by calculating ORF lengths from start codon to stop codon rather than between two stop codons. This was not considered for the screening results here due to variation in alternative start codons among viruses, but would be an important optimization in more targeted screening. One caveat to this method (and other bioinformatics approaches) is that sensitivity depends on the size of overlap, with smaller regions of overlap being more difficult to detect. Unlike other methods, however, we explicitly calculated the sensitivity for many lengths of overlap and find that a length of at least 50 nucleotides (17 codons) is required before the method becomes effective. As this length increases to 300 nucleotides (100 codons), the method becomes a very powerful diagnostic tool as measured by an area under the curve equal to 0.89. The estimate of this method's sensitivity and false discovery rates for an overlapping gene detection method is a strength, as although sensitivity can be calculated for other methods, false discovery estimation is often neglected and rarely reported due to a lack of negative controls. When it is reported, it is usually based on estimates of type 1 error rates of P values, rather than comparison to a negative control as we have done in here.
Search related documents:
Co phrase search for related documents- alternative start codon and codon stop: 1
- bioinformatic approach and codon permutation: 1
- bioinformatic approach and discovery rate: 1
- codon permutation and combined test: 1, 2, 3, 4, 5
- codon permutation and detection method: 1
- codon permutation and diagnostic tool: 1
- codon permutation and discovery rate: 1, 2, 3
- codon stop and combined test: 1
- codon stop and error rate: 1
- combined test and detection method: 1, 2, 3, 4, 5, 6, 7
- combined test and diagnostic tool: 1, 2, 3
- combined test and discovery rate: 1, 2, 3
- detection method and diagnostic tool: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25
- detection method and error rate: 1, 2
- diagnostic tool and discovery rate: 1
- diagnostic tool and error rate: 1, 2, 3, 4
- discovery estimation and error rate: 1
- discovery rate and error rate: 1, 2, 3
- equal curve and error rate: 1
Co phrase search for related documents, hyperlinks ordered by date