Selected article for: "alignment read and method sequence"

Author: Jakub M Bartoszewicz; Anja Seidel; Bernhard Y Renard
Title: Interpretable detection of novel human viruses from genome sequencing data
  • Document date: 2020_1_30
  • ID: ac00tai9_57
    Snippet: Compared to the previous state-of-the-art in viral host prediction directly from next-generation sequencing reads (Zhang et al., 2019) , our models drastically reduce the error rates. This holds also for novel viruses not present in the training set. In the paired read scenario, the previously described method fails, and standard, alignment-based homology testing algorithm cannot find any matches in more than 10% of the cases, resulting in relati.....
    Document: Compared to the previous state-of-the-art in viral host prediction directly from next-generation sequencing reads (Zhang et al., 2019) , our models drastically reduce the error rates. This holds also for novel viruses not present in the training set. In the paired read scenario, the previously described method fails, and standard, alignment-based homology testing algorithm cannot find any matches in more than 10% of the cases, resulting in relatively low accuracy. On a real human virome sample, where a main source of negative (Moustafa et al., 2017) , our method filters out non-human viruses with high specificity. In this scenario, the BLAST-derived groundtruth labels were mined using the complete database (as opposed to just a training set). In all cases, our results are only as good as the training data used; high quality labels and sequences are needed to develop trustworthy models. Ideally, sources of error should be investigated with an in-depth analysis of a model's performance on multiple genomes covering a wide selection of taxonomic units. This is especially important as the method assumes no mechanistic link between an input sequence and the phenotype of interest, and the input sequence constitutes only a small fraction of the target genome without a wider biological context. Still, it is possible to predict a label even from those small, local fragments. A similar effect was also observed for image classification with CNNs (Brendel & Bethge, 2019) .

    Search related documents:
    Co phrase search for related documents
    • art previous state and image classification: 1, 2, 3
    • art previous state and novel virus: 1
    • art previous state and previous state: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31
    • art previous state and training set: 1
    • biological context and complete database: 1
    • biological context and high quality: 1
    • biological context and previous state: 1
    • biological context and sequencing read: 1
    • biological context and target genome: 1
    • biological context and training set: 1