Results

Selected article for: "negative class and training set"

Author: Jakub M Bartoszewicz; Anja Seidel; Bernhard Y Renard

Title: Interpretable detection of novel human viruses from genome sequencing data

Document date: 2020_1_30

ID: ac00tai9_9

Hyperlink: Download document. Google Scholar. Related documents.

Snippet: In this paper, we first improve the performance of read-based predictions of the viral host (human or non-human) from next-generation sequencing reads. We show that reversecomplement (RC) neural networks (Bartoszewicz et al., 2019) significantly outperform both the previous state-ofthe-art (Zhang et al., 2019) and the traditional, alignmentbased algorithm -BLAST (Altschul et al., 1990) , which constitutes a gold standard in homology-based bioinfo.....

KG: Link to Knowledge Graph

Complete Snippet

Document: In this paper, we first improve the performance of read-based predictions of the viral host (human or non-human) from next-generation sequencing reads. We show that reversecomplement (RC) neural networks (Bartoszewicz et al., 2019) significantly outperform both the previous state-ofthe-art (Zhang et al., 2019) and the traditional, alignmentbased algorithm -BLAST (Altschul et al., 1990) , which constitutes a gold standard in homology-based bioinformatics analyses. We show that defining the negative (nonhuman) class is non-trivial and compare different ways of constructing the training set. Strikingly, a model trained to distinguish between viruses infecting humans and viruses infecting other chordates (a phylum of animals including vertebrates) generalizes well to evolutionarily distant nonhuman hosts, including even bacteria. This suggests that the host-related signal is strong and the learned decision boundary separates human viruses from other DNA sequences surprisingly well.

Search related documents:

Co phrase search for related documents

bacteria include and gold standard: 1
bacteria include and human virus: 1, 2, 3, 4, 5, 6
bioinformatic analysis and DNA sequence: 1, 2
bioinformatic analysis and generation sequencing: 1, 2, 3, 4, 5, 6, 7, 8, 9
bioinformatic analysis and human virus: 1, 2, 3, 4, 5, 6, 7, 8
DNA sequence and generation sequencing: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23
DNA sequence and human virus: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12
generation sequencing and gold standard: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20
generation sequencing and human virus: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25
generation sequencing read and gold standard: 1
gold standard and human virus: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25
host relate and human virus: 1, 2

Co phrase search for related documents, hyperlinks ordered by date

ABSTRACT:

TERMS:

DOCUMENTS: