Author: Jakub M Bartoszewicz; Anja Seidel; Bernhard Y Renard
Title: Interpretable detection of novel human viruses from genome sequencing data Document date: 2020_1_30
ID: ac00tai9_6
Snippet: While DNA sequences mapped to a reference genome may be represented as images (Poplin et al., 2018) , a majority of studies uses a distributed orthographic representation, where each nucleotide {A, C, G, T } in a sequence is represented by a one-hot encoded vector of length 4. An "unknown" nucleotide (N ) can be represented as an all-zero vector. CNNs and LSTMs have been successfully used for a variety of DNA-based prediction tasks. Early works f.....
Document: While DNA sequences mapped to a reference genome may be represented as images (Poplin et al., 2018) , a majority of studies uses a distributed orthographic representation, where each nucleotide {A, C, G, T } in a sequence is represented by a one-hot encoded vector of length 4. An "unknown" nucleotide (N ) can be represented as an all-zero vector. CNNs and LSTMs have been successfully used for a variety of DNA-based prediction tasks. Early works focused mainly on regulation of gene expression in humans (Alipanahi et al., 2015; Zhou & Troyanskaya, 2015; Zeng et al., 2016; Quang & Xie, 2016; Kelley et al., 2016) , which is still an area of active research (Greenside et al., 2018; Nair et al., 2019; Avsec et al., 2019) . In the field of pathogen genomics, deep learning models trained directly on DNA sequences were developed to predict host ranges of three multi-host viral species (Mock et al., 2019) and to predict pathogenic potentials of novel bacteria (Bartoszewicz et al., 2019) . DeepVirFinder (Ren et al., 2018) and ViraMiner (Tampuu et al., 2019) can detect viral sequences in metagenomic samples, but they cannot predict the host and focus on previously known species. For a broader view on deep learning in genomics we refer to a recent review by Eraslan et al. (2019) .
Search related documents:
Co phrase search for related documents- active research and early work: 1
- active research and gene expression: 1, 2
- active research and learning model: 1, 2, 3, 4, 5
- deep learning and directly train: 1
- deep learning and dna sequence: 1, 2, 3, 4
- deep learning and gene expression: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14
- deep learning and genomic deep learning: 1, 2, 3, 4
- deep learning and host predict: 1, 2, 3, 4, 5, 6, 7, 8
- deep learning and host range: 1, 2, 3
- deep learning and learning model: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25
- deep learning and lstm cnn: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25
- directly train and gene expression: 1
- dna base and gene expression: 1, 2
- dna sequence and early work: 1
- dna sequence and gene expression: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25
- dna sequence and host predict: 1
- dna sequence and host range: 1
- dna sequence and learning model: 1
- dna sequence and lstm cnn: 1
Co phrase search for related documents, hyperlinks ordered by date