Author: Gurjit S. Randhawa; Maximillian P.M. Soltysiak; Hadi El Roz; Camila P.E. de Souza; Kathleen A. Hill; Lila Kari
Title: Machine learning using intrinsic genomic signatures for rapid classification of novel pathogens: COVID-19 case study Document date: 2020_2_4
ID: cetdqgff_18_1
Snippet: ed in the past for sequence 315 comparisons and analyses [51] . The main advantage alignment-free methodology offers 316 is the ability to analyze large datasets rapidly. In this study we confirm the taxonomy 317 of COVID-19 and, more generally, propose a method to efficiently analyze and classify a 318 novel unclassified DNA sequence against the background of a large dataset. We namely 319 use a "decision tree" approach (paralleling taxonomic ra.....
Document: ed in the past for sequence 315 comparisons and analyses [51] . The main advantage alignment-free methodology offers 316 is the ability to analyze large datasets rapidly. In this study we confirm the taxonomy 317 of COVID-19 and, more generally, propose a method to efficiently analyze and classify a 318 novel unclassified DNA sequence against the background of a large dataset. We namely 319 use a "decision tree" approach (paralleling taxonomic ranks), and start with the highest 320 taxonomic level, train the classification models on the available complete genomes, test 321 the novel unknown sequences to predict the label among the labels of the training 322 dataset, move to the next taxonomic level, and repeat the whole process down to the 323 lowest taxonomic label.
Search related documents:
Co phrase search for related documents- available complete genome and complete genome: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52
- classification model and decision tree: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16
- classification model and decision tree approach: 1, 2
- classification model and large dataset: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
- classification model and taxonomic level: 1, 2, 3
- complete genome and dna sequence: 1, 2, 3, 4, 5, 6, 7, 8, 9
- complete genome and large dataset: 1
- complete genome and taxonomic level: 1
- complete genome and unknown sequence: 1, 2
- decision tree and dna sequence: 1, 2
- decision tree and large dataset: 1, 2, 3, 4, 5, 6, 7
- decision tree and large dataset analyze: 1, 2
- decision tree approach and dna sequence: 1, 2
- decision tree approach and large dataset: 1, 2
- decision tree approach and large dataset analyze: 1, 2
Co phrase search for related documents, hyperlinks ordered by date