Author: Alejandro Lopez-Rincon; Alberto Tonda; Lucero Mendoza-Maldonado; Eric Claassen; Johan Garssen; Aletta D. Kraneveld
Title: Accurate Identification of SARS-CoV-2 from Viral Genome Sequences using Deep Learning Document date: 2020_3_14
ID: c2lljdi7_33
Snippet: We downloaded the dataset from the NGDC repository [6] on March 15 15 2020. We removed repeated sequences and applied the whole procedure to translate the data into the sequence feature space. This leave us with a frequency 215 table of 3,827 features with 583 samples (Table 3 ). Next, we ran a state-of-theart feature selection algorithm [36] , to reduce the sequences needed to identify different virus strain to the bare minimum. Remarkably, we a.....
Document: We downloaded the dataset from the NGDC repository [6] on March 15 15 2020. We removed repeated sequences and applied the whole procedure to translate the data into the sequence feature space. This leave us with a frequency 215 table of 3,827 features with 583 samples (Table 3 ). Next, we ran a state-of-theart feature selection algorithm [36] , to reduce the sequences needed to identify different virus strain to the bare minimum. Remarkably, we are then able to classify exactly all samples using only 53 of the original 3,827 sequences, obtaining a 100% accuracy in a 10-fold cross-validation with a simpler and more 220 traditional classifier, such as Logistic Regression. Table 3 : Organism, assigned label, and number of samples in the unique sequences obtained from the repository [6] . We use the NCBI organism naming convention [30] .
Search related documents:
Co phrase search for related documents- different virus and sample number: 1, 2, 3
- different virus and selection algorithm: 1, 2
- different virus and sequence feature: 1, 2
- feature selection algorithm and selection algorithm: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25
- feature selection algorithm and sequence feature: 1
- original sequence and sample number: 1
- repeat sequence and sequence feature: 1, 2
Co phrase search for related documents, hyperlinks ordered by date