Results

Selected article for: "different virus and sequence feature"

Author: Alejandro Lopez-Rincon; Alberto Tonda; Lucero Mendoza-Maldonado; Eric Claassen; Johan Garssen; Aletta D. Kraneveld

Title: Accurate Identification of SARS-CoV-2 from Viral Genome Sequences using Deep Learning

Document date: 2020_3_14

ID: c2lljdi7_31

Hyperlink: Download document. Google Scholar. Related documents.

Snippet: The convolutional layers of CNNs de-facto learn new features to characterize the problem, directly from the data. In this specific case, the new features are 165 specific sequences of base pairs that can more easily separate different virus strains (Fig. 12) . By analyzing the result of each filter in a convolutional layer, and how its output interacts with the corresponding max pooling layer, it is possible to detect human-readable sequences of .....

KG: Link to Knowledge Graph

Complete Snippet

Document: The convolutional layers of CNNs de-facto learn new features to characterize the problem, directly from the data. In this specific case, the new features are 165 specific sequences of base pairs that can more easily separate different virus strains (Fig. 12) . By analyzing the result of each filter in a convolutional layer, and how its output interacts with the corresponding max pooling layer, it is possible to detect human-readable sequences of base pairs that might provide domain experts with important information. It is important to notice that 170 these sequences are not bound to specific locations of the genome; thanks to its structure, the CNN is able to detect them and recognize their importance even if their position is displaced in different samples. For this purpose, we use the trained CNN described in Subsection 2.2, that obtained an accuracy of 98.75% in a 10-fold cross-validation. In a first step, 175 we plot the inputs and outputs of the convolutional layer, to visually inspect for patterns. As an example, in Fig. 13 we report the visualization of the first The copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/2020.03.13.990242 doi: bioRxiv preprint promising, as it seems to focus on the a few relevant points in the genome, and it is thus most likely able to identify meaningful sequences. The copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/2020.03.13.990242 doi: bioRxiv preprint 21-bps sequence that obtained the highest value from the convolutional filter, in a specific 148-position interval of the original genome: the first max pooling 195 feature will cover positions 1-148, the second will cover position 149-296, and so on. We graph the whole set of max pooling features for the complete data, The copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/2020.03.13.990242 doi: bioRxiv preprint to 148 positions). As some samples might present sequences that are displaced even more, in the next experiments we decided to just consider the relative frequency of the 21-pbs sequences identified at the previous step, creating a sequence feature space, to verify whether the appearance of specific sequences 210 could be enough to differentiate between virus strains.

Search related documents:

Co phrase search for related documents

accuracy obtain and different sample: 1
base pair and different sample: 1
base pair and different virus: 1
base pair and high value: 1
base pair and specific case: 1
base pair and specific sequence: 1, 2, 3, 4, 5
base pair and virus strain: 1, 2, 3

Co phrase search for related documents, hyperlinks ordered by date

ABSTRACT:

TERMS:

DOCUMENTS: