Author: Markus Luczak-Roesch
Title: Networks of information token recurrences derived from genomic sequences may reveal hidden patterns in epidemic outbreaks: A case study of the 2019-nCoV coronavirus. Document date: 2020_2_11
ID: kevrp8rg_11
Snippet: As the information tokens of interest in this study we encode unique codon identifiers as (a) their position when sliding a window of size 3 in steps of size 3 over the nucleotide sequence (e.g. "pos1", "pos2", ...), (b) a flag that indicates the reading frame ("+1", "+2" or "+3") that captures the respective triplet in the current window, and (c) the actual matched nucleotide triplet that constitutes the matched codon. The use of this tokenisati.....
Document: As the information tokens of interest in this study we encode unique codon identifiers as (a) their position when sliding a window of size 3 in steps of size 3 over the nucleotide sequence (e.g. "pos1", "pos2", ...), (b) a flag that indicates the reading frame ("+1", "+2" or "+3") that captures the respective triplet in the current window, and (c) the actual matched nucleotide triplet that constitutes the matched codon. The use of this tokenisation is motivated by previous work on phylogenetic profiling and gene sequencing [6, 13] that suggested that codons suit well for assessing similarities (and dissimilarities) at the gene, chromosome, or genome levels [6, 14, 22] .
Search related documents:
Co phrase search for related documents- codon identifier and unique codon identifier: 1
Co phrase search for related documents, hyperlinks ordered by date