Author: Markus Luczak-Roesch
Title: Networks of information token recurrences derived from genomic sequences may reveal hidden patterns in epidemic outbreaks: A case study of the 2019-nCoV coronavirus. Document date: 2020_2_11
ID: kevrp8rg_6
Snippet: These data were preprocessed using the R programming language. First, we derived a metadata object that allows to store the sequence identifier, the collection location, the collection date, and the raw nucleotide sequence. The metadata object was then filtered to keep only those sequences for which the collection date was not empty and which featured a length that fell within a margin of 1, 000 of the estimated 30, 000 nucleotides that are curre.....
Document: These data were preprocessed using the R programming language. First, we derived a metadata object that allows to store the sequence identifier, the collection location, the collection date, and the raw nucleotide sequence. The metadata object was then filtered to keep only those sequences for which the collection date was not empty and which featured a length that fell within a margin of 1, 000 of the estimated 30, 000 nucleotides that are currently assumed to make up the genomic code of the 2019-nCoV coronavirus (i.e. we cover the range of nucleotide sequences from 29, 000 to 31, 000 in length). This left us with a total of 82 genomic sequences. We then exported a single CSV file containing only the raw nucleotide sequences in ascending order by the collection date of the sample. This structure is the standard input format for the genes-CODON-samplesequence tokeniser featured in the Transcendental Information Cascades R toolchain that is available as free and open scientific software 2 .
Search related documents:
Co phrase search for related documents- collection location and genomic sequence: 1
- csv file and nucleotide sequence: 1
- genomic sequence and input format: 1, 2
- genomic sequence and metadata object: 1
- genomic sequence and nucleotide sequence: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25
- input format and nucleotide sequence: 1
Co phrase search for related documents, hyperlinks ordered by date