Results

Selected article for: "lossless compression and lz complexity"

Author: Karthi Balasubramanian; Nithin Nagaraj

Title: Automatic Identification of SARS Coronavirus using Compression-Complexity Measures

Document date: 2020_3_27

ID: ljli6a2z_62

Hyperlink: Download document. Google Scholar. Related documents.

Snippet: Compression-complexity measures such as LZ and ETC which are based on lossless compression algorithms are good candidates for developing fast alignment-free methods for genome sequence analysis, comparison and identification. The main reason for this is their ability to characterize and analyze information in biological sequences with very short length contiguous segments. As we have demonstrated in this study, our preliminary results suggests th.....

KG: Link to Knowledge Graph

Complete Snippet

Document: Compression-complexity measures such as LZ and ETC which are based on lossless compression algorithms are good candidates for developing fast alignment-free methods for genome sequence analysis, comparison and identification. The main reason for this is their ability to characterize and analyze information in biological sequences with very short length contiguous segments. As we have demonstrated in this study, our preliminary results suggests that ETC could be very useful for identifying an unknown sequence from a large database of nucleotide sequences since we can quickly compute the measure on the candidate sequences for a small set of nucleic bases. LZ complexity requires slightly larger nucleotide sequences and that needs more computation. Other information theoretic methods in literature which employ Shannon Entropy, Mutual Information etc. would also need larger nucleotide sequences for computation and are not robust to noise. Some areas for further research are:

Search related documents:

Co phrase search for related documents

alignment free method and nucleotide sequence: 1, 2, 3, 4
alignment free method and sequence analysis: 1, 2, 3
biological sequence and nucleotide sequence: 1, 2, 3, 4, 5, 6
biological sequence and sequence analysis: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
biological sequence and unknown sequence: 1
biological sequence information and nucleotide sequence: 1, 2
biological sequence information and sequence analysis: 1
candidate sequence and good candidate: 1
candidate sequence and nucleotide sequence: 1
candidate sequence and sequence analysis: 1, 2, 3, 4, 5

Co phrase search for related documents, hyperlinks ordered by date

ABSTRACT:

TERMS:

DOCUMENTS: