Results

Selected article for: "length sequence and relative sequence"

Author: Kuruppu, Shanika; Puglisi, Simon J.; Zobel, Justin

Title: Relative Lempel-Ziv Compression of Genomes for Large-Scale Storage and Retrieval

Cord-id: miz02qro

Document date: 2010_1_1

ID: miz02qro

Hyperlink: Download document. Google Scholar. Related documents.

Snippet: Self-indexes â€“ data structures that simultaneously provide fast search of and access to compressed text â€“ are promising for genomic data but in their usual form are not able to exploit the high level of replication present in a collection of related genomes. Our â€˜RLZâ€™ approach is to store a self-index for a base sequence and then compress every other sequence as an LZ77 encoding relative to the base. For a collection of r sequences totaling N bases, with a total of s point mutations from

KG: Link to Knowledge Graph

Complete Snippet

Document: Self-indexes â€“ data structures that simultaneously provide fast search of and access to compressed text â€“ are promising for genomic data but in their usual form are not able to exploit the high level of replication present in a collection of related genomes. Our â€˜RLZâ€™ approach is to store a self-index for a base sequence and then compress every other sequence as an LZ77 encoding relative to the base. For a collection of r sequences totaling N bases, with a total of s point mutations from a base sequence of length n, this representation requires just [Formula: see text] bits. At the cost of negligible extra space, access to â„“ consecutive symbols requires [Formula: see text] time. Our experiments show that, for example, RLZ can represent individual human genomes in around 0.1 bits per base while supporting rapid access and using relatively little memory.

Search related documents:

Co phrase search for related documents

Try single phrases listed below for: 1

Co phrase search for related documents, hyperlinks ordered by date

ABSTRACT:

TERMS:

DOCUMENTS: