Selected article for: "lz complexity and lz lempel ziv complexity"

Author: Pranay, SY; Nagaraj, Nithin
Title: Causal Discovery using Compression-Complexity Measures
  • Cord-id: yyvo4lff
  • Document date: 2020_10_19
  • ID: yyvo4lff
    Snippet: Causal inference is one of the most fundamental problems across all domains of science. We address the problem of inferring a causal direction from two observed discrete symbolic sequences X and Y. We present a framework which relies on lossless compressors for inferring context-free grammars (CFGs) from sequence pairs and quantifies the extent to which the grammar inferred from one sequence compresses the other sequence. We infer X causes Y if the grammar inferred from X better compresses Y tha
    Document: Causal inference is one of the most fundamental problems across all domains of science. We address the problem of inferring a causal direction from two observed discrete symbolic sequences X and Y. We present a framework which relies on lossless compressors for inferring context-free grammars (CFGs) from sequence pairs and quantifies the extent to which the grammar inferred from one sequence compresses the other sequence. We infer X causes Y if the grammar inferred from X better compresses Y than in the other direction. To put this notion to practice, we propose three models that use the Compression-Complexity Measures (CCMs) - Lempel-Ziv (LZ) complexity and Effort-To-Compress (ETC) to infer CFGs and discover causal directions. We evaluate these models on synthetic and real-world benchmarks and empirically observe performances competitive with current state-of-the-art methods. Lastly, we present a unique application of the proposed models for causal inference directly from pairs of genome sequences belonging to the SARS-CoV-2 virus. Using a large number of sequences, we show that our models capture directed causal information exchange between sequence pairs, presenting novel opportunities for addressing key issues such as contact-tracing, motif discovery, evolution of virulence and pathogenicity in future applications.

    Search related documents:
    Co phrase search for related documents
    • absolute difference and acute respiratory syndrome: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25
    • absolute difference and acute respiratory syndrome coronavirus: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25
    • acute respiratory and lossless compression: 1
    • acute respiratory syndrome and lossless compression: 1
    • acute respiratory syndrome coronavirus and lossless compression: 1