Selected article for: "analysis tool and reference genome"

Author: Liao, Herui; Cai, Dehan; Sun, Yanni
Title: VirStrain: a strain identification tool for RNA viruses
  • Cord-id: 06e96v7h
  • Document date: 2021_3_23
  • ID: 06e96v7h
    Snippet: Genome epidemiology, which uses genomic data to analyze the source and spread of infectious diseases, provides important information beyond interview-based methods. Given fast accumulation of sequenced viral genomes, a basic need in genome epidemiology is to identify which reference genomes are identical or closest to the ones in a sequenced sample. Then the associated metadata such as the geographical locations can be utilized to infer the transmission network. In this work, we deliver VirStrai
    Document: Genome epidemiology, which uses genomic data to analyze the source and spread of infectious diseases, provides important information beyond interview-based methods. Given fast accumulation of sequenced viral genomes, a basic need in genome epidemiology is to identify which reference genomes are identical or closest to the ones in a sequenced sample. Then the associated metadata such as the geographical locations can be utilized to infer the transmission network. In this work, we deliver VirStrain, a fast and accurate tool for conducting strain-level analysis from short reads. By using a greedy covering algorithm, we are able to derive unique k-mer combinations for highly similar reference genomes. VirStrain is able to detect the most possible strain and also multiple strains that may simultaneously infect the same host. We tested VirStrain on three types of RNA viruses whose reference genomes have different similarity distri-butions. For each types of virus, we assessed VirStrain across multiple bench-mark datasets of different properties and complexity. The experimental results on both simulated and real sequencing data show that VirStrain outperforms other strain identification tools.

    Search related documents:
    Co phrase search for related documents
    • accuracy demonstrate and low accuracy: 1, 2, 3, 4, 5, 6
    • accuracy demonstrate and low quality: 1