Selected article for: "contig coverage and sequencing coverage"

Author: Jiao Chen; Jiayu Shang; Jianrong Wang; Yanni Sun
Title: A binning tool to reconstruct viral haplotypes from assembled contigs
  • Document date: 2019_7_16
  • ID: 2basllfv_6
    Snippet: Contig binning for viral quasispecies has its unique challenges. First, the goal of binning is to distinguish contigs from different viral strains rather than species. Thus, composition-based features such as tetranucleotide frequencies or GC contents are not informative enough to separate contigs from different haplotypes, which usually share high sequence similarity (over 90%). Tools that heavily rely on sequence composition-based features will.....
    Document: Contig binning for viral quasispecies has its unique challenges. First, the goal of binning is to distinguish contigs from different viral strains rather than species. Thus, composition-based features such as tetranucleotide frequencies or GC contents are not informative enough to separate contigs from different haplotypes, which usually share high sequence similarity (over 90%). Tools that heavily rely on sequence composition-based features will not be able to estimate the number of haplotypes correctly. Second, RNA virus sequencing tends to be compounded by gene expression and fast degradation and thus the observed sequencing coverage along each haplotype, or even a contig, can be more heterogeneous than expected. In addition, if a contig contains a region that is common to multiple haplotypes, that region tends to have higher coverage than a haplotype-specific segment. All these challenges require carefully designed methods to use the coverage information for contig binning.

    Search related documents:
    Co phrase search for related documents
    • composition base and gene expression: 1, 2
    • composition base and high sequence: 1
    • composition base and RNA virus: 1, 2, 3
    • composition base and sequence composition base: 1, 2, 3
    • composition base and viral strain: 1, 2