Selected article for: "global alignment and local alignment"

Author: Jiao Chen; Jiayu Shang; Jianrong Wang; Yanni Sun
Title: A binning tool to reconstruct viral haplotypes from assembled contigs
  • Document date: 2019_7_16
  • ID: 2basllfv_17
    Snippet: Although the high similarity between haplotypes presents a barrier to adoption of kmer-based features for distinguishing contigs from different haplotypes, it brings opportunities for haplotype number estimation. With stringent alignment threshold, contigs that can be aligned with each other usually come from the same region of the virus and thus the number of aligned contigs can be carefully used for haplotype number estimation. We progressively.....
    Document: Although the high similarity between haplotypes presents a barrier to adoption of kmer-based features for distinguishing contigs from different haplotypes, it brings opportunities for haplotype number estimation. With stringent alignment threshold, contigs that can be aligned with each other usually come from the same region of the virus and thus the number of aligned contigs can be carefully used for haplotype number estimation. We progressively build multiple sequence alignments using contigs' pairwise alignments. In this step, base-level accuracy of the alignment is not a major concern and thus progressive construction of the alignment between contigs can serve the purpose well. We first sort the contigs by their lengths in descending order. The longest contig will be used as the first reference. All the other contigs will be aligned to the reference using blast+ (Camacho et al., 2009) to generate an alignment profile similar to multiple sequence alignment. Two types of alignments are kept from the output of blast+. One is the "glocal" alignment, which is local to the reference but global to the shorter contigs. The other is overlap alignment, which is the alignment between the suffix/prefix strings of the contigs. If not all the shorter contigs can be aligned to the reference contig, this process will continue by using the second longest contig as the reference until all the contigs are used. Fig. 2 .(c) shows the alignment between contigs using the longest contig as the reference, which is usually produced for the most abundant haplotype.

    Search related documents:
    Co phrase search for related documents
    • Try single phrases listed below for: 1