Selected article for: "multiple sequence alignment and sequence alignment"

Author: Hawkins, John A.; Kaczmarek, Maria E.; Müller, Marcel A.; Drosten, Christian; Press, William H.; Sawyer, Sara L.
Title: A metaanalysis of bat phylogenetics and positive selection based on genomes and transcriptomes from 18 species
  • Document date: 2019_6_4
  • ID: telmxmp4_12
    Snippet: Multiple Sequence Alignment Cleaning. Manual inspection of many multiple sequence alignments of orthologous genes revealed a nonrandom source of error: the species were biased toward segregating by data type (i.e., genomic data vs. transcriptomic data). Some poorly aligned regions would tend to agree within data type but disagree between data types. An example is shown in Fig. 2A . Furthermore, the splits were observed to happen at sharp boundari.....
    Document: Multiple Sequence Alignment Cleaning. Manual inspection of many multiple sequence alignments of orthologous genes revealed a nonrandom source of error: the species were biased toward segregating by data type (i.e., genomic data vs. transcriptomic data). Some poorly aligned regions would tend to agree within data type but disagree between data types. An example is shown in Fig. 2A . Furthermore, the splits were observed to happen at sharp boundaries highly suggestive of exon boundaries. This effect can be largely explained by the fact that we chose the longest isoforms predicted from annotated genome sequences, even though the longest isoforms might not be expressed at high enough levels to appear in the transcriptomic datasets from a given tissue. Other artifacts in transcriptomic or genomic data assembly and annotation could also contribute to this effect. Quantification of this bias, based on the cleaning pipeline described below, is shown in SI Appendix, Fig. S2 .

    Search related documents: