Author: Sofia Morfopoulou; Vincent Plagnol
Title: Bayesian mixture analysis for metagenomic community profiling. Document date: 2014_7_25
ID: 058r9486_24
Snippet: We then assessed the importance of the read support parameter r on the output of metaMix. We ran metaMix on the benchmark simHC FAMeS dataset with r = {10, 20, 30, 50} reads (Table 3, Figure 1 ). We observe that as r decreases, a few more related strains from the reference database that are not in the community are retained in the output. As r increases two similar strains are merged into one. We compared these results with the output of Pathosco.....
Document: We then assessed the importance of the read support parameter r on the output of metaMix. We ran metaMix on the benchmark simHC FAMeS dataset with r = {10, 20, 30, 50} reads (Table 3, Figure 1 ). We observe that as r decreases, a few more related strains from the reference database that are not in the community are retained in the output. As r increases two similar strains are merged into one. We compared these results with the output of Pathoscope and MEGAN. None of these methods have a read support parameter serving the same purpose as in metaMix, so we tuned the most relevant parameters in these tools. Pathoscope has a thetaPrior parameter that enforces a unique read penalty. This parameter represents the read pseudocounts for the non-unique matches and the default setting is zero which allows for non informative priors. Using the default setting Pathoscope identifies 47 taxa. When thetaP's value is in (1,7) it identifies 22 taxa, while with thetaP> 7 it identifies 165. With this latter setting which is the one we chose for the comparison, Pathoscope behaves as a standard mixture model. MEGAN has a "Min Support" parameter which sets a threshold for the number of reads that must be assigned to a taxon so that it appears in the result. Any read assigned to a taxon not having the required support is pushed up the taxonomy until a taxon is found that has sufficient support. We used Min support = {10, 20, 30, 50} reads. The respective number of taxa in the summary files were 250, 243, 236, 232.
Search related documents:
Co phrase search for related documents- Try single phrases listed below for: 1
Co phrase search for related documents, hyperlinks ordered by date