Author: Mathias Kuhring; Joerg Doellinger; Andreas Nitsche; Thilo Muth; Bernhard Y. Renard
Title: An iterative and automated computational pipeline for untargeted strain-level identification using MS/MS spectra from pathogenic samples Document date: 2019_10_24
ID: k7hm3aow_14
Snippet: For benchmarking, we compare TaxIt against classic comprehensive search strategies based on straight non-iterative taxonomic identification supported by unique PSMs or abundance similarity correction as provided by Pipasic 44 . TaxIt uses NCBI RefSeq proteins of selected kingdoms as reference databases for initial species identification followed by automated and selective strain protein incorporation. Unique-PSMs-and Pipasic-based strategies, how.....
Document: For benchmarking, we compare TaxIt against classic comprehensive search strategies based on straight non-iterative taxonomic identification supported by unique PSMs or abundance similarity correction as provided by Pipasic 44 . TaxIt uses NCBI RefSeq proteins of selected kingdoms as reference databases for initial species identification followed by automated and selective strain protein incorporation. Unique-PSMs-and Pipasic-based strategies, however, apply comprehensive databases integrating as many strain-level sequences as possible at once, including all protein sequences from the NCBI Protein database for selected kingdoms. In general, a preselection of kingdoms may be justified by clinical findings based on, for instance, symptoms or microscopic examination 57 . Both, unique-PSMs-and Pipasic-based strategies use the same procedures for peptide search, FDR control and taxonomic classification as described in the iterative workflow. However, PSMs are not summarized at species level and counts are directly inferred at the lowest possible taxonomic level. For the unique-PSMs-based strategy, adjusted counts are based on PSMs that occur only once. In other words, only spectra assigned unambiguously to only one peptide sequence and thus organism are taken into account. The abundance similarity correction of Pipasic uses the similarity of expressed proteomes between taxa to account for attribution biases. Originally intended for metaproteomic abundance correction, it is here applied to highlight the most likely strain within a provided sample. All rights reserved. No reuse allowed without permission.
Search related documents:
Co phrase search for related documents- clinical finding and comprehensive database: 1, 2
- comprehensive database and iterative workflow: 1, 2
Co phrase search for related documents, hyperlinks ordered by date