Author: Mathias Kuhring; Joerg Doellinger; Andreas Nitsche; Thilo Muth; Bernhard Y. Renard
Title: An iterative and automated computational pipeline for untargeted strain-level identification using MS/MS spectra from pathogenic samples Document date: 2019_10_24
ID: k7hm3aow_1_0
Snippet: Pathogenic strain identification using LC-MS/MS-based proteomics presents a crucial, yet highly challenging task. Many pathogenic strains feature significant phenotypic differences within a species with respect to pathogenicity, zoonotic potential, cell attachment and entry, host-virus interaction and clinical symptoms [1] [2] [3] . In a diagnostic context, strain-level knowledge is important to infer virulence 4, 5 and drug resistance 6 for appr.....
Document: Pathogenic strain identification using LC-MS/MS-based proteomics presents a crucial, yet highly challenging task. Many pathogenic strains feature significant phenotypic differences within a species with respect to pathogenicity, zoonotic potential, cell attachment and entry, host-virus interaction and clinical symptoms [1] [2] [3] . In a diagnostic context, strain-level knowledge is important to infer virulence 4, 5 and drug resistance 6 for appropriate therapy. However, inferring exact strain information from proteomic samples remains a difficult task, in particular when the taxonomic origin of a sample is unknown and when related strains feature high sequence similarity. In recent years, MALDI-TOF mass spectrometry has gained popularity as fast, sensitive and economical method for microbial biotyping. However, identifying strains using MALDI-TOF workflows is still very challenging and requires curated, often proprietary spectral databases 7 . Several commercial platforms for microbial biotyping down to the species or strain level are available based on MALDI and other technologies such as the Bruker MALDI Biotyper Systems 8 , the Bruker Strain typing with IR Biotyper 9 and the Ibis T5000 Universal Biosensor 10 . Several studies report on limitations of MALDI-TOF biotyping for strain-level identifications and advocate advancements towards MS/MS marker peptide detection and, consequently, the analysis was shifted to the MS/MS level 11, 12 . In these studies, however, the MS/MS-based protein identification was established using sequence databases that were already targeted or restricted to particular species or limited sets. In contrast, untargeted MS/MS typing approaches are limited to species level identification 6, 13 . However, in general MS/MS is preferred for the analysis of complex unpurified peptide mixtures as it is considered to provide more distinct and unambiguous peptide and protein identifications 14 and thus increased proteome resolution 15 as well as higher statistical confidence 16 . In particular, organisms with unknown taxonomic origin benefit from peptide sequence-based analysis, as MALDI-based biotyping is in comparison too prone to ambiguous identifications 17 . Furthermore, advances in instrumentation including higher resolution, mass accuracy and dynamic range increasingly allow for identification of the majority of all fragmented peptides 18 resulting in higher sensitivity, higher coverage of target proteomes and thus higher availability of distinctive features. Taking advantage of the vast amount of available protein sequences for MS/MS strain-level identification is challenging. On the one hand, constraining the search space may result in unidentified strains or incorrectly assigned taxa, in particular for non-model organisms 19 . On the other hand, applying large databases is not recommended either since it decreases peptide identification rates 20 and thus eventually impedes taxonomic inference 21 . Furthermore, with increasing database size sequence quality often decreases (e.g. when using the complete NCBI Protein in comparison to the NCBI RefSeq database) and contaminant sequences may occur more often 22 . Therefore, extended databases should only be used when necessary. However, strain-level identification of MS/MS spectra from samples with unclear taxonomic status requires an untargeted search against comprehensive databases holding as many strains as possible. A common and popular concept to handle incr
Search related documents:
Co phrase search for related documents- appropriate therapy and clinical symptom: 1
- appropriate therapy and comprehensive database: 1
- comprehensive database and database size: 1
Co phrase search for related documents, hyperlinks ordered by date