Author: Balaji, Advait; Kille, Bryce; Kappell, Anthony D.; Godbold, Gene D.; Diep, Madeline; Elworth, R. A. Leo; Qian, Zhiqin; Albin, Dreycey; Nasko, Daniel J.; Shah, Nidhi; Pop, Mihai; Segarra, Santiago; Ternus, Krista L.; Treangen, Todd J.
Title: SeqScreen: Accurate and Sensitive Functional Screening of Pathogenic Sequences via Ensemble Learning Cord-id: pznblc2h Document date: 2021_8_8
ID: pznblc2h
Snippet: The COVID-19 pandemic has emphasized the importance of detecting known and emerging pathogens from clinical and environmental samples. However, robust characterization of pathogenic sequences remains an open challenge. To this end, we developed SeqScreen, which can accurately characterize short nucleotide sequences using taxonomic and functional labels, and a customized set of curated Functions of Sequences of Concern (FunSoCs) specific to microbial pathogenesis. We show our ensemble machine lea
Document: The COVID-19 pandemic has emphasized the importance of detecting known and emerging pathogens from clinical and environmental samples. However, robust characterization of pathogenic sequences remains an open challenge. To this end, we developed SeqScreen, which can accurately characterize short nucleotide sequences using taxonomic and functional labels, and a customized set of curated Functions of Sequences of Concern (FunSoCs) specific to microbial pathogenesis. We show our ensemble machine learning model can label protein-coding sequences with FunSoCs with high recall and precision. SeqScreen is a step towards a novel paradigm of functionally informed pathogen characterization and is available for download at: www.gitlab.com/treangenlab/seqscreen
Search related documents:
Co phrase search for related documents- absence presence and accurate prediction: 1
Co phrase search for related documents, hyperlinks ordered by date