Selected article for: "important role and virus number"

Author: Almog, Gal; Olabode, Abayomi S; Poon, Art FY
Title: Tuning intrinsic disorder predictors for virus proteins
  • Cord-id: wch72xks
  • Document date: 2020_10_27
  • ID: wch72xks
    Snippet: Many virus-encoded proteins have intrinsically disordered regions that lack a stable folded threedimensional structure. These disordered proteins often play important functional roles in virus replication, such as down-regulating host defense mechanisms. With the widespread availability of next-generation sequencing, the number of new virus genomes with predicted open reading frames is rapidly outpacing our capacity for directly characterizing protein structures through crystallography. Hence, c
    Document: Many virus-encoded proteins have intrinsically disordered regions that lack a stable folded threedimensional structure. These disordered proteins often play important functional roles in virus replication, such as down-regulating host defense mechanisms. With the widespread availability of next-generation sequencing, the number of new virus genomes with predicted open reading frames is rapidly outpacing our capacity for directly characterizing protein structures through crystallography. Hence, computational methods for structural prediction play an important role. A large number of predictors focus on the problem of classifying residues into ordered and disordered regions, and these methods tend to be validated on a diverse training set of proteins from eukaryotes, prokaryotes and viruses. In this study, we investigate whether some predictors outperform others in the context of virus proteins. We evaluate the prediction accuracy of 21 methods, many of which are only available as web applications, on a curated set of 126 proteins encoded by viruses. Furthermore, we apply a random forest classifier to these predictor outputs. Based on cross-validation experiments, this ensemble approach confers a substantial improvement in accuracy, e.g., a mean 36% gain in Matthews correlation coefficient. Lastly, we apply the random forest predictor to SARS-CoV-2 ORF6, an accessory gene that encodes a short (61 AA) and moderately disordered protein that inhibits the host innate immune response.

    Search related documents:
    Co phrase search for related documents
    • absence presence and accuracy specificity: 1, 2
    • absence presence and accuracy specificity sensitivity: 1, 2
    • absence presence and accurate prediction: 1
    • absence presence and long short: 1, 2, 3, 4, 5, 6, 7, 8
    • accuracy measure and long short: 1, 2, 3
    • accuracy specificity and long short: 1, 2, 3, 4, 5
    • accuracy specificity sensitivity and long short: 1, 2, 3, 4, 5
    • accurate prediction and long short: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16
    • accurate prediction obtain and long short: 1, 2