Selected article for: "antibody test and international license"

Author: Rishikesh Magar; Prakarsh Yadav; Amir Barati Farimani
Title: Potential Neutralizing Antibodies Discovered for Novel Corona Virus Using Machine Learning
  • Document date: 2020_3_20
  • ID: fn7l93wh_26
    Snippet: . CC-BY-NC-ND 4.0 International license author/funder. It is made available under a The copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/2020.03.14.992156 doi: bioRxiv preprint Figure 1 . Designing antibodies or peptide sequences that can inhibit the COVID-19 virus requires high throughput experimentation of vastly mutated sequences of potential inhibitors. The screening of thousands of available s.....
    Document: . CC-BY-NC-ND 4.0 International license author/funder. It is made available under a The copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/2020.03.14.992156 doi: bioRxiv preprint Figure 1 . Designing antibodies or peptide sequences that can inhibit the COVID-19 virus requires high throughput experimentation of vastly mutated sequences of potential inhibitors. The screening of thousands of available strains of antibodies are prohibitively expensive, and not feasible due to lack of available structures. However, machine learning models can enable the rapid and inexpensive exploration of vast sequence space on the computer in a fraction of seconds. We collected 1933 virusantibody sequences with clinical patient IC50 data. Graph featurization of antibody-antigen sequences creates a unique molecular representation. Using graph representation, we benchmarked and used a variety of shallow and deep learning models and selected XGBoost because of its superior performance and interpretability. We trained our model using a dataset including 1,933 diverse virus epitope and the antibodies. To generate the hypothetical antibody library, we mutated the SARS scaffold antibody of 2006 (PDB:2GHW) and generated thousands of possible candidates. Using the ML model, we classified these sequences and selected the top 18 sequences that will neutralize COVID-19 with high confidence. We used MD simulations to check the stability of the 18 sequences and rank them based on their stability. The copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/2020.03.14.992156 doi: bioRxiv preprint The copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/2020.03.14.992156 doi: bioRxiv preprint Figure 3 . a) The test accuracy with five-fold cross validation for XG-Boost, Random Forrest (RF), Logistic Regression (LR), Support Vector Machine (SVM) and Deep Learning (Multilayer Perceptron. XGBoost has the highest performance with (90.75%). b) Out of training class test accuracy for influenza, Dengue, Ebola, Hepatitis, and SARS. To perform this test, for example for influenza, all the influenza virus-antibody sequences were removed from the training set and the obtained model were tested on all samples of Influenza and the accuracy is reported here. c) Blossum validated mutations, non-neutralizing and neutralizing antibody sequences. To achieve more confidence, we set the threshold of prediction probability to 0.9895 in XGBoost and found 18 neutralizing antibody sequences (the green points). d) Interpretability of ML model: to understand what mutations are playing the key roles in neutralization, XGBoost feature importance used with ranked atomic level features. Through connecting the atomic features with each of 20 amino acids, M was found to be the most important amino acids in neutralization followed by F, Y, W. The copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/2020.03.14.992156 doi: bioRxiv preprint . CC-BY-NC-ND 4.0 International license author/funder. It is made available under a The copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/2020.03.14.992156 doi: bioRxiv preprint

    Search related documents:
    Co phrase search for related documents
    • amino acid and antibody library: 1, 2, 3, 4, 5, 6
    • amino acid and antibody sequence: 1, 2, 3, 4, 5, 6, 7, 8, 9
    • amino acid and atomic level: 1, 2, 3
    • amino acid and available strain: 1, 2, 3, 4, 5
    • amino acid and available structure: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13
    • amino acid and clinical patient: 1, 2, 3, 4, 5, 6, 7
    • antibody sequence and clinical patient: 1
    • atomic level and clinical patient: 1
    • available structure and clinical patient: 1