Author: Qaid, Talal S.; Mazaar, Hussein; Alqahtani, Mohammed S.; Raweh, Abeer A.; Alakwaa, Wafaa
Title: Deep sequence modelling for predicting COVID-19 mRNA vaccine degradation Cord-id: f4gn408e Document date: 2021_6_22
ID: f4gn408e
Snippet: The worldwide coronavirus (COVID-19) pandemic made dramatic and rapid progress in the year 2020 and requires urgent global effort to accelerate the development of a vaccine to stop the daily infections and deaths. Several types of vaccine have been designed to teach the immune system how to fight off certain kinds of pathogens. mRNA vaccines are the most important candidate vaccines because of their capacity for rapid development, high potency, safe administration and potential for low-cost manu
Document: The worldwide coronavirus (COVID-19) pandemic made dramatic and rapid progress in the year 2020 and requires urgent global effort to accelerate the development of a vaccine to stop the daily infections and deaths. Several types of vaccine have been designed to teach the immune system how to fight off certain kinds of pathogens. mRNA vaccines are the most important candidate vaccines because of their capacity for rapid development, high potency, safe administration and potential for low-cost manufacture. mRNA vaccine acts by training the body to recognize and response to the proteins produced by disease-causing organisms such as viruses or bacteria. This type of vaccine is the fastest candidate to treat COVID-19 but it currently facing several limitations. In particular, it is a challenge to design stable mRNA molecules because of the inefficient in vivo delivery of mRNA, its tendency for spontaneous degradation and low protein expression levels. This work designed and implemented a sequence deep model based on bidirectional GRU and LSTM models applied on the Stanford COVID-19 mRNA vaccine dataset to predict the mRNA sequences responsible for degradation by predicting five reactivity values for every position in the sequence. Four of these values determine the likelihood of degradation with/without magnesium at high pH (pH 10) and high temperature (50 degrees Celsius) and the fifth reactivity value is used to determine the likely secondary structure of the RNA sample. The model relies on two types of features, namely numerical and categorical features, where the categorical features are extracted from the mRNA sequences, structure and predicted loop. These features are represented and encoded by numbers, and then, the features are extracted using embedding layer learning. There are five numerical features depending on the likelihood for each pair of nucleotides in the RNA. The model gives promising results because it predicts the five reactivity values with a validation mean columnwise root mean square error (MCRMSE) of 0.125 using LSTM model with augmentation and the codon encoding method. Codon encoding outperforms Base encoding in MCRMSE validation error using the LSTM model meanwhile Base encoding outperforms codon encoding due to less over-fitting and the difference between the training and validation loss error is 0.008.
Search related documents:
Co phrase search for related documents- activation function and loop structure: 1
- activation function and loss function: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12
- activation function and lstm model: 1
- activation function linear and loss function: 1
- long lstm short term memory unit and loss function: 1
- long lstm short term memory unit and lstm model: 1, 2
- loop structure and loss function: 1
Co phrase search for related documents, hyperlinks ordered by date