Author: Tsukiyama, Sho; Hasan, Md Mehedi; Fujii, Satoshi; Kurata, Hiroyuki
Title: LSTM-PHV: Prediction of human-virus protein-protein interactions by LSTM with word2vec Cord-id: ac1lhor5 Document date: 2021_2_27
ID: ac1lhor5
Snippet: Viral infection involves a large number of protein-protein interactions (PPIs) between human and virus. The PPIs range from the initial binding of viral coat proteins to host membrane receptors to the hijacking of host transcription machinery. However, few interspecies PPIs have been identified, because experimental methods including mass spectrometry are time-consuming and expensive, and molecular dynamic simulation is limited only to the proteins whose 3D structures are solved. Sequence-based
Document: Viral infection involves a large number of protein-protein interactions (PPIs) between human and virus. The PPIs range from the initial binding of viral coat proteins to host membrane receptors to the hijacking of host transcription machinery. However, few interspecies PPIs have been identified, because experimental methods including mass spectrometry are time-consuming and expensive, and molecular dynamic simulation is limited only to the proteins whose 3D structures are solved. Sequence-based machine learning methods are expected to overcome these problems. We have first developed the LSTM model with word2vec to predict PPIs between human and virus, named LSTM-PHV, by using amino acid sequences alone. The LSTM-PHV effectively learnt the training data with a highly imbalanced ratio of positive to negative samples and achieved an AUC of 0.976 with an accuracy of 98.4% using 5-fold cross-validation. By using independent test dataset, we compared the LSTM-PHV with existing state-of-the-art PPI predictors including DeepViral. In predicting PPIs between human and unknown or new virus, the LSTM-PHV presented higher performance than the existing predictors when they were trained by multiple host protein-including datasets. LSTM-PHV learnt multiple host protein sequence contexts more efficiently than the DeepViral. Interestingly, learning of only sequence contexts as words presented remarkably high performances. Use of uniform manifold approximation and projection demonstrated that the LSTM-PHV clearly distinguished the positive PPI samples from the negative ones. We presented the LSTM-PHV online web server that is freely available at http://kurata35.bio.kyutech.ac.jp/.
Search related documents:
Co phrase search for related documents- long lstm short term memory and low dimensional: 1
- long lstm short term memory and low dimensional space: 1
- long lstm short term memory and lstm application: 1, 2, 3
- long lstm short term memory and lstm neural network: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25
- loss function and low dimensional: 1
- loss function and lstm neural network: 1
Co phrase search for related documents, hyperlinks ordered by date