Author: Alguwaizani, Saud; Park, Byungkyu; Zhou, Xiang; Huang, De-Shuang; Han, Kyungsook
Title: Predicting Interactions between Virus and Host Proteins Using Repeat Patterns and Composition of Amino Acids Document date: 2018_5_9
ID: 0dxrai3j_9
Snippet: While feature F1 represents the repeat patterns and global composition of amino acids in the whole protein sequence, feature F3 represents the local composition of amino acids. For feature F3, we partition a protein sequence into 5 segments of equal length except the last one and compute the composition of amino acids in each of the 5 segments. Since the three features, F1, F2, and F3, are computed for each amino acid, every pair of virus and hos.....
Document: While feature F1 represents the repeat patterns and global composition of amino acids in the whole protein sequence, feature F3 represents the local composition of amino acids. For feature F3, we partition a protein sequence into 5 segments of equal length except the last one and compute the composition of amino acids in each of the 5 segments. Since the three features, F1, F2, and F3, are computed for each amino acid, every pair of virus and host proteins is represented in a feature vector with 280 elements (140 for a virus protein and 140 for a host protein). Data of virus-host PPIs were collected from IntAct [13] and VirusMentha [14] . But PPIs of HCV with human were obtained from the Hepatitis C Virus Protein Interaction Database (HCVpro) [15] because HCVpro has more human-HCV PPIs than IntAct. e sequences of the proteins involved in the virus-host PPIs were obtained from the UniProt database [16] . e training and test datasets constructed in our study can be summarized as follows. Figure 2 : Example of computing feature 2 (F2) of amino acid repeats. F2 is the maximum value of the sum of squared length of single amino acid repeats in a window of size six. e maximum repeat size of amino acid S is 3, which is observed in the windows starting at 4, 5, 6, 7, 13, 14, and 15. So, F2 (repeats of S) � 3 2 � 9. e maximum repeat size of amino acid W is 4, observed in the windows starting at 1 and 2. F2 (repeats of W) � 4 2 � 16. e maximum repeat size of amino acid R is 6, observed in the window starting at 10. F2 (repeats of R) � 6 2 � 36.
Search related documents:
Co phrase search for related documents- amino acid and feature vector: 1, 2, 3, 4, 5, 6
- amino acid and global composition: 1
- amino acid and host protein: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25
- amino acid and host protein virus: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18
- amino acid composition and feature vector: 1, 2, 3
- amino acid composition and global composition: 1
- amino acid composition and host protein: 1, 2, 3
- amino acid composition and host protein virus: 1
- amino acid compute and feature vector: 1
- amino acid global composition and global composition: 1
- amino acid repeat and host protein: 1
- feature vector and host protein: 1, 2
- feature vector and host protein virus: 1
- feature vector and host protein virus pair: 1
Co phrase search for related documents, hyperlinks ordered by date