Author: Li, Chun; Zhao, Jialing; Wang, Changzhong; Yao, Yuhua
Title: Protein Sequence Comparison and DNA-binding Protein Identification with Generalized PseAAC and Graphical Representation Document date: 2018_2_23
ID: u1imic5l_54
Snippet: For convenience of comparison, results of some existing methods including DNAbinder [1] , DNA-Prot [2] , iDNA-Prot [3] and enDNA-Prot [4] are also listed in Table 6 . DNAbinder developed by Kumar et al. [1] can extract evolutionary information in form of position specific scoring matrix (PSSM) from the corresponding protein sequence. PSSM-21 and PSSM-400 are two feature vectors generated by means of PSSM, whose dimensions are 21 and 400, respecti.....
Document: For convenience of comparison, results of some existing methods including DNAbinder [1] , DNA-Prot [2] , iDNA-Prot [3] and enDNA-Prot [4] are also listed in Table 6 . DNAbinder developed by Kumar et al. [1] can extract evolutionary information in form of position specific scoring matrix (PSSM) from the corresponding protein sequence. PSSM-21 and PSSM-400 are two feature vectors generated by means of PSSM, whose dimensions are 21 and 400, respectively. In [1] , PSSM-400 based SVM model was mainly used for predicting DNA-BPs. DNA-Prot [2] is a Random Forest based method, in which the feature vector includes sequence information and structure information, such as the composition of 20 standard amino acids, composition of 10 amino acid groups, and secondary structure information predicted from a protein sequence. iDNA-Prot [3] constructs the feature vector via the grey model, and Random Forest is also used as the operation engine. EnDNA-Prot [4] is a predictor which encodes a protein sequence into a feature vector with dimension of 188 and adopts an ensemble classifier constructed with four types of machine learning classifiers. All these methods are tested on the same datasets to make an unbiased comparison with our method. Observing Table 6 , we can see that the current approach outperforms other methods by 3.29-10.44% in terms of ACC, 0.056-0.206 in terms of MCC, and 1. .76% in terms of F1M. This result indicates that our method achieves highly comparable performance.
Search related documents:
Co phrase search for related documents- amino acid and dataset test: 1, 2, 3, 4
- amino acid and dna bp: 1, 2, 3, 4, 5, 6, 7, 8, 9
- amino acid and ensemble classifier: 1
- amino acid and evolutionary information: 1, 2, 3, 4, 5, 6, 7
- amino acid and feature vector: 1, 2, 3, 4, 5, 6
- amino acid and grey model: 1
- comparison convenience and dataset test: 1
- current approach and evolutionary information: 1
- current approach and evolutionary information extract: 1
- current approach and feature vector: 1, 2
- dataset test and ensemble classifier: 1, 2, 3
- dataset test and feature vector: 1
- ensemble classifier and feature vector: 1
Co phrase search for related documents, hyperlinks ordered by date