Author: Zuo, Guanghong; Xu, Zhao; Yu, Hongjie; Hao, Bailin
Title: Jackknife and Bootstrap Tests of the Composition Vector Trees Document date: 2011_3_5
ID: vm5zjr64_17
Snippet: It is interesting to note that Figure 2 provides another angle to look at the "best" K-values for different organism groups. It is appropriate to reproduce in more details the order-of-magnitude estimate for the "best" K-values, given in our recent paper (7). Suppose that the frequency of appearance of all amino acids is the same, e.g., 1/20. Then the probability of encountering a designated K-peptide is 20 -K . Let L be the total number of amino.....
Document: It is interesting to note that Figure 2 provides another angle to look at the "best" K-values for different organism groups. It is appropriate to reproduce in more details the order-of-magnitude estimate for the "best" K-values, given in our recent paper (7). Suppose that the frequency of appearance of all amino acids is the same, e.g., 1/20. Then the probability of encountering a designated K-peptide is 20 -K . Let L be the total number of amino acids in the collection of proteins of an organism, the expected number of such K-peptide is L/(20) K . In order for a K-peptide to bear species-specificity, this number should be less than what expected for a random sequence, i.e., L/(20) K <<1 or, after taking logarithm of base 10, logL < K(1+log2). On the other hand, the subtraction procedure in CV method requires that the number of (K 2)-peptide should not be too few: L/(20) (K 2) >1, i.e., logL > (K 2)(1+log2). Putting together these inequalities, we get logL/(1+log2) < K < 2+logL/(1+log2).
Search related documents:
Co phrase search for related documents- amino acid and expected number: 1, 2, 3, 4, 5
Co phrase search for related documents, hyperlinks ordered by date