Author: Shen, Shiyi; Kai, Bo; Ruan, Jishou; Torin Huzil, J.; Carpenter, Eric; Tuszynski, Jack A.
Title: Probabilistic analysis of the frequencies of amino acid pairs within characterized protein sequences Cord-id: hjo78o1q Document date: 2006_10_15
ID: hjo78o1q
Snippet: Here, we describe a unique probabilistic evaluation of the 20, naturally occurring, amino acids and their distributions within the Swiss-Prot and Complete Human Genebank databases. We have developed a computational technique that imparts both directionality and length constraints into searches for unique combinations of amino acids within protein sequences. Using statistical approaches, we have carried out searches of all possible two- and three-residue motifs contained within these databases. T
Document: Here, we describe a unique probabilistic evaluation of the 20, naturally occurring, amino acids and their distributions within the Swiss-Prot and Complete Human Genebank databases. We have developed a computational technique that imparts both directionality and length constraints into searches for unique combinations of amino acids within protein sequences. Using statistical approaches, we have carried out searches of all possible two- and three-residue motifs contained within these databases. This technique is based on the unusually high occurrence of a small number of these motifs when compared to the expected probability of finding a specific residue grouping within a given database. Subsequent filtering of this search to identify such unique combinations has provided several examples that can be used as markers to identify particular proteins within or across databases. We focus on three of these motifs, which were found to be of greatest interest to us. The CC, CM and a combination of the two, CCM motifs all occur either more or less frequently than would be predicted based on standard amino acid distributions within the entire human proteome.
Search related documents:
Co phrase search for related documents- actual frequency and acute respiratory syndrome: 1
- actual value and acute respiratory syndrome: 1
- acute respiratory syndrome and additional protein: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19
- acute respiratory syndrome and additional set: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
- acute respiratory syndrome and long history: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25
- acute respiratory syndrome and low frequency: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23
- long history and low frequency: 1
Co phrase search for related documents, hyperlinks ordered by date