Author: Tristan Bitard-Feildel; Isabelle Callebaut
Title: HCAtk and pyHCA: A Toolkit and Python API for the Hydrophobic Cluster Analysis of Protein Sequences Document date: 2018_1_18
ID: maf96bof_3
Snippet: Seg-HCA [10] was developed to automatically delineate potential "foldable" domains within protein 29 sequences and is the core part of our package. Recently, Piovesan et al. [21] implemented an in-house 30 version of Seg-HCA in FELLS, which allows to nicely visualize different properties of a protein sequence. 31 Our new version of Seg-HCA was rewritten for speed and a score is now computed, describing the general 32 composition in hydrophobic cl.....
Document: Seg-HCA [10] was developed to automatically delineate potential "foldable" domains within protein 29 sequences and is the core part of our package. Recently, Piovesan et al. [21] implemented an in-house 30 version of Seg-HCA in FELLS, which allows to nicely visualize different properties of a protein sequence. 31 Our new version of Seg-HCA was rewritten for speed and a score is now computed, describing the general 32 composition in hydrophobic clusters of the delineated foldable domains. This score is compared to an 33 empirical distribution computed over 734 disordered protein sequences from DisProt v7 [20] to produce 34 a p-value. Figure 1A shows the distributions of scores computed using non redundant sequences of The second methodology included in the package is our TREMOLO-HCA software (Traveling through Figure 1 : HCA score and HCA plot example. Panel A, left, shows the normalized HCA score distribution calculated for protein sequences from DisProt v7 (left, orange -disordered sequences) and from PDB (right, violet -globular domains). The HCA p-value assessing the globularity of delineated foldable segment, is computed using the empirical distribution from DisProt sequences. Panel B, right, shows the HCA plots of three BRCT domains from the Pfam family (PF00533). The aligned protein sequences were used as an input and conserved amino acids can be visualized in red (highly conserved) and yellow, in the context of hydrophobic clusters (HC), in order to evaluate the secondary structure conservation, relatively to the HC shapes. originates from the use of a two-dimensional alpha-helical net, connecting hydrophobic amino acids 81 separated by up to three non-hydrophobic amino acids (or a proline) [12] . Hydrophobic clusters de-82 fined in this way (with this hydrophobic alphabet and the connectivity distance associated with the 83 α-helix) have been shown to match at best regular secondary structures (α-helices and β-strands) and 84 to constitute hallmarks of folded domains [8, 30] . Sequence segments delineated by Seg-HCA, which The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It . https://doi.org/10.1101/249995 doi: bioRxiv preprint correspond to domains that have the ability to fold, either in an autonomous way or following contact 87 with partners [5, 10] ; these segments are later referred to as HCA domains. The advantage of Seg-HCA 88 for the characterization of the dark proteome is to allow the prediction of these foldable domains from 89 the only information of a single amino acid sequence, without the prior knowledge of homologous sequences. can be considered for evaluating the main propensities of hydrophobic clusters towards RSS [8, 24] .
Search related documents:
Co phrase search for related documents- aligned protein sequence and amino acid sequence: 1
- amino acid and connectivity distance: 1
- amino acid and different property: 1
- amino acid sequence and connectivity distance: 1
Co phrase search for related documents, hyperlinks ordered by date