Selected article for: "sequence composition and solubility score"

Author: Bikash K. Bhandari; Paul P. Gardner; Chun Shen Lim
Title: Solubility-Weighted Index: fast and accurate prediction of protein solubility
  • Document date: 2020_2_16
  • ID: 2rpr7aph_10
    Snippet: The final weights were derived from the arithmetic means of the weights for individual amino acid residues obtained cross-validation (Supplementary Table S4) . We observed over a 20% change on the weights for cysteine (C) and histidine (H) residues (Fig 2C and Supplementary Table S4 ). These results are in agreement with the contributions of cysteine and histidine residues as shown in Supplementary Fig S2B. We call the solubility score of a prot.....
    Document: The final weights were derived from the arithmetic means of the weights for individual amino acid residues obtained cross-validation (Supplementary Table S4) . We observed over a 20% change on the weights for cysteine (C) and histidine (H) residues (Fig 2C and Supplementary Table S4 ). These results are in agreement with the contributions of cysteine and histidine residues as shown in Supplementary Fig S2B. We call the solubility score of a protein sequence calculated using the final weights the Solubility-Weighted Index (SWI). (Smith et al. 2003) . The solubility score of a protein sequence was calculated using a sequence composition scoring approach (Equation 1, using optimised weights , W instead of normalised B-factors ). These scores were used to compute the AUC scores for B training and test datasets. (B) Training and test performance of solubility prediction using optimised weights for 20 amino acid residues in a 10-fold cross-validation (mean AUC ± standard deviation). Related data and figures are available as Supplementary Table S3 and Supplementary Fig S4 and S5 . (C) Comparison between the 20 initial and final weights for amino acid residues. The final weights are derived from the arithmetic mean of the optimised weights from cross-validation. These weights are used to calculate SWI, the solubility score of a protein sequence, in the subsequent analyses. Filled circles, which represent amino acid residues, are colored by hydrophobicity (Kyte and Doolittle 1982) . Solid black circles denote aromatic amino acid residues phenylalanine (F), tyrosine (Y), tryptophan (W). Dotted diagonal line represents no change in weight. See also Supplementary Table S4 and Fig S4. AUC, Area Under the ROC Curve; ROC, Receiver Operating Characteristic; , arithmetic W mean of the weights of an amino acid residue optimised from 1,000 bootstrap samples in a cross-validation step.

    Search related documents:
    Co phrase search for related documents