Author: Lamberth, K.; Nielsen, M.; Lundegaard, C.; Worning, P.; Laurmøller, S. L.; Lund, O.; Brunak, S.; Buus, S.
Title: ‘Queryâ€by Committee’— An Efficient Method to Select Informationâ€Rich Data for the Development of Peptide—HLAâ€Binding Predictors Cord-id: 45x44dth Document date: 2008_6_28
ID: 45x44dth
Snippet: Rationale: We have previously demonstrated that bioinformatics tools such as artificial neural networks (ANNs) are capable of performing pathogenâ€, genome†and HLAâ€wide predictions of peptide–HLA interactions. These tools may therefore enable a fast and rational approach to epitope identification and thereby assist in the development of vaccines and immunotherapy. A crucial step in the generation of such bioinformatics tools is the selection of data representing the event in question (in
Document: Rationale: We have previously demonstrated that bioinformatics tools such as artificial neural networks (ANNs) are capable of performing pathogenâ€, genome†and HLAâ€wide predictions of peptide–HLA interactions. These tools may therefore enable a fast and rational approach to epitope identification and thereby assist in the development of vaccines and immunotherapy. A crucial step in the generation of such bioinformatics tools is the selection of data representing the event in question (in casu peptide–HLA interaction). This is particularly important when it is difficult and expensive to obtain data. Herein, we demonstrate the importance in selecting informationâ€rich data and we develop a computational method, queryâ€byâ€committee, which can perform a global identification of such informationâ€rich data in an unbiased and automated manner. Furthermore, we demonstrate how this method can be applied to an efficient iterative development strategy for these bioinformatics tools. Methods: A large panel of binding affinities of peptides binding to HLA A*0204 was measured by a radioimmunoassay (RIA). This data was used to develop multiple first generation ANNs, which formed a virtual committee. This committee was used to screen (or ‘queried’) for peptides, where the ANNs agreed (‘lowâ€QBC’), or disagreed (‘highâ€QBC’), on their HLAâ€binding potential. Seventeen lowâ€QBC peptides and 17 highâ€QBC peptides were synthesized and tested. The high†or lowâ€QBC data were added to the original data, and new high†or lowâ€QBC second generation ANNs were developed, respectively. This procedure was repeated 40 times. Results: The highâ€QBCâ€enriched ANN performed significantly better than the lowâ€QBCâ€enriched ANN in 37 of the 40 tests. Conclusion: These results demonstrate that highâ€QBCâ€enriched networks perform better than lowâ€QBCâ€enriched networks in selecting informative data for developing peptide–MHCâ€binding predictors. This improvement in selecting data is not due to differences in network training performance but due to the difference in information content in the highâ€QBC experiment and in the lowâ€QBC experiment. Finally, it should be noted that this strategy could be used in many contexts where generation of data is difficult and costly.
Search related documents:
Co phrase search for related documents- Try single phrases listed below for: 1
Co phrase search for related documents, hyperlinks ordered by date