Author: Kshirsagar, Meghana; Carbonell, Jaime; Klein-Seetharaman, Judith
Title: Multitask learning for host–pathogen protein interactions Document date: 2013_7_1
ID: sdgt2ms5_69
Snippet: Given two paired sets of k measured values, the paired t-test determines whether they differ from each other in a significant i223 way. We compare MTPL with Indep.-the best baseline from the 10-fold CV results. Because the 10-fold CV results from the previous section give insufficient samples (i.e. only 10 samples), we instead use 50 bootstrap sampling experiments and use the results to compute the P-values. Each bootstrap sampling experiment con.....
Document: Given two paired sets of k measured values, the paired t-test determines whether they differ from each other in a significant i223 way. We compare MTPL with Indep.-the best baseline from the 10-fold CV results. Because the 10-fold CV results from the previous section give insufficient samples (i.e. only 10 samples), we instead use 50 bootstrap sampling experiments and use the results to compute the P-values. Each bootstrap sampling experiment consists of the following procedure: we first make two random splits of 80 and 20% of the data, such that the class ratio of 1:100 is maintained in both. The training set is then constructed using a bootstrap sample from the 80% split and the test data from the 20% split. A total of 50 models are thus trained and evaluated. We do not tune parameters again for each model and instead use the optimal setting of parameter values from our 10-fold CV experiments. The F1 is computed for each experiment thereby giving us 50 values, which will be our samples for the hypothesis test. Because t-tests assume a normal distribution of the samples, we first did a normality test on each set of 50 F1 values. We performed the Shapiro-Wilk test with a significance level of ¼ 0:00001 and found that our samples satisfy normality.
Search related documents:
Co phrase search for related documents, hyperlinks ordered by date