next up previous
Next: Experimental Results Up: EXPERIMENTS WITH BRUTEDL Previous: EXPERIMENTS WITH BRUTEDL

Experimental Design

Segal et al [10] ran BruteDL 10 times on each of several data set. In each run, they randomized the original data, used 70% of the data as training data and 30% as testing data. They showed the results averaged over 10 trials.

For this thesis, we have run 25 trials with BruteDL on each data set to make the experimental result statistically informative. Unlike the experiments done by Segal et al, we did not simply use 70% data for training and 30% for testing. Since the default rule will be heavily used when the learning system is not ideally trained, it is very important to examine the performance of BruteDL when the training data is sparse. We have done this by running BruteDL on different sized training sets. In each of the 25 trials, we have run BruteDL on seven different sized training sets. First, we randomly split the original data set into 2 subsets. One of the subsets contains 30% data which is used for testing, the other contains 70% of the data which will be further processed to generate training data. We call this 70% subset the pre-sampled training set. The pre-sampled training set is divided into seven equal sized subsets. We run BruteDL on 1/7, then on 2/7, 3/7, 4/7, 5/7, 6/7, and 7/7 of the pre-sampled training data. This is equivalent to running BruteDL on 10%, then 20%, 30% ... 70% of the original data, and always testing on the same 30% of the original data. Thus, we have 25 trials with 7 different training runs on each trial. In each run, we recorded the prediction accuracy of the system, the frequency with which the default rule was used and the prediction accuracy of the default rule when used. For the Monks problems, two specific subsets of the data have traditionally been used for training and testing. We combined the two sets and ran experiments in the manner described above. The results of the experiments are given in the following section.



next up previous
Next: Experimental Results Up: EXPERIMENTS WITH BRUTEDL Previous: EXPERIMENTS WITH BRUTEDL



Jing Lin
Mon Apr 1 19:35:53 CST 1996