next up previous
Next: Chapter Summary Up: NON-BEST HOMOGENEOUS RULES Previous: Default Rule Performance

Discussion

Compared to the original BruteDL, BruteDL+ provides marginally better performance in general. However, a notable exception is observed on the small soybean domain. There are significant decrease of the system prediction accuracy on the 50% through 70% sized training sets. The difference is very large on the 70% set. Looking closely at the decision list generated by BruteDL+ on this data set, we found that the default rule appeared in the output is different from the one learned by the original BruteDL. To explain why this occurred, recall that the default class is defined as the class that is most frequently seen among the training examples that are not covered by any learned rule, if any. Otherwise, it is the most frequently seen class among all the training examples. In the small soybean case, BruteDL learns only a few rules. Among those training examples that are not covered by these rules, the most frequently seen class may not be the same as the one that is most popular among all the training examples. But when we ran BruteDL+ on this data, some of those formerly uncovered examples may be covered by those not-best rules and, as a result, the default class changes. If all the formerly uncovered examples are covered by not-best rules, the default class will be the one that is most popular among all the training examples. Since the default rule found by the original BruteDL performs better than the default rule found by BruteDL+ on the 70% sized soybean data set, the prediction accuracy of the system as a whole degrades as shown in Table 23. The obvious fix of always using the default rule generated by the original BruteDL suggests itself, but a deeper analysis of how and when to use the additional rules of BruteDL+ is really desirable.

Also shown by the experimental results is that the performance remains the same on lenses, mushroom and all the monks problems domains. For the mushroom and the monks data sets, this is because the training set sizes of them are relatively large and the default rule performs fine. For the lenses domain, which is the smallest database, neither BruteDL nor BruteDL+ learns any homogeneous rules other than the default rule. All testing observations are classified by the default rule and no improvement on the performance can be obtained.



next up previous
Next: Chapter Summary Up: NON-BEST HOMOGENEOUS RULES Previous: Default Rule Performance

Jing Lin
Mon Apr 1 19:35:53 CST 1996