All the best rules appear before the default rule in the decision list are learned during BruteDL's search and are more informative than the default rule. Also, they are better than those not-best rules stored in BruteDL+. If we can appropriately utilize the existing best rules in the decision list when an observation is not completely matched by any of them, the performance of the system may be improved.
The process of classifying an observation according to a rule learned by the learning system can be viewed as a process to find the similarity between the observation to be classified and the training examples given to the learning system, and then assign a class to the unlabeled observation that is most frequently seen among the training examples in the group most similar to the unlabeled observation. By looking at the rules in the decision list, we can find a similar group of training examples for an unlabeled observation if a rule completely matches that observation. In this case, the observation is classified as a member of the class that appears at the right hand side of the rule. If none of the rules completely matches the observation, there may be some rules that partially match it. The groups of the training examples covered by these rules can be considered similar, to varying degrees, to the observation being classified. Among these groups we should be able to find the one that is the most similar to the observation and assign to it the most frequently seen class value among the examples in the group. This assigned value may be a more reasonable guess than the one provided by the default rule. We call the rule in the decision list which covers the most similar group of training examples the closest-matching rule of the observation to be labeled.
Using this method, a major problem is how to find the closest matching rule.
The BruteDL rules are in the form of
The similarity between an observation and a BruteDL rule can be represented
by the number of the rule's conjuncts that match the observation. For
example, given a set of attributes, a set of classes, an unlabeled
observation X
and the BruteDL rules as following:
Attributes:Both of the rules partially match the observation X. One of rule 1's two conjuncts matches X while two of rule 2's three conjuncts match X. We can conclude that rule 2 represents a group of training examples that are more similar to X than the group represented by rule 1. Therefore, X can be classified into class![]()
![]()
![]()
Classes:
![]()
X: (
)
Rules: 1.
![]()
2.
![]()
according to rule 1.
However, reconsidering the example, both rule 1 and rule 2 have only one
conjunct that is not matched by X. In some sense, the two rules
represent the groups of training examples that bear the same similarity to
observation X. Since rule 1 has higher accuracy, we can consider rule 1
the closest matching of X and thereby classify X as a member of class
.
As indicated by the example, we can define the closest-matching rule in multiple ways. For example, any one of the following definitions can be used.
These definitions are based on the absolute number of conjuncts. There are many other measures of similarity that we might explore, but we limit ourselves to the two described above.
While using the best matching rule when none of the existing BruteDL rules
matches an unclassified observation, the frequency of the default rule being
used should decrease sharply. However, this frequency would not drop to zero.
It is still possible that no conjuncts of any of the BruteDL rules matches
an observation. For instance, in the previous example, if the system
encounters an observation
The question marks indicate that observation Y has unknown values
for attributes
and
. In this case, neither rule 1 nor rule 2
matches Y and we do not measure the similarity between Y and the rules.
In this case, each of our modifications to BruteDL classifies observation
Y according to the default rule.
Jing Lin