Unsupervised Performance Task

From the start, Cobweb's performance task, and a performance task for unsupervised learning generally was:

Given a partial object description,
Predict of unknown variable values (i.e., pattern completion).

I also called this flexible prediction and the learning that stems from this task has been called inference learning or multi-task learning recently, rather than unsupervised learning.

In unsupervised learning there is no special dependent dimension (e.g., class label) to be predicted, but any dimension is fair game for prediction. The paper below contrasts the abilities of one classification hierarchy built via Cobweb with N decision trees, one for each dimension.

Fisher, D. (1987). "Conceptual Clustering, Learning from Examples, and Inference," Proceedings of the Fourth International Workshop on Machine Learning. Irvine, CA: Morgan Kaufmann.

Other papers on Cobweb also make the extension of the supervised performance task to unsupervised learning clear.

Fisher, D. (1987) "Knowledge Acquisition Via Incremental Conceptual Clustering," Machine Learning, 2, 139--172. Reprinted in J. Shavlik & T. Dietterich (eds.), Readings in Machine Learning, 267--283, Morgan Kaufmann, 1990.

Fisher, D. (1987). "Improving Inference Through Conceptual Clustering," Proceedings of the Sixth National Conference on Artificial Intelligence, Seattle, WA: Morgan Kaufmann, 461--465.

Also see

Fisher, D. (2001). Unsupervised Learning (Editorial), Machine Learning, 45, 1, 5--7. (Special issue editor, 30 submissions, 2 installments).

Fisher, D. (2002). "Conceptual Clustering," in W. Klosgen and J. Zytkow (eds.), Handbook of Data Mining and Knowledge Discovery, Oxford University Press, 388--396, Chapter 16.5.2. A preprint

Fisher, D. (1996). "Iterative Optimization and Simplification of Hierarchical Clusterings," Journal of Artificial Intelligence Research, 4, 147--179.


Unsupervised Objective Measures

It is not surprising then that objective measures for clustering can be viewed as composites of objective measures for supervised induction. This can include the relationship between Autoclass' objective measure an the Naive Bayesian Classifier, for example. The following papers make the connections explicit between unsupervised objective criteria for hierarchical clustering and the split measures for supervised decision tree induction:

Fisher, D. (1996). "Iterative Optimization and Simplification of Hierarchical Clusterings," Journal of Artificial Intelligence Research, 4, 147--179.

The following paper defines an unsupervised decision tree algorithm using a composite measure:

Fisher, D., & Hapanyengwi, G. (1993). "Database Management and Analysis Tools of Machine Learning," Journal of Intelligent Information Systems, 2, 5--38.


Pruning and Simplification

Pruning and simplification methods of supervised induction, notably decision tree induction (but also based on Michaski's work with AQ) can serve as the basis for simplification of hierarchical classification schemes built vis unsupervised learning. In particular, the following papers show how to identify frontiers for each variable that optimize prediction accuracy for that variable within a single hierarchical classification scheme.

Fisher, D. (1996). "Iterative Optimization and Simplification of Hierarchical Clusterings," Journal of Artificial Intelligence Research, 4, 147--179.

Fisher, D. (1989). "Noise-Tolerant Conceptual Clustering" Proceedings of the International Joint Conference on Artificial Intelligence, Detroit, MI: Morgan Kaufmann, 825--830.

Fisher, D. (1995). "Optimization and Simplification of Hierarchical Clusterings," First International Conference on Knowledge Discovery in Databases, Montreal, Canada: AAAI Press, 118--123.

The following paper looked at the variable benefits of pruning (or simplification in the AQ vocabulary), both within a supervised and unsupervised system.

Fisher, D., & Schlimmer, J. (1988). "Concept Simplification and Prediction Accuracy," Proceedings of the Fifth International Machine Learning Conference. Ann Arbor, MI: Morgan Kaufmann.