ITERATE: A Conceptual Clustering Algorithm for Data Mining

Gautam Biswas, Jerry B. Weinberg, and Doug H. Fisher

Department of Computer Science
Vanderbilt University,
Nashville, TN 37235

IEEE Transactions on Systems, Man, Cybernetics, vol. 28C, no. 2, pp.219-230, May 1998.


The data exploration task can be divided into three interrelated subtasks: (i) feature selection, (ii) discovery, and (iii) interpretation. This paper describes an unsupervised discovery method with biases geared toward partitioning objects into clusters that improve interpretability. The algorithm, ITERATE, employs: (i) a data ordering scheme and (ii) an iterative redistribution operator to produce maximally cohesive and distinct clusters. Cohesion or intra-class similarity is measured in terms of the match between individual objects and their assigned cluster prototype. Distinctness or inter-class dissimilarity is measured by an average of the variance of the distribution match between clusters. We demonstrate that interpretability, from a problem solving viewpoint, is addressed by the intra- and inter-class measures. Empirical results demonstrate the properties of the discovery algorithm, and its applications to problem solving.

Keywords: knowledge discovery, data mining, conceptual clustering, concept formation, criterion function, order bias, iterative redistribution.

Full Paper (PDF 294912 bytes).