Before concluding, we want to explicitly address the methodological biases that have guided the research proposed here. This is an important aspect of our research plan since it conveys the extent to which our results can be generalized and applied.
The surest way to characterize system behavior is through formal analysis (e.g., proofs of correctness), and these exist for a number of search-intensive concept learning systems (Hayes-Roth & McDermott, 1978; Vere, 1980); if a concept space is exhaustively searched, then it is not surprising that concepts of appropriate forms will be found. However, incremental processing is complicated by a cost/performance tradeoff. Recently, analytic techniques for characterizing probably approximately correct learning have been developed (Haussler, 1987; Valiant, 1985) that may be more amenable in characterizing incremental methods. Generally though, analytic methods result in statements of worst case and global performance that cover a wide range (often all) of the possible inputs. In many cases they hide fluctuations in behavior over subranges of input. For these reasons, our studies of incremental learning have been, and will continue to be, predominantly empirical.
While analytic statements are sure, empirical characterizations can differ radically in the accuracy with which they portray a system's performance. Testing a system in very few domains may leave doubt as to the generality of the system's apparent success. To understand the limitations and applicability of a method, we must determine the source of the method's effectiveness. Like Simon's (1969) metaphorical ant, it may be that most of the interesting computation occurs as a simple method interacts with a complex environment; despite a researcher's claim, there may be very little `power' in the system alone. For these reasons, it is not sufficient to test a system in a number (even many) domains; it is necessary to show that the domains are qualitatively different in important respects. Only in this manner can we fully characterize the spectrum of a system's ability.
Fisher (1987a, c) and Schlimmer (1987a) have employed two techniques for empirically characterizing incremental learning systems. One technique computes a measure of `domain complexity' (or regularity); in (Fisher, 1987a; Fisher & Schlimmer, 1988; Schlimmer, 1987a) this is a function of statistical interdependencies within a domain. A second approach is to construct artificial domains to delimit the conditions of system success and failure. While it is important that systems effectively operate in natural domains, it is a formalized representation of the domain that is presented to the learning system, and this representation can be made arbitrarily complex or trivial. Artificial domains can be used to sweep away any confusion as to just how difficult a task is.
The use of `domain complexity' measures, natural domains, and artificial domains can be combined into an effective empirical methodology. We will continue to use these techniques to characterize incremental learning cost and quality. This empirical methodology is relatively novel with respect to current practices in machine learning, and we believe it is a contribution in itself.