Channels ▼

Community Voices

Dr. Dobb's Bloggers

An Introduction to Case-Based Reasoning

May 11, 2008


In this essay from the August 1991 issue of AI Expert, Ralph Barietta, a onetime case-based reasoning expert for Lockheed's Artificial Intelligence Center and director of CBR at Cognitive Systems, explained how choosing a doctor illustrates the role of specific case experience in guiding expert problem solving.



An Introduction to Case-Based Reasoning

By Ralph Barietta

 


When choosing a doctor, we tend to prefer practitioners who are older. Why? Certainly a doctor fresh out of school has the latest information on diagnoses and treatments. In fact, the young doctor probably has more book knowledge on what is on the cutting edge than the more experienced practitioner. We feel more comfortable with older doctors because they have seen and treated more patients who have had illnesses similar to our own. In essence, we view the experiences of these doctors more in terms of how many cases they have handled than what procedures they know.

Case-based reasoning (CBR) is a rapidly emerging AI technology can use past experiences (cases) to solve current problems. The idea of reasoning from relevant past cases is appealing because it corresponds to the process an expert uses to solve "new" problems quickly and accurately.


Choosing a doctor illustrates the role of specific case experience in guiding expert problem solving. The ability to retrieve and manipulate past problem-solving examples accurately is important for many domains: diagnosis, classification, prediction, planning, design, law, process control and monitoring, advanced manufacturing, and configuration are good examples. Despite the value of remembering past cases for problem solving, humans can forget what has worked in the past, or why, for that matter. This can lead to the "reinventing the wheel" syndrome so prevalent in government and industry.


Computers do not forget. If we can get them to remember the right thing at the right time, we can reap the benefits of the past successes and failures of ourselves and others. This is the ultimate goal of the growing CBR research community.

THE CBR APPROACH

A rule-based expert system solves problems by taking an input specification (or developing one through a question-and-answer dialog with the user) and then "chaining" together the appropriate set of rules from the rule base to arrive at a solution (Figure 1).

Given the same exact problem situation, the system will go through exactly the same amount of work to come up with the solution. (In other words, rule-based systems don't inherently learn.) In addition, given a problem that is outside the rule-based system's original scope (for example, a new problem type within the same domain), the system often can't render any assistance. Finally, rule-based systems are very time-consuming to build and maintain because rule extraction from experts is labor-intensive and rules are inherently dependent on other rules, making the addition of new knowledge to the system a complex debugging task.


Case-based systems operate in a very different way (Figure 2).

Given an input specification, a case-based system will search its case memory for an existing case that matches the input specification. If we're lucky (our luck increases as we add new cases to the system), we will find a case that exactly matches the input problem and goes directly to a solution, making it possible to provide solutions to potentially complex problems quickly. If, on the other hand, we are not so lucky, we will retrieve a case that is similar to our input situation but not entirely appropriate to provide as a completed solution.


The case-based system (or the human user) must then find and modify small portions of the retrieved case that do not meet the input specifications. This is called case adaptation. The result of case adaptation is a completed solution, but it also generates a new case that can be automatically added to the system's case memory for future use. (Learning is a basic part of a CBR system's architecture.) So, if the same problem is given at some future point, the system will bypass the effort required the first time it solved the problem. In general, case-based systems are easier to build and maintain because the knowledge engineering task is reduced to the simpler problem of defi ning terms and collecting preclassified cases from the expert, and ad ding new knowledge is as easy as adding a new case to the case memory.

 

ORIGINS OF CBR


Most people in the AI field consider Roger Schank's work in Dynamic Memory, published in 1982, to be the first work that described a memory-based approach to reasoning and gave an architecture for building that type of reasoning system on a computer. His early ideas were expanded by some of his students at Yale University, most notably Janet Kolodner of Georgia Tech, whose Cyrus system in 1983 was the first computer implementation of many of the themes expressed in Schank's work. Her early work in CBR was followed by systems built by graduate students at Yale, Georgia Tech, and the University of Massachusetts. These systems demonstrated CBR in such varied domains as law, cooking, dispute and labor mediation, and medicine.

These CBR pioneers have been instrumental in defining the basic research areas and problems now being explored by industry-research labs and graduate students at about a dozen schools around the country. As CBR research has matured, CBR industry applications are being built and fielded at Lockheed, GTE, DEC, Boeing, and Martin Marietta. Efforts to build commercial CBR shells are also underway at Cognitive Systems and Inference Corp., who have announced them as products or product add-ons in 1991.

The CBR research community has held annual workshops since 1987 to bring researchers in the field together. Participation and interest has grown over the past four years, and three of the workshops have published proceedings that are excellent reading for people interested in the field.

CBR RESEARCH ISSUES


Most CBR research falls into a few broad categories: case representation, indexing, storage and retrieval, adaptation, and learning and generalization.


Case representation: What is a case? How can the case be described to the computer? In its simplest form, a case is a list of features that lead to a particular outcome. Some examples include the information on a personal-loan form and whether or not the loan was granted, and a patient history and the associated diagnosis. In its most complex form, a case is a connected set of subcases that form the problem-solving task's structure; for example, the design of a circuit or an airplane. The designs of an airplane and a circuit are made up of subdesigns of the components that comprise the whole, each of which could be considered a case unto itself.


Determining the appropriate case features is the main knowledge-engineering task in CBR systems, and this task mirrors a subset of what needs to be done in knowledge engineering a rule-based system. It involves defining the terminology of the domain and gathering representative examples (such as cases) of problem solving by the expert. The difference between rule-based and case-based knowledge engineering is that automatic case-indexing techniques drastically reduce the need to extract and structure specific rule-like knowledge from the expert-the most time-consuming part of rule-based knowledge engineering.


Case indexing: A CBR system derives its power from its ability to retrieve relevant cases quickly and accurately from its memory. Figuring out when a case should be selected for retrieval in similar future situations is the goal of the case-indexing process. Building a structure or process that will return the most appropriate case is the goal of the case memory and retrieval process. Case indexing processes usually fall into one of three kinds: nearest neighbor, inductive, and knowledge-guided, or a combination of the three.


Nearest-neighbor approaches let the user retrieve cases based on a weighted sum of features in the input case that match cases in memory. In the simplest approach to nearest-neighbor matching (in which all features have equal weight), the system would prefer a case that matched on eight features to a case that matched on six. This approach is a good one to use if the retrieval goal is not well defined or if few cases are available. The biggest problem with using this approach exclusively is that it can be impossible to converge on a set of global-feature weights that will accurately retrieve cases in all situations. Feature weights for most problem domains are context dependent; that is, a given feature may be more or less important in determining the appropriate case to retrieve depending on the values of the case's other features. In essence, each case should have its own set of feature weights for determining the relevance of that case to a new problem.


Inductive approaches to indexing are a significant improvement over nearest-neighbor approaches in situations where the retrieval goal or case outcome is well defined and there are enough examples of each type of goal with which to perform inductive comparison. Using algorithms like ID3 and CART, the task is to determine inductively which features do the best job of discriminating between the various case "outcomes" that need to be classified (for example, good loans versus bad loans) and to organize case memory with respect to those induced features.


Inductive approaches have two advantages. First, they can automatically, objectively, and rigorously analyze the cases to determine the best features for distinguishing them (providing the major system-building advantage over rule-based approaches). Second, the cases can be organized for retrieval into a hierarchical structure that increases the retrieval time by only the log of the number of cases rather than linearly. Retrieval time can be an important factor when using libraries of cases numbering in the thousands. To perform induction, however, the system needs a reasonable quantity of cases to generate accurate discriminating features and it can take a lot of upfront time to perform the inductive analysis, which is inductive indexing's primary drawback.


Knowledge-based indexing approaches try to apply existing knowledge to each case in the library to determine which features are important for retrieving each case. This is the preferred approach to indexing cases if such explanatory knowledge is available and representable. The problem is that it is often difficult to codify enough explanatory information (usually in the form of a rule-based domain model) to perform complete knowledge-based indexing on a wide range of possible case inputs. Hence, many systems use knowledge-based indexing in conjunction with other indexing techniques. However, since some useful indexing knowledge is almost always available for most real-world domains, it is important for CBR systems to try to take advantage of it.


Case storage and retrieval: Once cases are represented and indexed, they can be organized into an efficient structure for retrieval. Most case-memory structures fall into a range between purely associative retrieval, where any or all of the features of a case are indexed independently of the other features, and purely hierarchical retrieval, where case features are highly organized into a general-to-specific concept structure. Nearest-neighbor matching techniques are considered associative because they have no real-memory organization. Discrimination nets are more of a cross between associative and hierarchic al because they have some structure to the net but greater retrieval flexibility because they have a greater number of links between potential indexing features. Decision trees are an example of purely hierarchical memory organization.


The type of memory organization is related to the amount of knowledge available to perform indexing and the retrieval needs of the system. If flexibility is required because one case library is being used for several retrieval tasks, a more associative approach is often used. When the retrieval task is well defined, a hierarchical approach is used because of the advantages in retrieval time the hierarchical approaches have over associative approaches.


Case adaptation: The goal of case-retrieval is to return the most similar past case that is relevant to the input situation. Case adaptation takes a retrieved case that meets most of the needs of the current situation and turns it into one that meets all of the situation' s needs. Case adaptation can also involve making minimal changes to the input requirements to meet a known goal embodied in a stored case. It is difficult to define a single generically applicable approach to perform case adaptation, because adaptation tends to be problem-specific The adaptation needs and approaches for one problem (design ) may be different from the adaptation needs for another (diagnosis).
Most existing CBR systems achieve case adaptation for the specific problem domains they address by encoding adaptation knowledge in the form of a set of adaptation rules or domain model. Adaptation rules are then applied to a retrieved case to transform it into a new case that meets all of the input problem's constraints. More recent applications have successfully used pieces of existing cases in memory to perform adaptations. In problem domains where it is difficult to codify enough rule-like knowledge to let broad adaptation be done, using pieces of cases is the best, if not the only, alternative. And, even if cases can't be adapted by the computer, at least the system has provided the human "adapter" with a significant starting point.


Learning and generalization: Learning and generalization are important to CBR systems. Taking advantage of existing techniques for extracting useful information from examples lets case-based systems avoid some of the main problems of rule-based approaches in gathering problem-solving or classification knowledge and putting it to good use. Inductive and explanation-based indexing techniques under development will let CBR systems derive useful indexing features and structure them into an efficient memory organization. As cases accumulate, case generalization can be used to define prototypical cases that embody the major features of a group of specific cases, and those prototypical cases can be stored with the specific cases, improving the accuracy of the system in the long run. In addition, inductive-case analysis research is being done to build domain theories in areas where even the experts don't understand how the underlying processes in their domain work.

HOW DOES CBR COMPARE?


Let's compare CBR to other well known and emerging techniques like rule-based systems, neural, networks, and pattern-recognition techniques.


Comparing CBR to rules for classification tasks: CBR has several advantages over rule-based approaches for classification. All the knowledge-engineering time spent manually extracting a complete and consistent set of classification rules from an expert is saved when using the CBR approach. Instead, time is spent gathering representative cases from each of the classes that need to be recognized and creating the appropriate set of descriptive features for doing the classification. Representing features of the domain and gathering representative cases for use in knowledge engineering is also a part of building rule-based systems, so no appreciable time is added to developing th is aspect of the CBR system.


CBR saves time when structuring the acquired rules into a functioning system. Even though expert-system shells work hard to provide tools to help structure rules, this effort can still be a difficult and time-consuming process, especially in complex classification domains. In Cognitive System's CBR shell, ReMind, the inductive learning and indexing process automatically builds the optimal decision structure for classification.


The difficulty in structuring rules into a working system also affects the long-term costs of maintaining and updating the system. To add more knowledge to a case-based system, all the user has to do is add more cases and let the inductive-indexing mechanism restructure the classification tree. If for some reason a decision is made to change the set of categories that need to be recognized in an inductively oriented CBR system, the user merely resets the classifications of the existing cases to the new classifications and reruns the indexing mechanism. In a rule-based system, the developer could be faced with redoing the entire system from scratch.


CBR also beats rule-based approaches in explaining and justifying the classifications made by the system. Since expert-system rules are a distillation of the information found in examples, a rule-based system can only report to the user the chain of rules that led to the determination of one class or another-the original examples are not used. A case-based system can justify its classification not only by describing the general rules that indicate one class over another (generated through indexing process) but also by showing the examples th at support the index used. In cases where the classification cannot be made definitively, letting the user look at real examples can help make a final categorization. A case-based system also lets the user store useful classification information in cases that cannot be used by the computer for either case indexing or rule firing (like pictures, sound, and free-form text) but can be used by a human.


Comparing CBR to neural nets and pattern recognition: Pattern-recognition techniques have been used to do classification tasks for years. The basic approach involves taking a set of examples and counter-examples of the "target concept" (good and bad loans) and attempting to create a mathematical function for determining an unseen input's class. Many kinds of pattern-recognition algorithms exist, but the most popular are linear or quadratic discriminants (found in commercial software like SAS) and Bayes (probabilistic) classifiers.


Because the final outputs of linear and quadratic discriminant approaches are mathematical functions, they only work well when the features of the input examples are described with numbers. This restriction limits the usefulness of pattern-recognition approaches in domains that have important nonnumeric features. Bayes classifiers have a well-known problem: they require complete probability assessments for all of the dependent variables in the sample. For most real-world do mains, this is hard to accomplish.


Even if you can build an accurate discriminant function, it can only be used to make a black or white decision about a new example's classification. Supporting evidence of examples is unavailable, preventing the user from making further assessments based on direct comparison with past precedents. Inadequacies in pattern-recognition approaches have been the prime motivating factor in the development of the symbolic machine learning and neural-network approaches of recent years.
Neural networks have demonstrated interesting capabilities that warrant the attention they are receiving. Neural networks can produce accurate classification results comparable to symbolic inductive approaches-sometimes better, sometimes worse. They are also very good at handling "noisy" data-data that has errors in it. Finally, neural networks, once they are trained, can classify new examples quickly, making them well suited for real-time applications.


Despite these advantages, neural nets have many disadvantages that make them difficult or impractical to use on a broad spectrum of classification problems. Representationally, it can be impossible or awkward to define the features of an example by a vector of either Boolean or numeric values. Many problems contain complex examples that are described by an internal structure of relationships rather than a flat list of features. Computationally, neural networks require two to three orders of magnitude more data and CPU time to arrive at an accurate network configuration. For complex datasets, this can mean weeks of CPU time to train a single network.


The structure of a neural net has a huge impact on its ability to classify things. The network builder has many degrees of freedom in defining how the network is structured; he or she can alter the number of nodes, the connectivity between nodes, and the number of layers of nodes in a network. All this flexibility presents a problem: no defined methodology as yet exists for designing the structure of a network to the needs of a given problem domain; it is still very much an art form rather than a science. This forces the system developer to try different network structures to find an optimal setup. Coupling this trial-and-error process with the high computation time required to train a single network only adds to the problem of developing classifiers with neural nets.


Even if a satisfactory network structure is found, the examples represented, and the network trained, you won't be able to understand why a particular classification is derived when a new example is presented. A good way to interpret what the network has learned doesn't exist; all you have are the nodes, links, and weights. For most domains, people want to know what the best classification is and the reasons why the example fit the classification. Neural nets have no way of providing the user with this vital information (although research is being done in this area).


The inductive-indexing capabilities in CBR systems provide several major advantages over neural nets and pattern-recognition techniques. Inductive systems can represent and learn from a wider range of feature types than either neural nets or pattern recognition. The ability to use a richer feature set for describing examples makes them at least as accurate or more accurate than other approaches. The number of examples and time required to train an inductive system is less than that required for pattern recognition and much less than neural networks. The output of the inductive system is readable by a nontechnical person, eliminating the need for specialized training to interpret the results. Finally, case-based systems can be used to perform more tasks than just classification; neural networks and pattern recognition cannot.


A recent study established an empirical benchmark of inductive learning against neural networks and pattern recognition techniques. Two inductive-learning techniques were compared with two neural-network approaches and four pattern-recognition approaches on five datasets of real-world data.


Inductive techniques were slightly more accurate than neural nets and were much more accurate than pattern-recognition approaches. Induction and pattern recognition used about the same amount of processing time, and both were much faster than neural networks. The output of inductive systems was more readable than either neural networks or pattern-recognition techniques. These research results support the ease of use and power of inductive approaches, in general, over other techniques.


Comparing CBR to rule-based systems on complex synthesis tasks: Rule-based systems have been used to solve complex, real-world problems in planning, scheduling, and design. They have been successful in these problem areas for three reasons: the experts understood the application domains, the underlying domain behavior was representable, and the problem was tractable enough to solve from scratch.


Rule-based approaches solve difficult problems like planning or scheduling by synthesizing them from scratch. This is computationally expensive for real-world problems, making it difficult to explore alternative plans or schedules because they take too long to create with a "from-scratch" approach. The problem is not that rule-firing in expert systems is inefficient, but that constructing complex plans from scratch involves a lot of search (rule firings), requiring the system to consider thousands of alternatives while generating a single plan. That's why people try to avoid building plans, designs, and schedules from scratch.


What is needed is a way to "jump start" the system with a workable plan that is then modified by the rules to find the solution via CBR. CBR systems solve complex problems like planning, scheduling, and design by finding a similar, successful past plan, schedule, or design, and modifying it to meet all the current problem's needs. This assumes that the system has a decent set of cases from which to work. If you can't derive any value from past plans or designs, CBR is not a good idea. Fortunately, not many domains fit this description. Even in design, where the emphasis is on innovation, work is done by working from what has been successful in the past.


Since most retrieved cases that the system could start with will not meet the exact needs of the current situation, automatic case adaptation is a key issue in using CBR. Most CBR approaches rely on adaptation rules to do adaptation. This is one area where potential synergy could be realized by combining CBR with rule-based systems, and some researchers are exploring the best ways of combining the two approaches.


Case-based approaches work well in domains that are not well understood, because the system doesn't need to know why something worked in the past. It needs only to recognize a future situation in which the plan could be used. Case-based systems can often finesse some of the difficult representational and behavioral modeling needs of a domain that make using rule-based approaches impossible because the rules are not known. In fact, the inductive techniques inherent in case-based systems can, in time, facilitate the process of "understanding" the domain by performing analysis that can show previously unknown relationships between the domain features.


Many complex problems exist that neither case-based nor rule-based systems could fully address alone, such as economic or political analysis, building the space shuttle, or handling the logistics of moving and maintaining troops and supplies. But even on these problems, CBR can at least assist human planners in taking advantage of what has happened before.


REAL-WORLD APPLICATIONS


CBR systems are being deployed in a variety of areas in business and manufacturing. An example CBR application that Cognitive Systems built and fielded using its CBR shell provides a good illustration of the system-building and performance differences between a case-based and a rule-based system performing the same task.


The Prism system. Large banks receive up to 2,000 telexes per day, describing more than 100 kinds of bank transactions that need to occur. These telexes need to be read, classified, and routed (in real time) to the appropriate bank department. This classification and routing process is normally done by the bank staff.


Cognitive Systems was contracted to automate classification and routing. Since most of Cognitive Systems' experience to that point was in developing rule-based systems, a rule-based telex classifier called Prism was developed. The original system took about one year to build. Cognitive then went to a second bank, who also contracted them to build a rule-based version of the telex classifier for its message traffic. It took four months to build the second rule-based classifier. The rule-based system could process telexes in two to seven seconds and achieved a general classifying accuracy on the entire telex range of 72%.


As an experiment, Cognitive decided to reimplement the second version of the telex classifier using its newly developed CBR shell. The system's case-based version was developed in about two weeks. It could process telexes in 0.19-0.25 seconds (an order of magnitude faster) and achieved a classifying accuracy of 77%. Upon receipt of these results, the customer decided to replace the system's rule-based version with the case-based system, because it was faster, more accurate, and easier to modify and update than the rule-based system.


The Clavier system. Lockheed and other aerospace companies use composite materials (such as Graphite-Epoxy) to make parts for airplanes, satellites, and missiles. These composite parts must be cured in an autoclave, a large pressure- and temperature-adjustable convection oven, for four to eight hours. Since curing time is long and schedules are short, as many parts as possible must be put into the autoclave in each run. Yet for all the parts to be cured properly at the end of the eight hours, they must all heat up at the same rate. The heat-up characteristics of a given part are affected not only by the thickness and composition of that part but where it is positioned in the oven (air current inside the autoclave create pockets of cooler and warmer spots in the oven). The cost of the parts can run into the thousands of dollars, so accuracy and consistency in configuring parts in the autoclave is important. The main difficulty in automating this kind of task is that the knowledge used to configure parts in the oven is not well understood.


The only way to know if a given configuration will work is to run it. Because this task was oriented toward trial and error, the human experts used pictures of past successful configurations to help them load the autoclave. This was a tailor-made application for CBR.


Lockheed built a case-based system called Clavier to configure composite parts inside an autoclave for maximum performance and throughput. This task would have been difficult to perform from scratch using a rule-based approach because of the difficulty in representing and understanding the impact of all the spatial issues in configuring an autoclave load. By using successful past cases, the system could circumvent representing the spatial information at the detail level necessary to perform the configuration task from scratch. All it needed to know was if a particular part were in a particular place in the autoclave in a prior layout that was successfully cured. The spatial knowledge used by the human experts to configure loads was defined in each case. Clavier is currently in use at Lockheed's Sunnyvale Plant.

A NEW ROAD


Your doctor of 20 years might not know all the tricks that the recent medical school graduate knows, but sometimes when you don't knows where that pain is coming from, experience is everything. Like choosing a doctor with experience, case-based reasoning is a very appealing technology because it corresponds to the process expert uses to solve problems. In real-world situations, not only is CBR possible, it is often a better, faster, and more accurate way to find solutions.

SUGGESTED READING


Schank, R., and C. Riesbeck. Inside Case-Based Reasoning. Hillsdale, N.J.: Lawrence Erlbaum Assoc., 1989. Provides a good overview of work in the field, covers some of the major techniques used to perform aspects of CBR, and provides LISP programming examples that can be tried by the reader.
CBR workshop proceedings. Menlo Park, Calif: Morgan Kaufman, 1988-1991. Covers CBR research work of universities and industry AI labs.
Weiss, S.M., and I. Kapouleas. "An Empirical Comparison of Pattern Recognition, Neural Nets, and Machine Learning Classification Methods." Proceedings of the 11th IJCAL, 1989. One of several good papers in the proceedings that covers the comparison of neural nets, pattern recognition, and symbolic induction.

Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 


Video