Learning from experience, using knowledge of past problems and their solutions to find a solution to the problem at hand, is one of the central forms of both natural and machine learning. Most machine learning algorithms divide this process into two parts: first, a general rule, formula, or method is abstracted from the corpus of past cases offline, and then this rule is applied to new examples as they arise. An alternative approach, however, cuts out the middle man and goes directly from the past examples to the current case. Confronted with a new problem, the reasoning mechanism searches through the corpus looking for previous cases that were similar, and then adapts the solution that worked there as necessary. No general rule is ever derived or applied. Intuitively, this idea seems very attractive; it seems both very plausible as a cognitive mechanism, and like an idea that should lend itself to an efficient and effective implementation.
Two versions of this general idea have been developed in the computer science literature. The k-nearest-neighbors (kNN) algorithm is a precisely defined version: a “case” is a feature vector, “similarity” is the distance between vectors under a metric, and the “solution” is an additional feature to be predicted. The number k is a fixed parameter of the program. Presented with a new instance, one retrieves the k closest cases in the corpus of cases, and these “vote” on the solution. No solution adaptation is carried out.
The other method is case-based reasoning (CBR). In CBR, the representations of cases and solutions have a much richer structure, and the measure of similarity and the method of solution adaptation reflect deep domain knowledge. The first studies of this technique were conducted by Janet Kolodner and Kristian Hammond in the 1980s, and Kolodner published a general survey  in 1993.
Both kNN and CBR are a little out of the mainstream of artificial intelligence and machine learning. For instance, the current edition of Russell and Norvig  contains only four pages on kNN and only a passing reference to CBR. Nonetheless, I followed some of the progress in kNN over the years. On the theoretical side, more efficient algorithms for retrieving the k nearest neighbors (the computationally hard part of the process), such as locality-sensitive hashing, have been developed by Piotr Indyk and others. On the application side, Torralba, Fergus, and Freeman demonstrated in their tour de force  that kNN could be efficiently and effectively applied to a corpus of 80 million images, each 32 x 32 pixels, to do image classification. By contrast, I had no awareness of improvements in the state of the art in CBR since Kolodner’s book 20 years ago; when I heard there was a new textbook, I eagerly awaited the chance to bring myself up to date.
Several hours of reading and 546 pages later, I am, regrettably, not much the wiser. Richter and Weber’s book is extremely poorly written. It veers from obvious to vague to meaningless. Occasionally, it drops into highly technical detail without providing nearly enough background. For example, see the complicated information-theoretic formulas and obscure theorems in the chapter on probability.
The exercises are similarly flawed. Most of them are so vague that, as a student, I would have no idea what they are asking, and, as an instructor, I would have no idea how to grade them. For example, consider this exercise on page 52: “Define a recommender system for students who want to study in your area. Formulate typical queries the students may have.” A few require technical knowledge far beyond and unrelated to the content of the book, such as the exercise on p. 186: “Prove that the average number of Voronoi edges per Voronoi polygon (in two dimensions) is at most 6.” In a class on computational geometry, this would be a reasonable problem (the proof, using Euler’s formula, is short, though a little tricky); in this course, it is completely out of place.
The later chapters on “complex knowledge sources” are short surveys of various artificial intelligence topics, with little clear explanation of the relation between these theories and CBR. For instance, chapter 17, “Textual CBR,” mostly provides the same kind of material one would find in an information retrieval (IR) textbook, such as the one by Manning, Raghavan, and Schütze , but presented much more superficially and much less intelligibly. The discussion of the connection to CBR is short and mostly vague, and the main specific information is a list of similarity measures from the IR literature, inadequately described or explained. Basic questions remain unanswered: What does CBR contribute to the problem of text understanding, and what is missed by more mainstream approaches? What understanding problems can be solved using CBR? What kind of corpus of texts can be used, and what adaptation techniques are useful, if any? The chapters on image recognition and speech recognition are similarly inadequate.
The discussion of specific applications is frustratingly vague. For example, Section 4.6.1 discusses spam filtering, and claims that “CBR is recommended for spam filtering classification because of how it compares to its alternatives.” The reader naturally wants to know how CBR does compare to its alternatives. Has an industrial-strength CBR spam filter been built, or even a toy CBR spam filter? How large a corpus has it been used on? What features does it have, and what techniques does it use? How does it compare in accuracy or in any other respect to the alternatives? These key questions are not discussed nor are references given.
I still do not know whether CBR is, at this point, a viable technique, a promising line of research, or essentially moribund. So I do not know whether there is any point in teaching a course on the topic at all. But certainly, if one were to teach a course on CBR, this textbook would not be very helpful.