It seems appropriate to begin this review of books on neural networks by establishing the scope of what is to be covered. First, it does not include the classic references in the field (some of which have been reviewed separately in Computing Reviews) such as Anderson and Rosenfeld [1], Minsky and Papert [2], Kohonen [3], and Rumelhart and McClelland [4,5]. Being essentially research works, many of these classic references remain inaccessible to the novice neural netter. Generating novel research ideas is one thing; being able to communicate these ideas to nonspecialists is another.
Second, this review is not concerned with cognitive science, theories of the brain, or biological neural networks as emphasized in Rumelhart and McClelland [4,5]. The emphasis in this review is on neurally inspired computers or artificial neural networks (ANNs).
Third, I do not discuss connectionism as an alternative to rule-based (heuristic) artificial intelligence; much of the work on both sides of the fence is far too speculative for my liking.
Fourth, I do not address VLSI implementations of ANNs. (Two good starting points for this specialization are Mead [6] and Morgan [7].)
So precisely what are we concerned with in this review? I focus primarily on introductory books published during the last two to three years. My emphasis is on simple, clear explanations (with a minimum of math) of neural networks for people encountering the field for the first time. What follows will therefore be of interest to educators establishing courses in ANNs.
Prior to 1990, few introductory books on ANNs were available. Two notable books were Pao [8] and Wasserman [9]. The title of Pao’s book, Adaptive pattern recognition and neural networks, reveals its pattern recognition orientation, yet it manages to cover perceptrons, associative memory, and self-organizing networks in a general manner. One of the best features of Pao is the inclusion of a C-code listing of the generalized delta rule in an appendix. Wasserman introduces the perceptron, backpropagation, counterpropagation, Hopfield, BAM, ART, and neocognitron ANN models, and includes an appendix on the relevant training algorithms associated with some of these models. Wasserman’s book is accessible to novice readers, and has been used for some years in both undergraduate and graduate courses on ANNs.
Since 1990, a plethora of ANN books has appeared, some motivated more by an attempt to cash in on the recent upsurge of interest in the field than by a desire to present an explanation of the ANN fundamentals in a clear, accessible manner. What follows is a critical evaluation of some of the best introductory ANN books published during the last two or three years.
Some books, such as Hecht-Nielsen, Simpson, and Zurada, bolster their descriptions of ANNs with considerable doses of mathematics, in the mistaken belief that the math imparts instant credibility, validity, or justification to their work. Unfortunately, this heavy concentration of math often has the opposite effect, with novice readers quickly turning off and heading in search of more readily accessible texts. The non-novice readers are probably better advised to read the original descriptions if they desire a comprehensive treatment of specific ANN models. (In this case, I recommend Anderson and Rosenfeld [1].) Hecht-Nielsen’s book is particularly disappointing, since it is in distinct contrast to his neurocomputer software, such as ExploreNet. A mathematically rigorous ANN book does not necessarily have to remain inaccessible to readers, as Hertz’s book clearly demonstrates.
Table 1: ANN Models Covered |
| Aleksander and Morton | Beale and Jackson | Dayhoff | Freeman and Skapura | Hecht- Nielsen | Hertz, Krogh, and Palmer | Khanna | Simpson | Zurada |
ADAL | No | No | Yes | Yes | Yes | No | No | Yes | No |
AM | No | No | Yes | No | Yes | No | No | Yes | Yes |
ART | No | Yes | No | Yes | Yes | Yes | Yes | Yes | Yes |
BACKP | No | No | Yes | Yes | Yes | No | Yes | Yes | No |
BAM | No | No | No | Yes | No | No | No | Yes | Yes |
BOLTZ | Yes | No | No | Yes | Yes | Yes | Yes | Yes | No |
CL | Yes | No | Yes | No | Yes | No | Yes | Yes | No |
COUNT | No | No | Yes | Yes | Yes | No | No | Yes | Yes |
GMDH | No | No | No | No | Yes | No | No | No | No |
HAMM | No | No | No | No | No | No | No | Yes | Yes |
HOPF | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
HOPP | No | No | No | No | No | No | Yes | Yes | No |
KOH | Yes | Yes | Yes | Yes | Yes | Yes | No | Yes | Yes |
LAT | No | No | Yes | No | No | No | No | Yes | No |
MLP | Yes | Yes | Yes | No | No | Yes | Yes | Yes | Yes |
NEOC | No | No | No | Yes | Yes | No | Yes | Yes | No |
PERCP | Yes | Yes | Yes | No | No | Yes | Yes | Yes | Yes |
RECUR | No | No | No | No | Yes | Yes | No | Yes | No |
SPT | No | No | No | Yes | Yes | No | No | Yes | No |
TDNN | No | No | Yes | No | No | No | No | Yes | No |
Key: ADAL=Adaline AM=Associative memory ART=Adaptive resonance theory BACKP=Backpropagation BAM=Bidirectional associative memory BOLTZ=Boltzmann machine CL=Competitive learning COUNT=Counterpropagation net GMDH=Group method of data handling HAMM=Hamming HOPF=Hopfield HOPP=Hoppenstaedt KOH=Kohonen’s self-organizing feature map LAT=Lateral inhibition MLP=Multi-layer perceptron NEOC=Neocognitron PERCP=Perceptron RECUR=Recurrent nets SPT=Spatio-temporal classification TDNN=Time delay neural net |
At the other end of the spectrum are books dashed off quickly in an attempt to capitalize on the recent popularity of ANNs. Such books read as drafts that would have benefited considerably from proofreading, revision, and expansion prior to publication. Khanna is one of the better books in this category. (The worst examples of this kind of ANN book have naturally been eliminated altogether from this review.) Even good books such as Beale and Jackson would have benefited from such an exercise. Their coverage of Kohonen’s self-organizing feature map contains errors, for example.
A failing of most introductory ANN books is the lack of accompanying simulator software. (ANN books that do come with simulator software on disks will be the subject of a forthcoming review.) Few books have followed Wasserman’s earlier lead and included code listings; a notable exception is Freeman and Skapura.
Virtually all nine books in this review cover the most important ANN models, namely perceptron, multilayer perceptron and backpropagation, Hopfield, Boltzmann Machine, Kohonen’s self-organizing feature map, and adaptive resonance theory.
Aleksander and Morton
The emphasis here is on Wisard, an adaptive pattern recognition machine; probabilistic logic nodes; speech; and neocognitron. The authors have written a book that is easily accessible to the nonexpert on neural nets, with minimal mathematical content. The book’s best feature is its good description of ANN models, especially the Hopfield model. Its worst feature is the lack of an accompanying ANN software simulation package.
Beale and Jackson
The authors emphasize pattern recognition and associative memory in their text; it is easily accessible and contains minimal math. The book’s best features are the end-of-chapter summaries and the fact that descriptions of various ANN algorithms appear in separate boxes. As is true of Aleksander and Morton’s book, its worst feature is the lack of an accompanying software package.
Dayhoff
Dayhoff emphasizes both biological and artificial neural networks. The book is easily accessible and the math is minimal, in fact almost nonexistent. Descriptive, especially clear examples are the book’s best feature. The lack of accompanying software and insufficient mathematics are the worst features.
Freeman and Skapura
The authors use sufficient mathematics and description to explain ANN models. They focus on guidelines for writing software simulations. The book is easily accessible to the nonexpert. The best features of this book are the inclusion of ANN simulator guidelines (in C code) and a clear explanation of the most significant ANN models.
Hecht-Nielsen
The emphasis of this moderately accessible book is on theoretical and mathematical principles, but also on neurocomputer principles and applications. Thus, the mathematical concentration is heavy. The book provides a good historical account of ANNs; this is its best feature. The book’s worst feature is its reliance on mathematical proofs and justification; it includes a description of Axon, but no software package accompanies it.
Hertz, Krogh, and Palmer
The authors emphasize theoretical issues. This book is part of the Sciences of Complexity book series. It is moderately accessible and provides a good description and coverage of its subject (despite the mathematics, which concentrates on statistical mechanics, mean field, and spin glass theories). The book’s other good feature is its discussion of optimization problems, including the traveling salesperson problem. Its worst feature is the lack of accompanying software.
Khanna
Like Hertz, Krogh, and Palmer, Khanna focuses on theoretical issues. The author also emphasizes associative memory. The book contains a moderate amount of mathematics and is moderately accessible. The worst features of Khanna’s book are its brief coverage of the subject and its heavy reliance on one article [10] by R. P. Lippman that appeared in an IEEE publication.
Simpson
Simpson stresses the mathematical summary of ANN algorithms; he provides a heavy concentration of mathematical content, which is only moderately accessible to the nonexpert. This book serves as a reference source to previous work on specific ANNs, especially applications. I liked this feature. The book’s other good features are the appendix on the history of ANNs and the bibliography. Its worst feature is its brief coverage.
Zurada
The book has a theoretical emphasis, with moderate accessibility and heavy mathematics. Its best feature is that it describes ANN applications well. I did not like its comprehensive style; for example, a chapter on neural network implementations contains a VLSI tutorial.
Comparison
I found the best introductory books that clearly explain ANN principles in a readily accessible manner to be the ones by Beale and Jackson and by Freeman and Skapura. The former has been the textbook for the graduate neural networks course at the University of Wollongong for the past two years; this year, it will be replaced by the latter book.