Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Perceptrons: expanded edition
Minsky M., Papert S., MIT Press, Cambridge, MA, 1988. Type: Book (9789780262631112)
Date Reviewed: Apr 1 1990

This book is a reprint of the classic 1969 treatise on perceptrons, containing the 1972 handwritten alterations of that text. This expanded edition includes two sections written in 1988: a 9-page prologue and a 34-page epilogue. With the advent of new neural net and connectionist research [1], another look at this classic work is in order.

This book contains results on the fundamental mathematical properties of linear threshold functions or perceptrons. Let R designate an arbitrary set of points and let &PHgr; = { &fgr; 1, &fgr; 2, . . .} be a family of predicates defined on R. A function &psgr; is a linear threshold function with respect to &PHgr; if there exists a number &thgr; and a set of numbers &agr; ( &fgr; ), one for each &fgr; in &PHgr;, such that

&psgr; - ( X ) = ⌈ ∑ &agr; ( &fgr; ) &fgr; ( X ) > &thgr; ⌉
where X ⊆ R and ⌈ P ⌉ = 1 if and only if predicate P is true; otherwise ⌈ P ⌉ = 0. Let L ( &PHgr; ) denote all the functions that can be defined in this way; these then are all the perceptron-computable functions of &PHgr;.

Minsky and Papert ask the important question: what characterizes the set of patterns over R (which we normally think of as a two-dimensional retina) which perceptrons can recognize?

The work in this book answers many important facets of that question. First, the question is meaningless without restrictions on the set of predicates &PHgr;. An example restriction studied by the authors is k-order-restricted perceptrons, in which each member of &PHgr; must depend on no more than k points of R. The order of a threshold function &psgr; is defined to be the smallest integer k such that we can find a set of k-order-restricted predicates &PHgr; with &psgr; ∈ L ( &PHgr; ).

The book is divided into three main sections. Section 1, with four chapters, treats the basic algebraic theory of perceptrons. Section 2, of five chapters, treats their geometric theory, while the final three chapters (Section 3) treat the learning aspects of perceptrons.

The algebraic portion serves as a basic background. In it, the authors prove the group-invariance theorem, which states that any &psgr; which is invariant under a group of transformations of R can be represented as a sum in which the coefficients of all group-equivalent predicates have the same numeric value. This theorem is used to show (among other things) that

&psgr;PARITY ( X ) = ⌈ | X | is an odd number
is of order | R |, where | X | is the cardinality of the set X.

In the second main section, the authors begin by showing that the predicate connected is not of finite order. More generally, they show that the only topological invariant predicates of finite order are functions of the Euler number. They then prove that a number of interesting geometric predicates (circular, rectangular, etc.) are of low order. They prove a stratification theorem that allows parallel perceptrons to simulate sequential decision making. They study another limitation of the &PHgr; class of predicates, namely, diameter-limited perceptrons. Chapter 9 looks at computing geometric predicates using serial devices such as Turing machines.

In the section on learning, the authors begin by examining measures of complexity about perceptron computation. They show the important fact that some predicates have coefficients that can grow faster than exponentially with | R |. They then prove Rosenblatt’s perceptron convergence theorem, which states that the simple perceptron reinforcement learning scheme converges to a correct solution when such a solution exists. The authors conclude by describing other learning schemes and popular pattern recognition devices of the period.

To understand the importance of the work in this excellent book, one must know some of the history of the research on neural nets in the late 1950s and early 1960s. The authors give a simple overview in their last chapter. The perceptron convergence theorem was an example of the dominant paradigm, which asked such questions as If a given predicate is in an L ( &PHgr; ), then what learning procedure will find coefficients to represent it? This leads to a view of recognition and learning as finding decision surfaces in n-dimensional coefficient spaces (which can lead to a complete statistical theory; see Duda and Hart [1]).

It was the authors’ insight to ask what characterizes L ( &PHgr; ) in general and to use geometric predicates and the k-order perceptrons to do so. This insight moved from analyzing perceptrons and their learning procedures to the characterization of the classes of problems computable by perceptrons, a move which leads to results similar to today’s theory of computational complexity.

Many current neural network researchers say that Minsky and Papert’s work is not relevant to their research. In the epilogue, the authors address this claim in detail. They take as an example the work in Rumelhart et al. [2] and illustrate how their research makes important contributions to the experimental work done there. They show that their stratification theorem is relevant for the symmetry predicate found experimentally. They make the important point that the Generalized Delta Rule (and backpropagation) is nothing more or less than hill climbing and thus has all its power and problems.

To characterize Minsky and Papert’s view of the current work on neural nets, they feel the state of knowledge has seen little theoretical advance. Much of the current work has the flavor of the old neural network research: my net can do toy problem X in Y learning trials. It was precisely to answer the “so what” questions this raises that the work described in this book was performed.

Minsky has been quoted as saying that the problem with Perceptrons was that it was too thorough; it contained all the mathematically “easy” results. A new researcher in the field has no new theorems to prove and thus no motivation to continue using these analytical techniques. It is a challenge to neural net researchers to provide as detailed and exacting an analysis of their networks as Minsky and Papert have done for perceptrons.

The writing style of the book is pleasant. Even with its highly detailed formal mathematical results, the presentation is still informal, making it much easier to read than it would be otherwise. In this style, the authors describe in simple terms what they are trying to achieve, present the mathematical details with theorems and lemmas, then follow up with a restatement of what was achieved. Reading this book is quite different from reading mathematical texts presented in a more “formal” style.

I have some minor quibbles. The 1972 handwritten alterations are both interesting and annoying. Pages 231 and 232 are printed in reverse order. The scene analysis discussion, as the authors point out in one of the handwritten notes, is dated, as is some of the other work in chapters 12 and 13.

I recommend this book highly since, as a pure mathematical text, its style and scope are interesting in and of themselves. Further, neural net researchers should be required to read this book (along with a classic pattern recognition text, such as Duda and Hart [1]), even if they limit themselves to the prologue, the introduction, and the epilogue.

Reviewer:  S. P. Smith Review #: CR113147
1) Duda, R. and Hart, P.Pattern classification and scene analysis. Wiley, New York, 1973. See <CR> 14, 10 (Nov. 1973), Rev. 25,928.
2) Rumelhart, D.; McClelland, J.; et al.Parallel distributed processing. MIT Press, Cambridge, MA, 1986.
Bookmark and Share
 
General (I.5.0 )
 
 
Interconnection Architectures (C.1.2 ... )
 
 
Knowledge Acquisition (I.2.6 ... )
 
 
Parallel Algorithms (G.1.0 ... )
 
 
Predicate Logic (I.2.4 ... )
 
 
Unbounded-Action Devices (F.1.1 ... )
 
Would you recommend this review?
yes
no
Other reviews under "General": Date
Recognizing unexpected objects: a proposed approach
Rosenfeld A. (ed) International Journal of Pattern Recognition and Artificial Intelligence 1(1): 71-84, 1987. Type: Article
Jun 1 1988
Pattern recognition: human and mechanical
Watanabe S., John Wiley & Sons, Inc., New York, NY, 1985. Type: Book (9789780471808152)
Mar 1 1986
Sourcebook of automatic identification and data collection
Adams R., Van Nostrand Reinhold Co., New York, NY, 1990. Type: Book (9789780442318505)
Jul 1 1991
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy