Computing Reviews, the leading online review service for computing literature.

Search

Representation and recognition in vision
Edelman S., MIT Press, Cambridge, MA, 1999. Type: Book (9780262050579)

Date Reviewed: Jul 1 1999

The goal of this book is to explore visual representations harbored by sophisticated cognitive systems such as human observers and the way these representations are used to support high-level visual functions in recognition and categorization. Edelman argues that, rather than capture “first-order” qualities pertaining to shapes (their geometry, for instance), representations should encode “second-order,” or relational, quantities such as similarity. Similarity is treated as a function that assigns a real number to each pair of shapes. This can be done if one adapts the notion of shape space, a metric space in which each point corresponds to a particular shape. In that case, similarity can be thought of as a quantity inversely related to the shape-space distance between the objects, defined through a metric function of that space. Objects with comparable shapes are mapped into the same neighborhood of the shape space. Edelman’s basic premise is that the world of shapes is “out there” for anyone to see, and that internal states causally related to it can be maintained by a visual system (used for all kinds of practical purposes, of which object recognition is but one). Thus, the structure of a novel object is described in terms of memory records of similar structures rather than as a combination of generic primitive shapes. Edelman’s “chorus of prototypes” employs a small number (ten) of reference objects to approximate the dimensions of the shape representation space. A module tuned to a particular shape must be insensitive to its transformations, and must respond differently to different shapes. When a set of 200 measurements on a stimulus image is presented to the system, the modules similar to the stimulus shape respond strongly, and this set of responses forms a low-dimensional representation space for the stimulus. This approach is strongly motivated by results in biological vision. The book begins with a nice introduction to the problem of representation in vision. This is followed by a chapter on theories of representation and object recognition that first indicates the various recognition-related tasks that require representation, and then discusses various approaches to representation. Then the chorus-of-prototypes approach is discussed in three chapters, one on the theory, one on implementation, and one on experiments on recognition-, categorization-, and analogy-related tasks. The following chapter relates the chorus-of-prototypes approach to mechanisms found in neurobiological studies and to results from psychophysical studies. Finally, an interesting chapter presents dialogues on representation, in which objections to the author’s approach are raised and answered. Appendices cover topics such as measurement space, representation by distances to prototypes, quasiconformal mappings, radial basis functions, vector quantization, and multidimensional scaling. The author clearly states that his approach requires pre-segmented objects. Other shortcomings of the chorus of prototypes implementation indicated are the lack of tolerance to image-plane translation and scaling of the stimulus; the lack of a principled way of dealing with occlusion and interference among neighboring objects in a scene; and the lack of an explicit representation of object structure. This thought-provoking book is well written, and its discussion of the representation problem and the strengths and weaknesses of the various approaches is strong. Excellent diagrams illustrate the somewhat abstract concepts, and a complete set of references is provided.

Reviewer: O. Firschein	Review #: CR122395 (9907-0511)

Representations, Data Structures, And Transforms (I.2.10 ... )

Biology And Genetics (J.3 ... )

Feature Measurement (I.4.7 )

Would you recommend this review?

yes

Other reviews under "Representations, Data Structures, And Transforms":	Date

Model-based strategies for high-level robot vision Shneier M., Lumia R., Kent E. Computer Vision, Graphics, and Image Processing 33(3): 293-306, 1986. Type: Article	Sep 1 1986

TID--a translation invariant data structure for storing images Scott D., Iyengar S. Communications of the ACM 29(5): 418-429, 1986. Type: Article	Nov 1 1986

Computation of geometric properties from the medial axis transform in O (n log n) time Wu A., Bhaskar S., Rosenfeld A. (ed) Computer Vision, Graphics, and Image Processing 34(1): 76-92, 1986. Type: Article	Jan 1 1988

more...

Reproduction in whole or in part without permission is prohibited. Copyright 1999-2024 ThinkLoud^®
Terms of Use | Privacy Policy