Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Visual occlusion and the interpretation of ambiguous pictures
Cooper M., Ellis Horwood, Upper Saddle River, NJ, 1992. Type: Book (9780139317675)
Date Reviewed: Sep 1 1994

The basis of this research monograph is the author’s 1987 thesis of the same title. Its focus is a technical area of computer vision now 20 years old. The book strikes a pleasing balance between survey sections and original work, and between heuristic methods and sharp theoretical analyses.

The book has two parts. The first concentrates on labeling line drawings of unknown objects. The paradigmatic example here is attaching semantic labels (convex, concave, occluding, and so on) to edges of drawings of polyhedral scenes. Cooper examines labeling a single scene, as well as comparing two views of the same object and seeking correspondences in cycles of labels. The chapter on the consistent labeling problem is especially thorough. This is the problem of assigning labels to units subject to a set of k-ary label constraints. Although the problem is NP-complete and all known algorithms require exponential time, the problem allows considerable latitude for intelligence. Cooper presents an original algorithm (based on one by Freuder) that establishes i-consistency for each i ≤ k, which is useful for incrementally eliminating illegal labelings.

The second part is more substantive and twice as long as the first part. Here the setting is a single raster-image picture produced by painting known two-dimensional shapes front-to-back. The goal is to discover which shapes or objects were painted into the image in what order. The objects are represented as fixed raster images, although they only cover a subset of all pixels. In the worst case, an exponential number of different ways to generate the same scene exist. After detailing algorithms that find all such generating lists, Cooper turns to the less demanding task of segmenting the scene and assigning each region an object label consistent with some way of generating the picture. This approach circumvents the exponential explosion inherent in the problem, as can be seen from the greedy algorithm OneParse on page 121. For each object, search for an exact pixel-by-pixel match with the picture. When one is found, subtract the object image from the picture, and repeat the process on the remaining pixels of the picture.

Cooper analyzes a generalization of this greedy labeling strategy--AllParses on page 143--and finds that it has worst-case complexity O ( m n 2 ), where n is the number of pixels and m the number of object images. Because an “object” in his analysis is at a fixed location in the image array, allowing all translations of a particular shape inflates the number of objects by a factor of n. Thus one should view the complexity as O ( n 3 ) for a fixed number of shapes. With n typically on the order of 2 18 ≈ 250,000 for a real image, one can see that even polynomial algorithms may be infeasible. It would be interesting to see experimentation with these algorithms on more than toy examples.

Cooper extends his analyses at the end of the book to three-dimensional objects and to films (sequences of images). For the former, he provides a clever 3SAT reduction establishing that detecting cyclic overlap is NP-hard. The book has an extensive bibliography of 302 items, but no name index.

Reviewer:  Joseph O’Rourke Review #: CR116329
Bookmark and Share
  Featured Reviewer  
 
Depth Cues (I.4.8 ... )
 
 
Curve, Surface, Solid, And Object Representations (I.3.5 ... )
 
 
Geometrical Problems And Computations (F.2.2 ... )
 
 
Perceptual Reasoning (I.2.10 ... )
 
 
Range Data (I.4.8 ... )
 
 
Time-Varying Imagery (I.4.8 ... )
 
  more  
Would you recommend this review?
yes
no
Other reviews under "Depth Cues": Date
High-speed range estimation based on intensity gradient analysis
Skifstad K., Springer-Verlag New York, Inc., New York, NY, 1991. Type: Book (9780287974799)
Aug 1 1992
Extracting view-dependent depth maps from a collection of images
Kang S., Szeliski R. International Journal of Computer Vision 58(2): 139-163, 2004. Type: Article
Jan 13 2005

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy