Computing Reviews

Learning semantic representations of objects and their parts
Mesnil G., Bordes A., Weston J., Chechik G., Bengio Y. Machine Learning94(2):281-301,2014.Type:Article
Date Reviewed: 07/03/14

Digital images are everywhere. To make those images searchable, we laboriously annotate them with a list of contents. Ideally, we want a program that does the annotation for us, but this is still an unsolved problem in image analysis. This paper considers a slightly different problem of extracting parts-owner relationships from an image. Their algorithm can be used to annotate an image and its subregions simultaneously as an object (for example, a car) and its parts (for example, a wheel, headlights, windshield, and so on). The parts-owner relationships provide the semantics of the image, thus the work may be a step toward an automated image understanding system.

There are two databases, WordNet and ImageNet, that provide a large number of parts-owner relationships in words and images, respectively. We can use them to train the system. Incorporating the parts-owner relationships may improve the annotation accuracy, since the presence of an object suggests the presence of its parts, and vice versa. These are the motivations behind the work.

Training such a system is tricky. Since state-of-the-art object recognition is still not good enough, the algorithm bypasses the visual association of objects and parts altogether. Instead, it learns associations between images and labels provided by ImageNet, as well as parts-owner relationships in labels provided by WordNet. This approach appears clever, but highlights a general issue in artificial intelligence: we often settle for a compromise due to the lack of a reliable artificial vision system.

This paper may not be an easy read as it uses algorithms from other works with little explanation. There are some careless errors in the references and notations. Nevertheless, it is entertaining.

Reviewer:  T. Kubota Review #: CR142470 (1410-0887)

Reproduction in whole or in part without permission is prohibited.   Copyright 2024™
Terms of Use
| Privacy Policy