Because Nature proceeds economically (in accordance with Fermat’s minimization principle), it is fruitful to examine biological solutions to a problem before creating new technologies. The author’s robot vision, is praiseworthy, even if their comprehension of the biological world is in complete.
Following an introduction, Section 2 of the paper seeks to characterize multidimensional perception (MDP), which generalizes simple one-dimensional perception and is formally defined as perception using more than one sensor. This definition may be the first error in the analysis of MDP. According to Moles’s classification [1], sound is one-dimensional (the dimension is time), but the recognition of auditory patterns requires the identification of at least three characteristics: rhythm (or intensity), melody (or height variation), and harmony (chords or timbre). Is acoustic perception one- or three-dimensional? According to the definition used in this paper, the single sensor makes it one-dimensional but hearing is clearly an MDP process.
Another consequence of the authors’ definition that leads to incorrect conclusions is considering the sensory space for color vision as intrinsically three-dimensional because a single physical property (the electromagnetic wave) is sensed by three detectors (for red, green, and blue) RGB perception can certainly identify emission colors situated in the classical triangle of colors (as defined by the International Commission on Illumination), but the identification of reflected colors requires (such as white and black, or two shades of gray). Moreover, according to classical treatises on color vision (see, e.g., Judd [2]), the color space is not properly metric or well-connected.
The authors refer to MDP’s method of pattern recognition as class connectivity analysis (CCA), where a class is defined as a set of maximally connected values. One of the authors, G. Beni, has developed connectivity analysis in previous papers written with J. Wang (see, e.g., [3]; this subject is treated superficially in Section 3 of this paper.
The core of the paper is Section 4, “Application to Human Perception.” After citing some theories of color perception and mentioning some systems for their representation (but not the most common ones, such as those of Munsell, Ostwald, and DIN), the authors describe two interesting experiments, which they call “Flowers in the meadow” and “Snake in [the] grass.” In the first experiment, a central shell with a radius of 10 or 11 (in a color vision space where the maximum possible radius would be 32) and a single pixel are superposed: if the pixel is outside the shell (e.g., it is red), it is detected immediately, while if it is within the shell (e.g., it is gray), the eye cannot find it. The second experiment asks a subject to identify a ‘snake’ 4:5 pattern superposed on 10:11 to 14:15 ‘grass’; the snake pattern is nearly undetectable. But the conclusion--that the human eye is only capable of one-dimensional analysis--is questionable. The final section, “Applications to Robot Perception,” is pretentious to suppose that a robot could see better than humans. Certainly, a microscope or a telescope can “see” more details than the human eye, but we cannot conclude that those instruments are better than the biological eye and brain, as these true marvels are not yet completely understood.