It’s strange to find the terms “quantum theory” and “information retrieval” together in a sentence about processing documents. As this paper reports, the authors, inspired by quantum theory, use Hilbert space properties to describe how documents can be characterized and how those documents that satisfy a search criterion can be identified. It is not a report on quantum computing.
A Hilbert space is a generalization of the Euclidean space to abstract vector spaces with any number (including infinite) of dimensions. The concepts of the angle between vectors and length are supported. Hilbert spaces are fundamental to quantum theory, Fourier analysis, and other applications in applied mathematics, physics, and theoretical chemistry. This paper borrows some concepts and techniques from quantum theory, including density distribution, superposition, and expectation values. It then applies them to the problem of identifying documents that are suitable matches to a query.
Using quantum theory as an analogy is an intriguing idea. As the authors point out, related models based on the vectorization of images and the associated matrix algebra are successfully used to identify people and fingerprints. The important feature of their approach is the vector algebra on the Hilbert space. There is no real physics here.
The paper deals with the genuine problem of finding documents that satisfy a query. Within a limited glossary of terms, many different approaches may work. However, the users who form the query and the authors of the documents may only use equivalent words or phrases to describe the same phenomenon or concept, never hitting the specific word or phrase that would trigger a match. The authors call these issues “diversity” and “novelty,” and state that their research will continue to study these issues where the approach may have the most value. Applying the Hilbert space/vector-matrix algebra approach to the discovery of synonyms and to the building of thesauri could be the application area in which their model may have the greatest value.
The ideas in this paper may be of value to ontology engineers who develop the synonyms, lexicons, metadata, and thesauri required for the robust behavior of their products.