Technological facilities for storing tremendous amounts of information pose challenging problems with respect to their effective utilization. Vast amounts of data need to be searched for matching patterns, or for some collections of keywords (traditional information retrieval). Certain ambiguities in producing these relatively simple requests can be dependably resolved by using various lexicon manipulations. In more complicated circumstances, a criterion for retrieval cannot be exactly formulated. Different techniques are known to deal with such vague searching like query relaxation, semantic enhancements, and statistical analysis.
This paper upholds another intricacy typical of the proliferation of big data, as “the user may not know how to [represent the searching] specifications of the items of interest, but does know one of [the relevant elements] expected ... in the result set.” The paper suggests “ways to infer the result set using the known item as a seed.” So, “the user ‘query’ works as an example of what the elements of interest are.” As a result, the search is performed not by an actual query, but by some exemplar query. This approach finds application when a curious person needs to perform a study of an unusual topic, to which this person “may not be familiar with, but has as a starting point” some related element.
Characteristically, implementation of the given exemplar query involves an NP-hard problem of subgraph isomorphism. Yet, in practice, extracting knowledge from big data should not essentially rely on intensive computational procedures.