Web search engines, such as Google and Yahoo, are using information retrieval (IR) applications or systems to answer users’ requests. Using IR systems/applications enables the reduction of information overload in large databases.
The evaluation of an IR system consists of determining how well the system performs or processes users’ requests with regard to the information needs of its users. One of the widely employed IR system evaluation strategies is the Cranfield paradigm, which consists of running two different strategies, evaluating them using appropriate metrics, and statistically comparing the strategies with each other.
The common or standard IR evaluation methodologies, based on pooling and crowdsourcing, are largely limited. They consist of partially judging a set of documents. Only the top documents retrieved by the IR system are evaluated.
In this paper, the authors present a new methodology, called pooling-based continuous evaluation, that overcomes the limitations of standard IR system evaluations. The authors describe these limitations as follows:
- “the difficulty in gathering comprehensive relevance judgments for long runs,” and
- “the unfair bias towards systems that are evaluated as part of the original evaluation campaign (that is, when the collection is created).”
With this work, the authors greatly contribute to the improvement of IR systems’ evaluation with regard to the accuracy of the results of an IR system. The paper is well written and well structured. The authors judiciously explain the methodology they propose, provide well-described algorithms, and present the limitations of the current IR system evaluation methodologies. I recommend this paper for students and researchers working particularly in the areas of data mining and document processing.