Computing Reviews, the leading online review service for computing literature.

Search

Utilizing inter-passage and inter-document similarities for reranking search results
Krikon E., Kurland O., Bendersky M. ACM Transactions on Information Systems29 (1):1-28,2010.Type:Article

Date Reviewed: May 11 2011

With search engines, users usually examine the top ten search results. In this paper, the authors present a language-model-based approach to re-ranking search results to improve precision at the top five and top ten rank positions. They do this by re-ranking the top 50 search results. The study utilizes inter-passage and inter-document similarities. The authors base the study on the fact that a long or heterogeneous relevant document may contain parts (passages) that are not relevant to the query. Earlier passage-based studies addressed this issue, but they did not consider similarity relationships between documents or between passages. Their model integrates document-query, passage-query, inter-document, and inter-passage similarities. In the experiments that this paper discusses, the authors use five different Text Retrieval Conference (TREC) test collections. The collections show variety in terms of the number of documents they contain, the nature of the documents (such as news or Web documents), and the length of the documents. It is advantageous to have such variety to see the effectiveness of a model under different conditions. In this detailed study, the authors show that, in several cases, their model statistically and significantly outperforms many other methods. The TREC information retrieval test collections are an excellent research tool: they provide a lab environment where researchers can repeat experiments and compare the results of different studies. While reading this paper, and while conducting similar activities in my own research, a few questions came to mind: What would happen if we applied this experiment in real life? Would the users feel or appreciate the difference at a statistically significant level (such as a top five precision improvement from 33.9 to 37.1, as shown in Table 1 with the TREC WT10G collection) while using the test collection? Incidentally, WT10G is the most challenging test collection used in the experiments. The paper provides much better improvements for some of the other collections. While it would be beneficial to add an actual user dimension to the experiments, and eventually to the test collections, doing so is easier said than done.

Reviewer: F. Can	Review #: CR139049 (1111-1199)

Retrieval Models (H.3.3 ... )

Would you recommend this review?

yes

Other reviews under "Retrieval Models":	Date

Evaluation of an inference network-based retrieval model Turtle H., Croft W. (ed) ACM Transactions on Information Systems 9(3): 187-222, 1991. Type: Article	May 1 1993

On a model of distributed information retrieval systems based on thesauri Mazur Z. Information Processing and Management: an International Journal 20(4): 499-505, 1984. Type: Article	Sep 1 1985

Information processing in linear vector space Kunz M. Information Processing and Management: an International Journal 20(4): 519-525, 1984. Type: Article	Mar 1 1985

more...

Reproduction in whole or in part without permission is prohibited. Copyright 1999-2024 ThinkLoud^®
Terms of Use | Privacy Policy