Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Utilizing inter-passage and inter-document similarities for reranking search results
Krikon E., Kurland O., Bendersky M. ACM Transactions on Information Systems29 (1):1-28,2010.Type:Article
Date Reviewed: May 11 2011

With search engines, users usually examine the top ten search results. In this paper, the authors present a language-model-based approach to re-ranking search results to improve precision at the top five and top ten rank positions. They do this by re-ranking the top 50 search results. The study utilizes inter-passage and inter-document similarities. The authors base the study on the fact that a long or heterogeneous relevant document may contain parts (passages) that are not relevant to the query. Earlier passage-based studies addressed this issue, but they did not consider similarity relationships between documents or between passages. Their model integrates document-query, passage-query, inter-document, and inter-passage similarities.

In the experiments that this paper discusses, the authors use five different Text Retrieval Conference (TREC) test collections. The collections show variety in terms of the number of documents they contain, the nature of the documents (such as news or Web documents), and the length of the documents. It is advantageous to have such variety to see the effectiveness of a model under different conditions. In this detailed study, the authors show that, in several cases, their model statistically and significantly outperforms many other methods.

The TREC information retrieval test collections are an excellent research tool: they provide a lab environment where researchers can repeat experiments and compare the results of different studies. While reading this paper, and while conducting similar activities in my own research, a few questions came to mind: What would happen if we applied this experiment in real life? Would the users feel or appreciate the difference at a statistically significant level (such as a top five precision improvement from 33.9 to 37.1, as shown in Table 1 with the TREC WT10G collection) while using the test collection?

Incidentally, WT10G is the most challenging test collection used in the experiments. The paper provides much better improvements for some of the other collections.

While it would be beneficial to add an actual user dimension to the experiments, and eventually to the test collections, doing so is easier said than done.

Reviewer:  F. Can Review #: CR139049 (1111-1199)
Bookmark and Share
 
Retrieval Models (H.3.3 ... )
 
Would you recommend this review?
yes
no
Other reviews under "Retrieval Models": Date
Evaluation of an inference network-based retrieval model
Turtle H., Croft W. (ed) ACM Transactions on Information Systems 9(3): 187-222, 1991. Type: Article
May 1 1993
On a model of distributed information retrieval systems based on thesauri
Mazur Z. Information Processing and Management: an International Journal 20(4): 499-505, 1984. Type: Article
Sep 1 1985
Information processing in linear vector space
Kunz M. Information Processing and Management: an International Journal 20(4): 519-525, 1984. Type: Article
Mar 1 1985
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy