Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Towards effective strategies for monolingual and bilingual information retrieval: lessons learned from NTCIR-4
Qu Y., Hull D., Grefenstette G., Evans D., Ishikawa M., Nara S., Ueda T., Noda D., Arita K., Funakoshi Y., Matsuda H. ACM Transactions on Asian Language Information Processing4 (2):78-110,2005.Type:Article
Date Reviewed: Feb 24 2006

This substantial paper will be very useful for researchers working in automated information retrieval (IR), but not for a general audience. It describes, in great detail, techniques for both monolingual IR in English and Japanese, and Japanese-English cross-language IR (Japanese queries, English documents).

The paper reports on retrieval experiments in the context of NTCIR-4, a Japanese retrieval testing program run by the National Institute of Informatics (NII), much like the Text Retrieval Conference (TREC) run by the National Institute for Standards and Technology (NIST) in the US. It describes retrieval systems developed in a collaboration between Justsystem Corporation (JSC) and Clairvoyance Corporation (CC).

The system uses natural language processing (NLP) techniques, including noun-phrase detection, with language-specific extensions, and rich translation resources. It explores issues of noun-phrase weighting, translation weighting, pseudo-relevance feedback, and term-weight merging. The experiments are carefully set up, exploring the interactions of variables through analysis of variance (ANOVA) and reporting statistical significance. A particularly welcome feature is error analysis that uses a typology of errors to gain insight into the contribution of various system components to the end result. The results are presented in many tables.

The system, testing procedures, and results are all well explained. There are no earth-shattering results here, but that is true for most papers reporting on IR experiments. There are too many variables influencing retrieval performance; results are often specific to a given context, and grand generalizations are hard to come by. What sets this paper apart is the clear framework used for testing various configurations of system components, and the carefully worked out testing methodology, especially the typology of errors for the failure analysis.

Reviewer:  D. Soergel Review #: CR132483 (0611-1163)
Bookmark and Share
 
Linguistic Processing (H.3.1 ... )
 
 
Relevance Feedback (H.3.3 ... )
 
 
Text Analysis (I.2.7 ... )
 
 
Information Search And Retrieval (H.3.3 )
 
Would you recommend this review?
yes
no
Other reviews under "Linguistic Processing": Date
Anatomy of a text analysis package
Reed A. Information Systems 9(2): 89-96, 1984. Type: Article
Jun 1 1985
Dependency parsing for information retrieval
Metzler D., Noreault T., Richey L., Heidorn B.  Research and development in information retrieval (, King’s College, Cambridge,3241984. Type: Proceedings
Oct 1 1985
Automated medical office records
Gabrieli E. Journal of Medical Systems 11(1): 59-68, 1987. Type: Article
Nov 1 1988
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy