Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Needs for research in indexing
Milstead J. Journal of the American Society for Information Science45 (9):577-582,1994.Type:Article
Date Reviewed: Apr 1 1996

The author decries the fact that there is less research activity and less interest now than there was 20 years ago in the construction of printed or machine-readable indexes, and in the human factors that go into the construction of effective indexes. However, as Milstead acknowledges in the conclusion, as fully automatic text-handling methods have become more powerful, “fewer resources will be dedicated to analysis of individual documents at input, and more to the output (that is, search and retrieval) side of the equation.”

To support her judgement of the importance of indexes used by humans, the author makes statements that are not supported by current evidence. It is not possible anymore to write about text retrieval, retrieval evaluation, and retrieval tools without taking into account the results and insights accumulated over the last few years by the TREC retrieval studies [1]. In light of what we have learned in the TREC environment (750,000 full-text documents in many subject areas, with queries and objective relevance assessments, and controlled retrieval evaluations for the many dozens of participating systems around the world), the author makes a lot of questionable statements.

She says, “It has meant that automatic or computer-aided indexing systems cannot produce indexing comparable to that produced by humans.” In fact, fully automatic methods have vastly outperformed manually controlled ones for at least 20 years. Second, she claims that “present-day decision-rule systems, as well as statistical, syntactic and semantic methods, attempt to mimic the results produced by humans.” This is not true at all. There is no attempt to mimic human beings. What human could perform an analysis--no matter how crude--of 750,000 full-text items? Computers do that efficiently and effectively. Third, she says, “Their success [of the automated methods] is limited because of our lack of understanding of the process.” Their success is limited, but the success of human indexing is much more limited. Finally, she states, “It seems to be true that they [the automated methods] provide the best results on documents which have undergone some human analysis.” In the TREC environment, the fully automatic statistically based methods vastly outperform all other approaches.

I am not arguing against research in the construction of printed or other indexes. But the modern retrieval world is elsewhere now, and it will never go back to human-controlled methods in information search and retrieval.

Reviewer:  Gerard Salton Review #: CR119114 (9604-0286)
1) Harman, D. K., Ed. The first text retrieval conference (TREC 1) (Rockville, MD, Nov. 4–6, 1992). NIST special publication 500-207, National Institute of Standards and Technology, Gaithersburg, MD, 1993.
Bookmark and Share
 
Indexing Methods (H.3.1 ... )
 
 
Thesauruses (H.3.1 ... )
 
Would you recommend this review?
yes
no
Other reviews under "Indexing Methods": Date
Computation of term/document discrimination values by use of the cover coefficient
Can F. (ed), Ozkarahan E. Journal of the American Society for Information Science 38(3): 171-183, 1987. Type: Article
Mar 1 1988
Automatic indexing of full texts
Jonák Z. Information Processing and Management: an International Journal 20(5-6): 619-627, 1984. Type: Article
Jul 1 1985
Evaluation of access methods to text documents in office systems
Rabitti F., Zizka J.  Research and development in information retrieval (, King’s College, Cambridge,401984. Type: Proceedings
Sep 1 1985
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy