Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Information retrieval
Frakes W., Baeza-Yates R., Prentice-Hall, Inc., Upper Saddle River, NJ, 1992. Type: Book (9780134638379)
Date Reviewed: Jul 1 1993

The area of text analysis, search, and retrieval has taken on increasing importance in recent years, and the field is now of interest to large communities in science and in the humanities. The need for a volume covering the major information retrieval algorithms has been apparent for many years, and the authors and editors of this book ought to be congratulated for devoting much time and effort to this important area. This book consists of separate chapters by some 20 different well-qualified authors, and it covers many of the more important information retrieval algorithms, including methods of file organization, file search and access, and query processing. The book has a practical outlook, and it should be of substantial help to people interested in information retrieval applications. Some of the chapters provide a reasonable overview of the areas they cover as well as a decent bibliography.

Unhappily, as is true of many such collections of individual chapters, the overall effect appears in many ways to be less than the sum of its parts. First, the treatment of the field is noticeably uneven. Some subjects are covered only cursorily; for example, text analysis, portions of which are included in several different chapters, is not treated with any real insight.

Some other subjects are emphasized far beyond their real importance--for example, a whole chapter is devoted to PAT trees, a data structure used in searching the Oxford English Dictionary. Digital search trees could have been covered more reasonably in two or three pages. The same is true of string searching, which is not applicable to the large information files that are now prevalent, although it is expertly covered in the chapter by Baeza-Yates.

The book also unfortunately lacks glue between various related subjects, because the central agent that could have related different parts of the book is absent. The two short initial chapters that were obviously designed to provide this connection are not effective in this respect.

An editorial decision that substantially contributes to the choppiness of the material is the use of major topic subdivisions entitled “File structure,” “Term and Query Operations,” and “Document Operations.” In practice, user queries are often available in the form of natural-language statements that are not immediately distinguishable from the stored document representations. In such circumstances, the distinct treatment of queries and documents produces unfortunate conceptual problems. Thus, relevance feedback--a query modification operation that occurs at the tail-end of the processing chain--is described in chapter 11, whereas some basic text indexing operations used at the beginning of the retrieval process are treated as document operations and appear in chapter 14. This discontinuity leads Donna Harman, the author of both of these chapters, to suggest that chapter 14 be read before chapter 11.

For all these reasons, it is difficult to recommend this book to novices who many not be in a position to provide the needed context. Something more integrated, with a more compelling structured treatment of the field, might have been more useful. The current mix of some nice survey chapters, notably on signature files, string searching, relevance feedback, ranking, and clustering, with other more narrowly conceived topic treatments is difficult to manage even for people who know the field well.

When a book is written by many different people with different outlooks and approaches, a careful editing job is essential. The treatment offered here leaves a lot to be desired. First, the book has many typos. The chapter heads are not evenly treated: sometimes a full address is given for an author as part of the chapter heading; other times only a company affiliation is given. Some authors use the third person in covering their subject; others prefer the first person plural, even when only a single author is involved. Most distressing is the lack of a decent index. The book has no author index, and no index of retrieval system acronyms. A subject index is included, but many important concepts in retrieval, such as classification, dictionary, knowledge base, hypertext, multimedia, and query formulation, do not appear in the index.

Overall, this book fulfills a real need for the practitioner. It includes many nice chapters written by capable contributors. The volume could have been more useful, however, if more attention had been paid to the overall organization and if tighter editing and more careful production had prevailed.

Reviewer:  Gerard Salton Review #: CR116486
Bookmark and Share
 
General (H.3.0 )
 
 
Information Search And Retrieval (H.3.3 )
 
Would you recommend this review?
yes
no
Other reviews under "General": Date
Dictionary of information science and technology
Watters C., Academic Press Prof., Inc., San Diego, CA, 1992. Type: Book (9780127385105)
Jul 1 1993
Organizing information: principles of data base and retrieval systems
Soergel D., Academic Press Prof., Inc., San Diego, CA, 1985. Type: Book (9789780126542608)
Aug 1 1986
Test of methods for evaluating bibliographic databases: an analysis of the National Library of Medicine’s handling of literatures in the medical behavioral sciences
Griffith B., White H., Drott M., Saye J. Journal of the American Society for Information Science 37(4): 261-270, 1986. Type: Article
May 1 1987
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy