Computing Reviews

Information retrieval architecture and algorithms
Kowalski G., Springer-Verlag New York, Inc.,New York, NY,2010. 308 pp.Type:Book
Date Reviewed: 06/20/11

Kowalski’s textbook is for advanced undergraduate and first-year graduate courses on information retrieval (IR) systems.

The book’s title accurately captures its theme: an introduction to the architecture and algorithms necessary to build an effective IR system. It contains nine chapters.

Chapter 1 starts with a discussion of the functions that an IR system should provide and the major components an effective IR system should contain. The chapter concludes with a comparison and contrast of database and digital library systems. The second chapter presents commonly used data structures and algorithms in a typical IR system, such as inverted file structure and n-gram data structures, as well as Shannon’s theory of information, hidden Markov models, neural networks, and support vector machines. Chapters 3 and 4 are concerned with how to prepare raw data for search, including such topics as item normalization, stemming, and indexing for both plaintext and multimedia data. Chapter 5 introduces search and discusses topics such as similarity measure, ranking, relevance feedback, and multimedia search. The next chapter brings the readers into the world of clustering, including the concept of a cluster, the hierarchical formation of clusters, and the measurement of clusters. Information presentation--how to effectively present search results to users--is discussed in chapter 7. Chapter 8 talks about general search system architecture, offering a specific example of the Google search engine. Finally, chapter 9 discusses information system evaluation and presents as an example the Text Retrieval Conference (TREC).

Kowalski balances the theoretical and practical aspects very well. The content flow is natural and easy to read. Compared to some of the other books on the market, this book doesn’t tilt toward any extreme in the IR and computer science spectrum. It doesn’t concentrate on detailed implementation issues such as programming or hardware systems, nor does it present any extra content about IR theory that does not typically appear in a search/retrieval system. Kowalski focuses on the essence of IR system architecture, data structures, and algorithms; this focus makes the book suitable for a wider audience.

It is a pleasure to read this book because it is very well written. However, it contains dense text in many places and few graphical presentations; as a result, it appears to be a more difficult read than it actually is. All readers will quickly find that the language is very accurate and informative, and not dry at all. This is an excellent book on the subject.

Reviewer:  Xiannong Meng Review #: CR139168 (1201-0033)

Reproduction in whole or in part without permission is prohibited.   Copyright 2024 ComputingReviews.com™
Terms of Use
| Privacy Policy