Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Complete inverted files for efficient text retrieval and analysis
Blumer A., Blumer J., Haussler D., McConnell R., Ehrenfeucht A. Journal of the ACM34 (3):578-595,1987.Type:Article
Date Reviewed: Oct 1 1988

The authors provide an excellent treatment of a new data structure for text analysis, called the compact directed acyclic word graph (compact DAWG). This data structure is both smaller than and functionally superior to conventional keyword-based retrieval structures and to structures proposed in the mid-1970s such as suffix and position trees. One of the more powerful aspects of the approach treated in the paper is that searches for arbitrary strings are allowed. This is particularly important when standard keywords are undefined or inadequate (the authors suggest that searching a library of DNA sequences is just such a problem).

The flavor of the paper is theoretical. Linear (and hence optimal) times are shown for finding the longest prefix of a keyword that is present in a set of texts, counting the number of occurrences of this prefix, and finding the locations of all the occurrences. Bounds on the sizes of the data structures involved are also proven. In addition, the treatment is complete enough to show how to construct the needed data structures and implement text retrieval functions. In summary, the paper is interesting reading for anyone interested in text retrieval problems or data structures.

Reviewer:  J. Fried Review #: CR112185
Bookmark and Share
 
Graphs And Networks (E.1 ... )
 
 
Retrieval Models (H.3.3 ... )
 
Would you recommend this review?
yes
no
Other reviews under "Graphs And Networks": Date
Deformable spanners and applications
Gao J., Guibas L., Nguyen A.  Computational geometry (Proceedings of the 20th Annual Symposium on Computational Geometry, Brooklyn, New York, USA, Jun 8-11, 2004)190-199, 2004. Type: Proceedings
Sep 13 2004
Randomized fully dynamic graph algorithms with polylogarithmic time per operation
Henzinger M., King V. Journal of the ACM 46(4): 502-516, 1999. Type: Article
Mar 1 2000
Fighting against two adversaries: page migration in dynamic networks
Bienkowski M., Korzeniowski M., auf der Heide F.  Parallelism in algorithms and architectures (Proceedings of the Sixteenth Annual ACM Symposium on Parallelism in Algorithms and Architectures, Barcelona, Spain, Jun 27-30, 2004)64-73, 2004. Type: Proceedings
Aug 4 2004
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy