The authors provide an excellent treatment of a new data structure for text analysis, called the compact directed acyclic word graph (compact DAWG). This data structure is both smaller than and functionally superior to conventional keyword-based retrieval structures and to structures proposed in the mid-1970s such as suffix and position trees. One of the more powerful aspects of the approach treated in the paper is that searches for arbitrary strings are allowed. This is particularly important when standard keywords are undefined or inadequate (the authors suggest that searching a library of DNA sequences is just such a problem).
The flavor of the paper is theoretical. Linear (and hence optimal) times are shown for finding the longest prefix of a keyword that is present in a set of texts, counting the number of occurrences of this prefix, and finding the locations of all the occurrences. Bounds on the sizes of the data structures involved are also proven. In addition, the treatment is complete enough to show how to construct the needed data structures and implement text retrieval functions. In summary, the paper is interesting reading for anyone interested in text retrieval problems or data structures.