Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
On the exact complexity of string matching
Galil Z., Giancarlo R. SIAM Journal on Computing21 (3):407-437,1992.Type:Article
Date Reviewed: Mar 1 1993

A new online O ( n ) algorithm for identifying all the occurrences of a given pattern p = p [ 1 , m ] in a given text string t = t [ 1 , n ] is presented. The new algorithm modifies an algorithm due to Colussi [1] and requires at most ( 4 n - m ) &slash; 3 comparisons between characters of p and characters of t; this total does not include the comparisons between characters of p that are required during the preprocessing of p (computation of the periodic decomposition). A companion paper[2] has already established that, in the case m = 3, the identification of all occurrences of p by an online algorithm requires at least ( 4 n - m ) &slash; 3 comparisons; thus there are grounds for conjecturing that in this sense the new algorithm may be optimal.

The new algorithm is derived from a detailed and complex analysis of Colussi’s algorithm. The main idea of  Colussi’s  algorithm is to process the “noholes” (positions of interrupted periodicity) of the pattern from left to right, and then, only if all noholes match corresponding positions in the text, to compare the remaining positions (“holes”) from right to left. Colussi’s algorithm thus combines elements of the Knuth-Morris-Pratt and Boyer-Moore algorithms. The result is an algorithm in which certain positions of the text may not be tested at all. Based on their analysis, the authors show that Colussi’s algorithm can be improved by handling certain shifts of the pattern in a more efficient way. Both Colussi’s algorithm and the new algorithm operate on patterns that are assumed not to be strongly periodic; the authors show, however, that any such algorithm necessarily gives rise to another O ( n ) online algorithm that matches strongly periodic patterns using no more character comparisons between pattern and text. The authors claim that Colussi’s algorithm (and thus presumably also the new algorithm) performs well in practice; it would be interesting to test these algorithms against other recently reported fast pattern-matching algorithms, such as Wu and Manber’s [3].

This paper is important and interesting, as much for its methodology as its result. It is not for the fainthearted, however: the main result (Theorem 3) is stated after 30 pages of complex exposition including 20 lemmas and 12 “facts.” Not surprisingly, therefore, it is difficult to read--I found that half an hour per page was the best I could do. Although it is generally well written, the paper contains a few minor inaccuracies to distract the reader; it would have been improved, and its reading time reduced, by the use of illustrative examples throughout (it contains none) and a more generous use of drawings and figures.

Reviewer:  W. F. Smyth Review #: CR116579
1) Colussi, L. Correctness and efficiency of string matching algorithms. Inf. Comput. 95, 2 (Dec. 1991), 225–251.
2) Galil, Z. and Giancarlo, R. On the exact complexity of string matching: lower bounds. SIAM J. Comput. 20, 6 (1991), 1008–1020.
3) Wu, S. and Manber, U. Fast text searching allowing errors. Commun. ACM 35, 10 (Oct. 1992), 83–91.
Bookmark and Share
 
Data Types And Structures (D.3.3 ... )
 
 
Analysis Of Algorithms (I.1.2 ... )
 
 
Pattern Matching (F.2.2 ... )
 
Would you recommend this review?
yes
no
Other reviews under "Data Types And Structures": Date
Advances in database programming languages
Bancilhon F. (ed), Buneman P., ACM Press, New York, NY, 1990. Type: Book (9780201502572)
Aug 1 1991
Pascal and beyond
Fisher S., Reges S., John Wiley & Sons, Inc., New York, NY, 1992. Type: Book (9780471502616)
Sep 1 1992
User-defined types in a polymorphic language.
Harland D. The Computer Journal 27(1): 47-56, 1984. Type: Article
Mar 1 1985
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy