Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Algorithms on strings
Crochemore M., Hancart C., Lecroq T., Cambridge University Press, New York, NY, 2007. 392 pp. Type: Book (9780521848992)
Date Reviewed: Apr 8 2008

The manipulation of strings of characters is an important activity in the sciences. This book is an updated English translation of a book originally published in French in 2001. It is meant for master’s-level lectures on string processing and pattern matching in computer science and software engineering courses. The book may also be used as a reference for computational linguistic and computational biology students. Questions related to the automatic processing of natural language, the analysis of molecular sequences, and the management of textual databases are presented. Particulars about algorithms, their proof of correctness, and complexity analysis are also detailed. All chapters include exercises and notes about the topics discussed, as well as pointers to other texts. The book presents material gathered from several sources. Crochemore et al. acknowledge the sources from which information has been obtained. The parts of the book that include original work by the authors are highlighted.

The book is divided into nine chapters. Chapter 1 presents the tools that are used in the following chapters. It describes the essential elements for an accurate study of algorithms on strings, using information gathered from several books. It presents the concepts and notions used for working with strings, languages, and automata: combinatorial properties of strings; algorithmic elements; particulars of the implementation of automata (namely the data structures and algorithms); basic pattern matching techniques, including the notion of sliding window; utilization of heuristics for reducing the computation time; the use of automata as search engines; the bit vector model; tables of borders and prefixes; and the fundamental methods for efficiently locating patterns and searching for regularities in strings.

Chapter 2 concerns the problem of searching for a pattern in a text when the pattern represents a finite set of strings. The topics in this chapter include: the trie of a dictionary; techniques for searching for several strings; the implementation of an automaton with failure functions; implementation with successor by default; locating a string; locating a string with failure functions; and locating a single string with successor by default.

Chapter 3 studies the problem of searching for all occurrences of a fixed string in a text, searching without memory, complexity analysis of memoryless suffix search, the preprocessing required for it, the implementation of a shift function using an automaton, searching with single memory, searching with several memories, and dictionary searching.

Chapter 4 looks at the problem of searching a fixed text. A data structure known as suffix array is used for this purpose. The questions explored are: searching a list of strings memorized in a table, searching with the longest common prefixes, the preprocessing associated with it, sorting suffixes in lexicographic order, sorting suffixes on bounded integer alphabets, and common prefixes of the suffixes.

Chapter 5 concerns structures for indexes--the data structures for storing the suffixes of a text. It describes the suffix trie, the suffix tree, the formal basis for the construction of the minimal automaton that accepts the suffixes of a string, the suffix automaton, and the compact suffix automaton.

Chapter 6 is about indexes. It studies techniques for implementing an index, basic operations on indexes, the notion of transducers of position, repetition of factors inside text, forbidden strings, the use of a suffix automaton as a search machine, and searching for conjugates of strings inside a text.

Chapter 7 deals with alignments--a process often used for comparing strings. It presents methods for comparing strings, optimal alignment, longest common subsequence, alignment with gaps, local alignment, and heuristics for local alignment.

Chapter 8 is about approximate search for fixed strings. It looks at various notions of approximation on strings. It introduces approximate pattern matching with jokers, differences, and mismatches. Approximate pattern matching for short patterns and heuristics for approximate matching with differences are also discussed.

Finally, chapter 9 is committed to the detection of local periodicities that can occur inside a string. It analyzes ways of partitioning factors, detection of powers, detection of squares, and sorting suffixes.

The bibliography includes a list of books, articles, Web sites, applications, and references for algorithmics and combinatorics. The index is helpful. The quality of the paper, the layout, and the typesetting are praiseworthy. The material discussed has theoretical and practical significance. Although the book has a pedagogical orientation, there is no doubt that software practitioners, computational linguists, and computational biologists will also benefit.

As drawbacks go, the presentation style is very laconic and prosaic. Furthermore, including solutions for some problems would have been helpful. Nevertheless, I recommend this book as a useful reference for string processing and pattern matching.

Reviewer:  S. V. Nagaraj Review #: CR135461 (0902-0128)
Bookmark and Share
  Reviewer Selected
Featured Reviewer
 
 
Discrete Mathematics (G.2 )
 
 
Arrays (E.1 ... )
 
 
Biology And Genetics (J.3 ... )
 
 
Nonalgebraic Algorithms (I.1.2 ... )
 
 
Information Search And Retrieval (H.3.3 )
 
 
Nonnumerical Algorithms And Problems (F.2.2 )
 
Would you recommend this review?
yes
no
Other reviews under "Discrete Mathematics": Date
 Discrete mathematics and its applications
Rosen K., McGraw-Hill Higher Education, Columbus, OH, 2002.  928, Type: Book (9780072424348)
May 7 2003
Derandomization of auctions
Aggarwal G., Fiat A., Goldberg A., Hartline J., Immorlica N., Sudan M.  Theory of computing (Proceedings of the Thirty-Seventh Annual ACM Symposium on Theory of Computing, Baltimore, MD, May 22-24, 2005)619-625, 2005. Type: Proceedings
Jul 31 2006
Algorithms in algebraic geometry (IMA Volumes in Mathematics and its Applications)
Dickenstein A., Schreyer F., Sommese A., Springer, 2007. Type: Book
Jul 30 2008
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy