Repetition and periodicity play an essential role in efficient pattern matching, data analysis, and data compression. This paper presents an efficient algorithm to identify maximal sequences of approximate repetitions within a string, called maximal approximate runs. The authors discuss different definitions of approximate runs, and present a new definition. The new definition demands a greater degree of similarity between the substrings that form the run than did previous definitions.
With the help of a notation to specify approximate runs, the authors present an algorithm that generates all maximal approximate runs in a string. This algorithm is then improved to yield a more efficient algorithm. The improved algorithm has time complexity O(k2 log k), where k is the greatest number of letters that would need to be changed in an approximate run in order to obtain an exact run.
The paper provides a very detailed--but nonetheless accessible--presentation of the algorithm. This could be used to reproduce the results presented, or to implement the algorithm.
Overall, the paper provides a useful contribution that is likely to be of interest to readers with an interest in pattern matching, searching, or data compression.