Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
In-place algorithms for exact and approximate shortest unique substring problems
Hon W., Thankachan S., Xu B. Theoretical Computer Science690  12-25,2017.Type:Article
Date Reviewed: Dec 27 2017

The problem of exact shortest unique substring (SUS) is formulated as follows:

Given a string S and a position p in S, find a substring of S that (1) is unique in S, (2) covers p, and (3) there is no shorter substring of S, which satisfies (1) and (2).

For instance, for S = aabccabbcb and p = 3, there are three shortest unique substrings of S covering p: aab, abc, and bcc.

The problem originally has been studied in the work of Pei and colleagues [1]. It has been motivated by its applicability in locating the shortest unique snippets of the query string in a long string. Computation of exact SUSs found applications also in event analysis and bioinformatics.

The algorithm proposed in [1] computes all exact SUSs in quadratic time and linear space with respect to the length of string S. Later, there were several improvements, including using linear time and space to construct a data structure that allows any further exact SUS query to be answered in constant time.

In this paper, the authors consider an approximate version of the problem, where a fixed number of mismatches is allowed. A generic in-place algorithmic framework is proposed to solve both exact and approximate k-mismatch (k ≥ 1) SUS problems. For a string of length n, the framework uses 2n words, each of ⌈log2n⌉ bits, plus n bytes. Using the framework, it is possible to solve the exact SUS problem in linear time, and the approximate k-mismatch SUS problem in quadratic time for any k. The reported results of experiments show that the proposed framework is competitive with respect to both space consumption and speed.

The paper is quite technical, describing details of four different algorithms. The authors do a good job of explaining them and provide the high-level picture. However, readers would profit from more elaborate examples.

Reviewer:  Temur Kutsia Review #: CR145729 (1802-0089)
1) Pei, J.; Wu, W. C. H.; Yeh, M. Y. On shortest unique substring queries. In Proc. of the 29th IEEE International Conf. on Data Engineering. Jensen, C. S., Jermaine, C. M., Zhou, X., Eds. IEEE Computer Society, 2013, 937–948.
Bookmark and Share
 
Pattern Matching (F.2.2 ... )
 
Would you recommend this review?
yes
no
Other reviews under "Pattern Matching": Date
Deterministic sampling
Vishkin U. SIAM Journal on Computing 20(1): 22-40, 1991. Type: Article
Jun 1 1992
Two-way string-matching
Crochemore M., Perrin D. (ed) Journal of the ACM 38(3): 650-674, 1991. Type: Article
Sep 1 1992
On the exact complexity of string matching: lower bounds
Galil Z., Giancarlo R. SIAM Journal on Computing 20(6): 1008-1020, 1991. Type: Article
Jan 1 1993
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy