Computing Reviews, the leading online review service for computing literature.

Computing Reviews

Today's Issue

Hot Topics

Browse

Recommended

My Account

Log In

Review

Help

Search

Learning string-edit distance
Ristad E., Yianilos P. IEEE Transactions on Pattern Analysis and Machine Intelligence20 (5):522-532,1998.Type:Article

Date Reviewed: Nov 1 1998

A new stochastic model is presented for computing string-edit distances by learning from a corpus of examples. The model is applied to the difficult problem of learning the pronunciation of a spoken word. Compared to the established, classical reference point, the untrained Levenshtein distance, the new stochastic model makes only one-fifth as many errors. After the introduction, section 2 explains an algorithm to automatically learn a string-edit distance from similar string pairs. The algorithm is not directly applicable to string classification problems. Yet the stochastic solution in section 3, based on a corpus of labeled strings, automatically learns string classification. Section 4 outlines empirical results of five experiments. The first two use the switchboard pronouncing lexicon, the next two use a lexicon derived from the corpus, and the fifth merges the two lexicons. The observed superior performance of the new stochastic technique makes the costly process of constructing a lexicon by hand obsolete. It also allows the recognition of a new word from a single example of that word’s pronunciation, both constituting improvements over other methods. Without prior knowledge of the subject, sections 2 and 3 are hard to read; the meaning and time complexity of some of the statistical formulas are difficult to grasp. However, the results show that the new method has great promise for improving the current state of the art. Moreover, the authors have considered other alternatives, some with better attributes for the automatic acquisition of joint probabilities on string pairs, yet they offer good reasons for the superiority of their stochastic model. The paper is a valuable contribution to learning string-edit distance, and therefore is a must for researchers and developers in spelling correction and pronunciation modeling. The new model may even be extensible to speech recognition, computerized voice I/O, and decryption.

Reviewer: Herbert G. Mayer	Review #: CR121929 (9811-0898)

Data Types And Structures (D.3.3 ... )

Algorithm Design And Analysis (G.4 ... )

Classifier Design And Evaluation (I.5.2 ... )

Models (I.5.1 )

Would you recommend this review?

yes

no

Other reviews under "Data Types And Structures":	Date

Advances in database programming languages Bancilhon F. (ed), Buneman P., ACM Press, New York, NY, 1990. Type: Book (9780201502572)	Aug 1 1991

Pascal and beyond Fisher S., Reges S., John Wiley & Sons, Inc., New York, NY, 1992. Type: Book (9780471502616)	Sep 1 1992

On the exact complexity of string matching Galil Z., Giancarlo R. SIAM Journal on Computing 21(3): 407-437, 1992. Type: Article	Mar 1 1993

more...

Tips

Help

Contact Us

Reproduction in whole or in part without permission is prohibited. Copyright 1999-2024 ThinkLoud^®
Terms of Use | Privacy Policy