Computing Reviews

Inducing morphemes using light knowledge
Tepper M., Xia F. ACM Transactions on Asian Language Information Processing9(1):1-38,2010.Type:Article
Date Reviewed: 06/25/10

The morphological analysis of texts has resulted in improvements in speech recognition error rates and in information retrieval, as well as other areas where computational linguistics is used. The creation of rule-based morphological analyzers relies on having a morphological lexicon, which can be expensive to acquire. In essence, a rule-based system must know all the rules that govern the morphology of a language--for example, all of the rules for the pluralization of nouns. This paper uses light knowledge to induce morphemes in raw text input.

The system uses a hybrid approach that combines rewrite rules from a finite state machine approach with an unsupervised machine intelligence system. The advantage of this approach is that the rules are allowed to over-generate analyses that will be subsequently winnowed by the statistical component.

As the paper explains, the authors evaluate the system by running it on English and Turkish data from the 2005 and 2007 Morpho Challenge contests. The training data consists of a list of words with word counts. For evaluation, the Morpho Challenge contests use the harmonic mean of precision and recall, where precision measures the number of hits and recall measures the proportion of hits. The authors aim to show that by using even a small set of imperfect rules--in English, for example, one might use only the pluralization rule that adds the “s” sound, omitting plurals such as oxen--one can still get improvements in performance. There was significant improvement in almost all cases, though a greater improvement was found in the Turkish results. The results compare well against those of other systems.

Reviewer:  J. P. E. Hodgson Review #: CR138125 (1012-1293)

Reproduction in whole or in part without permission is prohibited.   Copyright 2024 ComputingReviews.com™
Terms of Use
| Privacy Policy