Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Inducing morphemes using light knowledge
Tepper M., Xia F. ACM Transactions on Asian Language Information Processing9 (1):1-38,2010.Type:Article
Date Reviewed: Jun 25 2010

The morphological analysis of texts has resulted in improvements in speech recognition error rates and in information retrieval, as well as other areas where computational linguistics is used. The creation of rule-based morphological analyzers relies on having a morphological lexicon, which can be expensive to acquire. In essence, a rule-based system must know all the rules that govern the morphology of a language--for example, all of the rules for the pluralization of nouns. This paper uses light knowledge to induce morphemes in raw text input.

The system uses a hybrid approach that combines rewrite rules from a finite state machine approach with an unsupervised machine intelligence system. The advantage of this approach is that the rules are allowed to over-generate analyses that will be subsequently winnowed by the statistical component.

As the paper explains, the authors evaluate the system by running it on English and Turkish data from the 2005 and 2007 Morpho Challenge contests. The training data consists of a list of words with word counts. For evaluation, the Morpho Challenge contests use the harmonic mean of precision and recall, where precision measures the number of hits and recall measures the proportion of hits. The authors aim to show that by using even a small set of imperfect rules--in English, for example, one might use only the pluralization rule that adds the “s” sound, omitting plurals such as oxen--one can still get improvements in performance. There was significant improvement in almost all cases, though a greater improvement was found in the Turkish results. The results compare well against those of other systems.

Reviewer:  J. P. E. Hodgson Review #: CR138125 (1012-1293)
Bookmark and Share
  Featured Reviewer  
 
Learning (I.2.6 )
 
 
Natural Language Processing (I.2.7 )
 
Would you recommend this review?
yes
no
Other reviews under "Learning": Date
Learning in parallel networks: simulating learning in a probabilistic system
Hinton G. (ed) BYTE 10(4): 265-273, 1985. Type: Article
Nov 1 1985
Macro-operators: a weak method for learning
Korf R. Artificial Intelligence 26(1): 35-77, 1985. Type: Article
Feb 1 1986
Inferring (mal) rules from pupils’ protocols
Sleeman D.  Progress in artificial intelligence (, Orsay, France,391985. Type: Proceedings
Dec 1 1985
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy