Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Unsupervised similarity-based word sense disambiguation using context vectors and sentential word importance
Abdalgader K., Skabar A. ACM Transactions on Speech and Language Processing (TSLP)9 (1):1-21,2012.Type:Article
Date Reviewed: Aug 24 2012

One of the major challenges for computers remains the ability to deal with text formulated in a natural language. While significant advances in computational linguistics, artificial intelligence, computing power, and other aspects have given us tools like search engines that are practically indispensable for many tasks, we also deal with other issues such as translation and virtual assistants, with performance ranging from amazing to appalling. One of the core underlying issues is the ability of humans to quickly determine the meaning of a natural language statement, usually without much effort. This interpretation of natural language statements relies on a combination of analyzing the structure of sentences and determining the intended meaning of words and phrases. For computers, the second aspect is far more challenging, both conceptually and computationally.

In this paper, the authors describe an approach that addresses the word sense disambiguation (WSD) problem by measuring the semantic similarity between different possible interpretations of a particular word and its context, represented by the remaining words within the text fragment under consideration. This approach is also referred to as knowledge-based WSD, in contrast to the corpus-based methods that rely on the availability of a set of training data consisting of words labeled with the correct meaning. Knowledge-based WSD has been pursued by other researchers with various similarity measures and computational refinements, but it is seriously affected by the amount of calculations resulting from the comparison of different interpretations for all of the words under consideration. Naive approaches typically lead to combinatorial increases in time or space requirements, whereas refinements often require restrictions such as considering only shorter fragments of text. The new method’s computational complexity is quadratic with the number of words in the context, whereas other approaches are often exponential with the size of the context window (also a measure of the number of words). Further improvements are achieved by establishing the order in which the words of the text fragment are considered for disambiguation. This relies on a graph-based approach to weigh the “importance” of words within a text fragment, similar to Google’s PageRank algorithm for ordering Web pages in search results.

The paper is well written, nicely structured, and reasonably easy to follow. It offers a good overview of the current state of WSD, with brief descriptions of commonly used methods. The results obtained by the authors are better than the methods they compared them against, both in standalone experiments based on commonly used datasets and in more challenging tasks involving sentence similarity measurement and sentence clustering. Especially for the sentence similarity task, the authors report significantly better results over previous approaches, also surpassing the mean performance of human participants in the experiment.

Reviewer:  Franz Kurfess Review #: CR140546 (1302-0145)
Bookmark and Share
  Featured Reviewer  
 
Text Analysis (I.2.7 ... )
 
Would you recommend this review?
yes
no
Other reviews under "Text Analysis": Date
Some issues in the semantics and pragmatics of definite reference in the context of natural language database access
Berry-Rogghe G. Circuits, Systems, and Signal Processing 3(1): 47-54, 1984. Type: Article
Jun 1 1985
Word division in Spanish
Mañas J. Communications of the ACM 30(7): 612-616, 1987. Type: Article
Jul 1 1989
Schemata for understanding of argumentation in newspaper texts
Roesner D.  Progress in artificial intelligence (, Orsay, France,3111985. Type: Proceedings
Apr 1 1986
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy