Existing machine translation (MT) programs often need help to resolve ambiguities in input texts, but how should they ask for this help? In particular, given a structurally ambiguous sentence such as “I saw the man in the park with a telescope,” the parser needs a way to express the alternative parse trees in natural language so that the average user can understand what is going on.
The author’s solution is to attach linguistic templates to the rewrite rules of a context-free parser. The parser uses these templates to form disambiguation questions using the lexical items in the input text. The templates on a rule specify the focus of the rule, a full English template, and a summarizing English template. For example, the rule “NP1-- 1-2- NP2 PP” (a noun phrase can consist of a noun phrase followed by a prepositional phrase) focuses on NP2, can be expressed in full as :20NP2 is PP,” and can be summarized as “NP2.” Thus, when the rule parses “the man in the park,” the full template would generate “(the man) is (in the park)” and the summarizing template would generate “(the man).”
Given “the man in the park with a telescope,” the rule above can group “with a telescope” with either “the park” or “the man.” The parser groups together conflicting rule applications that share focused elements. In this case, “in the park” is focused on by the two occurrences of the noun phrase rule. The parser then uses the full and summarizing templates to ask the user to choose between “(the park) is (with a telescope)” and “(the man) is (with a telescope).”
I have several major questions and reservations with this approach. How would the system extend beyond context-free grammars? How well do the templates work for structural ambiguities more complex than prepositional phrase attachment and conjuction? How do you keep your templates from generating queries that are as ambiguous as the original input? Why cannot the templates be replaced with pointers to rewrite rules, thereby making the system less language dependent? (Right now, the templates contain English lexical items.) Finally, while it is worth the experiment, I remain skeptical that a practical user-interface can be based on nothing more than content-free syntactic selection of phrases from the input text.