Machine translation (MT) was one of the earliest goals of artificial intelligence (AI) research; its first rise in the late 1950s and fall in the mid-1960s is one of the classic examples of excessive hype followed by dramatic deflation in the history of AI. (An entertaining, detailed account of the politics involved is given in Gordin .) In the last two decades, MT has been revived, with very great success, considering the inherent difficulty of the task: commercial systems such as Google Translate and Bing Translator provide translations across 40-odd languages that are often good enough to be useful, though rarely flawless and sometimes misleading or unintelligible. In my personal opinion, these are the most impressive products of AI research to date.
Bhattacharyya’s Machine translation is a college textbook, based on courses that he has taught for ten years at the Indian Institute of Technology (IIT) Bombay. It presumes an elementary knowledge of probability theory, but little else.
Chapter 1 gives a general introduction to challenges and methodologies. In particular, it presents the Vauquois triangle as a general methodological framework: an MT program first climbs the left-hand side in analyzing the structure of the source text to some level of abstraction, next transfers over to the corresponding abstractions in the target language, and then generates the final translation by working down from the abstraction to the output text. In pure word-to-word translation, the transfer is done at the bottom level; whereas in a rule-based system that uses a language-independent interlingua, the system climbs all the way from the input to a meaning representation at the apex of the triangle, and then down again to the output; no separate transfer step is needed. Other systems cross over somewhere in the middle; in general, the more similar the languages, the more effective it is to cross over at a low level.
Chapters 2 to 4 present statistical MT, currently by far the dominant paradigm. Chapter 2 covers basic word-to-word translation, with a simple model of alignment; chapter 3 extends this to more sophisticated models of alignment; and chapter 4 discusses phrase-based translation. The emphasis here is the construction of probabilistic models and the manipulation of the resulting formulas, and on the use of the expectation-maximization (EM) algorithm.
Chapter 5 discusses rule-based MT. It includes an extended exposition of the universal networking language (UNL), a general interlingua for MT developed in the late 1990s by the United Nations University.
Chapter 6 discusses example-based MT, essentially the attempt to do MT using case-based reasoning. Often with case-based reasoning, though the examples are appealing, the specifics of the method are vague and it is hard to see how this can be done at scale, or how the inherent obstacles can be overcome.
The book includes a number of student exercises and useful bibliographies for each chapter.
Since the textbook was written for students at IIT, most of the examples that Bhattacharyya uses are translations between English and one of the Indian languages (chiefly Hindi, but also Bengali and Marathi) or between two of the Indian languages. These are not, obviously, the most accessible choices for most American or European students; but equally obviously, there is no possible choice that would be good for all students. Bhattacharyya provides a transliteration and a gloss (word-by-word translation) for each of his examples, so readers like myself who know nothing at all of any of the languages can follow all the examples.
There are some significant gaps. There is little discussion of the difficult issue of evaluation. In chapters 2 to 4, the focus is narrowly technical; I would have like to have seen more “big picture” discussion and analysis of the strengths and weaknesses of the methods described.
All in all, though, this is a clear, well-written introduction to a key area in computer science.