Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Neural network methods in natural language processing
Goldberg Y., Morgan & Claypool Publishers, San Rafael, CA, 2017. 310 pp. Type: Book (978-1-627052-98-6)
Date Reviewed: Mar 26 2018

Deep learning has become the catchphrase to refer to artificial neural networks, one of the hottest research areas within machine learning. Now in their third generation, artificial neural networks have been used to crack problems that were unfeasible to solve just a few years ago. Their widespread use began within speech recognition systems, and their current popularity is due to computer vision, where they have won every object recognition challenge since 2012. In both areas, neural networks have achieved human-level performance, an undeniable feat that researchers are currently trying to reproduce in other application domains.

One such domain is natural language processing (NLP). Even though text written in human languages might not seem to be the most natural use case for an approach that is inherently tied to numerical computation, some breakthroughs have already revolutionized how NLP is done in commercial systems, from syntax analysis to machine translation. Yoav Goldberg provides a nice overview of the current state of the art based on a previous survey paper published by JAIR in 2016 [1]. This extended version of his survey appeals to a wider audience, who might already be familiar with statistical NLP techniques but unaware of recent advances in neural networks.

With this NLP audience in mind, Goldberg starts by introducing the basics of deep learning, namely, feed-forward neural networks, gradient-based optimization, and the computation graph abstraction used by deep learning software tools. These foundations are explained in a concise manner, without too many details and with enough clarity to be understood by novices in the field. There are only a few minor terminological issues that might be objectionable and potentially misleading, such as mingling outliers with noise in the training data or referring to multilayer perceptrons for multilayer networks that do not resort to the perceptron learning algorithm, a common mishap in the literature.

After the initial introductory chapters, the book delves into how to process natural language data using the numerical approach of neural networks trained with gradient descent. Even though deep learning is supposed to end with the manual effort of feature engineering, this is not so when we talk about NLP. This part of the book covers feature engineering, language modeling, and word embeddings (as in Word2Vec or GloVe). Apart from crystal-clear descriptions of the techniques involved, it also includes an interesting discussion on the choice of context and how that choice influences the resulting word embeddings, which basically let us represent words as numerical vectors. This section of the book ends with a case study on textual entailment (that is, given two texts, deciding whether the first text entails the second one) to illustrate the power of neural networks when solving complex problems (in an obscure way, critics might object).

The third part of the book deals with specialized architectures, the kinds of neural networks that are employed to solve real-world problems in NLP. In particular, separate chapters are devoted to convolutional neural networks as n-gram detectors, different kinds of recurrent neural networks for modeling sequences (from Elman’s original RNNs to current gated architectures such as LSTM and GRU), and encoder-decoder architectures (under the title of “conditional generation” in this book), including attention mechanisms, of course. The author describes the main ideas behind each technique, formalizes using mathematical notation, and provides plenty of applications where they have been used to achieve state-of-the-art results.

The final chapters provide brief and less self-contained introductions to other topics that have spurred the curiosity of researchers, such as recursive tree-structured networks and structured output prediction. The final chapter describes how to combine multiple neural models using cascading (composing larger models from smaller network components), multitask, and semi-supervised learning.

In summary, deep learning techniques have recently provided significant improvements over the statistical NLP techniques used since the 1990s. However, as the author acknowledges, they are not a silver bullet. The many nuances and subtleties of natural language are not always captured by current neural models, which, although successful in practice, often work as greedy black boxes that require huge amounts of data to perform. In this sense, NLP is not different from other domains where deep learning has already made its dent. Much work remains to be done, and deep learning is, without any doubt, one of the most promising research areas within AI, also within the NLP field.

More reviews about this item: Amazon, Goodreads

Reviewer:  Fernando Berzal Review #: CR145933 (1806-0288)
1) Goldberg, Y. A primer on neural network models for natural language processing. Journal of Artificial Intelligence Research 57 (2016), 345–420.
Bookmark and Share
  Reviewer Selected
Featured Reviewer
 
 
Natural Language Processing (I.2.7 )
 
 
Connectionism And Neural Nets (I.2.6 ... )
 
 
Self-Modifying Machines (F.1.1 ... )
 
 
Learning (I.2.6 )
 
 
Models Of Computation (F.1.1 )
 
Would you recommend this review?
yes
no
Other reviews under "Natural Language Processing": Date
Current research in natural language generation
Dale R. (ed), Mellish C. (ed), Zock M., Academic Press Prof., Inc., San Diego, CA, 1990. Type: Book (9780122007354)
Nov 1 1992
Incremental interpretation
Pereira F., Pollack M. Artificial Intelligence 50(1): 37-82, 1991. Type: Article
Aug 1 1992
Natural language and computational linguistics
Beardon C., Lumsden D., Holmes G., Ellis Horwood, Upper Saddle River, NJ, 1991. Type: Book (9780136128137)
Jul 1 1992
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy