Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Introduction to deep learning
Charniak E., The MIT Press, Cambridge, MA, 2019. 192 pp. Type: Book (978-0-262039-51-2)
Date Reviewed: Mar 5 2020

Deep learning has taken many application domains by storm, specifically those where artificial intelligence (AI) techniques have been struggling without too much success for decades. One of those domains is natural language processing (NLP), which includes key applications such as speech recognition and machine translation. Eugene Charniak is a renowned professor and researcher in statistical language processing who, along with many others, was surprised by the sweeping irruption of deep learning in NLP. The author, however, should not be blamed for not being able to predict the impact of deep learning in NLP, even as a doctoral student of Marvin Minsky in the past.

Now in their third generation, artificial neural networks (ANNs) have lived through several hype cycles, from peaks (now in their third) of inflated expectations to troughs of disillusionment. The author explains his situation:

I found myself way behind the times and struggling to catch up. So I did what any self-respecting professor would do, scheduled myself to teach the stuff, started a crash course by surfing the web, and got my students to teach it to me.

The origin of Charniak’s book explains its hands-on, project-driven approach. This is a short book, with seven down-to-earth chapters that include many code snippets and discussions of implementation details you would not expect to find in a more academic textbook.

The first chapter is an introduction to ANNs. Perceptrons, the cross-entropy loss function, and stochastic gradient descent are described for neophytes, whereas a thorough explanation of the backpropagation algorithm is sidestepped. Readers will get the gist of the idea, albeit they will have to rely on their software tools to provide its full implementation.

Among the myriad of deep learning frameworks available, the author resorts to one of the most popular: Google’s TensorFlow. The second chapter of the book is basically a simple tutorial on TensorFlow. Readers should be careful when working through the provided examples because sometimes they maintain the older Python 2 syntax (for instance, print statements), and the provided learning rate might be too high for the multilayer example (that is, they might experience convergence problems).

Following his tutorial-like style, the third chapter is a primer on convolutional neural networks (CNNs). Students will get acquainted with convolutions, strides, paddings, and the peculiarities of implementing CNNs in TensorFlow. Unfortunately, the provided example is too simplistic, and a student might not be fully aware of the method’s power.

The following two chapters deal with neural networks as they are applied to NLP problems. Given the author’s background, it is no surprise that they are the most interesting and thought out. The first one, on word embeddings, starts by describing a bigram language model and mentioning perplexity. After dealing with pragmatic considerations, with respect to the training of neural networks (such as overfitting and regularization, where dropout and L2 regularization are preferred over early stopping), the author introduces recurrent neural networks and their training algorithm, backpropagation through time. His discussion concludes with an operational description of long short-term memory (LSTM) networks before jumping to sequence-to-sequence (Seq2Seq) models in the following chapter. Seq2seq models, and their encoder-decoder architecture, are the cornerstone of neural machine translation [1]. Their performance is improved with the help of attention mechanisms, which are also briefly introduced, at least at the TensorFlow operational level. Unfortunately, current neural machine translation techniques are not suitable for class projects. They are more like “a cookbook teaching how to make pancakes by saying, mix together 100,000 gallons of milk and a million eggs.”

The final two chapters deal with other deep learning techniques that have attracted some attention during the past few years, namely deep reinforcement learning and unsupervised models. In the deep reinforcement learning chapter, Charniak swiftly follows the natural progression from Markov decision processes, value iteration, Q-learning, and deep Q-learning to policy gradient (that is, REINFORCE) and actor-critic (a2c) methods. In the final chapter on unsupervised models, he introduces autoencoders (convolutional and variational), as well as generative adversarial networks (GANs).

It is remarkable how many interesting topics are addressed in such a short book, especially given that it provides code snippets and discusses TensorFlow implementation details. However, readers will often learn the how rather than the why behind deep learning techniques. Still, the book might be useful for practically oriented minds and those in a hurry who might be interested in learning what the deep learning craze is really about. It is a good starter to whet your appetite on ANNs before delving deeper into their algorithmic foundations [2] or their NLP applications [3].

More reviews about this item: Amazon, Goodreads

Reviewer:  Fernando Berzal Review #: CR146920 (2008-0174)
1) Yonghui Wu et al. Google’s neural machine translation system: bridging the gap between human and machine translation. arXiv (2016), https://arxiv.org/abs/1609.08144.
2) Goodfellow, I.; Bengio, Y.; Courville, A. Deep learning. MIT Press, Cambridge, MA, 2016.
3) Goldberg, Y. Neural network methods in natural language processing. Morgan & Claypool, San Rafael, CA, 2017.
Bookmark and Share
  Reviewer Selected
Featured Reviewer
 
 
Neural Nets (C.1.3 ... )
 
 
General (I.0 )
 
Would you recommend this review?
yes
no
Other reviews under "Neural Nets": Date
Neural networks: an introduction
Müller B., Reinhardt J., Springer-Verlag New York, Inc., New York, NY, 1990. Type: Book (9780387523804)
May 1 1993
The computing neuron
Durbin R. (ed), Miall C., Mitchison G., Addison-Wesley Longman Publishing Co., Inc., Boston, MA, 1989. Type: Book (9780201183481)
May 1 1993
A practical guide to neural nets
McCord-Nelson M., Illingworth W., Addison-Wesley Longman Publishing Co., Inc., Boston, MA, 1991. Type: Book (9780201523768)
May 1 1993
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy