Deep learning is used everywhere, from environment perception in self-driving vehicles to natural language understanding, to fraud and real-time threat detection, to name only a few examples. This book introduces both the fundamental and latest concepts in the field, as well as most of the mathematical foundations required to understand them, including linear algebra, probability and information theory, and numerical computation.

Deep learning is a class of optimization methods for artificial neural networks (ANNs), which have numerous hidden layers between the input layer and the output layer, and thus feature a large internal structure. In extensions to learning algorithms for network structures with very few or no intermediate layers, as in the case of the single-layered perceptron, the methods of deep learning also enable stable learning attainment for numerous intermediate layers. The general assumption is that by increasing the depth of an ANN, additional layers of abstraction are added, which is what the remarkable generalization power and manifoldness of deep neural networks (DNN) are based upon.

In the first part of the book, a detailed introduction to applied mathematics and machine learning basics is given, which covers working with scalars, vectors, matrices, and tensors. This part also covers the basics of probability and information theory, as well as machine learning basics, such as supervised and unsupervised learning and (stochastic) gradient descent. The part concludes with the limiting factors of shallow ANN architectures. In the second part, the authors elaborate on modern practices of DNNs, covering deep feed-forward ANNs, regularization techniques, convolutional networks, and recurrent and recursive neural networks. The particular emphasis on regularization techniques--that is, parameter tuning, data augmentation, and sparse representations--helps readers understand how to design and train the introduced algorithms such that the performance does not significantly decrease on new inputs. Readers learn how to apply regularizers that help to reduce variance significantly while minimally increasing the bias. The second part concludes with explanations about multiple application scenarios for deep learning, giving insights into computer vision, natural language understanding, speech recognition, recommender systems, and more. The third part of the book covers the latest deep learning research, including linear factor models, deep autoencoders, and generative models. This section covers the most recent and most important research conducted by the authors, providing a promising outlook on how existing deep learning applications will be improved, and which ones are to come.

The book is for people with intermediate knowledge about machine learning. Although the authors explain all required mathematical and statistical concepts, readers should have a basic mathematical background. A website providing exercises accompanies the book. However, these are provided by the open-source community, and as of now, only exercises for the chapter on linear algebra are available. The book uniquely covers both the fundamentals and the latest concepts of deep learning and provides practical considerations and application scenarios in all of the chapters.

More reviews about this item: Amazon, Goodreads