Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Linear algebra and optimization for machine learning
Aggarwal C., Springer International Publishing, New York, NY, 2020. 516 pp. Type: Book (978-3-030403-43-0)
Date Reviewed: Mar 26 2021

This excellent introduction to linear algebra is aimed at folks who want to better understand how machine learning really works. Modern machine learning requires a background in linear algebra, but most books and courses focus more on the high-level tools--more “how to use TensorFlow” and less “how does this all really work.” Meanwhile, most linear algebra books do not focus on the techniques and perspectives needed for machine learning. This book aims to fill this gap, and does so beautifully. The writing style is rigorous yet crystal clear, captivating, and well motivated.

The book starts with an assumption of little more than high school math and some prior exposure to Taylor expansions and partial derivatives, but the speed picks up quickly. It expects a mature ability to read and understand equations. The author does not explicitly declare his intended audience, but the book appears suitable for advanced undergraduates or graduate students. It can also be useful for motivated practitioners, but will require a significant time commitment and desire to work through the exercises.

The focus on machine learning leads to a somewhat nonstandard order of presentation. For example, solving systems of linear equations does not appear as a section head until page 68, and I could not find it in the index at all. But by the time this topic is addressed, it can be handled concisely yet completely, building on the material taught earlier in the book. Each chapter has frequent inline problems and ends with a brief summary, a list of further readings, and a very rich set of exercises.

Chapter 1 offers an introduction to the rest of the book. It quickly builds up to the basic matrix operations, including multiplication, inversion, norms, polynomials, and geometric operators. It then shares a series of quick teases on how this material will be useful for classic machine learning tasks such as recommender systems, clustering, classification, outlier detection, backpropagation, and so on. Most mentions of a task include a link to the later chapter(s) where additional details are provided.

The next two chapters start a deeper dive into linear systems. Chapter 2 looks not just at vector spaces, decomposition, and multiplication, but also applications like wavelets, discrete cosine, and discrete Fourier transforms. Chapter 3 continues with classic linear algebra: eigenvectors and diagonalization.

The next few chapters look at optimization, followed by factorization, similarity, and graphs. Throughout, there is a focus on real machine learning problems and algorithms.

This book is not a light read. It is a textbook in every sense of the word. Getting through it will require a good teacher and/or a willingness to solve many of the supplied exercises. But my sense is that the motivated reader will walk away with a rich understanding of modern linear algebra and how it is applied in today’s machine learning systems.

That said, this book was not perfect for me. I came to it with a casual understanding of machine learning and an undergraduate study of linear algebra in my distant past. I was not willing to give the book much more time than I normally give to a book review, and certainly did not seriously tackle any of the exercises. So, though the following criticisms are not really fair, I would have benefited more if the book had done the following: used shading or font to highlight key definitions; offered a somewhat more comprehensive index (the index is about four pages with maybe 300 entries); and offered solutions to some of the exercises.

More reviews about this item: Amazon

Reviewer:  David Goldfarb Review #: CR147226 (2107-0173)
Bookmark and Share
  Reviewer Selected
Featured Reviewer
 
 
Learning (I.2.6 )
 
Would you recommend this review?
yes
no
Other reviews under "Learning": Date
Learning in parallel networks: simulating learning in a probabilistic system
Hinton G. (ed) BYTE 10(4): 265-273, 1985. Type: Article
Nov 1 1985
Macro-operators: a weak method for learning
Korf R. Artificial Intelligence 26(1): 35-77, 1985. Type: Article
Feb 1 1986
Inferring (mal) rules from pupils’ protocols
Sleeman D.  Progress in artificial intelligence (, Orsay, France,391985. Type: Proceedings
Dec 1 1985
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy