Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Introduction to statistical machine learning
Sugiyama M., Morgan Kaufmann Publishers Inc., San Francisco, CA, 2016. 534 pp. Type: Book
Date Reviewed: Apr 11 2017

The huge amount of data resulting from the increase in connected computers, mobile devices, and sensors in diverse domains has facilitated a boom of machine learning in recent years. Machine learning, the topic of this book, plays a central role in analyzing such data.

The book consists of five parts. The first part gives an overview of machine learning and its different tasks. The second part is concerned with probability and statistics. It is a good revision of basic and even more advanced concepts that are then applied in the following parts. However, it is not intended for absolute beginners; it requires understanding of advanced mathematical analysis and algebra if the reader wants to follow all the reasoning and mathematical proofs presented.

Parts 3 and 4 are focused on generative and discriminative approaches, respectively. These are probably the two main parts of the book. Other books in this field often choose different categorizations of methods, for example, supervised versus unsupervised, regression versus classification, or parametric versus non-parametric approaches. All these different views are also included in this book; in addition, frequentist versus Bayesian approaches are discussed. This can sometimes be overwhelming, but the presented concepts are generally very well explained with the help of illustrative figures. From the many approaches presented in these two parts, I particularly recommend the chapters on regularized and robust regression (chapters 23 to 25) because they show how a relatively simple linear model can be extended to be able to handle noisy data with many predictors and outliers, which is sometimes lacking in other books in this field.

I also appreciate the last part (“Further Topics”), which presents advanced concepts, such as ensemble and multitask learning, outlier and change detection, semi-supervised learning, and even some unsupervised approaches to dimensionality reduction (including autoencoder and restricted Boltzmann machine used in the deep neural networks that grew in popularity in the last few years thanks to their success in various machine learning tasks).

These parts can be read relatively independently; for example, one can skip the second part if one is familiar with the concepts presented. Also, the approaches presented in Parts 3 and 4 are more or less complementary.

Overall, the book covers an exceptionally wide range of machine learning topics in just 500 pages, thus giving a very good overview of the state of the art in the field as well as pointers to other sources. It manages to present not only the basic methods, but also advanced and relatively new concepts developed in the last few years. On the other hand, this also means that the author cannot go into depth with all of the presented approaches or give enough practical examples or use cases that would help with understanding and show the use of the approaches in real-world applications. Also, some concepts, such as bias-variance tradeoff or overfitting, could be more thoroughly discussed and explained because they play a crucial role in many practical machine learning problems.

Therefore, it is more theoretical, helping readers understand the background mathematics. Pieces of code in MATLAB/Octave are included, but these are aimed at illustrating the mechanics of the presented algorithms rather than their practical use. Also, the author assumes the reader’s familiarity with this programming language.

Thus, the expected audience of this book is researchers or graduate students wishing to get an overview of the different machine learning approaches based on good mathematical foundations or practitioners (engineers) wishing to get a deeper understanding of the approaches they want to use for their specific problems. I would recommend using this book in combination with other existing ones that are more focused on machine learning applications in R, Python, or Scala because they can nicely complement each other.

Reviewer:  M. Bielikova Review #: CR145185 (1706-0354)
Bookmark and Share
  Reviewer Selected
 
 
Learning (I.2.6 )
 
 
Statistical Computing (G.3 ... )
 
Would you recommend this review?
yes
no
Other reviews under "Learning": Date
Learning in parallel networks: simulating learning in a probabilistic system
Hinton G. (ed) BYTE 10(4): 265-273, 1985. Type: Article
Nov 1 1985
Macro-operators: a weak method for learning
Korf R. Artificial Intelligence 26(1): 35-77, 1985. Type: Article
Feb 1 1986
Inferring (mal) rules from pupils’ protocols
Sleeman D.  Progress in artificial intelligence (, Orsay, France,391985. Type: Proceedings
Dec 1 1985
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy