Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Machine learning : an algorithmic perspective (2nd ed.)
Marsland S., Chapman & Hall/CRC, Boca Raton, FL, 2014. 457 pp. Type: Book (978-1-466583-28-3)
Date Reviewed: Mar 27 2015

The book’s emphasis on algorithms distinguishes it from other books on machine learning (ML). This is further highlighted by the extensive use of Python code to implement the algorithms. This results in a presentation in which the explanations of the details of an algorithm are occasionally somewhat opaque--for example, in the discussion of back propagation in a neural network--and readers must rely on the code for better understanding.

The outline of the book is as follows. Chapter 1 gives an introduction to the field of ML, distinguishing between supervised and unsupervised learning. Chapter 2, “Preliminaries,” describes how to validate and test ML algorithms, and introduces the important statistical notions that underlie the algorithms described in the subsequent text. This is a valuable idea in the presentation, as it puts much material that will be used throughout the book in one place. Chapter 3, “Neurons, Neural Networks, and Linear Discriminants,” describes the biological motivation for neural networks, details the perceptron, and discusses classification when a linear separation is possible. Chapter 4, “The Multi-Layer Perceptron,” discusses multi-layer neural networks and back propagation for weight discovery. Chapter 5, “Radial Basis Functions and Splines,” considers networks where the neurons only respond to nearby neurons. Chapter 6, “Dimensionality Reduction,” considers the ways in which a multidimensional space can be reduced to one with fewer dimensions. Chapter 7, “Probabilistic Learning,” describes the important expectation-maximization (EM) algorithm that is used for maximum likelihood estimation. Chapter 8, “Support Vector Machines,” describes the role of kernel functions in optimal separation as it is used in classification problems. Chapter 9, “Optimization and Search,” addresses problems such as the traveling salesman, where the function to be optimized is not a differentiable one but rather a discrete one. Chapter 10, “Evolutionary Learning,” returns to biology for motivation and describes genetic algorithms. Chapter 11, “Reinforcement Learning,” describes Markov decision processes and the Q-learning and state-action-reward-state-action (SARSA) algorithms. Chapter 12, “Learning with Trees,” describes decision trees and the iterative dichotomiser 3 (ID3) algorithm. Chapter 13, “Decision by Committee: Ensemble Learning,” describes ways in which different learning systems can be combined to improve performance using boosting and bagging. In Chapter 14, “Unsupervised Learning,” the author introduces the k-means algorithm for classification and the self-organizing feature map algorithm. Chapter 15, “Markov Chain Monte Carlo (MCMC) Methods,” shows how to generate samples corresponding to a given probability distribution. Chapter 16, “Graphical Models,” introduces Bayesian networks--here, graphical means graph based, not picture based. Chapter 17, “Symmetric Weights and Deep Belief Networks,” introduces the idea of chaining together a number of three-layer networks--deep encoders--for learning. The final chapter, “Gaussian Processes,” describes how to combine Gaussian processes to model distributions. Finally, there is an appendix on programming in Python.

The general approach taken in each chapter is to begin with a brief, informal introduction to the topic(s). The algorithms introduced are then described. This is done by giving both a narrative description of the algorithm and pseudocode for the algorithm. This algorithm is then implemented using Python code. While the book tries to tread lightly in its use of mathematics, the author does include mathematical justifications when necessary. For example, in the discussion of back propagation, mathematical justification is given in a concluding section. Some readers might be more comfortable with having the details presented earlier, but they can skip ahead if they wish.

Each chapter contains a guide to further reading and, except for the first chapter, a set of exercises based on the material. Some of the exercises require the reader to modify the code given in the book, while others use short examples and require the reader to work through the details of an algorithm. Both of these are valuable ideas. Some of the examples used in the body of the text appear in multiple places, allowing the reader to compare the use of differing techniques on the same problem; this is also done with some of the exercises. All of the code in the book, plus some additional code, is available on the book’s website (https://seat.massey.ac.nz/personal/s.r.marsland/MLBook.html).

Because this is a second edition and there has been some rearranging of the material from the first edition, often material that comes later in the book is referred to as though it appears earlier. Fortunately, this does not affect readability. There are also a number of typographical errors. In most cases, it is clear what the correction should be, however, in one case--the definition of correlation in Equation 2.23--it is not obvious. A more detailed description of how to install the special Python modules NumPy and SciPy would have been helpful.

The book covers a large area of ML--although not all, as the logic-based approaches of Muggleton [1] and Shapiro [2] are not covered. The topics chosen do reflect the current research areas in ML, and the book can be recommended to those wishing to gain an understanding of the current state of the field.

More reviews about this item: Amazon

Reviewer:  J. P. E. Hodgson Review #: CR143292 (1506-0460)
1) Muggleton, S. The MIT encyclopedia of the cognitive sciences (MITECS). MIT Press, Cambridge, MA, 1999.
2) Shapiro, E. Inductive inference of theories from facts. February 1981, http://www.cvc.yale.edu/publications/techreports/tr192.pdf.
Bookmark and Share
  Featured Reviewer  
 
Learning (I.2.6 )
 
 
Applications And Expert Systems (I.2.1 )
 
 
Reference (A.2 )
 
Would you recommend this review?
yes
no
Other reviews under "Learning": Date
Learning in parallel networks: simulating learning in a probabilistic system
Hinton G. (ed) BYTE 10(4): 265-273, 1985. Type: Article
Nov 1 1985
Macro-operators: a weak method for learning
Korf R. Artificial Intelligence 26(1): 35-77, 1985. Type: Article
Feb 1 1986
Inferring (mal) rules from pupils’ protocols
Sleeman D.  Progress in artificial intelligence (, Orsay, France,391985. Type: Proceedings
Dec 1 1985
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy