A wise man (reportedly Yogi Berra, but more likely a legislator in the Danish parliament) once observed, “Prediction is difficult, especially of the future.” In spite of the challenge, humanity’s oldest written records attest the demand for tools for prognostication. Today’s seers have traded computers for the sheep livers of ancient Mesopotamia, and data scientists have taken the place of the priests of Marduk, but to business executives faced with commissioning predictive products, the technology seems no less arcane and inaccessible. In this second edition of a successful 2013 title, Eric Siegel draws back the curtain, providing an easy-to-understand account of what current offerings in predictive analytics can do, the principles that guide their use, and a wide range of concrete case studies. While the 332-page volume assumes no technical background, two online supplements provide 131 pages of notes and references to sources for the cases presented and technical papers with mathematical details. The size of these supplements, which were integral to the 2013 edition’s 320 pages, are an indication of the amount of new, updated material documenting this fast-moving field. Online resources also include teaching materials to support use of the book in the classroom.
After an introductory chapter, Siegel develops the subject from seven perspectives. Chapter 1 addresses deployment, the decision to entrust real-world decisions with concrete consequences to the conclusions of an electronic seer. Chapter 2 explores the ethical issues involved in the massive collection of personal data and its use in prediction. Siegel recognizes the sensitivity of the subject, but argues that the more data is available, the better prediction is, and the less danger there is from mistaken predictions. Chapter 3 focuses on the data that drives the new algorithms and some statistical flukes that can lead to misleading conclusions. Chapter 4 provides a clear, nontechnical explanation of decision trees, one of the main algorithms used in the field, and chapter 5 emphasizes the power of ensembles of diverse models. Chapter 6 is an extended review of IBM’s Watson and the Jeopardy challenge. Chapter 7 focuses on the difference between predicting an outcome of an action and comparing that outcome with what would have happened if the action had not been taken, sometimes called uplift or persuasion modeling. Every chapter is amply supported by case studies, and at the center of the book is a collection of 182 summaries of further cases, all documented in the online notes.
This volume is an excellent resource both for nontechnical readers who want some understanding of what predictive analysis can do and for developers of tools and applications in this field seeking to understand the state of practice that others have achieved. It engages the reader in a lively, concrete review of the state of the field, and in a spirit of appreciation, I note three questions that it simulated in my mind.
First, as the word is commonly used, “prediction” refers to reasoning from current facts to future events that result from those facts. Another broad area of data mining, classification, seeks to label data items based on their similarity to other data items, without any distinction of past and future. Siegel is certainly aware of this distinction, and he labors to justify the inclusion of Watson in his examples. On page 220, he recognizes that “answering questions is not prediction in the conventional sense,” but appeals to an alternative sense of “predict”: “to imperfectly infer an unknown.” This definition broadens the sense of “predict” so much as to make it meaningless, though he goes on to suggest that what Watson is doing is predicting what a human expert would say about the answer that it produces. But page 220 is a bit late to address this distinction. Many of the examples in the introduction (for example, grading student essays, detecting distracted drivers, recognizing fraudulent banking transactions), as well as throughout the book, are not predictive in any temporal sense. It would be wonderful to have a volume on “nonpredictive analytics” as clear and as well documented as this one.
Second, the methods he considers are all based on statistics, and thus do not engage causality. The current market is focused on the accuracy of prediction, whatever the underlying reasoning, but responsible use of predictive analytics depends on engaging the decision-maker with a causal explanation for the prediction, a requirement that statistical methods cannot address. There are predictive technologies (for example, constructive agent-based simulation or system dynamics) that embody causal theories of the world, and one looks forward to a third edition of the book that touches on some of these less common but very promising approaches and their roles in more intelligent use of prediction by decision-makers.
Third, the book takes an asymmetric view of prediction: prediction is something done by one group to anticipate the actions of another group, and thus to decide how to act toward that other group. This view leads naturally to accuracy as the standard for evaluating predictive algorithms. After learning of these techniques, I might reasonably want a system to predict my own actions and their likely outcomes, so that I could modify my behavior to avoid undesirable consequences. If I change my behavior in light of the prediction, the prediction becomes false, but in this case the prediction might still have given me very useful insight. How should one evaluate predictive analytics used in settings where their success intrinsically diminishes their accuracy?
More reviews about this item: Amazon, Goodreads