Chris J. Lloyd, from the Melbourne Business School in Australia, stresses the important role of data in understanding business outcomes. In his book, you will not find a thorough treatment of machine learning and data mining techniques, nor will you find a conventional treatise on statistics. Instead, you will discover a pragmatic hands-on monograph that discusses key statistical concepts and the use of quantitative methods for solving realistic business problems using commercial off-the-shelf software (Microsoft Excel, in particular). As such, this book is intended for introductory MBA courses on quantitative analysis and undergraduate courses on business statistics. With some additions, it might also serve as the basis for the lab sessions of computer science courses on introductory data mining and business intelligence.
The book is organized around 20 chapters of about 25 pages, corresponding to 20 90-minute classes. Each chapter focuses on particular aspects of statistics that are relevant for the data analyst. Rather than explaining theoretical concepts as most textbooks do, it emphasizes particular business problems where the discussed concepts might be useful and discusses how the results of data analysis can influence business decision making. Roughly speaking, the chapters fall somewhere between Web-like tutorials and the case-based approach business schools have popularized.
After a slow start in the first two chapters on truly basic ideas in descriptive statistics, including coefficients of variation and z-scores, the book gradually gets up to speed during five chapters devoted to fundamental probability concepts; it finally reaches cruising speed when it jumps into data analysis techniques. After an aseptic survey of the most common probability distributions (namely, binomial, negative binomial, Poisson, normal, and lognormal), the heart and soul of data-driven business decisions emerges in the chapter on decision trees, where cumulative risk profiles and the value of information are discussed. Subsequent chapters delve into statistical tests, confidence intervals, correlation, regression, and time series analysis, in other words, the typical syllabus in any undergraduate statistics course. Each topic is carefully dissected through examples that illustrate how the different techniques are used in practice, and the author warns of the potential mistakes you can make when you focus too much on the numbers and lose the perspective of the business context the problem comes from.
The book includes plenty of practical cases extracted from real-world scenarios, from the Financial Times business school rankings to the demand for Versace bags or the valuation of real estate assets. Each chapter includes solved cases, “check your understanding” questions, and lists of exercises at the end for students to apply the concepts they have learned. Exercises and cases involve many common business situations, from analyzing customer satisfaction surveys and marketing campaigns to studying the effectiveness of product placement in supermarkets (statistical tests and confidence intervals), the risks involved in building a financial portfolio (correlations), and the seasonal adjustment of time series.
Raw data, in Excel format, is provided on the accompanying CD-ROM, to allow students to experiment. They are encouraged to look for insights into the many different problems posed by the business cases and case studies throughout the book. The accompanying CD-ROM also contains a digital nonprintable version of the whole book with solutions to the exercises included in each chapter.
Even though this book might not be suitable as the only textbook for an actual computer science course, the case-based approach it borrows from business schools is still worthwhile and exploitable in CS courses. It will allow future computer scientists to approach real-world problems without becoming mired in the theoretical formalisms and algorithmic details that often cause them to miss the forest for the trees.