I continue to not be a statistician but rather a proverbial user of statistics, and therefore (at the risk of self-flattery) a dangerous person. Like many others in science, I was forced (in my case, a half century ago, and of course without preparation) into barely understood advanced statistics and Bayesian methods. The excellent but entirely theoretical (mathematical) book by Hogg and Craig [1], which I continue to recommend for that aspect, served at that time to fill at most one-third of the gap in my insight into statistics. Abramowitz and Stegun’s classic handbook [2], along with frequent use of IBM’s Fortran Subroutine Library, facilitated the creation of programs that computed the usual, and many unusual, statistics from observations, that is, from data: means, variances, correlation coefficients, and ad hoc statistics of interest in a particular area of research, such as today’s hot (again) random-matrix theory that is mysteriously shared by quantum mechanics (for example, the Gaussian orthogonal ensemble) and pure mathematics (for example, the Riemann zeta function). The handbook of functions [2] was the practical complement that served more or less toward closing the remaining two-thirds of my insight gap. Although more cohesive industrial-strength tools existed at the time, they were by no means ubiquitous, as was also the case with computers themselves in those middle years of computing. (There is a point to these reminiscences.)

The pendulum was at its opposite extreme two or three decades later. Manifestly and self-consciously practical books and software tools, such as spreadsheets with statistical functions and “For Dummies” books, appeared in some quantity. One would at best catch a glimpse of the theory from the hands-on steps that one was taking when using these resources. I hasten to add that this species of tools and books continues to be useful in its own right, and has a suitable place, my sorely missed idol E. W. Dijkstra to the contrary notwithstanding, as long as that place is not misconstrued to be one of thorough pedagogy or ample treatment of a subject.

This 1,000-page book provides evidence that both statistics tools (here, R) and the excellent instruction required to promulgate them (this R book) have evolved to a highly beneficial outcome, namely the one-stop learning and subsequent reference resource that we all dream about, thus effectively closing the above-described dichotomy of theory and practice. Though I don’t intend a glorification, hidden or otherwise, of my own lifelong problems with tools (word processors, spreadsheets, and databases, let alone compilers, in all of which show-stoppers rear their heads precisely when and where loss and damage are maximal), I must say that R and *The R book* have turned me around with respect to (confidence in and courage for) significant tool use. And there is, of course, also the boost to state-of-the-art statistics for a nonstatistician like me.

This book is about three main disciplines: computing, statistics, and the synthesis that is the R language. The categories of intended reader include beginners, students, and persons experienced in none, some, or all three disciplines, with the latter understood as only needing a complete and well-organized reference manual.

The how-to-use-this-book section is truly helpful to the seven types of reader listed, which span “beginner in both computing and statistics” through persons “familiar with statistics and computing, but need[ing] a friendly reference manual.” Although “the book is structured principally with [the beginner in both computing and statistics] in mind,” there is characteristically useful advice for those, for example, who have “done regression and ANOVA [analysis of variance], but want to learn more advanced statistical modeling”: “the best plan is to go directly to chapters 10-12 to see how the output from linear models is handled by R.” The guidance is crisp and actionable for each of the seven categories of reader.

The topics treated in the 29 chapters include: installing, running, and getting help in R; R language essentials; data input and dataframes; graphics, including shape, size, and multivariable plots; tables, including dataframe conversion; mathematics, including probability distributions, matrix algebra, and calculus; tests, including binomial, chi-squared, Kolmogorov-Smirnov, and bootstrap; statistical modeling; regression and nonlinear regression; ANOVA and analysis of covariance (ANCOVA); generalized linear models; count data (also in tables); proportion data; binary response variables; generalized additive and mixed-effect models; Bayesian statistics; tree models; time series analysis; multivariate statistics; spatial statistics; survival analysis; temporal and spatial simulation; and the “look” of graphics as the final topic.

I dare say that every reader will have favorite chapters or sections. Mine is chapter 22, “Bayesian Statistics,” a species of statistics that is currently experiencing a virtual renaissance [3,4]. The explanations and explications in this short chapter are excellent and in line with those of the rest of the book.

I consider this a must-read book for its content, writing, and organization.

More reviews about this item: Amazon, Goodreads