Review of “R in a Nutshell” by Joseph Adler, O'Reilly Media

Tuesday, December 25, 2012

I review for the O'Reilly Blogger Review Program

Back in grad school, our curriculum involved your standard statistics course. For labs, we used SPSS, one of the industry-standard software packages. SPSS is full-featured, but expensive. The interface also leaves much to be desired, with way too much pointing and clicking to get anything done. At the time, I had heard from professors about a free, command-line-driven software package called R, but the course soon ended and I moved on.

Now, a few years later, I am in need of performing various statistical analyses, but without the budget of an academic department to pay for SPSS or other visual software. So it’s time to revisit R, which incidentally has become even more popular since I last looked into it. Fortunately, O'Reilly put its R in a Nutshell book on the Blogger Review rotation, so I quickly grabbed it for review. Being a reference book, I haven’t read it cover-to-cover, but after working through the first few parts and several of the later chapters, I can say that the book does a great job of getting you up to speed on how to read the R language, how R organizes and works with datasets, and how to perform analyses and generate visuals.

I should caution upfront that this book is specifically about how to operate R, and not about statistics in general. R will dutifully carry out the calculations asked of it, but the operator still needs to know which calculations are appropriate for the data and goals at hand.

That said, R in a Nutshell is well-organized for first-time users of R. The book starts with an overview of the R syntax, data frames, and functions, then moves to the logistics of getting data into and out of R. Often another language will be used to format and prepare the data for analysis (to get an idea of how that process works, O'Reilly’s Exploring Everyday Things with R and Ruby is a good place to start).

There’s also an entire section dedicated to one of R’s specialties, data visualization! The book covers the built-in visualization tools, as well as popular ggplot2 package. Creating compelling visuals is easily one of the most exciting things about R, so it’s good to see this thoroughly covered. Finally, the book has many chapters covering the various statistics and analyses R can perform.

In short, it’s easy to recommend R in a Nutshell. R is by far the most popular free statistics package, but its command-line format can lead to lots of web searches to track down documentation. This book organizes a lot of that basic information and presents it well, so it is sure to be a big time saver for anyone who is new to R or would just like an easily-searchable reference.

Note: I received this book for free through the O'Reilly Blogger Review Program