Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Machine learning using R
Ramasubramanian K., Singh A., Apress, New York, NY, 2016. 566 pp. Type: Book (978-1-484223-33-8)
Date Reviewed: Oct 26 2017

Karthik Ramasubramanian and Abhishek Singh, experts in data science and business analytics, have written a comprehensive reference book on machine learning using R. This book covers a relevant and hot topic in today’s digital world.

The book has nine chapters and is written with a broad audience in mind--from beginners to experts in machine learning. If your interest is in machine learning and you already have a decent background in statistics and R, then you can head directly to chapter 6, the core chapter in this book, which is a deep dive into machine learning from a theoretical and practical perspective. You may continue reading the remaining chapters as they focus on topics such as machine learning model evaluation, performance improvement, and scalability.

However, if you are beginner, then your best bet is go sequentially from chapter 1, which covers 101-level concepts starting from definitions of machine learning and artificial intelligence to the basics of probability and statistics. For those who are also new to the software language R, chapter 1 covers the fundamentals of R while guiding them to try out the basic commands, using the provided software code, to gain familiarity in R. The initial chapter also briefly covers the machine learning end-to-end process flow.

Chapters 2 and 3 are particularly useful for data scientists and data analysts. Chapter 2 covers a very important aspect of preparation and exploration of data, probably the toughest step in a machine learning project. The authors discuss in detail different data types and techniques applicable for data preparation and exploration. Chapter 3 focuses on sampling, covering basic concepts, detailed probability, and non-probability sampling techniques. A credit card fraud dataset is used as a case study to explain, in the most practical manner, the concepts in chapters 2 and 3.

Now that the data is analyzed and cleaned up for better use, chapter 4 covers data visualization leveraging the R software package called ggplot2. A variety of simple (line, column, scatter, pie, and so on) and complex visualizations (heat maps, bubble, word clouds, spatial maps, and so on) are discussed in detail with appropriate software code and datasets to produce these data visualizations. In addition to reading the concepts, readers can also try out the examples (in an R software tool) as the required software code is provided.

The next chapter is on feature engineering, and aims to help data scientists optimize and select relevant features of the machine learning algorithms. Similar to earlier chapters, necessary software code is provided along with detailed descriptions.

The remaining chapters dive deep into basic and advanced machine learning concepts. Example real-world datasets around home sale prices, breast cancer diagnosis, and so on are provided in chapter 6. It also covers various statistical and machine learning techniques ranging from linear models to correlation and regression techniques, decision trees, and others. Chapter 7 is on how to evaluate machine learning models for continuous and discrete outputs and probabilistic techniques. Once the model is evaluated, its performance should be improved; chapter 8 focuses on this topic. Scalability is often a challenge faced by data scientists and analysts. Handling huge datasets that come in large-sized batch files or as real-time data streams poses a key challenge for data scientists. Chapter 9 discusses techniques leveraging cloud computing and big data to handle the scalability aspects of a machine learning problem.

All in all, this is a fantastic and commendable effort by the authors to write a comprehensive book on machine learning. They have taken special care to provide complete R software code while discussing machine learning concepts and use cases. While there are plenty of resources on the Internet about machine learning, this book will serve as a single-source reference for both theoretical and practical machine learning leveraging R.

The R software has a huge user community and it changes pretty rapidly. New R packages and features are introduced on a regular basis. While the book covers a variety of code examples, there may be better and more efficient ways of solving machine learning problems in the future. The reader should keep this in mind and use other resources, available on the Internet, when appropriate. I expect to see more revisions of this book in the future, to keep the discussed concepts and software code up to date and relevant.

More reviews about this item: Amazon

Reviewer:  Ponmurugarajan Thiyagarajan Review #: CR145622 (1712-0785)
Bookmark and Share
  Reviewer Selected
Featured Reviewer
 
 
Learning (I.2.6 )
 
 
Statistical Computing (G.3 ... )
 
 
Applications And Expert Systems (I.2.1 )
 
 
Probability And Statistics (G.3 )
 
Would you recommend this review?
yes
no
Other reviews under "Learning": Date
Learning in parallel networks: simulating learning in a probabilistic system
Hinton G. (ed) BYTE 10(4): 265-273, 1985. Type: Article
Nov 1 1985
Macro-operators: a weak method for learning
Korf R. Artificial Intelligence 26(1): 35-77, 1985. Type: Article
Feb 1 1986
Inferring (mal) rules from pupils’ protocols
Sleeman D.  Progress in artificial intelligence (, Orsay, France,391985. Type: Proceedings
Dec 1 1985
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy