Top positive review
If R is Ferrari of Data science world then this book is Nut and Bolt manual to cary out routine but critical operations on data
8 April 2018
Firstly understand that this book does not cover most of models, algorithms etc. It actually covers all the steps that you need to perform on "real life" data to get it ready to run your models. It does not teach about K-means, regression, bagging and boosting etc. For that you can use James,Witten,Hastie,Tibshirani's excellent "Introduction to Statistical Learning" book. But this Wikham book fills a critical void.
In my personal experience it takes more than 90% time to get the data in a shape where you think it is now ready for you to run your algorithms. This is far cry from the sterile,cleaned up data sets used in the data science courses. The real life data is very messy. This book is clearly written for the practitioners who deal with real life situations. This book helped me a lot to get my data in proper form. which includes joining two disparate data-sets based on keys, creating new derivative columns based on existing columns, remove unwanted feature set, take care of all those NAs, filter the rows on criteria, perform different aggregate functions (mean, sum, median etc) date wise or factor wise. It also covers bit of ggplot basics so you can start plotting the data from the word go. This book uses the tidyverse family packages (especially dplyr) written by the same author. ggplot is also written by Wikham. Both packages contain functions that can improve your productivity 10x. Both have now become de facto packages used by most R data scientists.
This book immensely helped me perform data analysis on my messy data. It taught me what and how to perform necessary operations on my data using very useful functions in dplyr package which is part of tidyverse family.
R is very powerful environment for data analysis. I would call it Ferrari of data science world. But although very powerful, it has its own quirks and learning curve, even for experienced programmers. The packages like dplyr, caret, ggplot make your life easier and allow you to fully harness the horse power of R.