- Publisher: Shroff Publishers & Distributors Pvt Ltd (1 July 2013)
- Language: English
- ISBN-10: 935110110X
- ISBN-13: 978-9351101109
- Package Dimensions: 23 x 17.8 x 2.4 cm
- Average Customer Review: Be the first to review this item
- Amazon Bestsellers Rank: #2,67,551 in Books (See Top 100 in Books)
Hadoop Beginner's Guide Paperback – 1 Jul 2013
Customers who viewed this item also viewed
Customers who bought this item also bought
Enter your mobile number or email address below and we'll send you a link to download the free Kindle App. Then you can start reading Kindle books on your smartphone, tablet, or computer - no Kindle device required.
To get the free app, enter mobile phone number.
|5 star (0%)|
|4 star (0%)|
|3 star (0%)|
|2 star (0%)|
|1 star (0%)|
Most helpful customer reviews on Amazon.com
The explanations are pretty good.
But when someone is attempting to learn a system, one of the most frustrating things is to have a lot of errors in the examples.
If the author had just gone through trying to type in the examples to see if they worked they would have seen that at least 10% of the example commands are wrong and/or incomplete.
It wouldn't be so bad if I already knew Hadoop, the errors would be obvious because once I do google searches to figure out what the author meant to display, it is very obvious.
But I bought this book to learn the product which means, I'm left hanging every time there is an error.
I suggest you wait for the next version so maybe they will do some editing and get it right.
In the mean time, get a different book on Hadoop.
I started working through the book. The first chapter is a great short and sweet introduction to Hadoop, MapReduce and HDFS. Its not too lengthy. Just the right length for a person like me who gets bored pretty quickly with a lot of information. From Chapter 2, the real hands on work started. Although the instructions are mainly for Ubuntu linux, it worked like a charm in Mac. I didnt have any issues that other reviewers mentioned. when you run the first example, i couldn't use the exact command given in the book to run the first example of Pi calculation simply because i was using the latest version of Hadoop so the Jar name was different. Thats not something that the author can fix. once you give the correct jar name, it worked like a charm. Im now moving alone to the other chapters very quickly. Its a really good book if you just wanna familiarize yourself with the product, which is exactly the book is about. its well written. easy to follow. Author doesnt try to bore you to death.
I recommend reading the below book if you first wanna understand what the Big Data is all about about and see what type of issues its trying solve
But the book is filled with an irritating number of errors. For example, there are many places where it shows that you run the hadoop command with the keyword "Hadoop". No, it's "hadoop". Linux is of course case-sensitive. Data files (at least in the Kindle version) have missing tabs, so you see this:
And you have to figure out that it should be:
These are just two of many examples. There are also lots of typos in the code (e.g., a missing letter in a Java import, or an extraneous space), and in at least one case, a couple of missing lines. If you are totally brand-new to Hadoop, this may throw you. I had to spend some time googling some things, and getting other real-world examples, to figure out what was wrong. But if you are familiar with Linux, and you approach this book with a think-on-your-feet attitude, it can be a great source to get started.
There are also too many examples of twisted syntax that can lead to different, even contradictory, interpretations. A good editor would have been really helpful here.
One disappointing omission is that the book mentions HBase, briefly describes what it is, and then says nothing more than that the book won't cover it. In my opinion, a beginner's guide to Hadoop should devote an entire chapter to HBase. But that's the only omission I can cite.
Be aware that this book covers 1.04 of Hadoop. Things have changed a fair amount since then. So you should get version 1.04 if you're going to follow the examples in the book, but of course if you work with version 2, there will be other things to learn.
Finally, and this is an observation rather than a criticism, it should be kept in mind that this is truly a beginner's guide. There is much else to learn. For example, the book presents mostly the straightforward, typical way of reading and writing data using the MapReduce APIs. There are others. The book does have one example of how to process two input files, but this example only points out that there is much in the rich Hadoop API to be explored. Another example is that the pre-0.20 version of Hadoop had a different basic API, and it is still present in the code. The book touches on this, but doesn't have any comprehensive "what's changed" coverage.
I also recommend, before reading this book, downloading Hortonworks' Hadoop sandbox, and going through the tutorials included in it. It's brain-dead simple to download their VM image and run it.
I gave the book four stars because it has been very helpful. If the errors were all fixed, I'd raise it to five stars.