- Paperback: 378 pages
- Publisher: Morgan Kaufmann (26 November 2014)
- Language: English
- ISBN-10: 012802044X
- ISBN-13: 978-0128020449
- Product Dimensions: 19 x 2.2 x 23.4 cm
- Average Customer Review: Be the first to review this item
- Amazon Bestsellers Rank: #3,31,302 in Books (See Top 100 in Books)
Data Architecture: A Primer for the Data Scientist: Big Data, Data Warehouse and Data Vault Paperback – 26 Nov 2014
Customers who bought this item also bought
Description for Data Architecture: A Primer for the Data Scientist: Big Data, Data Warehouse and
About the Author
Best known as the “Father of Data Warehousing, Bill Inmon has become the most prolific and well-known author worldwide in the big data analysis, data warehousing and business intelligence arena. In addition to authoring more than 50 books and 650 articles, Bill has been a monthly columnist with the Business Intelligence Network, EIM Institute and Data Management Review. In 2007, Bill was named by Computerworld as one of the “Ten IT People Who Mattered in the Last 40 Years of the computer profession. Having 35 years of experience in database technology and data warehouse design, he is known globally for his seminars on developing data warehouses and information architectures. Bill has been a keynote speaker in demand for numerous computing associations, industry conferences and trade shows. Bill Inmon also has an extensive entrepreneurial background: He founded Pine Cone Systems, later named Ambeo in 1995, and founded, and took public, Prism Solutions in 1991. Bill consults with a large number of Fortune 1000 clients, and leading IT executives on Data Warehousing, Business Intelligence, and Database Management, offering data warehouse design and database management services, as well as producing methodologies and technologies that advance the enterprise architectures of large and small organizations world-wide. He has worked for American Management Systems and Coopers & Lybrand. Bill received his Bachelor of Science degree in Mathematics from Yale University, and his Master of Science degree in Computer Science from New Mexico State University.
Enter your mobile number or email address below and we'll send you a link to download the free Kindle App. Then you can start reading Kindle books on your smartphone, tablet, or computer - no Kindle device required.
To get the free app, enter mobile phone number.
|5 star (0%)|
|4 star (0%)|
|3 star (0%)|
|2 star (0%)|
|1 star (0%)|
Most helpful customer reviews on Amazon.com
Shows perspectives of data architecture for different manner to work with data. Differences between an original definition of term for repetitive and non-repetitive data. Best practices for Data Vault!
1. Content is universally basic and rarely insightful.
2. Diagrams take up a LOT of real-estate, and they are a JOKE!
3. Textual references to diagrams are not at all helpful....no "interpretation" added to the text. For example: "Figure 3.3.1 shows that storing historical data has always been a challenge". I kid you not, that is how almost every diagram is referenced.
I bought this book for insight, but can't say I found it. Save time and money and don't buy this book, UNLESS you are looking for something at an extremely introductory level.
The problem with a primer is that the authors don't have to justify, exemplify or detail anything. Things are presented like this and you have no place to make a choice. It's not even take it or leave it, it's only take it. I mean most of the things look correct if you apply them and you happen to have the chance to have a situation where it fits. If you don't fit, you have no escape. A primer should present only clear simple concepts that are recognized throughout the community and ALL the concepts pertinent to the title. Imagine a data warehouse book where slow changing dimension is not mentioned, nor bitemporality, CWM, metamodel. OLAP is only mentioned in the glossary. Imagine a data architecture book where the words cartesian, constraints, enumeration or domain are not used. Even conceptual model is not used in the standard meaning. Those are cues that all the territory is not covered.
I would not recommend this book for a university student, a data professional or a data scientist. Just look at the glossary to convince you. A data model is defined as "an abstraction of data". DW 2.0 is defined as "the second-generation data warehouse architecture". MapReduce is defined as "a language for processing Big Data". A relational model is defined as "a form of data where data is normalized". Even Wikipedia can do better than that. Why putting terms in a glossary in a book if the terms are less precisely defined and/or do not help to contextualize the terms with the subject of the book. It leaves a bad taste for the rest of the book (The semantics may be loose, imprecise with many shortcuts and confusion).
This book tries to cover a lot of technologies in very few pages. A very large part is dedicated to Data Vault and it is, as usual, somewhat self-promoting. However, it could be the best book on Data Vault as far as I know.
I recommend that you skip right over the topics you already know and those who aren't the main subject, because the book presents a limited understanding of those topics : data governance, SDLC, CMMI, TQM, methodologies, Sarbanes-Oxley Act, Agile, Analytics, etc. They seem to be there in an attempt to cover all the topics, but it's not convincing. In my opinion, those topics don't have their place in a technical primer.
There is no bibliography at all which is not very good for a primer that is supposed to introduce you to a topic and guide you to more detailed information if you need. It's disturbing that this topic don't have any scientific paper or any serious monography to refer to. Hey those guys are geniuses, they don't need it; I'm sorry even Albert Einstein made mistakes that were corrected by means of reviewed scientific publications.
In summary, it is not a primer, it is closer to a Data Vault cookbook in a data warehouse environment, with an extension on unstructured data that is not bad. Really, it looks like the most mature book on Data Vault, but you'll have to clean the place, make your own experiments and check the coherence before applying it in a major project. Buy the book, discard some sections, put you own bookmarks, strikethrough the parts that are unproved or wrong, rephrase and fill the book with your experimentation notes.