Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Data architecture : a primer for the data scientist (2nd ed.)
Inmon W., Linstedt D., Levins M., Academic Press, Inc., San Diego, CA, 2019. 431 pp. Type: Book (978-0-128169-16-2)
Date Reviewed: Jul 31 2020

Showing a perceptible pattern as a line-edited transcript consolidated from one or more slide-based seminars, this book provides many illustrations that would otherwise have been the focus of a presenter’s lecture and interactive session. However, the printed transcription loses the immediacy and opportunity for feedback of a live seminar. In a static context, too many illustrations seem at times to add very little to the written text. The exposition also appears uneven: on one hand, the content aims at elementary plainness; on the other, at times, it expects from the reader a deeper background.

These and other complications may hypothetically stem from the fusion of disjointed seminars or other source material, perhaps overlapping in scope, consolidated under limited content editing. For instance, this book sometimes expects tolerance for, or even interest in, the repetitious exposition useful for drilling new, basic, and unfamiliar concepts to data science academy students with a weak science, technology, engineering and math (STEM) background. At other times, it expects a clear and unequivocal understanding of the differences between an assembler, a compiler, and an interpreter, or more abstract concepts like, for example, the meaning of normal forms in the context of entity-relationship data storage.

As an introductory textbook, it lacks end-of-chapter exercises or even a minimal bibliography, and provides a glossary that is at best approximate. As a means to divulge data science to a lay business-oriented audience, it meets the goal of covering, in some form, many of the field’s buzzwords; however, this is done with a level of technical detail that such an audience could find unnecessary.

Overall, this work remains confusing in its intent. The content is organized as follows. Chapters 1 through 4 (approximately 30 percent of the page count) drill the reader on the meaning of structured and unstructured data, may it be repetitive or nonrepetitive. These chapters are inclusive of historical views of the progress, leading to the current state of data storage for conventional (relational database management system, RDBMS) and, later, big data environments.

Chapters 5 and 6 tackle the business value of data. Chapter 6, in particular, extensively promotes data vault as an exemplary conceptual means for back-end data persistence. Chapters 7 and 8 address concepts associated with persisting data, starting from response time and covering entity relationship, data warehousing, and big data, as well as high-level perspectives on data modeling and architecture, including a historical overview.

Chapters 1 through 8 lack a cohesiveness that is better realized from chapter 9 onward. Chapters 9 through 13 (approximately 25 percent of the page count) introduce a conceptual framework for data analytics. The presentation is partitioned between analysis for repetitive (for example, time series) or for nonrepetitive (for example, email) data. Discussion of the former (chapter 9) includes an informal drill on elementary statistics concepts such as correlation. It also covers the difference between data and metadata, the sensitivity of information, different procedural intents such as filtering (selecting certain data from a whole) or distilling (synthesizing new information from an existing corpus), the usage of metrics, and archival means. Discussion of the latter (chapter 10) drills on disambiguation and contextualization, for instance, the use (possibly in combination) of taxonomies, ontologies, homographic resolution, tagging, and proximity analysis. There are examples from call center data and medical records.

Chapters 11 through 13 focus on response time, processing mechanisms, storage, and analytical results. They emphasize the multiplicity of processing environments and discuss the match between the organization of persistence (warehousing, data marts) and the intended purpose of analytical results, for instance, corporate decision-making or individual processing.

The last five chapters collectively make up 14 percent of the page count. Chapter 14 provides an overview of the need and means to integrate application data from sundry sources into a corporate data model. Chapter 15 drills on the concept of the system of record. Chapter 16 is a summary of the business value of data, and chapter 17 is a summary of the challenge presented by textual data. Chapter 18 is a very high-level review of data visualization.

The entire presentation appears intent on providing contextual information about, rather than insight into, the mechanisms of data science. It prefers top-down explanations rooted in the business-oriented reasons for data management. There are high-level conceptual overviews of technical aspects, yet additional details sprout unevenly here and there. When that happens, the material is at best, and perhaps necessarily, approximate. This book leaves an overall uneven impression, aiming perhaps at the textbook market for a code academy.

More reviews about this item: Amazon

Reviewer:  A. Squassabia Review #: CR147028 (2101-0003)
Bookmark and Share
  Reviewer Selected
Editor Recommended
Featured Reviewer
 
 
Data Warehouse And Repository (H.2.7 ... )
 
 
General (H.4.0 )
 
Would you recommend this review?
yes
no
Other reviews under "Data Warehouse And Repository": Date
The IBM data warehouse architecture
Bontempo C., Zagelow G. Communications of the ACM 41(9): 38-48, 1998. Type: Article
Jan 1 1999
Building the data warehouse
Gardner S. Communications of the ACM 41(9): 52-60, 1998. Type: Article
May 1 1999
Interactive data warehousing
Singh H., Prentice Hall PTR, Upper Saddle River, NJ, 1998. Type: Book (9780130803719)
Aug 1 1999
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy