Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Joe Celko’s complete guide to NoSQL : what every SQL professional needs to know about non-relational databases
Celko J., Morgan Kaufmann Publishers Inc., San Francisco, CA, 2013. 244 pp. Type: Book (978-0-124071-92-6)
Date Reviewed: Mar 4 2014

Most contemporary database professionals have been educated on relational models and products; however, most lack experience with legacy data systems, and many have little experience with the latest non-relational approaches.

In this book, the author tries to address these gaps. With new non-tabular data, there is a clear need for complementary views on data management. The book addresses non-relational data organizations in an eclectic sense, commonly known as NoSQL (“not only SQL”).

The book is organized into 13 short chapters covering a number of heterogeneous NoSQL topics without concentrating on a specific NoSQL product. While the book can hardly be considered a complete guide to NoSQL, since it is neither comprehensive nor lengthy, it does offer useful information on non-relational data management issues. Hence, the content is not sufficient to provide a deep understanding of the new mindset, but the references provided at the end of each chapter supply links for further study.

Chapter 1 recounts the development of the transactional world with a review of batch transaction processing; traditional issues of concurrency control; and atomicity, consistency, isolation, and durability (ACID). The author notes that traditional SQL relational database management systems (RDBMS) are not appropriate for every situation. In chapter 2, Celko introduces columnar databases, which operate on columns rather than rows like the relational model. In the third chapter, the author moves on to graph databases, with a discussion of their lack of standards. He describes a few graph languages, such as SPARQL, SPASQL, Gremlin, and Cypher. Chapter 4 discusses the MapReduce model, a widely used approach proposed by Google and Yahoo! for their big data implementations.

Chapter 5 explores other aspects of NoSQL, including streaming databases, complex events, optimistic concurrency, and complex event processing. The chapter mentions a few commercial stream-oriented products, such as StreamBase and Kx, but notes that no general standard exists in this area. Chapter 6 focuses on key-value stores with an informal discussion on handling keys and values. The chapter also discusses a number of products that emerged in 2013.

Chapter 7 features textbases, making the case that a lot of information is contained in unstructured text as opposed to structured data. The author introduces text mining and the issue of syntax versus semantics. The chapter ends with a reminder that meaning can be linked to domain-specific vocabulary, as exemplified in legal applications that use LexisNexis, or a medical domain implemented in the IBM Watson AI computing system.

Chapter 8 covers geographical data, illustrated with examples of postal code data from the US, Canada, and Great Britain. In chapter 9, the author presents the inevitable discussion of big data and cloud computing.

Chapter 10 changes direction completely with the introduction of biometrics, fingerprints, and specialized DNA databases. In chapter 11, the author transitions to analytic databases with an overview of the different online analytical processing (OLAP) approaches. Chapter 12 is a brief overview of multi-valued and non-first normal form (NFNF) databases, a somewhat obscure topic for the traditional relational database practitioner.

The book concludes with a reminder that a lot of data is still stored in legacy systems such as IBM information management systems, integrated database management systems (IDMS), or other pre-relational navigational technologies. While the topic has been dropped from most database management textbooks, the real world demonstrates that these implementations have been very resilient and continue to hold their place in the data management landscape.

The book summarizes various NoSQL topics to acquaint readers with both old and new data management issues outside the realm of the relational framework. The content would be more accessible with better organization, but the book is very timely. Data management novices might find it difficult to find the relevant nuggets here, but an open-minded seasoned data professional should be able to extrapolate the new material from the heterogeneous coverage.

Aside from weak editing and some associated remaining errors, the book evinces the author’s experience and knowledge. I found it thought provoking and believe that it has a place on the data manager’s bookshelf.

More reviews about this item: Amazon, Goodreads, i-Programmer

Reviewer:  Jean-Pierre Kuilboer Review #: CR142059 (1406-0406)
Bookmark and Share
  Reviewer Selected
 
 
Distributed Databases (H.2.4 ... )
 
 
Query Processing (H.2.4 ... )
 
 
SQL (H.2.3 ... )
 
Would you recommend this review?
yes
no
Other reviews under "Distributed Databases": Date
Federated database systems for managing distributed, heterogeneous, and autonomous databases
Sheth A., Larson J. ACM Computing Surveys 22(3): 183-236, 2001. Type: Article
Jul 1 1991
Asserting the optimality of serial SJRPs in processing simple queries in chain networks
Gursel G., Scheuermann P. Information Processing Letters 19(5): 255-260, 1984. Type: Article
Sep 1 1985
Nested transactions: an approach to reliable distributed computing
Moss J., Massachusetts Institute of Technology, Cambridge, MA, 1985. Type: Book (9780262132008)
Mar 1 1986
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy