Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Practical Cassandra : a developer’s approach
Bradberry R., Lubow E., Addison-Wesley Professional, Upper Saddle River, NJ, 2014. 208 pp. Type: Book (978-0-321933-94-2)
Date Reviewed: Feb 12 2015

The book aims to be a quick guide for developers, covering installation, management, and application development in Cassandra, a popular, massively scalable open-source NoSQL database. It describes all the steps to start a Cassandra cluster, including installation, performance monitoring, tuning, and troubleshooting, which makes it useful for beginners and intermediate-level developers. The strength of this book is its brevity. It gives a succinct overview of how to start a Cassandra database instance so that developers can build Cassandra-based applications, and how to manage and administer the database to support their applications. Although simplified, it provides comprehensive treatment containing the essential details (commands and sample code) to aid understanding, and includes industry case studies and enterprise stacks where Cassandra databases are deployed.

It would improve the organization of the book if the chapters were in a different order, as some terms are used without a proper introduction. A better logical grouping would make the book more readable. It would also be desirable to improve the indexing of the technical terms, which would make the book more searchable.

The book presents the material in the order in which a developer would advance for development purposes, but it would be preferable to present Cassandra’s basics and the data and query model (chapters 1, 3, and 4) first, followed by installation and deployment (chapters 2, 5, and 11). Sample application drivers/connectors and use cases for development (chapters 9 and 12) should follow next, and finally database trouble shooting, performance monitoring, tuning, and maintenance (chapters 10, 6, 7, and 8) should round out the coverage.

Regarding Cassandra data and query modeling, chapter 1 describes the history, the basic terminology, and the main characteristics of Cassandra, such as the ColumnFamily schema, dynamically expandable columns, decentralized processing nodes, data replication, and tunable consistency. Chapter 3 presents the Cassandra data model compared with a relational database management system (RDBMS) data model. It illustrates how one row dynamically adds new columns using composite primary keys, creating a dynamic table in Cassandra. It shows that Cassandra has aggregate metrics and collections data types such as COUNTER, SET, and ordered LIST, as well as key-value type MAP to easily model denormalized tables that may result in the quickest result retrieval for the application-specific query model. Chapter 4 introduces the structured query language (SQL)-like Cassandra query language (CQL) to create tables and to manipulate static and dynamic tables. A wide row with partition keys in a dynamic table can provide very fast access to data that belongs to a partition key.

Installation and deployment is next. Chapter 2 goes through the installation steps, including Cassandra package download, installation and basic configurations, setting up file directories, and configuring Cassandra for a multi-node cluster. Chapter 5 address questions related to Cassandra deployment, such as on how many nodes data will be replicated and where they get stored, that is, replication strategies, placement decisions, and snitches that map Internet protocols (IPs) to racks and data centers, partitioners for deciding each row placement in a cluster, and platform decisions (for example, cloud platforms). Chapter 11 contains the overall architecture and related components.

Chapter 9, on application development, illustrates how different application languages can connect to Cassandra and manipulate data (for example, create ColumnFamily, insert data, retrieve, delete). This chapter should be followed by chapter 12, which includes different use cases that show how different applications use Cassandra.

Chapters 6 through 8 cover tuning and maintenance. They deal with performance tuning, monitoring, maintenance tools, and options. Performance tools include setting values for timeouts for a coordinator node, restreaming between nodes, and cross-node communication; CommitLog synchronization set-up; MemTable threshold adjustments; concurrency setups for reads and writes; caching options; compression types; memory/swap settings; Java Virtual Machine (JVM) tuning; and so on. Chapter 7 focuses on a command-line management tool called nodetool and its commands and options. Many nodetool commands are available to view node, ring, ColumnFamily, and thread pool information and statistics, while others are related to compaction strategies, backup, restore, and archiving of Cassandra databases. Chapter 8 discusses system health checks through JConsole to monitor Managed Beans (Mbean) and Nagios for monitoring all levels of the system, specifically port checks, log monitoring, and Java Management Extensions (JMX) checks such as read/write latency, garbage collection timing, capacity checks, heap usage, and memory. Chapter 10 discusses some common problems such as slow reads/fast writes, freezing nodes, and large cache size or MemTable size issues, and presents troubleshooting tools.

The appendices provide pointers to useful resources such as examples of sample code, user groups, and Enterprise Cassandra systems. As mentioned, the index seems a bit inadequate. Since terms are not introduced in a logical sequential manner, it is important to have a good index where one can quickly find the related pages for a term.

In summary, the book is a succinct introduction to the Cassandra NoSQL database system without overwhelming the reader. I would recommend this book for database administrators who need to monitor and maintain a Cassandra cluster, as well as for programmers who would like to start using Cassandra for application development. It assumes some background knowledge about databases, such as knowledge of RDBMS technology, and some familiarity of Unix or Linux environments. The book will also be a good introductory text for students to experiment with application development with Cassandra, either on a standalone machine or on a cloud-based cluster.

More reviews about this item: Amazon, Goodreads

Reviewer:  Soon Ae Chun Review #: CR143184 (1505-0350)
Bookmark and Share
  Featured Reviewer  
 
Query Languages (H.2.3 ... )
 
 
Reference (A.2 )
 
Would you recommend this review?
yes
no
Other reviews under "Query Languages": Date

Zhang Z., Mendelzon A.Type: Article
Jan 1 1986
Negation in rule-based database languages
Bidoit N. Theoretical Computer Science 78(1): 3-83, 1991. Type: Article
Oct 1 1992
A query language for retrieving information from hierarchical text structures
MacLeod I. The Computer Journal 34(3): 254-264, 1991. Type: Article
Sep 1 1992
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy