Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Real-time analytics : techniques to analyze and visualize streaming data
Ellis B., Wiley Publishing, Indianapolis, IN, 2014. 432 pp. Type: Book (978-1-118837-91-7)
Date Reviewed: Jan 11 2016

Analyzing and acting on streaming data is a challenging task, even compared to batch processing big data, with its ecosystem of tools. The book starts with an overview of the evolution of data streaming tools and their motivation, giving readers a good idea of the importance of knowing their data sources and still being agile and prepared for future changes in demands from users.

In the starting chapter, we get acquainted with many of the challenges in handling massive, irregular, and not-always-so-structured streams of data, and the limitations of the short time windows put on how much analysis is possible. The goal of this book is to provide the reader with the ability to implement a streaming data project from start to finish; readers are guided through the area from a bird’s-eye view to actual code examples. After reading this book, the reader will be well prepared for the task, understanding the big picture and having a good idea of what approach is needed.

The author gives the reader a broad view of the development of real-time applications, including the various components and frameworks. The book thoroughly covers components for real-time architecture; how to collect streaming data balancing flexibility and performance; compression; and all of the steps from getting streaming data, to processing, storing, and delivering it, to analysis and visualization. Fundamental concepts and challenges are pedagogically presented, giving the reader a good understanding of the motivations behind the architectures and what choices to make. The overall demands of a well-performing real-time architecture, such as high availability, replication, low latency, and horizontal scalability, are well covered and explained. A real-time architecture checklist is helpfully added for the reader to walk through: from collection, data flow (Flume if integrating to preexisting systems, Kafka if retrofitting or building new), processing (Samza for Yarn-veterans, or else Storm), storage (Redis, Cassandra), and delivery. Once the reasoning behind a functional streaming architecture is well covered, the author gives a good overview of analysis and visualization, including a good number of examples.

Overall, the book covers the topic very well, giving the reader a solid base to build on. One way to approach this fairly complex area is to read the overview texts in the book; from there, dive deeper into the selected frameworks, and finally follow the code examples to get a real feeling of what it takes to create a full value chain.

The book fails to mention Apache Spark and Apache Flink and their streaming capabilities; hopefully, this will be addressed in future editions. The coding examples in the book will be of less interest, and not necessarily working, in the near future (the book is from 2014; much has changed since then), but will stay relevant for the reader to get an idea of what is needed to get these workflows in place. A supporting website for the book helps mitigate this. The visualization area especially is evolving rapidly, but analysis packages are also adjusting quickly to the challenges of handling big data (for example, Apache Spark and other frameworks and their machine learning libraries).

To summarize, the book is well written, with a good pedagogical approach. It’s a good starting point, and will help newcomers to this area (and fill knowledge gaps for the more experienced reader).

More reviews about this item: Amazon, Goodreads

Reviewer:  Aake Edlund Review #: CR144089 (1604-0227)
Bookmark and Share
  Reviewer Selected
 
 
Data Communications (C.2.0 ... )
 
 
Data-Flow Architectures (C.1.3 ... )
 
Would you recommend this review?
yes
no
Other reviews under "Data Communications": Date
Communications formulas & algorithms for systems analysis and design
Rorabaugh C., McGraw-Hill, Inc., New York, NY, 1990. Type: Book (9780070536449)
Feb 1 1992
Telecommunications for management
Meadow C. (ed), Tedesco A., McGraw-Hill, Inc., New York, NY, 1984. Type: Book (9780070411982)
Jan 1 1985
After the breakup
Crandall R., The Brookings Institution, Washington, DC, 1991. Type: Book (9780815716068)
Jun 1 1992
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy