Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Data management in machine learning systems
Boehm M., Kumar A., Yang J., Morgan&Claypool Publishers, San Rafael, CA, 2019. 174 pp. Type: Book (978-1-681734-96-5)
Date Reviewed: Jan 10 2020

Supervised machine learning (ML) systems need labeled datasets for training and testing models, and unsupervised ML systems need datasets for identifying hidden patterns. If the datasets for an application could be generated from an existing database, two approaches are possible: database systems could be modified to incorporate a learning environment, or a ML environment could be extended to incorporate a database system. This book offers interesting insights in this direction, having surveyed major initiatives existing in this narrow domain.

The book is organized into nine chapters. The first chapter narrates the motivation for the idea and scope of discussion. The second and third chapters discuss how ML features can be realized in database systems, including algorithms and learning over joins. Chapters 4 through 7 deal with database-integrated ML systems, covering aspects like logical and physical operator selection, execution strategies for such ML systems, memory management that also includes parallel architecture, as well as the cloud-based deployment of heterogeneous resources. The penultimate chapter surveys other tasks in the ML life cycle, ranging from data sourcing, data preparation, model selection, and model deployment. In the last chapter, the authors conclude the discussion, having analyzed the state of the art in ML-integrated database systems (or vice versa), user-defined functions, the specifications needed for high-level languages, optimization techniques, different ML models, heterogeneous data sources, and the life cycle management needed for such complex integrated systems.

The emergence of the Hadoop Distributed File System (HDFS) as an extension of a distributed database system catering to big-data-driven applications, to a certain extent, justifies the exploration of work integrating ML systems with database applications. The authors, however, do not explore in detail the emergence of highly specialized, sophisticated, and easy-to-use ML and deep learning frameworks like Keras, for instance.

Additionally, the types of datasets used in generic ML or deep learning environments are highly diversified when compared to those that could be used in these database-integrated learning systems. Nevertheless, the extensive surveys done on exiting integrated database-cum-learning environments, highlighting salient features as well as limitations, make this book extremely interesting for scholars working in this field and also product designers attempting to tap in to this niche domain.

Reviewer:  CK Raju Review #: CR146835 (2005-0100)
Bookmark and Share
 
General (H.2.0 )
 
 
Data Types And Structures (D.3.3 ... )
 
 
General (I.2.0 )
 
 
General (H.0 )
 
 
General (I.0 )
 
Would you recommend this review?
yes
no
Other reviews under "General": Date
Design of the Mneme persistent object store
Moss J. ACM Transactions on Information Systems 8(2): 103-139, 2001. Type: Article
Jul 1 1991
Database management systems
Gorman M., QED Information Sciences, Inc., Wellesley, MA, 1991. Type: Book (9780894353239)
Dec 1 1991
Database management (3rd ed.)
McFadden F., Hoffer J., Benjamin-Cummings Publ. Co., Inc., Redwood City, CA, 1991. Type: Book (9780805360400)
Jun 1 1992
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy