Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Data science: an introduction to statistics and machine learning
Plaue M., Springer International Publishing, Cham, Switzerland, 2023. 385 pp. Type: Book (3662678810)
Date Reviewed: Mar 1 2024

This comprehensive textbook covers a broad spectrum of essential topics for understanding and working in the field of data science. It is structured in a clear, well-organized manner, divided into three main parts.

Part 1, “Basics,” starts with foundational concepts like data organization, quality, and cleaning. It introduces various data models, including relational, graph-based, and hierarchical models. The focus on data quality, including aspects like validation, standardization, and deduplication, provides thorough groundwork for understanding the importance of data integrity in data science.

Part 2, “Stochastics,” delves into probability theory, exploring concepts like probability measures, random variables, and characteristic measures. It explains key principles like Bayes’ theorem, conditional probability, and various distributions. The section on inferential statistics is particularly valuable, covering topics like statistical models, interval estimation, hypothesis testing, and regression analysis. It’s a robust guide to the statistical underpinnings of data science.

Part 3 addresses the core of modern data science: machine learning. It’s divided into sections on supervised and unsupervised learning, detailing algorithms, neural networks, dimensionality reduction, and cluster analysis. The practical applications section, with examples like text recognition and sentiment analysis, provides real-world context to the methodologies discussed.

The book covers a wide range of topics, from basic statistical concepts to advanced machine learning algorithms. It is both deep and broad, making it a valuable resource for both beginners and experienced practitioners. Each concept is well explained and often accompanied by practical examples, which enhances understanding.

The inclusion of real-world examples and applications of machine learning techniques is a major strength. However, the mathematical rigor might be challenging for beginners and those without a strong background in mathematics and statistics.

This book covers foundational concepts that are timeless, but it’s also important for readers to be aware that they might need to supplement their knowledge with more current trends and software tools in the field.

An important aspect of data science is the practical implementation of algorithms and models using programming languages like Python or R. The book’s usefulness would have been enhanced by code examples and discussions of software commonly used in the field.

Data visualization and the ability to communicate findings are key skills in data science. The book lacks sections on these aspects, which would significantly add to its practical utility.

The inclusion of exercises and challenges that encourage critical thinking and problem solving is a strong point. This not only helps in understanding the concepts but also in applying them to solve real-world problems. Readers might need additional resources for topics of particular interest or complexity, and may need supplementary materials for the latest trends, tools, and deeper dives into specific topics [1,2,3,4,5].

Reviewer:  Wael Badawy Review #: CR147719
1) James, J.; Witten, D.; Hastie, T.; Tibshirani, R. An introduction to statistical learning: with applications in R. Springer, New York, NY, 2013.
2) Clarke, B.; Fokoue, E.; Zhang, H. H. Principles and theory for data mining and machine learning. Springer, New York, NY, 2009.
3) Cady, F. The data science handbook. Wiley, Hoboken, NJ, 2017.
4) Hastie, T.; Tibshirani, R.; Friedman, J. The elements of statistical learning: data mining, inference, and prediction (2nd ed.). Springer, New York, NY, 2009.
5) Veltri, G. A. Big data is not only about data: the two cultures of modelling. Big Data & Society 4, 1 (2017), https://doi.org/10.1177/2053951717703997.
Bookmark and Share
  Reviewer Selected
 
 
Statistics (K.1 ... )
 
 
Statistical (I.5.1 ... )
 
 
Statistical (I.4.10 ... )
 
 
Statistical Computing (G.3 ... )
 
 
Statistical Databases (H.2.8 ... )
 
 
Statistical Methods (D.2.4 ... )
 
  more  
Would you recommend this review?
yes
no
Other reviews under "Statistics": Date
Employment and salaries of recent doctorates in computer science
Maisel H., Gaddy C. Communications of the ACM 40(9): 90-93, 1997. Type: Article
Dec 1 1997
International dimensions of the productivity paradox
Dewan S., Kraemer K. (ed) Communications of the ACM 41(9): 56-62, 1998. Type: Article
Nov 1 1998

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy