This interesting textbook on data analysis considers summarization as a means for developing and augmenting analytical concepts; correlation for enhancing and establishing relations; and visualization as a means to “presenting results in a cognitively comfortable [manner].”
The main topics covered by the book include: principal component analysis (PCA), correlation and regression, decision trees, k-means cluster analysis, and hierarchical cluster analysis. This provides a good overview of the main statistical techniques used by data analysts and data scientists.
The chapter on PCA provides interesting historical and mathematical background and explains why “it has become one of the most popular methods for data summarization and visualization.” The chapter covering correlation and regression provides a clear discussion of the relevant techniques, including linear regression, neural networks, support vector machines, naive Bayes classifiers, and classification trees. The chapter on cluster analysis provides a detailed discussion of k-means cluster analysis and other clustering approaches.
Overall, this book provides a clear overview of the data analysis process, the different types of statistical techniques employed for data analysis, and their role and purpose. It also clearly explains how artificial intelligence and machine learning relate to data analysis. The use of visualization with statistical analysis results is covered in an informative and useful manner. There is good use of a variety of examples to demonstrate how the different techniques are applied in practice.
The book’s main purpose would be as a textbook for undergraduate students, or a reference book for data analysts.
More reviews about this item: Amazon