Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
The data science design manual
Skiena S., Springer International Publishing, New York, NY, 2017. 445 pp. Type: Book (978-3-319554-43-3)
Date Reviewed: Feb 23 2018

The 14 chapters of this book have been carefully devised to provide a comprehensive introduction to data science as an academic discipline. The special feature of this text is that it does so by focusing on the skills and principles needed to design systems for collecting, analyzing, and interpreting data. The contents of the book as expressed by the chapter structure reflect contributions from computer science, statistics, and artificial intelligence (in particular, machine learning).

The first chapter is devoted to discussing what data science is. The discussion includes several interesting and inspiring insights. Most of them are based on identifying fundamental differences between computer science or software engineering and data science. Real scientists are data driven, while computer scientists are method driven. This is one of the theses the author formulates to justify and describe the new discipline. The chapter could not provide a definitive answer to the question of whether data science should be generally accepted as a new scientific discipline, but it is worth reading for anyone interested.

Chapter 2 provides mathematical preliminaries, in particular probability, statistics, correlation analysis, and logarithms. Anyone contemplating how to write about such a wide scope of those preliminaries in 30 pages would realize the immense difficulties involved. It is obvious that this is not a book on either probability or statistics. It is not possible to develop probability and statistics results here like those that can be found in any standard textbook on those disciplines. It is only possible in 30 pages to present and explain some related results. The author has mastered this brilliantly in the remaining chapters: “Data Munging,” “Scores and Rankings,” “Statistical Analysis,” “Visualizing Data,” “Mathematical Models,” “Linear Algebra,” “Linear and Logistic Regression,” “Distance and Network Methods,” “Machine Learning,” and “Big Data: Achieving Scale.”

This approach might justify calling the book a manual. In my humble opinion, however, the book is more than a typical manual. In fact, the author himself designates it as a textbook for an introductory course on data science. The chapters are richly equipped with exercises. The topics are always explained starting with a proper motivation and continuing with practical examples. This is perhaps the most outstanding feature of the book. It can serve as a regular textbook for an academic course. In fact, I should like to recommend it exactly for this purpose. On the other hand, it provides a wealth of material for people from industry, such as software engineers, and can serve as a manual for them to accomplish data science tasks. It should be noted that the book is not just a text, but a much more complex product, including a full set of lecture slides available online as well as a solutions wiki.

More reviews about this item: Amazon

Reviewer:  P. Navrat Review #: CR145880 (1805-0207)
Bookmark and Share
 
Content Analysis And Indexing (H.3.1 )
 
 
Reference (A.2 )
 
Would you recommend this review?
yes
no
Other reviews under "Content Analysis And Indexing": Date
Personal bibliographic indexes and their computerisation
Heeks R., Taylor Graham Publishing, London, UK, 1986. Type: Book (9789780947568115)
Sep 1 1987
Development of a term association interface for browsing bibliographic data bases based on end users’ word associations
Pejtersen A., Olsen S., Zunde P., Taylor Graham Publishing, London, UK, 1987. Type: Book (9780947568306)
Nov 1 1989
Transforming text into hypertext for a compact disc encyclopedia
Glushko R. ACM SIGCHI Bulletin 20(SI): 293-298, 1989. Type: Article
May 1 1990
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy