Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Mathematical problems in data science : theoretical and practical methods
Chen L., Su Z., Jiang B., Springer International Publishing, New York, NY, 2015. 213 pp. Type: Book (978-3-319251-25-7)
Date Reviewed: Sep 8 2016

Data science includes mathematical and statistical tools required to find relations and principles behind heterogeneous and possibly unstructured data. It is an emerging field, under active research, and the authors here have attempted to explain existing methods while introducing some open problems.

The contents are organized into three parts. Basics are covered in the first part, which comprises the first three chapters. Machine learning is the focus of Part 2 (chapters 4 through 7). The third part consists of chapters 8 through 12 and contains selected topics and research papers.

The material in this book is more suitable for advanced students and researchers rather than beginners. Regarding the writing style, construction of the English sentences could have been better in some of the earlier chapters. Linguistic and typographic errors make the book hard to follow occasionally.

The introduction given in the first two chapters moves too quickly and lacks key details of the methods discussed, though numerous references are provided for an interested reader.

Chapter 3 introduces lambda-connectedness, defined relative to a potential function on vertices of a graph. One interesting application of this method is in image processing, and image segmentation is discussed in some detail. Another application discussed is data reconstruction, which fills missing data points based on data samples.

Chapter 4 provides some more introductory material and covers a range of topics from decision trees to neural networks, to computational learning theory. The presentation is useful, but does not go in sufficient depth.

Image segmentation and video tracking are the topics in chapter 5. The challenge here is the time complexity of the computation. Current research is geared toward developing distributed algorithms that can run in linear or sublinear time. This chapter is detailed and well written, covering the basics as well as current trends.

Chapter 6 introduces topological data analysis, which is used to find structural features of datasets. Voronoi diagrams, Delaunay triangulation, and persistent homology are intuitively explained along with relevant algorithms.

Chapter 7 is well written. It covers Monte Carlo methods right from the basics and describes their application in big data analysis. Several case studies are presented.

In chapter 8, the authors propose an interesting geometric framework called vector bundle learning, along with a supervised feature extraction algorithm. They also show promising results in two applications: face recognition and classification of handwritten digits.

Chapter 9 discusses interpolation methods for interest rate curves. Existing methods based on cubic splines can generate negative forward rates, and the authors have come up with a piecewise rational function. The presented results look promising.

Chapter 10 considers the problem of automatic image segmentation when the intensity is not homogeneous. The authors present modified variational models and extend the concept to selective segmentation, which takes user input in intensity correction. Relevant background and existing methods are well described here.

In chapter 11, an interesting problem of emergency evacuation is considered. People need to be taken out of an affected area as quickly as possible, where groups of evacuees share information with each other, but do not know the boundary of the dangerous zone. The authors consider a model of the problem where the affected area is treated as a convex region in the plane, with evacuees positioned at different points. After describing existing methods, the authors suggest efficient strategies to determine escape paths, along with open problems for further research.

Chapter 12 attempts to outline a theoretical model for big data called master/slave multiprocessor. The author hopes that such a model along with an adaptive variant can capture important properties of big data computations. The complexity of solving some simple problems in this model is explored. The discussion here is very interesting and relevant, as the constraints imposed on the master node look reasonable and practical.

Overall, the book offers a collection of papers that describe current trends and future directions along with appropriate references. The presented applications cover a broad spectrum of domains where big data poses challenges.

Reviewer:  Paparao Kavalipati Review #: CR144746 (1612-0871)
Bookmark and Share
  Reviewer Selected
Featured Reviewer
 
 
General (H.3.0 )
 
 
Database Applications (H.2.8 )
 
 
Numerical Algorithms And Problems (F.2.1 )
 
Would you recommend this review?
yes
no
Other reviews under "General": Date
Dictionary of information science and technology
Watters C., Academic Press Prof., Inc., San Diego, CA, 1992. Type: Book (9780127385105)
Jul 1 1993
Information retrieval
Frakes W., Baeza-Yates R., Prentice-Hall, Inc., Upper Saddle River, NJ, 1992. Type: Book (9780134638379)
Jul 1 1993
Organizing information: principles of data base and retrieval systems
Soergel D., Academic Press Prof., Inc., San Diego, CA, 1985. Type: Book (9789780126542608)
Aug 1 1986
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy