Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Data science fundamentals for Python and MongoDB
Paper D., Apress, New York, NY, 2018. 214 pp. Type: Book (978-1-484235-96-6)
Date Reviewed: Oct 17 2018

Data science is an important and growing field in computing.

This book purports to provide an introduction to data science with Python and MongoDB. It fails on pretty much every level.

Here are just a couple of samples of things that are very wrong indeed (not all problems are included here and most of the less egregious problems are skipped).

One sample program provides the following function:

def str_int(s):

val= “%.2f” % profit
return float(val)

The text documents it as “convert a string to a float,” but then why is it called str_int?

First of all, this does not work at all. The parameter s is never used and profit is, but profit is not available in this scope.

Second, assuming we change the parameter s to profit, it takes not a string but a number (either an integer or a float).

Third, an int is not returned, but a float that has the (string) value in profit truncated to two decimal places after the decimal point. It does this by formatting the number to a string and then casting the result back to a float. Python has a round function that will do this just fine (which indeed is used in the code given in the text near the call site for str_int).

Finally, this is a dubious operation in context. Why not carry around the extra bits and only round at the time when the values are printed? If there’s a reason, it’s not mentioned, but then too no reason is given for the truncation. To complicate matters, the result is a binary representation, so the rounded value probably doesn’t even represent the number truncated in decimal. Why not express the prices in pennies rather than floating-point dollars?

So the function is actually more aptly named something like truncate_float_to_two_decimal_digit_float, and the description in the text is not just wrong but misleading and confusing.

Another sample of code claims to do gradient descent to find local minima of functions. This code looks okay, but the examples are seriously wrong. One example function used is the sigmoid function: . But this function increases monotonically from x=−∞ to x=∞, so has no local minima (unless “local minimum” is used differently than I’ve come to expect). Still the code manages to find a local minimum around x=0. The gradient descent code seems okay, but the derivative used is incorrect: σ’(x) = x(1-x) instead of σ(x) ∗ (1-σ(x)). The function is plotted along with the derivative, and the function clearly has a nonzero derivative at x=0 (and a very large negative derivative at both small and large values of x). A plot of the computed local minima is provided with a local minimum at x=0, which is not correct and doesn’t even look correct with the included plots of σ and its derivative.

There is another sample to get the local maximum in about the same way (though the stated function again has no local maxima), with the same errors and the same result: x=0. Of course, if there’s a local minimum and local maximum at the same point, a smooth function must be locally constant, which σ is not at x=0.

There are many more errors, big and small.

I suspect that MongoDB is tossed into the mix because it provides a better title; however, why it is worth using is never presented. Almost any SQL package (SQLite, MySQL, PostgreSQL) would do very nicely for the examples shown, and little justification is provided for using Mongo over SQL (or for that matter some other NoSQL system). Indeed, just plain Python would work well (given enough memory) for most of the examples given. A number of other packages are used with no real justification and essentially no explanation other than “we’re using this package.”

The descriptions of the code in the text are unhelpful (at best), and the code is poorly formatted (with a type size that forces many lines of code into two or three lines of text on the page). The corresponding output is often presented in a font that is much smaller than either the text or the code.

This book is not suited for students nor self-study, not just because it is frequently wrong, but also because it is poorly written, badly typeset, and lacks a coherent explanation for almost everything discussed.

More reviews about this item: Amazon

Reviewer:  Jeffrey Putnam Review #: CR146285 (1812-0611)
Bookmark and Share
  Featured Reviewer  
 
Object-Oriented Programming (D.1.5 )
 
 
Database Management (H.2 )
 
Would you recommend this review?
yes
no
Other reviews under "Object-Oriented Programming": Date
Object-oriented programming systems, languages, and applications
Paepcke A.  Object-oriented programming systems, languages, and applications,Phoenix, AZ,Oct 6-Oct 11, 1991,1991. Type: Whole Proceedings
Oct 1 1992
Object lifecycles
Shlaer S., Mellor S., Yourdon Press, Upper Saddle River, NJ, 1992. Type: Book (9780136299400)
Apr 1 1993
Object-oriented programming
Voss G., Osborne/McGraw-Hill, Berkeley, CA, 1991. Type: Book (9780078816826)
Oct 1 1992
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy