Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Joe Celko’s data, measurements and standards in SQL
Celko J., Morgan Kaufmann Publishers Inc., San Francisco, CA, 2009. 312 pp. Type: Book (978-0-123747-22-8)
Date Reviewed: Oct 4 2010

It is increasingly important for databases to comply with standards whenever possible, since the data stored in them is often used for many years and may be used in several different countries. In particular, almost any commercial database is likely to have to deal with customers or suppliers in different countries, and designers need to be aware of this in advance and design appropriately. Otherwise, there is always the risk of having to change things once the database is live, populated, and more difficult to safely change. This book focuses on the problems of standardization and the related problems of measurements. It is clearly intended to encourage designers to at least be aware of the problems, before their database designs go live.

The book is divided into two parts. The first part gives a general overview of the kinds of problems encountered. There are chapters on measurements, validation, data encoding, and scales. The second part consists of more than 30 short chapters on specific kinds of standards, which were chosen more for illustrative purposes than as specific standards that database designers need to know. For example, there are chapters on vehicle identification numbers, shoe sizes, and temperature scales alongside chapters on dates and times, national identification numbers, and paper sizes.

All of this is, at least in theory, excellent information. Certainly, only a few database designers may need to know the standards for shoe sizes. But, in general, designers can alleviate the pain of later conversions (or worse yet, having to completely discard unusable data) by knowing that there are many standards and seeking out the appropriate ones to use. The short chapters on specific standards are interesting and rather fun to read. That being said, the book as a whole has more than a few problems. For instance, in the section on credit card numbers, the Luhn algorithm for computing check digits is repeated in sections 2.2.2.1 and 2.2.2.3, with slightly different structured query language (SQL) code in each section. Further on in the same section, there is some SQL code that declares a credit card number to be 17 characters long and a constraint that requires the credit card number to be four sets of four digits, with the sets separated by a “-” for a total of 19 characters. This is not the only repetition of information--in the standards sections, there are two chapters on dates and times, with slightly different information, though generally overlapping.

In one section, Celko mentions that failure to allow for growth is a major problem (which it is); however, in another section, he recommends that Internet protocol (IP) addresses be stored as four TINYINTs, which might be good for now, but will not allow for growth if IP version 6 (IPv6) ever becomes the default. In the section on US Social Security numbers (SSNs), Celko does not discuss the privacy issues and laws concerning them; nonetheless, it might be worth at least reminding people that SSNs should never be used for identifiers in any system.

Also, there are a couple of odd omissions. As an example, there is no discussion of the wide variety of names that people may use. While this is a very large problem (especially when internationalization is considered) that probably requires far more space than this relatively thin book can spare, it is at least worth mentioning, as people tend to want their names to be spelled correctly--yes, last names can include punctuation marks such as hyphens and apostrophes. As another example, Celko fails to discuss addresses, except for postal codes. While generally useful standards for either of these may not exist, given the frequency with which names and addresses are used, these are odd omissions, to say the least.

Another problem is the text’s somewhat strange tone. It isn’t quite clear what to make of Celko taking a Web poster to task (a bit rudely) by complaining that the date “2007-04-01” translated to both “2007-04-01” and “2007-01-04” in his database, when it indeed appears to be a problem with the database, not the standard or the poster.

In summary, while there is probably a need for a book that gives this kind of advice to database designers, Celko’s book, as it stands, is not the answer, despite its wealth of good information.

Reviewer:  Jeffrey Putnam Review #: CR138439 (1107-0699)
Bookmark and Share
  Reviewer Selected
Featured Reviewer
 
 
SQL (H.2.3 ... )
 
 
Metrics (D.2.8 )
 
Would you recommend this review?
yes
no
Other reviews under "SQL": Date
SQL and its applications
Lorie R., Daudenarde J., Prentice-Hall, Inc., Upper Saddle River, NJ, 1991. Type: Book (9780138379568)
Dec 1 1991
Learning SQL
, Prentice-Hall, Inc., Upper Saddle River, NJ, 1991. Type: Book (9780135287040)
Jun 1 1992
SQL and relational databases
Vang S., Microtrend Books, San Marcos, CA, 1991. Type: Book (9780915391424)
Sep 1 1991
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy