Data quality is a hot issue today, as a consequence of the overwhelming avalanche of data recorded and exchanged. Every second, many information systems and databases throughout an enterprise are addressed. However, it is difficult to find a holistic approach, such as the one provided by Lee et al.: they cover key aspects of data quality, such as quality assessment, data quality cost/benefit analysis, and quality processes.
A wonderful addition provided throughout the book is the practical distillation of artifacts devoted to data quality. Chapter 3, for example, “Assessing Data Quality: Part I,” depicts a complete questionnaire for information quality assessment among the user areas. This is done in a way that allows the reader to apply it immediately. In chapter 4, “Assessing Data Quality: Part II,” Lee et al. present the formulas for assessing data quality. This information can be gathered from queries against database catalogs or from simple routines to scan database contents. In chapter 6, there is a detailed explanation of data quality problems, where ten problem patterns are examined through six issues: manifestations, warning signs, impact of problem, intervention actions, target state, and improved data usage.
Chapter 8 presents some interesting data quality management concepts that are based on the principle that information should be treated as a product from an industrialized process. Ultimately, a method for analyzing the information production processes in an organization was developed by Lee et al.: information product maps (IP-maps), which are detailed in chapter 9 of the book. The IP-maps are simple to depict and very useful for understanding the information production cycle. They allow the data quality architect to analyze issues such as data governance and data quality metric indicators.
A minor drawback that I perceived throughout the book was the bias toward the hospital management system problem domain: all examples mentioned by Lee et al., including the extensive case elaborated in chapter 10, are related to the practical consulting experience Lee et al. had with data quality within health management systems. However, this fact does not diminish this book’s importance. On the contrary, the examples reveal complex situations that the authors have faced in real life. In the end, it is shown that data quality is not an exact science, and sometimes decisions should be made to accomplish reasonable results (as opposed to the never-ending search for exceptional results).