Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
The minimum description length principle
Grünwald P., The MIT Press, 2007. 504 pp. Type: Book (9780262072816)
Date Reviewed: Dec 5 2007

Minimum description length (MDL) is an information-theoretic method for inductive inference. MDL is related to, but also quite different from, the older notion of minimum message length (MML). This book is a detailed and comprehensive introduction to the principle, and to the related theories of universal model and model selection. MDL was introduced in 1978, by Jorma Rissanen, who wrote the foreword to this book.

The key concept of MDL is the description of a data set in a compressed form by a restricted number of symbols. Regularities in the data make the compression easier, so one’s effort should be focused on finding these regularities in order to, namely, learn more about the data. Therefore, the ultimate objective is to find a good model explaining the data, and not the estimation of the mechanism generating the data. Moreover, the notion of universal model is basic for model comparison and model selection, and therefore for efficient prediction and classification in practice.

MDL is connected with other principles, theories, and desired properties, for example, Occam’s razor, avoidance of overfitting, and Bayesian interpretation. All of these connections are described throughout the book.

The material is wisely organized; the various topics are introduced gradually, and in a very friendly way. In this regard, the author succeeds in producing a complete reference guide, answering all of the questions about MDL, in an accessible form.

The material is divided into four parts. The first part is introductory, and contains a presentation of the basic ideas behind MDL, and an overview of the statistical and information-theoretic concepts necessary for understanding the theory of MDL. The second part addresses the theory of universal coding: namely, the theory of data compression, emphasizing all aspects related to MDL. The third part provides a detailed and formal treatment of MDL as a theory of inductive inference based on universal coding. The fourth part is essentially an appendix of the theorems given in the second part, and contains the theory of exponential and maximum entropy families of probability distributions.

Part 1, “Introductory Material,” contains five chapters. Chapter 1 describes the fundamental philosophy of MDL, connecting the concepts of learning, regularities, and compression. There are also other notions involved, such as the notion of complexity, and there are many useful discussions concerning the history and the types of MDL. It also responds to the criticisms against MDL. Chapter 2 provides an account of the mathematical, probabilistic, and statistical preliminaries needed to read the rest of the book. Chapter 3 is an account of the information theoretic preliminaries, like coding systems and their relations with probabilities and entropy. Chapter 1 considers information-theoretic properties of statistical models, like Fisher information and its relation to likelihood code length and entropy. Part 1 closes with an introduction of the simplest version of MDL, which is called crude MDL.

Part 2, “Universal Coding,” contains eight chapters. Chapter 6 introduces simple universal coding with countable models. Chapters 7 through 10 analyze universal code for parametric models with finite parametric complexity, while chapters 11 and 12 discuss parametric models with infinite complexity. Chapter 13 is about countable unions of parametric models and nonparametric models.

Part 3, “Refined MDL,” describes a more formal and advanced version of MDL, and contains four chapters. Chapter 14 discusses the refined MDL method and model selection. Chapter 15 considers MDL for prediction and estimation. Chapter 16 discusses aspects of consistency and convergence in the MDL method. Chapter 17 compares MDL with other statistical inference methods, including Bayesian inference, maximum entropy, and MML.

Finally, Part 4, “Additional Background,” contains two chapters concerning theoretical tools for the proofs of Part 2. Specifically, chapter 18 provides an introduction to exponential families and their statistical properties, while chapter 19 focuses on their information-theoretic properties.

In general, the content of the book is highly advanced, but the author makes a remarkable effort to facilitate its study through a number of means. Specifically, the book avoids use of measure theory, provides comprehensive introductory material, uses summaries at the beginning and at the end of each chapter, and uses boxes or different fonts to point out the most important ideas and details that might otherwise be skipped at first reading. The result is a very interesting and highly readable book, revealing the mathematical and philosophical elegance of a fascinating topic that combines statistics and information theory. It is recommended for researchers in mathematics and computer science who deal with statistics, artificial intelligence, machine learning, and data mining, and also for researchers in many applied sciences.

Reviewer:  Lefteris Angelis Review #: CR135001 (0810-0962)
Bookmark and Share
  Reviewer Selected
 
 
Statistical (I.5.1 ... )
 
 
Learning (I.2.6 )
 
 
Coding And Information Theory (E.4 )
 
 
Probability And Statistics (G.3 )
 
Would you recommend this review?
yes
no
Other reviews under "Statistical": Date
A formulation and comparison of two linear feature selection techniques applicable to statistical classification
Young D., Odell P. Pattern Recognition 17(3): 331-337, 1984. Type: Article
Mar 1 1985
Remarks on some statistical properties of the minimum spanning forest
Dubes R., Hoffman R. Pattern Recognition 19(1): 49-53, 1986. Type: Article
Dec 1 1987
Statistical pattern recognition--early development and recent progress
Chen C. International Journal of Pattern Recognition and Artificial Intelligence 1(1): 43-51, 1987. Type: Article
May 1 1988
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy