Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
The Monte Carlo Database System: stochastic analysis close to the data
Jampani R., Xu F., Wu M., Perez L., Jermaine C., Haas P. ACM Transactions on Database Systems36 (3):1-41,2011.Type:Article
Date Reviewed: Apr 24 2012

The main subject of the paper relates to using Monte Carlo simulation on databases to allow the creation of future scenarios flexible enough to allow a what-if hypothesis. It combines database theory with Monte Carlo methods; for example, “What would be the expected outcome of customers’ orders if the price of component X were increased by 20 percent?” The paper proceeds to explain the methodology to answer such formulations, and shows interesting results and conclusions.

Yet some of the content is somehow disappointing, first and foremost because the paper is not clear on some issues. For example, the Monte Carlo simulations are done using the Gamma function. But why not use other functions more directly related to the simulation of such phenomena? Furthermore, at some point, the paper claims that simulation using aggregated data (for example, clumping together records of clients in a database) results in losing predictive power. This seems to contradict the law of large numbers and basic probability theory.

Another aspect that is disappointing is related to the first footnote of the paper on page 5. While the authors are evaluating related work, it is claimed: “Indeed, MCDB [Monte Carlo Database System] is the first DBMS [database management system] for which the Monte Carlo approach is fundamental to the entire system design.” The footnote associated with this sentence states, “The recent PIP system of Kennedy and Koch (2010) combines PrDB [probabilistic databases] and Monte Carlo techniques, and can yield superior performance for certain MCDB-style queries.” It is not clear why this is not discussed in the main paper and, furthermore, why a system that has superior performance is relegated to a footnote.

Another disappointing aspect is related to the second footnote: it is not true that pseudorandom number generators are statistically indistinguishable from truly independent and identically distributed (i.i.d.) uniform random numbers. Self-similarity analysis of pseudorandom number sequences changes when the pseudorandom number generators change.

As to the core of the paper, and with the limitations expressed above, the idea of using Monte Carlo in database simulation is well explained and the authors show clearly how the MCDB system works.

Reviewer:  Nuno M. Garcia Review #: CR140085 (1209-0944)
Bookmark and Share
 
Relational Databases (H.2.4 ... )
 
 
Simulation Support Systems (I.6.7 )
 
Would you recommend this review?
yes
no
Other reviews under "Relational Databases": Date
A sound and sometimes complete query evaluation algorithm for relational databases with null values
Reiter R. Journal of the ACM 33(2): 349-370, 1986. Type: Article
Nov 1 1986
Sort sets in the relational model
Ginsburg S., Hull R. Journal of the ACM 33(3): 465-488, 1986. Type: Article
Nov 1 1986
Foundation for object/relational databases
Date C., Darwen H., Addison Wesley Longman Publishing Co., Inc., Redwood City, CA, 1998. Type: Book (9780201309782)
Nov 1 1998
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy