Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
The cross entropy method : a unified approach to combinatorial optimization, Monte-Carlo simulation (Information Science and Statistics)
Rubinstein R., Kroese D., Springer-Verlag New York, Inc., Secaucus, NJ, 2004. Type: Book (9780387212401)
Date Reviewed: Nov 23 2004

Rarely have I seen such a dense and straight to the point pedagogical monograph on such a modern subject. This excellent book, on the simulated cross-entropy method (CEM) pioneered by one of the authors (Rubinstein), is very well written, starting with an exposition on the notions to be used, continuing with examples and a detailed presentation of the variants of the method, and closing with a discussion of how CEM effectively attacks an impressive list of modern problems. These modern problems include combinatorial optimization problems (COP), like the traveling salesman problem (TSP); the quadratic assignment problem (QAP); the max-cut problem; the alignment of deoxyribonucleic acid (DNA) sequences, and noisy estimation problems, like the buffer allocation problem (BAP); noisy optimization; continuous multi-external optimization; machine learning; and clustering and vector quantization problems. The amazing thing is that such a diverse list of topics is attacked and solved by methodologies that all have, at their core, the same simple notions. These notions are briefly presented below.

Assume that we are searching for the solution of an integral of the form A = I[f(x)g(x;u)dx] = Eu[f], where x, and u are multidimensional vectors. We also assume that no analytical solution is known. We use Monte Carlo simulation. We first try the “brute force” estimator, (1/N)&Sgr;i[f(xi)], where the N points xi follow the g(x;u) distribution. The problem arises when f(x) is meaningful where the values xi are very rare, according to g. In this case, we might use an extremely large N to estimate the quantity A with an acceptable relative error. A clever approach is to find an importance sampling (IS) function h(x), usually belonging to the family of the g(x;u) densities, and then attack an easier problem to solve: I[f(x) g(x)/h(x) h(x)dx] = Eh[fg/h]. Now we go stochastic again, and use a first generation of N points {xi} that follow the h(x) distribution, and the estimator: (1/N) &Sgr;i[f(xi) g(xi)/h(xi)].

The best way to use this estimator is to find the best h(x) to use, and this best, named g*, is equal to f(x)g(x;u)/A, since, when we put this in the summation above, we get (1/N)NA = A. The problem is that this best h is utopic, since A is unknown.

If we then start from a nonoptimal h, the question is: “Can we use a generation of sample points xi so as to improve h, and then repeat the process with an ameliorated h?” This compels us to find a way to compare the h distributions with the initial g*. Here, the Kullback-Leibler distance (or cross-entropy) is used: Dg(g;h) = I[ln(g(x)/h(x)) g(x)dx] = I[g lng dx] - I[g lnh dx] >= 0.

So, what we are searching for is an h with the minimal D from g*; we are searching for the solution to the functional minimization problem minh{Dg*(g*;h)}, where the class of densities h is restricted to the family g(x;v), such that minv{Dg*(g*;g(x;v)}. Since the first term of D does not depend on v, this reduces to maximizing only I[g*lng(x;v)dx] with respect to v. Putting the expression for g* back we get: maxv{I[(f(x)g(x;v)/A) lng(x;v)dx]} = maxv{Eu[f(x)lng(x;v)]}.

But now we can again use the IS-approach on the last expectation integral with an IS-density like g(x;w), to obtain: maxv{Ew[f(x)lng(x;v)g(x;u)/g(x;w)]}.

Obviously, we perform this step because we want to create an iterative algorithm that will approximate g*, little-by-little, searching the u-space of g(x;u). Now the last expression can be attacked stochastically, as we did for A, namely: v* = arg-maxv{(1/N) &Sgr;i[f(xi)lng(xi;v)g(xi;u)/g(xi;w)]} (a), where the xi follow the g(x;w) distribution. In typical applications, the part inside the brackets {} is a convex and differentiable function with respect to v, so v* can be found by solving the following system of equations: (1/N) &Sgr;i[f(xi)g(xi;u)/g(xi;w)(d/dx)lng(xi;v)]} = 0.

So, the main CE algorithm is:

(1) Set t=1, v0=u
(2) Generate sample {xi} from the density g(x;vt-1)
(3) Calculate f(xi) and maybe do some tricks here
(4) Solve the maximization problem (a) using vt-1 in place of w. Set vt = v
(5) Set t = t+1 and go to 2, until satisfied
(6) Estimate (1/N) &Sgr;i[ f(xi) g(xi;u)/g(xi;vt)] using the latest xi set available.

Step 3 and the “until satisfied” of step 5 need some clarification: since we stay in the family of the initial probability density function (pdf), there is strong potential that f(x) will still be meaningful for rare cases, and the initial problem remains, even under g(x;w). The trick here is to go from f to a family of fs, specifically, to slightly redefine f at each iteration, so that, at each iteration, each vt will provide the optimal IS for the redefined f and the redefined A. Convergence criteria will be satisfied when the successive redefinitions of f drive it to the original f.

COP problems are attacked by the CE method, by translating the original deterministic optimization into a related stochastic estimation problem, and then applying the rare-event machinery of the CE method. In general, however, choosing an appropriate parameterization and sampling scheme is more of an art than a science.

The book contains exercises, extensive numerical examples, an extensive introduction into every subject (supported by an extensive literature survey), example programs, and an associated Web site, http://www.cemethod.org, with additional material, namely, programs, tutorials, papers, and links.

Scientists involved in any of the problems mentioned above, students of the physical and engineering sciences, and any researcher involved in stochastic optimization and simulation should read this book. Readers will benefit from this book, both in inspiration, and in grasping the essence of a modern, extremely powerful, computational tool.

Reviewer:  Constantin S. Chassapis Review #: CR130457 (0507-0768)
Bookmark and Share
  Reviewer Selected
 
 
Stochastic Programming (G.1.6 ... )
 
 
Heuristic Methods (I.2.8 ... )
 
 
Monte Carlo (I.6.8 ... )
 
 
Problem Solving, Control Methods, And Search (I.2.8 )
 
 
Types Of Simulation (I.6.8 )
 
Would you recommend this review?
yes
no
Other reviews under "Stochastic Programming": Date
An efficient Monte Carlo method for optimal control problems with uncertainty
Cao Y., Hussaini M., Zang T. Computational Optimization and Applications 26(3): 219-230, 2003. Type: Article
Apr 9 2004
Stochastic local search: foundations and applications
Hoos H., Stützle T., Morgan Kaufmann Publishers Inc., San Francisco, CA, 2004.  658, Type: Book (9781558608726), Reviews: (1 of 2)
Mar 28 2005
Stochastic local search: foundations and applications
Hoos H., Stützle T., Morgan Kaufmann Publishers Inc., San Francisco, CA, 2004.  658, Type: Book (9781558608726), Reviews: (2 of 2)
Oct 2 2006
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy