Computing Reviews, the leading online review service for computing literature.

Search

The cross entropy method : a unified approach to combinatorial optimization, Monte-Carlo simulation (Information Science and Statistics)
Rubinstein R., Kroese D., Springer-Verlag New York, Inc., Secaucus, NJ, 2004. Type: Book (9780387212401)

Date Reviewed: Nov 23 2004

Rarely have I seen such a dense and straight to the point pedagogical monograph on such a modern subject. This excellent book, on the simulated cross-entropy method (CEM) pioneered by one of the authors (Rubinstein), is very well written, starting with an exposition on the notions to be used, continuing with examples and a detailed presentation of the variants of the method, and closing with a discussion of how CEM effectively attacks an impressive list of modern problems. These modern problems include combinatorial optimization problems (COP), like the traveling salesman problem (TSP); the quadratic assignment problem (QAP); the max-cut problem; the alignment of deoxyribonucleic acid (DNA) sequences, and noisy estimation problems, like the buffer allocation problem (BAP); noisy optimization; continuous multi-external optimization; machine learning; and clustering and vector quantization problems. The amazing thing is that such a diverse list of topics is attacked and solved by methodologies that all have, at their core, the same simple notions. These notions are briefly presented below. Assume that we are searching for the solution of an integral of the form A = I[f(x)g(x;u)dx] = E_u[f], where x, and u are multidimensional vectors. We also assume that no analytical solution is known. We use Monte Carlo simulation. We first try the “brute force” estimator, (1/N)&Sgr;_i[f(x_i)], where the N points x_i follow the g(x;u) distribution. The problem arises when f(x) is meaningful where the values x_i are very rare, according to g. In this case, we might use an extremely large N to estimate the quantity A with an acceptable relative error. A clever approach is to find an importance sampling (IS) function h(x), usually belonging to the family of the g(x;u) densities, and then attack an easier problem to solve: I[f(x) g(x)/h(x) h(x)dx] = E_h[fg/h]. Now we go stochastic again, and use a first generation of N points {x_i} that follow the h(x) distribution, and the estimator: (1/N) &Sgr;_i[f(x_i) g(x_i)/h(x_i)]. The best way to use this estimator is to find the best h(x) to use, and this best, named g, is equal to f(x)g(x;u)/A, since, when we put this in the summation above, we get (1/N)NA = A. The problem is that this best h is utopic, since A is unknown. If we then start from a nonoptimal h, the question is: “Can we use a generation of sample points x_i* so as to improve h, and then repeat the process with an ameliorated h?” This compels us to find a way to compare the h distributions with the initial _g. Here, the Kullback-Leibler distance (or cross-entropy) is used: D_g(g;h) = I[ln(g(x)/h(x)) g(x)dx] = I[g lng dx] - I[g lnh dx] >= 0. So, what we are searching for is an h with the minimal D from g; we are searching for the solution to the functional minimization problem min_h{D_g_(g;h)}, where the class of densities h is restricted to the family g(x;v), such that min_v{Dg(g;g(x;v)}. Since the first term of D does not depend on v, this reduces to maximizing only I[g*lng(x;v)dx] with respect to v. Putting the expression for g* back we get: max_v{I[(f(x)g(x;v)/A) lng(x;v)dx]} = max_v{E_u[f(x)lng(x;v)]}. But now we can again use the IS-approach on the last expectation integral with an IS-density like g(x;w), to obtain: max_v{E_w[f(x)lng(x;v)g(x;u)/g(x;w)]}. Obviously, we perform this step because we want to create an iterative algorithm that will approximate g, little-by-little, searching the u-space of g(x;u). Now the last expression can be attacked stochastically, as we did for A, namely: v = arg-max_v{(1/N) &Sgr;_i[f(x_i)lng(x_i;v)g(x_i;u)/g(x_i;w)]} (a), where the x_i follow the g(x;w) distribution. In typical applications, the part inside the brackets {} is a convex and differentiable function with respect to v, so v* can be found by solving the following system of equations: (1/N) &Sgr;_i[f(x_i)g(x_i;u)/g(x_i;w)(d/dx)lng(x_i;v)]} = 0. So, the main CE algorithm is: (1) Set t=1, v₀=u (2) Generate sample {x_i} from the density g(x;v_t-1) (3) Calculate f(x_i) and maybe do some tricks here (4) Solve the maximization problem (a) using v_t-1 in place of w. Set v_t = v (5) Set t = t+1 and go to 2, until satisfied (6) Estimate (1/N) &Sgr;_i[ f(x_i) g(x_i;u)/g(x_i;v_t)] using the latest x_i set available. Step 3 and the “until satisfied” of step 5 need some clarification: since we stay in the family of the initial probability density function (pdf), there is strong potential that f(x) will still be meaningful for rare cases, and the initial problem remains, even under g(x;w). The trick here is to go from f to a family of fs, specifically, to slightly redefine f at each iteration, so that, at each iteration, each v_t will provide the optimal IS for the redefined f and the redefined A. Convergence criteria will be satisfied when the successive redefinitions of f drive it to the original f. COP problems are attacked by the CE method, by translating the original deterministic optimization into a related stochastic estimation problem, and then applying the rare-event machinery of the CE method. In general, however, choosing an appropriate parameterization and sampling scheme is more of an art than a science. The book contains exercises, extensive numerical examples, an extensive introduction into every subject (supported by an extensive literature survey), example programs, and an associated Web site, http://www.cemethod.org, with additional material, namely, programs, tutorials, papers, and links. Scientists involved in any of the problems mentioned above, students of the physical and engineering sciences, and any researcher involved in stochastic optimization and simulation should read this book. Readers will benefit from this book, both in inspiration, and in grasping the essence of a modern, extremely powerful, computational tool.

Reviewer: Constantin S. Chassapis	Review #: CR130457 (0507-0768)

Stochastic Programming (G.1.6 ... )

Heuristic Methods (I.2.8 ... )

Monte Carlo (I.6.8 ... )

Problem Solving, Control Methods, And Search (I.2.8 )

Types Of Simulation (I.6.8 )

Would you recommend this review?

yes

Other reviews under "Stochastic Programming":	Date

An efficient Monte Carlo method for optimal control problems with uncertainty Cao Y., Hussaini M., Zang T. Computational Optimization and Applications 26(3): 219-230, 2003. Type: Article	Apr 9 2004

Stochastic local search: foundations and applications Hoos H., Stützle T., Morgan Kaufmann Publishers Inc., San Francisco, CA, 2004. 658, Type: Book (9781558608726), Reviews: (1 of 2)	Mar 28 2005

Stochastic local search: foundations and applications Hoos H., Stützle T., Morgan Kaufmann Publishers Inc., San Francisco, CA, 2004. 658, Type: Book (9781558608726), Reviews: (2 of 2)	Oct 2 2006

more...

Reproduction in whole or in part without permission is prohibited. Copyright 1999-2024 ThinkLoud^®
Terms of Use | Privacy Policy