Computing Reviews, the leading online review service for computing literature.

Search

Work-efficient matrix inversion in polylogarithmic time
Sanders P., Speck J., Steffen R. ACM Transactions on Parallel Computing2 (3):1-29,2015.Type:Article

Date Reviewed: Dec 9 2015

This paper is best motivated by the following observation (on p. 2). Currently, the fastest machine in the TOP500 list is rated at 16.3 peta floating-point operations per second (PFLOPS), which is achieved on matrices of dimension 12,681,215. Twenty years ago, the speed was 170.6 giga FLOPS, and the corresponding matrix dimension was 42,000. This seems like tremendous progress, a factor of ten in speed every four years. However, the 1995 machine took less than five minutes to demonstrate this speed, whereas the 2015 machine took more than 23 hours. This is because the usual linear algebra algorithms have critical paths of length Ω (N) when dealing with N × N matrices, independent of the number of processors used. Algorithms that have shorter (polynomial in log N) critical paths are known, but they perform asymptotically more work in all but the classical algorithms. This paper proposes algorithms for inverting N × N symmetric positive definite matrices where the critical path is polynomial in log N, but the work involved is at most a constant multiple of that of the classic algorithms. The key idea is to use Strassen’s inversion [1] (not to be confused with Strassen’s multiplication). However, if we use this all the way, we get a critical path of length Ω(N). Hence, the authors stop at some point n and switch to Pan’s adaptation of Newton’s iteration [2], using O(log N) iterations (which suffices if the condition number of the matrix is polynomial in N). This does give the desired result: algorithm NeSt requires work (1+o(1))N³ and critical path length O(log^2.5+εN). However, is it stable? This is a key question for nonstandard inversion algorithms. The authors in fact propose doing a single further Newton iteration after each Strassen step. This algorithms NeStS multiplies the work by five, but under appropriate conditions can be proved to be stable; in fact, it is so under much wider conditions in practice. Modeling for a hypothetical (but pretty realistic) exascale machine shows that the basic NeST algorithm is 10 times faster than LU decomposition of N<1,000,000. It is a pity that the authors didn’t also model NeStS, as this seems theoretically recommended. The authors also perform practical tests on a 4×8-core NUMA machine, measuring their algorithms against a recent (inversion has apparently improved recently) Intel MKL implementation. Their NeSt uses MKL for the addition and multiplication subroutines. On a single core, NeSt takes 1.27 times as long as MKL, better than theory would predict. On 32 cores, with N=2¹⁴, MKL seems to take 50 percent longer than NeSt. There are then some rather surprising results on accuracy, performed on N=2¹³ matrices (always symmetric positive definite). Up to condition number 2⁶, the methods are similar. Then, pure Newton and, surprisingly, NeStS are worse, while NeSt tracks MKL up to condition number 2¹⁴, and then is worse (but still much better than NeStS or Newton). Finally, the authors use the theoretical trick of reducing inversion of arbitrary matrices to symmetric positive definite ones. As expected, the accuracy behavior is dire (as this trick squares the condition number).

Reviewer: J. H. Davenport	Review #: CR144002 (1602-0129)

1)	Strassen, V. Gaussian elimination is not optimal. Numerische Mathematik 13 (1969), 354–356.

2)	Pan, V.; Reif, J. Fast and efficient parallel solution of dense linear systems. Computers and Mathematics with Applications 14 (1989), 1481–1491.

Numerical Linear Algebra (G.1.3 )

Numerical Algorithms And Problems (F.2.1 )

Would you recommend this review?

yes

Other reviews under "Numerical Linear Algebra":	Date

Exploiting fast matrix multiplication within the level 3 BLAS Higham N. ACM Transactions on Mathematical Software 16(4): 352-368, 2000. Type: Article	Aug 1 1991

Fundamentals of matrix computations Watkins D., John Wiley & Sons, Inc., New York, NY, 1991. Type: Book (9780471614142)	Jun 1 1992

Computational methods for linear control systems Petkov P., Christov N., Konstantinov M., Prentice Hall International (UK) Ltd., Hertfordshire, UK, 1991. Type: Book (9780131618039)	Jun 1 1992

more...

Reproduction in whole or in part without permission is prohibited. Copyright 1999-2024 ThinkLoud^®
Terms of Use | Privacy Policy