A sequence of papers by C.C. Paige [1,2,3] introduced the notion of a generalized Q-R decomposition to fit the general linear model min uTu, y = X&bgr; + Cu, where &OHgr; = &sgr;2CCT is the covariance matrix for the unmeasurable errors, X is an m × n matrix, and C is an m × m matrix. The idea is simple: first, perform a “regular” Q-R decomposition of X into
where Q1 = Q1 = I is an orthogonal matrix and R is an upper triangular matrix. Then, find an orthogonal Q2 such that
where S22 is upper triangular. This reduction allows for a swift, elegant solution of generalized least squares (GLS) problems (see [2]).
Since C is often a Cholesky factor, it is often handed to us in upper triangular form and that is what the authors here assume. In making that assumption, the authors turn an O(m3) operation LAPACK procedure into an O(mn2) operation procedure using chasing techniques for applying Givens rotations. The authors then develop a parallel procedure that exploits a row wrap mapping of the rows of X and C under a message-passing model of parallel computing, and show results of tests that exhibit reasonable parallel speedups and scalability.