Computing Reviews, the leading online review service for computing literature.

Search

Impact of reverse computing on information locality in register allocation for high performance computing
Bahi M., Eisenbeis C. International Journal of Parallel Programming42 (1):49-76,2014.Type:Article

Date Reviewed: Mar 25 2014

For computers dedicated to high-performance computing (HPC) applications, computation is virtually free, while data access shows up as the main cost factor. This iconoclastic view is even more valid when looking at graphics processing units (GPUs)--thousands of cores operating on what remain somewhat limited high-speed register files still call for external, host-based memories, which are at least a thousand-fold slower. One way to address this shortcoming is to replace the storing of intermediate values with the recomputations of the same values, when evaluating instruction blocks, thus trading cheap repeated expression evaluations for costly memory spills. The main idea introduced in this paper is that this rematerialization of values can sometimes be more favorably performed via output values in lieu of input ones, at least as long as one uses reversible operators. Hence, after a register assignment such as r₂:= r₁₊₁, one could use reversible computing to recompute, for instance, expression 2 × r₁ as 2 × (r_2-1), which might enable the reuse of register r₁. The paper illustrates how this concept can improve both instruction scheduling and register allocation of scientific applications, taking as use case a lattice quantum chromodynamics (LQCD) physics simulation running on an Nvidia GPU. Rematerialization via reversible computing limits register pressure beyond traditional techniques, and this proves to be beneficial in practice; double precision, run-time performance is improved up to 10 percent. Even though still preliminary, this work addresses a key compilation issue for HPC applications, and provides new ideas that should interest compiler writers and researchers working on program transformations.

Reviewer: P. Jouvelot	Review #: CR142101 (1406-0443)

Processors (D.3.4 )

General (C.1.0 )

Modes Of Computation (F.1.2 )

Systems And Software (H.3.4 )

Would you recommend this review?

yes

Other reviews under "Processors":	Date

The IBM family of APL systems Falkoff A. IBM Systems Journal 30(4): 416-432, 1991. Type: Article	Dec 1 1993

Attribute grammars: attribute evaluation methods Engelfriet J., Cambridge University Press, New York, NY, 1984. Type: Book (9780521268431)	Jun 1 1985

Processor control flow monitoring using signatured instruction streams Schuette M., Shen J. IEEE Transactions on Computers 36(3): 264-277, 1987. Type: Article	Dec 1 1987

more...

Reproduction in whole or in part without permission is prohibited. Copyright 1999-2024 ThinkLoud^®
Terms of Use | Privacy Policy