Computing Reviews, the leading online review service for computing literature.

Search

Optimizing data placement on GPU memory: a portable approach
Chen G., Shen X., Wu B., Li D. IEEE Transactions on Computers66 (3):473-487,2017.Type:Article

Date Reviewed: Apr 17 2017

We say that a computational artifact is “portable” if it can be implemented on a variety of targets. In this paper, the portable artifact is a strategy for the placement of data on a graphics processing unit (GPU). The strategy is lazy in that it defers at least some placement decisions until the program has accumulated performance data from the current run. The authors’ system, PORPLE, has three major components: MSL (a small language in which to describe GPU memory architecture), PORPLE-C (a compiler that understands data access patterns), and PLACER (an engine that makes the placement decisions). All of these components run on the central processing unit (CPU) rather than the GPU. PORPLE-C analyzes the user’s program, creating a description of the data access patterns and a target program that is able to accept data placement suggestions from PLACER and run correctly no matter what those suggestions are. PLACER makes its suggestions based on an encoding of the MSL specification for the GPU, the data access patterns produced by PORPLE-C, and information extracted as the program runs. PORPLE was evaluated on three different machines, with different GPUs, by comparing it with a rule-based approach. It was consistently able to find GPU data placements that yielded excellent performance of the running program. The paper is carefully written, gives good descriptions of the PORPLE components, and provides a convincing evaluation. There are extensive references. In order to fully appreciate the details, however, the reader needs to have an understanding of GPU architecture.

Reviewer: W. M. Waite	Review #: CR145201 (1707-0449)

Performance Analysis And Design Aids (B.3.3 )

Cache Memories (B.3.2 ... )

Compilers (D.3.4 ... )

Graphics Processors (I.3.1 ... )

Modeling Of Computer Architecture (C.0 ... )

Would you recommend this review?

yes

Other reviews under "Performance Analysis And Design Aids":	Date

Accounting for Memory Bank Contention and Delay in High-Bandwidth Multiprocessors Blelloch G., Gibbons P., Matias Y., Zagha M. IEEE Transactions on Parallel and Distributed Systems 8(9): 943-958, 1997. Type: Article	Jun 1 1998

Architecting phase change memory as a scalable DRAM alternative Lee B., Ipek E., Mutlu O., Burger D. ACM SIGARCH Computer Architecture News 37(3): 2-13, 2009. Type: Article	Oct 28 2009

Flash as cache extension for online transactional workloads Kang W., Lee S., Moon B. The VLDB Journal: The International Journal on Very Large Data Bases 25(5): 673-694, 2016. Type: Article	Dec 20 2016

more...

Reproduction in whole or in part without permission is prohibited. Copyright 1999-2024 ThinkLoud^®
Terms of Use | Privacy Policy