Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Optimizing data placement on GPU memory: a portable approach
Chen G., Shen X., Wu B., Li D. IEEE Transactions on Computers66 (3):473-487,2017.Type:Article
Date Reviewed: Apr 17 2017

We say that a computational artifact is “portable” if it can be implemented on a variety of targets. In this paper, the portable artifact is a strategy for the placement of data on a graphics processing unit (GPU). The strategy is lazy in that it defers at least some placement decisions until the program has accumulated performance data from the current run.

The authors’ system, PORPLE, has three major components: MSL (a small language in which to describe GPU memory architecture), PORPLE-C (a compiler that understands data access patterns), and PLACER (an engine that makes the placement decisions). All of these components run on the central processing unit (CPU) rather than the GPU.

PORPLE-C analyzes the user’s program, creating a description of the data access patterns and a target program that is able to accept data placement suggestions from PLACER and run correctly no matter what those suggestions are. PLACER makes its suggestions based on an encoding of the MSL specification for the GPU, the data access patterns produced by PORPLE-C, and information extracted as the program runs.

PORPLE was evaluated on three different machines, with different GPUs, by comparing it with a rule-based approach. It was consistently able to find GPU data placements that yielded excellent performance of the running program.

The paper is carefully written, gives good descriptions of the PORPLE components, and provides a convincing evaluation. There are extensive references. In order to fully appreciate the details, however, the reader needs to have an understanding of GPU architecture.

Reviewer:  W. M. Waite Review #: CR145201 (1707-0449)
Bookmark and Share
 
Performance Analysis And Design Aids (B.3.3 )
 
 
Cache Memories (B.3.2 ... )
 
 
Compilers (D.3.4 ... )
 
 
Graphics Processors (I.3.1 ... )
 
 
Modeling Of Computer Architecture (C.0 ... )
 
Would you recommend this review?
yes
no
Other reviews under "Performance Analysis And Design Aids": Date
Accounting for Memory Bank Contention and Delay in High-Bandwidth Multiprocessors
Blelloch G., Gibbons P., Matias Y., Zagha M. IEEE Transactions on Parallel and Distributed Systems 8(9): 943-958, 1997. Type: Article
Jun 1 1998
Architecting phase change memory as a scalable DRAM alternative
Lee B., Ipek E., Mutlu O., Burger D. ACM SIGARCH Computer Architecture News 37(3): 2-13, 2009. Type: Article
Oct 28 2009
Flash as cache extension for online transactional workloads
Kang W., Lee S., Moon B. The VLDB Journal: The International Journal on Very Large Data Bases 25(5): 673-694, 2016. Type: Article
Dec 20 2016
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy