Computing Reviews, the leading online review service for computing literature.

Search

Writing a performance-portable matrix multiplication
Fabeiro J., Andrade D., Fraguela B. Parallel Computing52 (C):65-77,2016.Type:Article

Date Reviewed: May 26 2016

An important challenge of parallel programming--programming optimization with regard to the specific characteristics of the target computer architecture--is addressed in this paper. For this, an auto-tuning approach is implemented and evaluated using matrix multiplication as an example. Auto-tuning is a technique to automatically perform program optimizations; it is based on two core ideas: (1) the target application is implemented in a parameterized form regarding so-called tuning parameters, such that different parameter values lead to semantically equal but differently optimized code variants; and (2) the tuning parameters’ values are determined for the specific characteristics of a device by performing an automated search on the parameter space. The parameterized implementation of matrix multiplication is developed by the authors using the heterogeneous programming library (HPL)--a high-level parallel programming model on top of OpenCL. The implementation comprises 14 tuning parameters that capture optimizations such as granularity, loop unrolling factors, local memory usage, and vectorization. A search engine for determining device-optimized parameter values is implemented based on a genetic algorithm. The performance evaluation is conducted on four different device architectures: Intel central processing units (CPUs), NVIDIA/AMD graphics processing units (GPUs), and Intel Xeon Phi co-processors. Two reference implementations are used for comparison: clBLAS and ViennaCL. The authors show that their suggested auto-tuning system enables better performance than the reference implementations.

Reviewer: Sergei Gorlatch	Review #: CR144455 (1608-0587)

Parallel Programming (D.1.3 ... )

Would you recommend this review?

yes

Other reviews under "Parallel Programming":	Date

How to write parallel programs: a first course Carriero N. (ed), Gelernter D. (ed), MIT Press, Cambridge, MA, 1990. Type: Book (9780262031714)	Jul 1 1992

Parallel computer systems Koskela R., Simmons M., ACM Press, New York, NY, 1990. Type: Book (9780201509373)	May 1 1992

Parallel functional languages and compilers Szymanski B. (ed), ACM Press, New York, NY, 1991. Type: Book (9780201522433)	Sep 1 1993

more...

Reproduction in whole or in part without permission is prohibited. Copyright 1999-2024 ThinkLoud^®
Terms of Use | Privacy Policy