Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Performance characteristics of the Cray X1 and their implications for application performance tuning
Shan H., Strohmaier E.  Supercomputing (Proceedings of the 18th Annual International Conference on Supercomputing, Malo, France, Jun 26-Jul 1, 2004)175-183.2004.Type:Proceedings
Date Reviewed: Jul 1 2005

In the last decade, superscalar processors have been widely used in high-performance computing, and have even replaced vector architectures that were popular during the 1980s. The development of the Japanese Earth Simulator and the Cray X1 supercomputer has revived interest in vector architectures, however. In this context, this paper considers the differences in the performance characteristics of superscalar and vector architectures, and addresses the question of how programs tuned for superscalar platforms can be transformed into programs that are efficient for vector architectures. As execution platforms, an IBM Power4 and a Cray X1 are considered. The performance evaluation is based on a synthetic access probe program (Apex-Map) and several application programs (one-dimensional fast Fourier transform (1D-FFT), radix sort, Nbody simulation, and matrix multiplication).

Based on a detailed evaluation of the performance characteristics of the applications, the paper shows that the average vector length used and memory bank conflicts have the largest impact on the performance of the Cray X1. The memory size accessed and data reuse have a much smaller effect.

Using these guidelines for performance tuning, the authors were able to increase the performance of 1D-FFT and radix sort significantly. For the Nbody simulation, a performance gain of only 20 percent was obtained, mainly because of the irregular memory accesses that limit the potential for vectorization.

The paper is written for expert readers who are interested in the characteristics of modern vector architectures and their impact on program performance, as well as in the tuning of programs for vector architectures.

Reviewer:  T. Rauber Review #: CR131446 (0605-0486)
Bookmark and Share
 
Performance Attributes (C.4 ... )
 
 
Array And Vector Processors (C.1.2 ... )
 
 
Parallel Programming (D.1.3 ... )
 
Would you recommend this review?
yes
no
Other reviews under "Performance Attributes": Date
Attributes of the performance of central processing units: a relative performance prediction model
Ein-Dor P., Feldmesser J. Communications of the ACM 30(4): 308-317, 1987. Type: Article
Jul 1 1988
Performance estimation of computer communication networks: a structured approach
Verma P., Computer Science Press, Inc., New York, NY, 1989. Type: Book (9789780716781837)
Jun 1 1990
Computer hardware performance: production and cost function analyses
Kang Y. Communications of the ACM 32(5): 586-593, 1989. Type: Article
Feb 1 1990
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy