Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Massively parallel lattice-Boltzmann codes on large GPU clusters
Calore E., Gabbana A., Kraus J., Pellegrini E., Schifano S., Tripiccione R. Parallel Computing58 1-24,2016.Type:Article
Date Reviewed: Jan 31 2017

Calore et al. present a detailed overview of the development and optimization process of lattice-Boltzmann code for modern graphics processing units (GPUs). The paper begins with a brief introduction of the lattice-Boltzmann method (LBM) and modern Nvidia GPU architectures. Then, the authors explain how to optimize the LBM code for a single GPU using different data structures and possible variants of organizing data parallel kernels. Every applied optimization is guided either by simple analytical performance models or by the results of small benchmarks. Afterwards, the authors present a structured way of porting the single-node LBM code to a large cluster of GPUs. Again, performance models and benchmarks are used to explain the performance of different domain decomposition strategies like 1D or multidimensional tiling.

The paper is well written; the optimization approach is presented in a structured and comprehensible manner. Therefore, the described strategies can be easily adapted for other scientific applications by performance engineers targeting parallel and distributed systems. Furthermore, this paper contains valuable results on developing high-performance libraries or programming frameworks to enable performance-portable code, that is, code that achieves high performance across a wide range of architectures.

Reviewer:  Sergei Gorlatch Review #: CR145036 (1705-0310)
Bookmark and Share
 
Graphics Processors (I.3.1 ... )
 
 
Heterogeneous (Hybrid) Systems (C.1.3 ... )
 
 
Parallel Processing (I.3.1 ... )
 
Would you recommend this review?
yes
no
Other reviews under "Graphics Processors": Date
Introduction to volume rendering
Lichtenbelt B., Crane R., Naqvi S., Prentice-Hall, Inc., Upper Saddle River, NJ, 1998. Type: Book (9780138616830)
May 1 1999
Time/space tradeoffs for polygon mesh rendering
Bar-Yehuda R., Gotsman C. ACM Transactions on Graphics (TOG) 15(2): 141-152, 1996. Type: Article
Jul 1 1997
A programmable vertex shader with fixed-point SIMD datapath for low power wireless applications
Sohn J., Woo R., Yoo H.  Graphics hardware (Proceedings of the ACM SIGGRAPH/EUROGRAPHICS Conference on Graphics Hardware, Grenoble, France, Aug 29-30, 2004)107-114, 2004. Type: Proceedings
Jul 8 2005
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy