Computing Reviews, the leading online review service for computing literature.

Search

Massively parallel lattice-Boltzmann codes on large GPU clusters
Calore E., Gabbana A., Kraus J., Pellegrini E., Schifano S., Tripiccione R. Parallel Computing58 1-24,2016.Type:Article

Date Reviewed: Jan 31 2017

Calore et al. present a detailed overview of the development and optimization process of lattice-Boltzmann code for modern graphics processing units (GPUs). The paper begins with a brief introduction of the lattice-Boltzmann method (LBM) and modern Nvidia GPU architectures. Then, the authors explain how to optimize the LBM code for a single GPU using different data structures and possible variants of organizing data parallel kernels. Every applied optimization is guided either by simple analytical performance models or by the results of small benchmarks. Afterwards, the authors present a structured way of porting the single-node LBM code to a large cluster of GPUs. Again, performance models and benchmarks are used to explain the performance of different domain decomposition strategies like 1D or multidimensional tiling. The paper is well written; the optimization approach is presented in a structured and comprehensible manner. Therefore, the described strategies can be easily adapted for other scientific applications by performance engineers targeting parallel and distributed systems. Furthermore, this paper contains valuable results on developing high-performance libraries or programming frameworks to enable performance-portable code, that is, code that achieves high performance across a wide range of architectures.

Reviewer: Sergei Gorlatch	Review #: CR145036 (1705-0310)

Graphics Processors (I.3.1 ... )

Heterogeneous (Hybrid) Systems (C.1.3 ... )

Parallel Processing (I.3.1 ... )

Would you recommend this review?

yes

Other reviews under "Graphics Processors":	Date

Introduction to volume rendering Lichtenbelt B., Crane R., Naqvi S., Prentice-Hall, Inc., Upper Saddle River, NJ, 1998. Type: Book (9780138616830)	May 1 1999

Time/space tradeoffs for polygon mesh rendering Bar-Yehuda R., Gotsman C. ACM Transactions on Graphics (TOG) 15(2): 141-152, 1996. Type: Article	Jul 1 1997

A programmable vertex shader with fixed-point SIMD datapath for low power wireless applications Sohn J., Woo R., Yoo H. Graphics hardware (Proceedings of the ACM SIGGRAPH/EUROGRAPHICS Conference on Graphics Hardware, Grenoble, France, Aug 29-30, 2004)107-114, 2004. Type: Proceedings	Jul 8 2005

more...

Reproduction in whole or in part without permission is prohibited. Copyright 1999-2024 ThinkLoud^®
Terms of Use | Privacy Policy