Computing Reviews, the leading online review service for computing literature.

Search

Triangulating molecular surfaces over a LAN of GPU-enabled computers
Dias S., Gomes A. Parallel Computing42 (C):35-47,2015.Type:Article

Date Reviewed: Nov 11 2015

As the availability of modern massively parallel graphics processing units (GPUs) increases, researchers are exploring ways to utilize loosely coupled GPU-equipped workstations to increase application performance without the cost and complexity of traditional supercomputers. Dias and Gomes present a heterogeneous compute solution for one application, triangulation and rendering of molecular surfaces based on a marching cubes algorithm. Their solution uses a combination of OpenMPI, OpenMP, and CUDA across a network of multicore systems with CUDA-capable devices. They use a standard master-worker distributed architecture, where each worker node has multiple GPUs. Each GPU in the worker node is controlled by a unique central processing unit (CPU) core or thread. In their experiments, they use up to five worker nodes with two GPUs each. Gigabit and ten-gigabit networks are used in their experiments. The algorithm’s performance is measured on 40 molecules of varying sizes. Their results show that performance scales with the number of GPUs in the network, especially for very large molecules. Network communication overhead is discussed, and it appears that scaling beyond ten GPUs/five workers may not yield additional performance improvement. While the performance results are generally very impressive, the data presented show some anomalies that are not addressed in the paper. The runtimes of the 40-core solution often show super-linear speedup compared to the single CPU runtimes. Also, for the largest molecule, there is a 70 percent performance improvement on the infiniband network when using ten GPUs versus eight GPUs. These results are nonintuitive, and readers of this paper may be left wondering why this is the case. This work is encouraging in that it shows that significant performance improvement can be achieved on some applications using a loosely coupled network of commodity hardware without a major refactoring of an existing algorithm.

Reviewer: Chris Lupo	Review #: CR143928 (1601-0056)

Local and Wide-Area Networks (C.2.5 )

Graphics Processors (I.3.1 ... )

Would you recommend this review?

yes

Other reviews under "Local and Wide-Area Networks":	Date

Microcomputer LANs (2nd ed.) Hordeski M., TAB Books, Blue Ridge Summit, PA, 1991. Type: Book (9780830634248)	Jul 1 1992

High-speed local area networks and their performance Abeysundara B., Kamal A. ACM Computing Surveys 23(2): 221-264, 1991. Type: Article	Jun 1 1992

Local area networking Naugle M., McGraw-Hill, Inc., New York, NY, 1991. Type: Book (9780070464551)	Jun 1 1992

more...

Reproduction in whole or in part without permission is prohibited. Copyright 1999-2024 ThinkLoud^®
Terms of Use | Privacy Policy