Computing Reviews

Performance modeling for hierarchical graph partitioning in heterogeneous multi-core environment
Chan S., Ling T., Aubanel E. Parallel Computing46(C):78-97,2015.Type:Article
Date Reviewed: 11/12/15

Graph partitioning is an interesting topic in parallel scientific computing that has been studied for decades. For many numerical methods, such as finite element methods (FEMs), finite difference methods, and finite volume methods, a mesh has to be distributed among message passing interface (MPI) processes. A partition must be determined such that each process has an equal workload if the system is homogeneous and communications should be minimized. Graph partitioning is ideal to mesh partitions, since each mesh can be converted to a dual graph and graph partitioning tools can be applied right away, such as ParMETIS. Usually, communications are quasi-optimal and good scalability can be expected.

This paper studies performance modeling of hierarchical graph partitioning in heterogeneous clusters, where a new model is proposed under several assumptions: communications are asynchronous and non-blocking; computations and communications overlap; and network and memory contentions are non-negligible. The model also assumes that computations are linearly proportional to mesh size. The authors provide a step-by-step procedure, including platform model and performance model calibration. The performance model is validated using four real-world meshes.

The research topic is interesting and practical. However, the value of this paper is uncertain. First, the targeted applications are FEM codes whose communications can be modeled precisely. It is well known that many open-source parallel graph partitioning tools exist, such as ParMETIS and PT-SCOTCH, which can provide excellent partitioning. Second, real FEM applications require preconditioners to work with linear solvers. This paper uses an iterative conjugate gradient method only, which is not efficient to my knowledge. In the end, the paper assumes that computations are proportional to mesh size, which is not true, especially when the number of MPI processes is large.

Reviewer:  Hui Liu Review #: CR143938 (1601-0059)

Reproduction in whole or in part without permission is prohibited.   Copyright 2024 ComputingReviews.com™
Terms of Use
| Privacy Policy