Tomov et al.’s work involves the interface of hardware (hybrid systems made up of graphics processing units (GPUs) with more conventional central processing unit (CPU)-based systems) and software (for treating dense linear algebra (DLA) systems). This is an important paper from a hardware viewpoint, since it addresses both the cost and performance of GPUs, which make them highly desirable target systems, as is illustrated by their popularity in the video game arena and their potential in computational science fields. On the other hand, the target of DLA software is an example of an initial software project, both in terms of its utility and as a demonstration of dealing with the issues that need to be addressed when developing software of such hybrid hardware systems.
The paper begins with an introduction and background section on the history of GPUs, followed specifically by an overview of prior work on GPUs and DLA. Section 3 deals with the details of a hybrid system of a multicore CPU and GPUs. This section focuses on approaches to the hybridization of LAPACK, and it details task splitting and scheduling using directed acyclic graphs (DAGs). In Section 4, an illustrative example of DLA algorithms is given, along with performance results. This approach uses a random butterfly transformation (RBT) as a preconditioner. The paper includes performance results and conclusions for extending this approach. This interesting paper can serve as a foundation for dealing with more advanced hybrid systems and more complex mathematical software problems.