Computing Reviews, the leading online review service for computing literature.

Search

Gorlatch, Sergei
University of Muenster
Muenster, Germany

	Reader Recommended
	Reviewer Selected
	Highlighted

Options:

Date Reviewed

1 - 10 of 46 reviews

Convolutional neural networks in APL
Šinkarovs A., Bernecky R., Scholz S. ARRAY 2019 (Proceedings of the 6th ACM SIGPLAN International Workshop on Libraries, Languages and Compilers for Array Programming, Phoenix, AZ, Jun 22, 2019) 69-79, 2019. Type: Proceedings
After a short introduction, the authors show how to implement a convolutional neural network (CNN) in APL, a programming language based on a multidimensional array, illustrated by an example CNN for handwriting recognition on the Modif...

Oct 31 2019

System-wide time versus density tradeoff in real-time multicore fluid scheduling
Kim K., Cho Y., Eo J., Lee C., Han J. IEEE Transactions on Computers 67(7): 1007-1022, 2018. Type: Article
The paper addresses an important question of parallel programming: the degree to which an application should be parallelized to keep parallelization overhead low. This is also known as the time improvement versus tradeoff problem....

Aug 30 2018

A technique to automatically determine ad-hoc communication patterns at runtime
Moreton-Fernandez A., Gonzalez-Escribano A., Llanos D. Parallel Computing 69 45-62, 2017. Type: Article
Moreton-Fernandez et al. present an approach to automatically determine communication patterns at runtime for programs that run on distributed-memory parallel systems. Optimizing and tuning code for such systems when using traditional,...

Feb 14 2018

Compiler-based code generation and autotuning for geometric multigrid on GPU-accelerated supercomputers
Basu P., Williams S., Van Straalen B., Oliker L., Colella P., Hall M. Parallel Computing 64 50-64, 2017. Type: Article
Basu et al. present a code generation and autotuning technique for geometric multigrid codes targeted for graphics processing unit (GPU)-accelerated supercomputers using the CUDA-CHiLL compilation framework....

Aug 30 2017

A GPU-based branch-and-bound algorithm using IntegerVectorMatrix data structure
Gmys J., Mezmaz M., Melab N., Tuyttens D. Parallel Computing 59(C): 119-139, 2016. Type: Article
Gmys et al. address the parallelization of the popular branch-and-bound (B&B) algorithm for solving combinatorial optimization problems using graphics processing units (GPUs). The irregular structure of B&B makes it challenging...

May 17 2017

GPU-accelerated Hungarian algorithms for the linear assignment problem
Date K., Nagi R. Parallel Computing 57(C): 52-72, 2016. Type: Article
The paper is devoted to a fundamental problem in the area of combinatorial optimization--the assignment problem. The goal is to find a maximum (or minimum) weight matching in a weighted bipartite graph. The authors focus on th...

May 11 2017

Massively parallel lattice-Boltzmann codes on large GPU clusters
Calore E., Gabbana A., Kraus J., Pellegrini E., Schifano S., Tripiccione R. Parallel Computing 581-24, 2016. Type: Article
Calore et al. present a detailed overview of the development and optimization process of lattice-Boltzmann code for modern graphics processing units (GPUs). The paper begins with a brief introduction of the lattice-Boltzmann method (LB...

Jan 31 2017

Writing a performance-portable matrix multiplication
Fabeiro J., Andrade D., Fraguela B. Parallel Computing 52(C): 65-77, 2016. Type: Article
An important challenge of parallel programming--programming optimization with regard to the specific characteristics of the target computer architecture--is addressed in this paper. For this, an auto-tuning approach i...

May 26 2016

Thread-level synthetic benchmarks for multicore systems
Sen A., Deniz E. Microprocessors & Microsystems 39(7): 471-479, 2015. Type: Article
A framework that automatically generates synthetic benchmarks for multicore systems is presented in this paper....

Apr 12 2016

Compiler-driven software speculation for thread-level parallelism
Yiapanis P., Brown G., Luján M. ACM Transactions on Programming Languages and Systems 38(2): 1-45, 2016. Type: Article
Compilers quickly reach their limits in parallelizing sequential code (for example, C++): static analysis methods often fail because of insufficient information about values that become known at runtime; therefore, compilers work usual...

Feb 24 2016

Display

per column

Reproduction in whole or in part without permission is prohibited. Copyright 1999-2024 ThinkLoud^®
Terms of Use | Privacy Policy