|
Browse All Reviews > Computer Systems Organization (C) > Processor Architectures (C.1) > Multiple Data Stream Architectures (Multiprocessors) (C.1.2)
|
|
|
|
|
|
|
|
|
1-10 of 209
Reviews about "Multiple Data Stream Architectures (Multiprocessors) (C.1.2)":
|
Date Reviewed |
|
Algorithm 980: sparse QR factorization on the GPU Yeralan S., Davis T., Sid-Lakhdar W., Ranka S. ACM Transactions on Mathematical Software 44(2): 1-29, 2017. Type: Article
Many large-scale scientific and engineering computational problems lead, after some kind of discretization, to the solution of huge systems of linear algebraic equations and/or linear least squares problems containing many hundreds of ...
|
Mar 14 2018 |
|
Coarray-based load balancing on heterogeneous and many-core architectures Cardellini V., Fanfarillo A., Filippone S. Parallel Computing 68 45-58, 2017. Type: Article
Load balancing to gain performance and energy efficiency within a heterogeneous processing environment, which contains processing units with different specialized cores and memory specifications that are supposed to collaborate in a pa...
|
Dec 12 2017 |
|
Improving loop dependence analysis Jensen N., Karlsson S. ACM Transactions on Architecture and Code Optimization 14(3): 1-24, 2017. Type: Article
Research on multicore utilization embraces three major categories of topics: investigations regarding the issues of multicore interaction functionality, operating system affairs and compilers, and programming language concerns. The pap...
|
Dec 7 2017 |
|
SoPHy+ Kim T., Kang J., Kim S., Ha S. Microprocessors & Microsystems 43(C): 47-58, 2016. Type: Article
This paper deals with the problem of efficiently mapping programs to an accelerator that presents runtime choices about how to map programs to computation elements either because the accelerator configuration is unknown at compile time...
|
Aug 1 2016 |
|
A MapReduce scratchpad memory for multi-core cloud computing applications Kachris C., Sirakoulis G., Soudris D. Microprocessors & Microsystems 39(8): 599-608, 2015. Type: Article
MapReduce is a programming framework that is widely used for data center cloud computing or multicore system-on-chip (SoC) applications. The goal is to process and generate large datasets. MapReduce is a two-step function: the first st...
|
May 19 2016 |
|
FPGA-GPU communicating through PCIe Thoma Y., Dassatti A., Molla D., Petraglio E. Microprocessors & Microsystems 39(7): 565-575, 2015. Type: Article
Hardware accelerators have seen increasing use in the last decade and can provide significant performance benefits compared to traditional central processing unit (CPU) architectures, especially for massively parallel applications. Spe...
|
May 18 2016 |
|
Speculative segmented sum for sparse matrix-vector multiplication on heterogeneous processors Liu W., Vinter B. Parallel Computing 49(C): 179-193, 2015. Type: Article
Sparse matrix-vector multiplication (SpMV) is a key operation in many scientific and graph applications. It is also challenging because of data compression and load balancing issues. Data compression is needed to efficiently store the ...
|
Apr 21 2016 |
|
Thread-level synthetic benchmarks for multicore systems Sen A., Deniz E. Microprocessors & Microsystems 39(7): 471-479, 2015. Type: Article
A framework that automatically generates synthetic benchmarks for multicore systems is presented in this paper....
|
Apr 12 2016 |
|
Modular vector processor architecture targeting at data-level parallelism Rooholamin S., Ziavras S. Microprocessors & Microsystems 39(4): 237-249, 2015. Type: Article
This research presents a VHSIC hardware description language (VHDL) vector processor architecture specifically designed to address data-level parallelism by separating the vector lanes to use its own private memory, avoiding any stalls...
|
Apr 8 2016 |
|
Design of 4-disjoint gamma interconnection network layouts and reliability analysis of gamma interconnection networks Rajkumar S., Goyal N. The Journal of Supercomputing 69(1): 468-491, 2014. Type: Article
Two designs of 4-disjoint gamma interconnection networks for reliable data communication in a tightly coupled, large-scale, multiprocessor system are described in this paper. These two designs provide four disjoint paths for each sourc...
|
Jun 10 2015 |
|
|
|
|
|
|