Research on multicore utilization embraces three major categories of topics: investigations regarding the issues of multicore interaction functionality, operating system affairs and compilers, and programming language concerns. The paper pays attention to the third set, with emphasis on the improvement of data dependence for loops and enhancing the performance of the automatic vectorization method.
Data dependence is the vanguard of the discussion, in which shared memory location writing within a common access multicore environment is the dilemma. Prior to dependence analysis and automatic vectorization, the topics related to the automatic vectorization problem and OpenMP structure in compiler optimization are reviewed.
The novelty of the proposal is in enhancing the precision level of dependence analysis and automatic vectorization using existing work, sharing OpenMP information. The idea has been implemented as a GNU Compiler Collection (GCC) optimization pass. By extracting the ordering of the loop execution, which has been added by OpenMP annotation, and feeding it to the dependence analysis phase of GCC, the performance of automatic vectorization is supplied. The approach supports the “pragma for” and “pragma parallel for” OpenMP pragmas and these clauses: chunk scheduling, ordering, attribute scopes, reduction, collapsing, and no wait. Dependence analysis, prototype overview, and control flow are the other topics considered. A lengthy experimental evaluation is the last part of the paper.
While a worthy topic is featured, the sections are not well balanced for an illustrative paper. The reader expects a more theoretical discussion about the details of the proposal rather than a lengthy introduction and experimental results.