Computing Reviews, the leading online review service for computing literature.

Search

Mapping deep nested do-loop DSP algorithms to large scale FPGA array structures
Kittitornkun S., Hu Y. IEEE Transactions on Very Large Scale Integration (VLSI) Systems11 (2):208-217,2003.Type:Article

Date Reviewed: Jan 13 2004

In its first section, this paper introduces the new MODG design methodology for mapping an inner loop body to a configurable, linear array of processing elements (PEs). The body is represented by a loop dependence graph, while the PEs are configured by building a superscalar processor from many low-cost, high-speed transistors of a field-programmable gate array (FPGA). This method is particularly applicable today because of recent significant advances in FPGA technology. The unique combination of new hardware technology and new design methodology allows one to build special-purpose, superscalar processors whose PEs are interconnected much like a systolic array (SA). Section 2 outlines how this SA architecture can be designed from FPGAs, while section 3 explains, in much formalism, the design methodology. Section 4 presents an example, and section 5 offers concluding remarks, which are followed by a bibliography. The theoretical formalism in this paper and the amount of design methodology presented seem disproportionately high for an idea that is in essence a skillful code generation algorithm for mapping certain loops to configurable target architectures. The great flexibility afforded by configuring the target stems from the adaptability of FPGAs, which can be tailored to a large class of superscalar processors. These are configured in a linear fashion, like single-dimensional SAs. Yet, these PEs may be heterogeneous, which is different from a typical SA. Any architect wishing to solve image processing or digital signal processing (DSP) applications, who is not constrained by a fixed, predefined target architecture, should understand the ideas presented in this paper. More importantly, the new methods are actually simpler and better than the formalisms of section 3 make them appear. Perhaps the authors can rearticulate their good ideas by clarifying and simplifying them, and then publishing a revised methodology much more widely applicable than the one shown in the current paper. Much greater use is still hiding behind this good work.

Reviewer: Herbert G. Mayer	Review #: CR128903 (0406-0694)

Parallelism And Concurrency (F.1.2 ... )

Arrays (E.1 ... )

Parallel Algorithms (G.1.0 ... )

General (G.1.0 )

Data Structures (E.1 )

Would you recommend this review?

yes

Other reviews under "Parallelism And Concurrency":	Date

Combinatorics on traces Diekert V., Springer-Verlag New York, Inc., New York, NY, 1990. Type: Book (9780387530314)	Aug 1 1991

Concurrent bisimulations in Petri nets Best E., Devillers R., Kiehn A., Pomello L. Acta Informatica 28(3): 231-264, 1991. Type: Article	May 1 1992

Improved upper and lower time bounds for parallel random access machines without simultaneous writes Parberry I. (ed), Yan P. SIAM Journal on Computing 20(1): 88-99, 1991. Type: Article	May 1 1992

more...

Reproduction in whole or in part without permission is prohibited. Copyright 1999-2024 ThinkLoud^®
Terms of Use | Privacy Policy