Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
The WaveScalar architecture
Swanson S., Schwerin A., Mercaldi M., Petersen A., Putnam A., Michelson K., Oskin M., Eggers S. ACM Transactions on Computer Systems25 (2):4-es,2007.Type:Article
Date Reviewed: Feb 4 2008

The WaveScalar architecture aims to take advantages found in dataflow to achieve more parallelism with less silicon area, along with better performance and scalability than superscalar architecture. The program counter, which is a bottleneck in von Neumann architecture, has been eliminated, and the natural parallelism found in dataflow architecture is combined with fine-grain multi-threading to achieve a linear speedup when used with several processing elements. Instructions are ordered according to the path to be stored in a WaveCache, which is designed for WaveScalar architecture in a tiled fashion. In this architecture, deep speculation is done naturally. To alleviate the cache coherency, directory-based MESI is used.

One novel idea in this architecture is that the WaveCache loads instructions from memory, and assigns them to processing elements instead of processing elements requesting certain instructions to process. Besides, the WaveScalar architecture does not use a register file for intervening accesses, but directly stores the required instructions.

In previous dataflow architectures, even the dynamic ones, the matching of operands was a bottleneck, and reduced the performance. In this architecture, the authors claim that the logic of match and dispatch is the most complex. What impact will it have for the WaveScalar architecture? Will it be a bottleneck, as it was in the past? These things need to be addressed clearly.

The WaveCache for this architecture is designed rather well. The compiler for this architecture is yet to be finished--it is rather tedious work. Will there be a linear speedup with the compiler for this particular architecture? This is yet to be evaluated. A hand-coded version would be naturally optimized. Is it possible to compare the performance of WaveScalar architecture with TRIPS, which takes a similar approach, but with slight variations? There is also a stream processor with clusters of processing elements, which uses local registers and global registers that try to take advantage of the von Neumann model without including dataflow concepts. Is it possible to compare the performance of the WaveScalar architecture to a stream processor? Since WaveScalar does not include any register file for intervening accesses, will it have a positive or negative impact on the overall performance of this architecture?

The authors try to bring dataflow concepts closer to von Neumann, but with less wire delay, and less silicon that can perform in a single-threaded or multithreaded version better than the superscalar architectures. The paper is rather long; the simple concepts of the von Neumann model and dataflow could have been omitted, since anyone reading this paper should know the basics in both realms.

Reviewer:  J. Arul Review #: CR135214 (0812-1184)
Bookmark and Share
  Featured Reviewer  
 
Data-Flow Architectures (C.1.3 ... )
 
 
Concurrency (D.4.1 ... )
 
 
General (C.5.0 )
 
Would you recommend this review?
yes
no
Other reviews under "Data-Flow Architectures": Date
Implementation of a general-purpose dataflow multiprocessor
Papadopoulos G., MIT Press, Cambridge, MA, 1991. Type: Book (9780262660693)
Jul 1 1992
Data flow computer architecture
Chudík J., Springer-Verlag, London, UK, 1984. Type: Book (9789780387136578)
Oct 1 1985
A fault-tolerant dataflow system
Srini V. Computer 18(3): 54-68, 1985. Type: Article
Mar 1 1986
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy