Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Scalable parallel computing on clouds using Twister4Azure iterative MapReduce
Gunarathne T., Zhang B., Wu T., Qiu J. Future Generation Computer Systems29 (4):1035-1048,2013.Type:Article
Date Reviewed: Nov 20 2013

Gunarathne et al. describe Twister4Azure, an interactive MapReduce application programming interface (API) for Windows Azure Cloud. Twister4Azure is optimized for data-intensive scientific applications that involve successive alternating steps of computation and communication. The proposed API extends the now-familiar MapReduce programming model with interactive extensions, aiming to provide an efficient, fault-tolerant, and easy-to-use environment on the Microsoft Azure platform.

The paper starts by describing Twister4Azure’s main building blocks, such as MapReduce, Hadoop, Twister, and the Microsoft Azure platform. A discussion of scientific application patterns follows: a significant number of data intensive computations exhibit the somewhat regular pattern of multiple interactions of computations followed by communications steps. There are basically two types of data in these computations: loop-invariant input data and loop-variant delta values. Twister4Azure adds one operand to MapReduce called merge, which supports the easy parallelization of the application. Merge executes after the reduce step, receives the output from reduce, and broadcasts it as the input to the next iteration. That broadcast usually consists of the loop-variant values.

The bulk of the paper is dedicated to several optimizations to the merge operand, including data caching and task scheduling.

The authors introduce the concept of “adjusted performance” metrics as a way to compare the performance of different software running on different hardware. Although the idea is appealing, its implementation is faulty, because it only takes into account the difference of a sequential application between the two hardware devices and uses that as a direct proportion between two parallel applications. Network performance, for example, which is paramount in a parallel application, is not considered at all.

Reviewer:  Veronica Lagrange Review #: CR141749 (1401-0073)
Bookmark and Share
  Reviewer Selected
Featured Reviewer
 
 
Cloud Computing (C.2.4 ... )
 
 
Parallel Programming (D.1.3 ... )
 
 
Parallel Architectures (C.1.4 )
 
Would you recommend this review?
yes
no
Other reviews under "Cloud Computing": Date
Cloud security and privacy: an enterprise perspective on risks and compliance
Mather T., Kumaraswamy S., Latif S., O’Reilly Media, Inc., Sebastopol, CA, 2009.  336, Type: Book (9780596802769), Reviews: (1 of 3)
Dec 14 2009
Cloud security and privacy: an enterprise perspective on risks and compliance
Mather T., Kumaraswamy S., Latif S., O’Reilly Media, Inc., Sebastopol, CA, 2009.  336, Type: Book (9780596802769), Reviews: (2 of 3)
Jan 26 2010
Cloud security and privacy: an enterprise perspective on risks and compliance
Mather T., Kumaraswamy S., Latif S., O’Reilly Media, Inc., Sebastopol, CA, 2009.  336, Type: Book (9780596802769), Reviews: (3 of 3)
Mar 18 2010
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy