Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
MillWheel: fault-tolerant stream processing at Internet scale
Akidau T., Balikov A., Bekiroğlu K., Chernyak S., Haberman J., Lax R., McVeety S., Mills D., Nordstrom P., Whittle S. Proceedings of the VLDB Endowment6 (11):1033-1044,2013.Type:Article
Date Reviewed: Apr 11 2014

In this paper, the authors introduce MillWheel, a framework for processing real-time streamed data that is used at Google. It is based on the paradigm of streaming computing, in which users specify a computation graph and provide code for individual nodes. After the computation starts, the application receives continuous data and processes it in a real-time fashion to reduce latency. MillWheel targets latency-sensitive applications, such as the detection of query hikes or query dips, and the delivery of ads.

The MillWheel model features persistent storage, low watermarks, and duplicate prevention. The persistent storage stores the state of each node so that if a node fails, the system can resume the computation without losing data by reading the stored state. The low watermark subsystem tracks the timestamps of processed data at each node, providing the basis for fault tolerance and duplication prevention. With those features, MillWheel is capable of processing continuous data streams that consist of data tokens with keys and timestamps. If the user wants to trigger a specific event at a future time, he or she can use the provided timer mechanism (which is optional) to register the event.

Fault tolerance is a key feature of MillWheel. Since the framework might run on thousands of nodes continuously, the chance of node failure is high. When a node fails, it is required that computed results are saved; if the node keeps state, the state can be resumed so that future computations are still correct. After the node restarts from the failure, duplication of computed results should also be prevented.

To achieve these goals, MillWheel ensures exactly once delivery with the help of persistent storage and data acknowledgment. For stateful computations (in which repeated computations with the same input may have different results), MillWheel provides a mechanism called “strong productions” to checkpoint produced data before changing the node state. This form of fault tolerance, however, might incur unnecessary cost for stateless computations (in which computations with the same input always yield the same output). MillWheel provides an option to turn off strong productions and use weak productions to improve the performance of stateless computations.

The state of computations is stored in both disks and memories. While disks provide space for huge amounts of state data, memories have a speed advantage. To ensure consistency, all state write operations are wrapped in per-key atomic operations. However, in case of work migration or node restart, there might be zombie writers and network remnants that issue stale writes. To prevent this, all write operations are assigned with a unique sequence. The user can customize the granularity of state modification operations for performance benefit according to the failure probabilities.

MillWheel has been implemented on Google clusters. Streams are delivered via remote procedure call. Load distribution is handled by a replicated master. Persistent state is maintained by a database like BigTable or Spanner. Low watermarks are computed conservatively by a central authority, but interested consumers should compute the low watermark according to their own records and those of subscribed senders.

Experiments show that with weak productions, the median record delay of a node is several milliseconds. With strong productions and exactly-once delivery enabled, the median delay increases to tens of milliseconds, which is still within human reaction time. Framework-level cache is also used to reduce traffic between storage layers.

MillWheel has been used in various Google internal systems, such as ad delivery and image processing for Google Street View. However, the authors point out that there are applications that MillWheel does not suit well. For example, if an application cannot be parallelized well among different keys, there could be bottleneck stages that slow down the whole computation.

Reviewer:  Peng Li Review #: CR142160 (1407-0550)
Bookmark and Share
  Reviewer Selected
 
 
Fault Tolerance (C.4 ... )
 
 
Real-Time And Embedded Systems (C.3 ... )
 
Would you recommend this review?
yes
no
Other reviews under "Fault Tolerance": Date
Performance of fault-tolerant data and compute intensive programs over a network of workstations
Smith J., Shrivastava S. Theoretical Computer Science 196(1-2): 319-345, 1998. Type: Article
Jan 1 1999
System diagnosis with smallest risk of error
Diks K., Pelc A. Theoretical Computer Science 203(1): 163-173, 1998. Type: Article
Mar 1 1999
Coding approaches to fault tolerance in combinational and dynamic systems
Hadjicostis C., Kluwer Academic Publishers, Norwell, MA, 2001.  216, Type: Book (9780792376248)
Jul 2 2002
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy