Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
epiC: an extensible and scalable system for processing big data
Jiang D., Wu S., Chen G., Ooi B., Tan K., Xu J. The VLDB Journal: The International Journal on Very Large Data Bases25 (1):3-26,2016.Type:Article
Date Reviewed: Sep 27 2016

Addressing the problem of performance while processing big data, the authors propose a new programming model, named epiC, that deals with the problem of data variety, for example when data arrives at the evaluation engine as a mix of structured, semistructured, and unstructured data.

Existing approaches, for example, Apache Hadoop and Pregel, are focused on coping with such big data issues as high volume of text data or graphs, respectively. When dealing with data variety, the usual approach is to set up a shared-nothing cluster where a portion of the cluster is dedicated and tailored to Hadoop, another for Pregel, and so on. Of course, this approach poses several problems such as how to deal with and manage intermediate results and how to manage faults in the cluster’s working nodes.

The authors propose epiC, an actor-based framework that, leveraging existing implementations for, for example, MapReduce, facilitates big data processing with a high degree of data variety. The actor model (first introduced in a 1973 paper [1]) is a mathematical model that is composed by distributed components capable of a limited set of actions, avoiding the need for locks. The authors made benchmarks with other frameworks and ontologies, and showed improved performance.

By adopting certain implementation details, epiC fits perfectly in the gaps left open by off-the-shelf implementations, by also providing an elegant fault tolerance mechanism.

The actor model is based on messages. Slave nodes only exchange metadata with master nodes; thus, the overhead of the communication infrastructure is really low. Computations and intermediate results are stored in a distributed file system (DFS), through an optimized storage mechanism to avoid input/output (I/O) delays.

Concluding, the paper presents a solution to those who need to tackle the issue of data variety, but not only that: the methodology used in the development is a detailed example to those who want to start developing following the actor-model paradigm.

Reviewer:  Massimiliano Masi Review #: CR144788 (1612-0904)
1) Hewitt, C.; Bishop, P.; Steiger, R. A universal modular ACTOR formalism for artificial intelligence. In IJCAI 73 (Proc. of the 3rd International Joint Conference on Artificial Intelligence). Morgan Kaufmann, San Francisco, CA, 1973, 235–245.
Bookmark and Share
  Featured Reviewer  
 
General (H.2.0 )
 
Would you recommend this review?
yes
no
Other reviews under "General": Date
Design of the Mneme persistent object store
Moss J. ACM Transactions on Information Systems 8(2): 103-139, 2001. Type: Article
Jul 1 1991
Database management systems
Gorman M., QED Information Sciences, Inc., Wellesley, MA, 1991. Type: Book (9780894353239)
Dec 1 1991
Database management (3rd ed.)
McFadden F., Hoffer J., Benjamin-Cummings Publ. Co., Inc., Redwood City, CA, 1991. Type: Book (9780805360400)
Jun 1 1992
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy