Computing Reviews

GraphChi:large-scale graph computation on just a PC
Kyrola A., Blelloch G., Guestrin C.  OSDI 2012 (Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation, Hollywood, CA, Oct 8-10, 2012)31-46,2012.Type:Proceedings
Date Reviewed: 05/24/13

The analytical processing of large graphs has increasingly been in demand for academic and industrial applications. Due to their magnitude, large graphs are typically processed in computing clusters--systems with many computers that work together--at the cost of greater complexity to build, configure, and manage them. Data analysts prefer to avoid problems like these. To address this, the authors of this paper propose the GraphChi system to allow for the processing of large graphs on a single computer.

GraphChi introduces the parallel sliding windows technique, inspired by the asynchronous model of computation [1], which processes the graph data according to P disjoint subsets of vertices defined so that the induced subgraph (plus out-edges) of each subset will fit into memory. This scheme ensures that the entire graph will be processed in P steps. Furthermore, vertex-ordered disk organization (preprocessed) guarantees that only P sequential disk accesses are necessary in each step. The result is an efficient and scalable framework.

GraphChi cannot be used for common operations like graph traversals and vertex queries, but it is quite suitable for problems in which the neighborhood of the vertices is sufficient. Examples of these problems include sparse-matrix dense-vector multiplication (SpMV); connected components, community detection, and triangle counting algorithms; collaborative filtering; and probabilistic graphical models. GraphChi rivals the principal frameworks presented in recent literature; for specific problems, it outperforms clusters with dozens of machines.

I recommend this paper to graph researchers and practitioners who can benefit from this novel and useful paradigm.


1)

Bertsekas, D. P.; Tsitsiklis, J. N. Parallel and distributed computation: numerical methods. Prentice Hall, Englewood Cliffs, NJ, 1989.

Reviewer:  Jose Rodrigues Review #: CR141246 (1308-0727)

Reproduction in whole or in part without permission is prohibited.   Copyright 2024 ComputingReviews.com™
Terms of Use
| Privacy Policy