Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Scaling distributed transaction processing and recovery based on dependency logging
Yao C., Zhang M., Lin Q., Ooi B., Xu J. The VLDB Journal: The International Journal on Very Large Data Bases27 (3):347-368,2018.Type:Article
Date Reviewed: Aug 31 2018

Since the beginning of the Computer Age, databases have stored data and applications have used that data. The use of the same data at the same time by different applications is called concurrency, and it may cause problems. Different methods have been devised over time to control concurrency and avoid these problems; however, nowadays, these methods are showing some limits, particularly when data is distributed among several machines. Today, most data access is organized in transactions, or data access sequences, that must logically be treated as a single access. Distributed transactions raise dependency problems: it is not always clear which application accessed some given data first; in these cases, transactions need to be rolled back, or data brought to its initial state. Most current solutions solve this problem with transaction partitioning, or executing transactions in batches, but these solutions in turn may raise problems in case of rollbacks. This problem is usually addressed by logging transactions at runtime, which causes overhead; when failures happen and recovery is needed, some downtime is certain. Moreover, most of these transactions live in memory, not on a disk; thus, bottlenecks are no longer input/output (I/O) writes, but the logging activities themselves.

The proposed dependency graph-based concurrency control (DGCC) protocol uses graphs to control and log distributed concurrent transactions at the same time. It executes both transactions and logging in batches, lowering both transaction times by avoiding deadlocks and logging time by avoiding aborts. The heart of the protocol is transaction dependency graphs. The paper defines them with logic definitions, figures, and pseudocode for synchronous execution; it shows how these graphs are suited for both multithread, single-node environments and multithread, multi-node distributed environments via a coordination mechanism. This is the most remarkable part of the paper, that is, despite technical language appropriate for an academic work, the authors present solutions, discover the problems these solutions raise, and then overcome the problems with different solutions in an almost thrilling way.

The final part of the paper contains tests and simulations that show improvements in overhead reduction, because logging is performed alongside transaction execution; scalability, because graphs are easily extendable; and overall runtime performance, due to concurrent execution.

Reviewer:  Andrea Paramithiotti Review #: CR146228 (1811-0572)
Bookmark and Share
  Featured Reviewer  
 
Super (Very Large) Computers (C.5.1 ... )
 
 
General (H.2.0 )
 
Would you recommend this review?
yes
no
Other reviews under "Super (Very Large) Computers": Date
Parallel supercomputing: methods, algorithms and applications
Carey G. (ed) John Wiley & Sons, Inc., New York, NY,1989. Type: Divisible Book
Dec 1 1991
Supercomputers: algorithms, architectures, and scientific computation
Matsen F. (ed), Tajima T., University of Texas Press, Austin, TX, 1986. Type: Book (9780292703889)
Dec 1 1991
Evaluating supercomputers
van der Steen A., Chapman & Hall, Ltd., London, UK, 1990. Type: Book (9780442311988)
Dec 1 1991
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy