Concurrency-related errors, like data races, are very difficult to track down and eliminate in large object-oriented programs. Existing approaches to prevent data races rely on protecting instruction sequences (like critical sections) with synchronization operations. Such control-centric approaches put a heavy burden on the programmer to ensure that all concurrently accessed memory locations are consistently protected. They are often error-prone and brittle.
Dolby et al. present a novel data-centric synchronization approach to handle the synchronization problem where atomic sets are defined by grouping fields of various objects. Such sets represent fields that must always be updated atomically. To preserve the consistency of an atomic set, it is associated with an appropriate consistency checker code segment. Synchronization operations are then added automatically by the compiler.
A subset of Java, called Atomic Java (AJ), is proposed here as an extension of Java with atomic sets. AJ creates annotations for data-centric concurrency control. It relies on a novel type system that produces atomic sets spanning multiple objects. AJ enables separate compilation and supports full encapsulation, ensuring efficient generation of code. Data-centric synchronization makes code easy to evaluate by refactoring classes from standard libraries and multithreaded benchmarks (SPECjbb) to use atomic sets. The process offers low annotation overhead, while successfully preventing data races.
The process requires about 40 annotations per thousand lines of code (KLOC) for the collection class and ranges from 0.6 to 11.5 per KLOC for the other applications. For all but one of the applications, the approach requires fewer annotations than the number of synchronized blocks in the original Java code. The throughput of the tuned AJ version of the SPECjbb benchmark achieves 90.8 percent of that achieved by the original Java implementation when run with 98 threads.
However, the prototype implementation has drawbacks, too. It does not support multiple atomic sets in a single class and the type system does not deal with generics. Importantly, AJ does not guarantee complete elimination of programmer errors. When a section of code manipulates data outside of the atomic set, data races or deadlocks can potentially occur.
This paper suggests an interesting system that guarantees the ability to serialize atomic sets spanning multiple objects and enables separate compilation at the same time. The low annotation overhead of this approach has been effectively demonstrated by the AJ-to-Java compiler. This makes AJ a good candidate for language-based synchronization design.