Transactional memory technology reduces the granularity of synchronization from the relatively coarse level of mutual exclusion locks to the finer level of memory access. This comprehensive paper contains valuable insights and working code to assess experimentally the potential of transactional memory.
Synchronized memory access is different from a traditional mutual exclusion lock. The advantage is reduction of contention; the disadvantage is a new model and new mechanisms for concurrent programming. The mutual exclusion model is implemented with mechanisms that serialize access to protected code at the procedural level. The new model is elegantly different and uses a concept similar to that of transactions, applying it to memory access or short collections of access events. When contention arises, it is resolved with the finest possible level of granularity, which offers the potential for extended concurrency with an intrinsic minimum of contention.
This unusually long paper (61 pages) presents three lock-free application programming interfaces (APIs) and their implementations. The code allows for writing concurrent applications synchronized using transactional memory. Fraser and Harris describe in great detail these implementations, which are immediately available for downloading. Source code is configurable for several microprocessor architectures. The code uses whatever primitive synchronization instructions the hardware makes available. Current hardware architecture, however, is inspired by the mutual exclusion model. I wonder what performance may be possible if more suitable hardware support for transactional memory ever becomes available.
Awe-inspiring benchmarks of these implementations are so compelling as to incite a duplication of the presented results. The benchmarks, alas, were run on exotic SPARC hardware with more than 100 central processing units (CPUs). Unfortunately, the code as downloaded generates compilation errors on a more modest SuSE Linux 10.2 i86_64 SMP desktop with GCC 4.2.1 pre-release, or on a SuSE Linux 10.3 32-bit i86 with GCC 4.2.1. Compilation errors, however disappointing, can be tolerated in a prototype.
Transactional memory may not be the universal solution for highly concurrent computing, but the implementations by the authors provide good arguments in its favor.