This interesting paper describes some expected (the transmission control protocol/Internet protocol (TCP/IP) stack) as well as unexpected (nonuniform memory access (NUMA) architecture and solid-state drive (SSD) storage devices) bottlenecks that prevent the full utilization of available network bandwidth. The authors demonstrate that by addressing these bottlenecks using a combination of existing solutions and their new remote direct memory access (RDMA)-based file transfer protocol (RFTP), much faster end-to-end data transfer speeds can be achieved, even over long-distance networks. This is crucial in many supercomputing applications, where massive datasets can live in data centers away from the supercomputers processing them.
While TCP/IP has been known for many years to be inefficient, the authors show that in modern multicore systems with NUMA memory hierarchy, TCP/IP multiple data copies hinder the performance even more. An even more interesting bottleneck, although mentioned only in passing as the reason to implement a random access memory (RAM)-based file system in the testbed, is the performance of SSD storage devices. Generally seen as a much faster alternative to spinning-media storage, it turns out that sustained writes of large amounts of data to SSDs trigger thermal throttling to prevent overheating, which causes their input/output (I/O) performance to drop to very low levels.
For practitioners, the main contribution of this paper is a practical way to improve a file write performance by tuning a non-NUMA-aware small computer system interface (SCSI) target daemon (by using the Linux numactl utility and an appropriately configured tmpfs file system). When combined with the authors’ RFTP, the end-to-end testbed demonstrates significantly faster data transfer speed than GridFTP, in both local (LAN) and wide area network (WAN) settings.