A storage architecture for big data environments, where high-throughput, low-latency access to the data is required, is the focus of this paper. The authors propose a system based on all-flash storage that follows a fully distributed, scalable architecture of interconnected storage nodes. Nodes are equipped with storage resources and a field-programmable gate array (FPGA), and are interconnected with a low-latency, high-bandwidth network. Both the flash controller and the network controller are implemented on the same FPGA and are tightly coupled, enabling low-latency data transfers from flash across the network.
The node FPGA may implement application-specific accelerators, allowing the system to move computational capabilities to where the storage is. The accelerator exposes an interface to the file system, which the applications can use to parameterize the computation that the controller should perform on the data. For instance, by applying a predicate to data tuples within the storage controller, the system can filter out data that is not relevant for a query computation. Thereby, fewer data need to be transferred to the host, resulting in lower latency and less bandwidth consumption. The network of storage nodes exposes a single address space to the users. By means of a two-level tagging mechanism, the process of completing a request on a remote node is transparent to the issuer.
The authors present interesting experimental results. According to the paper, the end-to-end latency when accessing “remote storage is much less than the sum of storage and network latencies accounted for separately.” In addition, latency scales linearly with the number of network hops: one could build a network with “dozens of nodes before the network latency becomes a significant portion of the storage [access] latency.” Therefore, the system is expected to maintain good performance at a larger scale. Overall, the proposed architecture is a promising approach to building distributed storage based on flash for high-performance data processing.