This is a friendly discussion on the status of the Google File System (GFS). Sean Quinlan works at Google and was at one time the GFS technology leader. In this article, he is interviewed by Kirk McKusick, who worked on Berkeley Software Distribution (BSD) Unix and its Fast File System (FFS) file system.
GFS is a distributed file system designed for internal use at Google. It supports petabytes of storage and is optimized for batch operations. This article identifies the two major issues with GFS: a single master that manages file placement and lookup, and an early design decision to set the block size at 64MB. The article discusses the background behind these decisions and their impact on current GFS usage and efforts.
GFS is proprietary to Google, but the Hadoop file system (HDFS), an open-source implementation, shares the same design. This article should be reviewed by distributed file system developers and by those interested in Hadoop, as the lessons learned at Google will drive systems design for many years to come.