Gecko is a research tool that allows accessed Web pages to be incorporated into a file system. This paper describes the authors’ second attempt to create Gecko, called Gecko II. Gecko is implemented on the UNIX platform, building on Network File System (NFS). Other attempts to accomplish similar goals include Microsoft’s WebFS. What makes Gecko unique, according to the authors, is that “Gecko provides access to the Web via NFS, allowing Web pages to be named, accessed, and cached like standard Unix files. Unmodified Unix applications such as cat and grep can be used to manipulate pages... .”
The paper discusses several important design issues the authors had to deal with when designing Gecko. These include system architecture, state retention, hard state versus soft state, compression, caching, and incompatibilities between NFS and the naming space for URLs. This long, well-written work goes into a good level of detail to help readers understand the authors’ choices, both why they choose a particular solution, and why some other alternatives were not chosen.
This is a solid contribution for system programmers. Here is an example that illustrates the kind of problem and the level of detail discussed: NFS allows the client to use the more command to retrieve a screen of data at a time, whereas HTTP is intended to retrieve entire documents. The designers of the Web, realizing that a part of a page may be needed, did include support for the notion of a partial retrieval. Because of dynamic page content, however, many Web sites can require that their site be non-cacheable, since content can and does change at each retrieval (for example, because advertisements are rotated).