A method for identifying the delays experienced by applications by using critical path analysis (CPA) is described. A critical path is built based on an examination of the packet dependence graph (PDG) for a transmission control protocol (TCP) flow. The construction and profiling of a critical path helps determine what fraction of total transfer latency is caused by packet propagation, network variation (for example, queueing at routers or route fluctuation), packet losses, and delays at the server and at the client.
The analysis is demonstrated by running the developed analysis tool (tcpeval) on a set of Web transactions for several days. The measurements are taken by passively collecting network traffic traces from tcpdump at both end points during a transaction, for varying network and server conditions.
The authors attempt to make their method application-independent by analyzing TCP control messages and HTTP headers in order to distinguish the connection establishment and teardown parts from the application protocol messaging and data transfer. To account for extra delays due to contacting other Web entities (such as DNS servers), the packets that belong to these entities would also need to be analyzed and incorporated in the construction of a critical path. Another limitation of the methodology in the context of TCP transactions is the inability to do “what if” analysis after establishing the CPA for a TCP transaction. Changing delays or drop events can change the dependencies within the entire transaction, and hence its PDG.
The main conclusions provided in the paper are obtained in an experimental manner. These results show that CPA can give additional insight into the causes of delays in Web transfers. For example, in the systems that are measured in the study, server load is the major determinant of transfer time for small files, while network load is the major determinant for large files. While the results presented can explain subtleties in the behavior of the entire end-to-end system, they need to be further verified for other systems in order to draw general conclusions that could influence server design, such as server enhancements to speed the transfer of small files.
Overall, the paper is very well written. The main conclusions are easily understood, even without expertise in transport protocols.