Optimizing peer-to-peer (P2P) network performance from an Internet service provider’s (ISP’s) point of view is a current topic of interest in the systems and networking domain. Essentially, the optimization consists of choosing local peers in a P2P network among which to exchange data. This reduces transit traffic and has the net effect of saving monetary costs for the ISP and increasing the quality of experience (QoE) for the user (QoE is an elastic term--defined from the viewpoint of a user involved in downloading P2P content, it translates into minimizing the delay in getting the content).
One way to optimize P2P performance is by using caches in the network. The questions here translate to how big the cache should be, how to quantify the bandwidth saved, and where it should be deployed in the network. Should the cache be active or passive?
In this paper, Carlinet et al. attempt to provide answers for these questions. They monitor live traffic in an ISP’s network for nine months and characterize the portion of traffic that used the eDonkey P2P protocol for data exchange. They find out that the file popularity follows a Mandelbrot-Zipf law; in their particular case, the most popular file was downloaded by 83 customers (out of the 7,012 customers who used eDonkey in their data). To determine the bandwidth savings, the authors use an analytical model, which shows theoretical savings of 48.6 percent of the total eDonkey traffic; that is, 48.6 percent of the traffic stayed local, which--in the absence of caches--would have caused transit costs to the ISP and a degraded QoE to the users, as the querying peer would have had to contact nonlocal peers to get chunks of the file. The empirical data collected indicates that a passive caching strategy saves 21 percent of the total P2P traffic, while the active caching strategy actually consumes more bandwidth than it saves. The reason for this, as their real-life trace shows, is that the users did not completely download every file they started to download.
The rest of the results follow from intuition: active caches are better than passive caches, but may consume more bandwidth; if an ISP is deploying active caches, then the behavior of the user population should be taken into account (the ISP should make sure that the users are intent on completing each download they start). The larger the cache size, the more it will be used (in a trace-driven simulation, Carlinet et al. found that a cache size greater than 7 TB served 332 GB of data, with a hit ratio of 4.8 percent, on a user population of 7,012 users). Finally, the placement of the cache should maximize the number of people using it (for instance, it could be tied to the edge router).