It would be nice to have a system that continuously monitors a wide range of information on the Internet, and then alerts you to the fact that the price of your favorite company stock has just reached a certain level. This is one potential use case for the peer-to-peer continuous query protocol (PeerCQ), as described by the authors in this paper. PeerCQ follows in the footsteps of OpenCQ and WebCQ, which were also built at the College of Computing at Georgia Tech. The novelty here is to perform continuous queries (CQs) using current peer-to-peer (P2P) techniques.
In PeerCQ, information monitoring requests are expressed in terms of CQs. Each CQ is assigned an identifier. A service partitioning scheme is then used to assign CQ identifiers to peers. The authors explain that, when this assignment is done randomly in heterogeneous P2P systems, performance is poor in terms of load balancing. Since heterogeneous peers are the norm on the Internet, a major challenge is achieving good load balancing between peers. This must be done with respect to the nature of the CQs, so as not to degrade overall system utilization.
To meet this challenge, the authors have designed and simulated a smart service partitioning scheme. Instead of just randomly matching CQ identifiers to peer identifiers, the matching algorithm is divided into two phases. In the first phase, matching is performed based on the strict numerical closeness of the CQ identifier to the peer identifier. The second phase fine-tunes this matching by taking into consideration the following factors: the grouping of similar CQs that monitor the same data source and item, the loading factor of the peer, and the distance between the peer and the data source to be monitored. A number of experimental results are formally presented. Both the experimental results and methods used will be of great interest to anyone who wants to explore the internal theories and algorithms involved in real-world P2P architectures.
This paper is easy to follow, and very well written and referenced. It exposes a number of critical issues in P2P computing, and formally demonstrates how an algorithm can be built and analyzed to address those issues.