The authors present an algorithm for estimating the frequent items that occur in a sliding window of a data stream. If the window size is N data stream symbols, &thgr; is a user-defined threshold, and &egr; is a relative error bound, then at any time we want to be able to produce a set B of data stream symbols with the following two properties:
- (1) B only contains symbols that occur at least (&thgr; -&egr;)N times in the sliding window.
- (2) Any symbol that occurs more than &thgr;N times must be in B.
The algorithm maintains a pair of data structures, one for each of the two consecutive positions of the sliding windows. The data structures have a counter for each symbol that has been determined to occur frequently. It also maintains linked lists and hash tables so that both the space and time requirements are of order O(1/&egr;).
The paper provides additional details of the algorithm. Unfortunately, however, it is described using fairly tedious nomenclature.