Abstract
RHist: Adaptive Summarization over Continuous Data Streams
by: Lin Qiao, Divyakant Agrawal, and Amr El Abbadi
Abstract:
Maintaining approximate aggregates and summaries over data streamsis crucial to handle the OLAP query workload that arises inapplications, such as network monitoring and telecommunications.Furthermore, since the entire data is not available at all timesthe maintenance task must be done incrementally. We show thatR(elaxed)Hist(ogram) is an appropriate summarization under datastream scenario. In order to reduce query estimation errors, wepropose adaptive approaches which not only capture the datadistribution, but also integrate independent query patterns. Weintroduce a workload decay model to efficiently capture globalworkload information and ensure that query patterns from therecent past are weighted more than queries that are further in thepast. We verify experimentally that our approach successfullyadapts to changes in the workload as well as continuously changingdata streams.
Keywords:
Algorithms, theory
Date:
August 2002
Document: 2002-25