A sample Online Aggregation interface.
Unfortunately, aggregation processing in today's database systems closely resembles the batch processing of the 1960's. When users submit an aggregation query to the system, they are forced to wait without feedback while the system churns through millions of records or more. Only after a significant period of time does the system respond with the (usually small) final answer. A particularly frustrating aspect of this problem is that aggregation queries are typically used to get a ``rough picture'' of a large body of information, and yet they are computed with painstaking precision, even in situations where an acceptably precise approximation might be available very quickly.
In the Online Aggregation project, we are changing the interface to aggregation processing and, by extension, changing aggregation processing itself. The idea is to perform aggregation online in order to allow users both to observe the progress of their queries and to control execution on the fly. This enhancement requires changes not only to the user interface, but also to the techniques used for query optimization and execution. In addition, we are using new and existing statistical estimation techniques to help users assess the proximity of the running aggregate to the final result; the proposed interface makes these techniques accessible even to users with little or no statistical background. Online aggregation interfaces can go well beyond merely providing a platform for such statistical estimation techniques, permitting an interactive approach to both formal and informal data exploration and analysis.