15 June 2004
Adaptive Frequency Counts over Real-time Data Streams
Speaker: Bill LIN
Abstract
Computing frequency counts is an essential step in data mining. In several
emerging applications, data takes the form of continuous data streams,
which can be considered as streams of data such that each record can be
processed or read only once. Traditional mining algorithms, however, are
not applicable in mining data streams because they require multiple passes
over the data.
In this talk, methods of frequency counting over a data stream will be
presented. In addition to the one-pass property, real-time requirements
are considered. We propose a flexible algorithm which computes frequency
counts adaptively depending on the time constraints. A practical system
based on the algorithm will be showed for mining data streams with bursty
traffic.
|