25 May 2006
Top-k Dominating Queries
Speaker: Ken YIU Man Lung
Abstract
A top-k dominating query returns k data objects which dominate the highest number of objects in a dataset. It provides users an intuitive and convenient way for finding "popular" objects, which is an important tool for decision support systems. The above query combines the advantages from top-$k$ and skyline queries without sharing their disadvantages: (i) the result size can be controlled, (ii) no ranking functions need to be specified by users, and (iii) the result is independent of the scalings of different dimensions. Based on the best-first paradigm, we first develop efficient algorithms for evaluating the query on indexed multi-dimensional data. We then introduce three effective techniques to optimize their performance: pruning techniques, batch counting, and lazy counting. Our experiments on synthetic datasets demonstrate that our algorithms outperform existing alternative approaches; while our results on real datasets show that the top-k dominating query delivers meaningful results to users.
Read the Presentation
Slides...
|