18 May 2006
Density-Based Clustering of Uncertain Data
Speaker: CHUI Chun Kit
Abstract
In many different application areas, e.g. sensor databases, location
based services or face recognition systems, distances between objects
have to be computed based on vague and uncertain data. Commonly, the
distances between these uncertain object descriptions are expressed by
one numerical distance value. Based on such single-valued distance
functions standard data mining algorithms can work without any
changes. The authors of this paper propose to express the similarity
between two fizzy objects by distance probability functions. These
fuzzy distance functions assign a probability value to each possible
distance value. By integrating these fuzzy distance functions directly
into data mining algorithms, the full information provided by these
functions is exploited. In order to demonstrate the benefits of this
general approach, the authors enhance the density-based clustering
algorithm DBSCAN so that it can work directly on these fuzzy distance
functions. In a detailed experimental evaluation based on artificial
and real-world data sets, the authors show the characteristics and
benefits of their new approach.
Read the Presentation
Slides...
|