10 Sep 2003
Biclustering Methods for Microarray Data Analysis
Speaker: Kevin YIP
Abstract
With the advent of microarray technology, the activity of
thousands of genes can be recorded simultaneously. Putting together the
expression profiles of the genes under different conditions, the
resulting data can be viewed as a large matrix, where each row
corresponds to a gene and each column corresponds to a condition. There
is a huge number of undergoing research efforts that try to dig out
biological knowledge from these large matrices. While directly applying
traditional machine learning or data mining methods on the data has been
successful in some cases, it is more common to see learning results that
have no apparent biological meaning. This may due to the special data
characteristics of the microarray datasets, which do not fit well into
the traditional data models. In the case of clustering, some parties
have suggested that clusters may be better represented by submatrices
that involve only subsets of rows and columns. The corresponding
algorithms cluster both genes and conditions simultaneously. The
problem, called biclustering, highly resembles the projected and
subspace clustering problems studied by the database community in recent
years. Interestingly, the approaches taken by the two communities are
fundamentally different.
In this talk, I will introduce some biclustering methods proposed
specifically for microarray data analysis. Some brief descriptions of
the models and algorithms will be given, and the approaches will be
compared based on some high-level criteria. I will also suggest some
possible further research topics in biclustering analysis on microarray
data.
Read the Presentation
Slides...
Referred Papers
|