HKU Research  The University of Hong Kong
Department of Computer Science
Feature
home
current research
people
publications
HKU CS

 

6 Apr 2004

Mining Periodic Patterns with Gap Requirement from Sequences
Line
Speaker: ZHANG Minghua

Abstract

We study a problem of mining frequently occuring periodic patterns with a gap requirement from sequences. Given a character sequence $S = s_1 s_2 ... s_L$ of length $L$ and a pattern $P = p_1 p_2 ... p_l$ of length $l$, we consider $P$ a frequently occurring pattern in $S$ if the probability of observing $P$ given a randomly picked length-$l$ subsequence of $S$ exceeds a certain threshold. In many applications, particularly those related to bioinformatics, interesting patterns are periodic with a gap requirement. That is to say, the characters in $P$ should match subsequences of $S$ of the form $s_i s_{i+g^1} ... s_{i+g^1+g^2 + ... + g^{l-1}}$, where all the $g^j$'s fall within a narrow range. The matching characters in $S$ are thus separated by gaps of more or less the same size. We show the complexity of the mining problem and discuss why traditional mining algorithms are computationally infeasible. We propose practical algorithms for solving the problem and evaluate their effectiveness.

Read the Presentation Slides...

Referred Papers

Back to the top

Comment?  Send to dbgroup@cs.hku.hk