Abstract
Sequential pattern mining is an important data mining problem with broad
applications. It is difficult since one may need to examine a
combinatorially explosive number of possible subsequence patterns. Most of
the previous sequential pattern mining methods are based on Apriori
property. Although Apriori-like methods may substantially reduce the
number of combinations to be examined, they still encounter the problem
when the sequence database is large and/or the sequential patterns to be
mined are numerous and/or long.
In my presentation, Ill introduce two novel, efficient sequential pattern
mining methods, pattern-growth methods. They are FreeSpan and PrefixSpan.
These methods explore database projections guided by patterns already
found. Performance analysis show that both methods outperform Apriori-like
method GSP and PrefixSpan achieves the best performance in mining large
sequence databases.