|
Abstract
The paper presents a new algorithm for the discovery of rigid patterns in
biological sequences. The method is combinatorial in nature and able to
produce all patterns that appear in at least a (user-defined) minimum
number of sequences, yet it manages to be very efficient by avoiding the
enumeration of the entire pattern space. Furthermore, the reported
patterns are maximal: any reported pattern cannot be made more specific
and still keep on appearing at the exact same positions within the input
sequences. The effectiveness of the proposed approach is showcased on a
number of test cases which aim to: 1. validate the approach through the
discovery of previously reported patterns; 2. demonstrate the capability
to identify automatically highly selective patterns particular to the
sequences under consideration. Finally, experimental analysis indicates
that the algorithm is output sensitive, i.e. its running time is
quasi-linear to the size of the generated output.
Read the Presentation
Slides...
Referred Papers
|