17 May 2007
Indexing DNA sequences for local similarity search
Speaker: Angela SIU Wing Yan
Abstract
Local similarity search of DNA sequences is the operation of locating similar
regions between the sequences. By comparing the content of DNA sequences among
different species, important regions can be revealed and functions and
structures of the regions can be deduced. Within the community of biologists,
genomists and medical researchers, BLAST, maintained by NCBI of the United
States, is one of the most popular tools for local similarity search between
sequences being studied and huge genomic databases. The BLAST algorithm applies
heuristics to speed up searching, dividing it into four phases: hit generation,
ungapped extension, gapped extension and traceback.
In our research, we focus on the indexing of DNA sequences in the database to
facilitate local similarity search. We have designed a new indexing scheme.
Let's call it prefix-suffix hashing scheme. The goal of this scheme is to speed
up the first two phases of BLAST and to reduce the processing cost of the later
phases.
Read the Presentation
Slides...
|