HKU Research  The University of Hong Kong
Department of Computer Science and Information System
Feature
home
current research
people
publications
downloads
HKU CS

 

30 Aug 2002

A Practical Solution for Indexing Human Genome Using a PC Cluster
Line
Speaker: CHENG Lok Lam

 

Abstract DNA sequences hold the code of life for every living organism. Currently, biologists are interesting in finding similar pattern in DNA sequences. The typical size of whole human genome is 3Gbp, i.e. 3G of characters. The sequences can be considered as strings over an alphabet of four characters -- A, C, G and T. Our goal is to figure out a practical approach to indexing human genome such that biologists can do approximate/exact matching on the 3Gbp DNA sequences efficiently.

In this talk, I will present my research concerning using suffix array and suffix tree structures with PC Cluster to indexing the human genome. Our experiment shows that suffix array performs better that than suffix tree, although suffix tree has lower run-time complexity theoretically. I will present the techniques for partitioning the data or index structure in order to fit into the PC cluster model.

Read the Presentation Slides...

Referred Papers

Back to the top

Comment?  Send to dbgroup@cs.hku.hk