next up previous contents
Next: 1.1 Commodity Supercomputing Up: thesis Previous: List of Tables   Contents


1. Introduction

Motivated by the desire to handle larger and complex problems, as well as to solve problems faster, we make use of multiple processing units for parallel computation. Developments of parallel computers, especially focusing on the architectural aspect, have been an active research subject since early 1960. During these forty years of developments, numerous parallel architectures have been developed, evolved and faded out, for examples, systolic architecture, dataflow architecture, and transputer system. Not until recent decades, due to the swiftly improvement of the VLSI technology, there is a clear convergence of parallel machines toward a generic parallel machine organization [32]. In this generic architecture, parallel machines are essentially comprised of a collection of complete computers, each with one or more processors and memory, and are interconnected by a communication network.

Advances in networking technology have accelerated this convergence. We are now capable to transform a pool of off-the-shelf computers to a powerful platform for supporting high-performance computations. This kind of computing platform is commonly known as Cluster of Workstations (COW), Network of Workstations (NOW), or simply Clusters [80]. By the name ``commodity cluster'', we refer to the clusters that are built on off-the-shelf components, such as high-performance microprocessors and high-speed networks. Better price-per-performance is the incentive of building clusters when compared to traditional parallel machines since they are built on commodity components. In addition, the performance of the cluster systems is getting along with the advances in commodity hardware.

These features are the selling points for using clusters on high-performance computing; however, just putting all state-of-the-art components together does not guarantee to be cost-effective and high-performance. The real challenge is how well can we harness these computing resources to meet our performance needs. As we are building clusters for high-performance computing, we have to face with challenges related to the performance issues on the cluster domain. In particular,

We believe that, to achieve effective parallel programming on the cluster platform, this requires the ability to measure the performance of the parallel applications, the ability to determine the performance capability of the cluster systems, and the ability to explain the performance behavior of a parallel application on a cluster system. This demands system designers and programmers to have in-depth understanding of the interactions between various hardware and software components. In this thesis, we base on a realistic communication model to guide our understanding and structure our reasoning, as well as to perform performance tuning. This model is used as a versatile tool for performance evaluation and predication, as well as for algorithm design and analysis.

The rest of this chapter is organized as follows. We first state our motivation of building commodity cluster, and describe the limitations and challenges we have to tackle in order to achieve our goal. Next, we declare the thesis statement and highlight the contributions of this thesis. Lastly, we present an outline of the organization of this thesis.



Subsections
next up previous contents
Next: 1.1 Commodity Supercomputing Up: thesis Previous: List of Tables   Contents