next up previous contents
Next: 7.2 Future Directions Up: 7. Conclusions and Direction Previous: 7. Conclusions and Direction   Contents

Subsections

7.1 Contributions

Although this dissertation focuses on the performance issues related to commodity clusters, the principle behind this thesis research should be applicable to other parallel computers with the same architecture foundation. To conclude, we organize and discuss our accomplishments around areas of contributions - modeling and performance understanding, congestion studies, and algorithm design and analysis.

7.1.0.0.1 Modeling and Performance Understanding

Computer systems are evolving rapidly. The development of computer systems is highly complex that demands for a systematic approach in performance understanding. This dissertation demonstrates the importance of having both quantitative and qualitative metrics in performance understanding. With the quantitative information, we can evaluate, question and analyze how the system performs. While with the qualitative information, we can predict, analyze and explain how the application behaves. One of the approaches on performance understanding is the use of modeling techniques. The foundation of this thesis research is the development of a realistic communication model that captures the performance characteristics of the target machine and serves as a practical tool for design and analysis of algorithms. This includes a set of microbenchmarks to quantify the resources information and a set of performance parameters to delineate the performance characteristics of the communication system. The model parameters hold the quantitative and qualitative information for both programmers and system designers to understand or reason about their program/system design decisions.

7.1.0.0.2 Congestion Study

This thesis demonstrates that the buffering mechanism within the routers or switches has a paramount influence on the communication performance. We make use of the available information on the network buffering to investigate the congestion problem by two different approaches. To begin with, we explore how the software and hardware components interacted when the communication network is under heavy congestion. Our analytical and experimental results show that under asymmetric traffic loads, the output-buffered mechanism is more susceptible to the congestion loss problem than the input-buffered mechanism. Although input buffering has a higher threshold, once the overflow situation occurs, the resulting communication performance drops significantly. Besides, we find that the behavioral different between the two buffering mechanisms lies on how they interact with the communication protocol.

The second part of our investigations on the congestion problem is on the design of high-performance communication algorithms, which can efficiently utilize the network resources as well as can avoid the building up of congestion. The unique feature of our approach is the uses of resources information provided by our communication model to guide our design and analysis. We have introduced a global congestion control scheme to the complete exchange algorithm, and have demonstrated that it effectively avoids congestion loss and maintains sufficient throughput to maximize the performance. The principle behind the global congestion control scheme is to prevent oversubscribing the network, and the scheme is derived from information related to the communication pattern and volume, a well as from modeled architectural features. Improving the congestion control with the ideas presented in this dissertation has made it an effective solution for high-performance communication on commodity networks.

7.1.0.0.3 Algorithms Analysis and Design

We have devised an efficient communication schedule for the complete exchange operation - the Synchronous Shuffle Exchange, and have shown that it is an optimal algorithm on any non-blocking networks. Although our experimental results showed that the synchronous shuffle exchange is realizable and efficient, in reality, there are limiting factors that restrain its performance. This is commonly happened when porting theoretical algorithms on to the real platforms, as algorithms are designed and analyzed on a simplified/abstract platform. In this dissertation, we unknot the problems by augmenting the algorithms with mechanisms that are derived from our communication model. This shows the importance of using a practical performance model to perform algorithm design and analysis.


next up previous contents
Next: 7.2 Future Directions Up: 7. Conclusions and Direction Previous: 7. Conclusions and Direction   Contents