Next: 3. Performance Signatures Up: 2. Communication Model Previous: 2.3 Related Models Contents

2.4 Summary

In this chapter, we introduce a simple communication model for parallel computing on the cluster platform. The aim of our model is for performance analysis together with the ability to be an algorithm design tool, i.e. feasible for complexity analysis. In particular, the main objective of this model is for the development of efficient high-level communication primitives on top of those lightweight messaging systems. With this objective, we are able to develop portable parallel programs that run efficiently on a range of commodity clusters.

In the selection of the model parameters, we have two slightly conflicting considerations. First, the information reflected by our model should be easily assessable. This is because it is useless to include features that appear to be simple but are difficult to quantify in practice. For example, if one wants to reflect the cost spent on the DMA transfer, one needs to rely on the hardware supports since no simple software solution is available. Second, we must consider on the weighting factor of a target architectural feature on the performance issue. This is because, for the performance tuning aspect, we are tempted to provide more details; however, an overwhelming set will be too tedious for practical analytical use. We believe that the use of the model in algorithm analysis should be done straightforwardly and easily, whenever the users are provided with some systematic means of analysis. Therefore, emphasis has been made on the derivation of our model parameters by software approach, which is the key to the whole analytical process. Based on these measurable parameters, higher level primitives can be built or analyzed, and these primitives can also be used as some high-level performance parameters in analyzing complicated applications.

In our model, communication events are abstracted as some means of local and remote data movements, and each movement should have an associated cost and may be related to the length of the data items. To be realistic, we have included a rich parameter set to the model; however, the used of those parameters are subjected to the target level of abstraction that we are going to work on. Therefore, under some circumstances, a few performance parameters are proved to be adequate for modeling the parallel system. And on other occasions, there are other issues that need to be studied or included for making the correct judgment. For instance, using a simple latency parameter may be good enough to capture the cost of the point-to-point communication, but is too simple for explaining the many-to-one or many-to-many issues.

When compared to other models, we opt to expose the contention issue explicitly and capture them in our parameters, e.g. the network latency and buffer capacity parameters, thus enhance the programmers' awareness on the contention issues. Besides, our model facilitates communication pipelining and overlapping of communications, which is useful for accurate performance analysis as well as for designing of efficient communication schedules.

Next: 3. Performance Signatures Up: 2. Communication Model Previous: 2.3 Related Models Contents