Analysis of Inter-Chip Communication Patterns on Multi-Core Distributed Shared-Memory Computers
Multi-core multi-socket distributed shared-memory computers (DSM computers, for short) have become an important node architecture in scientific computing as they provide substantial computational capacity with relatively low space and power requirements. Compared to conventional computer networks, inter-chip networks used in DSM computers feature higher bandwidth, lower latency and tighter integration with the CPU. The inter-chip network is a shared resource among the user application and many other services, which can lead to considerable variation of execution times of identical communication tasks. In this work, we explore traffic patterns resulting from MPI collective communication primitives and investigate the question whether inter-chip link load is a reliable indicator and predictor for the execution time of collective communication primitives In this work, we explore traffic patterns resulting from MPI collective communication primitives and investigate the question whether inter-chip link load is a reliable indicator and predictor for the execution time of collective communication primitives on a DSM computer. Our experiments on a Sun Fire X4600 M2 DSM computer with 32 cores (eight quad-core CPUs) indicate that specific single link loads are positively correlated with the execution time of MPI ALLREDUCE. Observing patterns over multiple links allows refinement of the single-link observation.
Top- Mücke, Manfred
- Gansterer, Wilfried
Category |
Paper in Conference Proceedings or in Workshop Proceedings (Paper) |
Event Title |
Symposium on HyperTransport Technology |
Divisions |
Computational Technologies and Applications |
Subjects |
Parallele Datenverarbeitung Rechnerarchitektur |
Event Location |
Mannheim |
Event Type |
Workshop |
Event Dates |
03.02.2011 |
Publisher |
Universitaetsbibliothek Heidelberg |
Date |
February 2011 |
Official URL |
http://www.ub.uni-heidelberg.de/archiv/11585 |
Export |