Automatic Search for Performance Problems in Parallel and Distributed Programs by Using Multi-Experiment Analysis

Automatic Search for Performance Problems in Parallel and Distributed Programs by Using Multi-Experiment Analysis

Abstract

We introduce Aksum, a novel system for performance analysis that helps programmers to locate and to understand performance problems in message passing, shared memory and mixed parallel programs. The user must provide the set of problem and machine sizes for which performance analysis should be conducted. The search for performance problems (properties) is user-controllable by restricting the performance analysis to specific code regions, by creating new or customizing existing property specifications and property hierarchies, by indicating the maximum search time and maximum time a single experiment may take, by providing thresholds that define whether or not a property is critical, and by indicating conditions under which the search for properties stops. Aksum automatically selects and instruments code regions for collecting raw performance data based on which performance properties are computed. Heuristics are incorporated to prune the search for performance properties. We have implemented Aksum as a portable Java-based distributed system which displays all properties detected during the search process together with the code regions that cause them. A filtering mechanism allows the examination of properties at various levels of detail. We present an experiment with a financial modeling application to demonstrate the usefulness and effectiveness of our approach.

Grafik Top
Authors
  • Seragiotto, Clovis
  • Fahringer, T.
Grafik Top
Shortfacts
Category
Technical Report (Technical Report)
Divisions
Scientific Computing
Publisher
Institute for Software Science, University of Vienna
Date
August 2002
Official URL
http://www.par.univie.ac.at/publications/download/...
Export
Grafik Top