Session 3

  1. Home
  2. Programme
  3. Session 3

Session 3

3.1 Project Talk: Developer tools for porting and tuning parallel applications on extreme-scale parallel systems "Developer tools project update"

Christian Feld (JSC)
Developments in the partners' tools will be reported, particularly the design and initial prototyping of XMPT mod-elled on the OMPT tools interface for OpenMP, which is expected to facilitate measurement of applications using XMP with Extrae, Score-P and other tools. We also provide an update on recent and planned training with our tools, and ongoing work to define a common performance analysis terminology, methodology, and efficiency metrics for MPI and multithreaded applications.

3.2 Project Talk: Use of the Folding pro ler to assist on data distribution for heterogeneous memory systems "Profiler-assisted data distribution for heterogeneous memory systems: getting close"

Antonio J. Pe~na (BSC)
In this project we aim at using the Extrae pro ler along with its Folding capabilities to provide optimized data distributions for heterogeneous memory systems based on coarse-grained sampling of hardware counters. In this talk we present the latest progress on this project on both the pro ling tool and the programming model sides.

3.3 Project Talk: Use of the Folding pro ler to assist on data distribution for heterogeneous memory systems "Profiler-assisted data distribution for heterogeneous memory systems: getting close"

Antonio J. Pe~na (BSC)
In this project we aim at using the Extrae pro ler along with its Folding capabilities to provide optimized data distributions for heterogeneous memory systems based on coarse-grained sampling of hardware counters. In this talk we present the latest progress on this project on both the pro ling tool and the programming model sides.

3.4 Individual Talk: Multi objective optimization of HPC kernels for performance, power, and energy

Prasanna Balaprakash (ANL)
Code optimization in the high-performance computing realm has traditionally focused on reducing execution time.The problem, in mathematical terms, has been expressed as a single-objective optimization problem. The expected concerns of next-generation systems, however, demand a more detailed analysis of the interplay among execution time and other metrics. Metrics such as power, performance, energy, and resiliency may all be targeted together and traded against one another. We present a multi-objective formulation of the code optimization problem and a machine-learning-based search algorithm. Our proposed framework helps one explore potential tradeoffs among multiple objectives and provides a signi cantly richer analysis than can be achieved by treating additional metrics as hard constraints. We empirically examine a variety of metrics, architectures, and code optimization decisions and provide evidence that such tradeoffs exist in practice.

3.5 Individual Talk: Some causes about performance uctuations of applications

Kiyoshi Kumahata (RIKEN)
"During the operation of the K computer, running time of an application occasionally becomes longer or shorter than previously measured time under the same conditions. We call this ""running time uctuation"". Running time uctuations disturb the efficient operation of a supercomputer. And it waste precious computer resources. Thus, we have been investigating and resolving such issue on the K computer. These issues may occur in many applications and supercomputers. In this talk, some causes of running time uctuations that we have ever encountered in the past are introduced."

3.6 Individual Talk: HPC-Tools JUBE, LLview and SIONlib at JSC: Recent developments

Wolfgang Frings (JSC), Sebastian Lehrs (JSC), Kay Thust (JSC)
In this talk we will present the recent developments of the benchmarking environment JUBE, the batch system monitoring tool LLview, and the parallel I/O library SIONlib. In detail, we will present how the benchmarking envi-ronment JUBE is integrated into a performance evaluation workflow, which is applied in the EU-project EoCoE. For LLview we will show the recently implemented job-based monitoring of I/O metrics. For SIONlib we will focus on the work we did in the EU-project DEEP-ER to support the efficient use of node-local storage in parallel task-local I/O. For all tool we show our future plans and present possible topics for collaboration.