Support des cours et TPs

Introduction to performance evaluation

    Intervenant : W. Jalby et S. Valat

support du cours de W. Jalby 1

support du cours de W. Jalby 2

support du cours de S. Valat


  • Presenting the stage and the actors
  • A brief survey of computer architecture
  • Some key technologies for performance analysis: tracing/sampling, source/binary instrumentation, hardware performance counters
  • Analyzing compiler inefficiencies: MAQAO
  • Understanding and quantifying hardware behavior: systematic microbenchmarking


  • Brief overview of OS key role: resource management
  • Reproducibility in performance measurements (controlling OS behavior)
  • Exemples of key performance problems generated by OS


  • Identifying the right performance problems
  • Principles of DECAN: a Decremental analysis tool
  • A few case studies using DECAN
  • DECAN applied to more general problems

Benchmarking : first step toward code optimization

    Intervenant : L. Saugé

support du cours

Evaluate the peak performance of a system and the knowledge of the limitations/constraints of a particular architecture is the first step for the performance analysis of a code and therefore, toward code optimization.

In this presentation, we study how to evaluate (or how to "benchmark") a high performance computing systems. We will review which hardware (and also software) elements composed such type of systems and for ea ch of them, how it can impact the performances of the code. And from these analyses we will find best practices the programmer concerned by the performance of his code.

Optimization in C & C++

    Intervenant : S. Binet

support du cours

support du TP

Understanding what are the costs of various C/C++ code constructs is
paramount in eventually providing an application asymptotically reaching
its full performance potential.
This lecture will first briefly introduce performance concepts in the
multicore/manycore landscape and how to detect and assess performance
A non-exhaustive list of various source code optimization techniques for
C and C++ will follow, associated with their cost.

Analysis and Optimization of the Memory Access Behavior of Applications

    Intervenant : J. Weidendorfer

support du cours

support du TP

The goal of this session is to understand how accessing memory on
modern CPUs in the wrong way can slow down an application, and
what can be done to make it faster. This will include a short
introduction on how caches work in modern CPUs. Standard cache
optimization strategies such as blocking and padding are
described, using small example codes such as matrix multiplication
and stencil codes.
In the second part, tools are presented which are able to detect
bad memory access behavior, using hardware performance counters
as well as cache simulation for detailed analysis. The session
ends with a hands-on part, demonstrating the use of the tools.

Description of analysis tools

    Intervenant : A. Charif Rubial

support du cours

Static Performance Model :code quality and vectorization opportunities
Application profiling from corse to fine grain : MIL

  • Static analysis to reduce to the runtime overhead
  • Binary level
  • General profiling strategies
  • User defined profiling

Memory behavior analysis of applications

  • Data Alignement
  • Detecting unefficient access patterns
  • multi-threaded

Using hardware performance counters

Studying the behavior of parallel applications and identifying bottlenecks by using performance analysis tools

    Intervenant : G. Markomanolis

support du cours

support du TP

Concerning the increase of the available number of the processors and the architecture complexity it becomes
more difficult to understand how a parallel application is behaving. Many times a parallel application does not
provide the expected speedup during the execution on many processors. TAU and Scalasca are two performance
analysis tools that can help the user to identify performance bottlenecks. Each one provides some different tools in
order to observe a problem that it can be caused by many different factors. Various aspects are covered, instrumenting,
measurement either profiling or tracing with PAPI hardware counters, analysis and visualization. Moreover an introduction
into two another tools will be given, the first one is the Score-P which is the next generation of the performance
measurement infrastructure with the collaboration of the Scalasca, VampirTrace, TAU and some other teams. Finally,
PerfExpert is a performance diagnosis tool which can provide detailed information about the causes of the bottleneck and
in some cases to propose optimizations to alleviate the identified bottlenecks.

Optimisation des entrées/sorties pour les systèmes POSIX

    Intervenant : L. Tortay et E. Legay

support du cours

support du TP

Présentation/rappels de fondamentaux sur les moyens actuels de stockage persistent. Fonctions des systèmes POSIX (et Linux) pour les entrées/sorties disque & réseau.

Bonnes pratiques pour une utilisation efficace de ces fonctions (« quand, comment, pourquoi les utiliser »).

Présentation d’outils utiles pour déterminer le type d’entrée/sorties effectives d’un programme, en particulier SystemTap (pour Linux).

titre documents joints

13 février 2012
info document : PDF
541.6 ko

13 février 2012
info document : PDF
653.7 ko

13 février 2012
info document : Zip
2.9 Mo

12 février 2012
info document : PDF
1.2 Mo

12 février 2012
info document : PDF
579.6 ko

11 février 2012
info document : GZ
2.1 Mo

11 février 2012
info document : PDF
2.4 Mo

11 février 2012
info document : Powerpoint
1.9 Mo

11 février 2012
info document : GZ
554 ko

11 février 2012
info document : PDF
6.2 Mo

11 février 2012
info document : GZ
3.3 ko

11 février 2012
info document : PDF
995.3 ko

11 février 2012
info document : PDF
1.5 Mo

Accueil | Contact | Plan du site | | Statistiques du site | Visiteurs : 4449 / 655119

Suivre la vie du site fr  Suivre la vie du site Présentation du Groupe Calcul  Suivre la vie du site Formations / Ecoles  Suivre la vie du site Ecole "Méthodologie et outils d’optimisation en (...)   ?

Site réalisé avec SPIP 3.0.17 + AHUNTSIC

Creative Commons License