Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Main Topics

Schedule

            Speaker

Affiliation

Type of presentation

Title (tentative)

Download

 

 

 

 

 

 

 

Dinner Before the Workshop

7:30 PM

Only people registered for the dinner

 

 

Valpré hotel

 

 

 

 

 

 

 

 

Workshop Day 1

Wednesday June 12th

 

 

 

 

 

 

 

 

 

 

TITLES ARE TEMPORARY (except if in bold font)

 

Registration

08:00

 

 

 

 

 

Welcome and Introduction

08:30

Marc Snir + Franck Cappello

INRIA&UIUC&ANL

Background

Welcome, Workshop objectives and organization

 

 

08:45

Bill Kramer

UIUC

Background

NCSA updates and vision of the collaboration

 

 

09:00

Marc Snir

ANL

Background

ANL updates vision of the collaboration

 

 

09:15

Frederic Desprez

Inria

Background

INRIA updates and vision of the collaboration

 

Big systems
Chair: Christian Perez

9:30

Bill Kramer

UIUC

Background

Update on BlueWaters

 

 

10:00

Break

 

 

 

 

 

10:30

Mitsuhisa Sato

U. Tsukuba & AICS

Background

AICS and the K computer

 

 

11:00

Paul Gibbon

Juelich

Background

TBA

 

Resilience&fault tolerance  and simulation
Chair: Franck Cappello

11:30

Marc Snir

ANL&UIUC

Report

ICIS report on Resilience

 

 

12:00

Lunch

 

 

 

 

Numerical Algorithms
Chair: Luc Giraud

13:30

Bill Gropp

UIUC

BackgroundTBA

Topics for Collaboration in Numerical Libraries

 


14:00

Paul Hoveland

ANL

Background

TBA

 

 

14:30

Frederic Nataf

INRIA&P6

Background

Toward black-box adaptive domain decomposition methods

 

 

15:00

Luke Olson

UIUC 

BackgroundTBA

Opportunities in developing a more robust and scalable multigrid solver

 

 15:30Break    

 

16:00

Marc Baboulin

INRIA 

Background

Using condition numbers to assess numerical quality in high-performance computing applications

 

Resilience&fault tolerance  and simulation

Chair: Franck Cappello

16:30

Vincent Baudoui

 

Total & ANLJoint-ResultsRound-off error and silent soft error propagation in exascale applications 
 17:00Bogdan NicolaeIBMJoint Result

AI-Ckpt: Leveraging Memory Access Patterns for Adaptive Asynchronous Incremental Checkpointing

 
 17:30Martin QuisonINRIAResultImproving Simulations of MPI Applications Using A Hybrid Network Model with Topology and Contention Support 

 

18:00

Adjourn

 

 

 

 

 

19:00

Dinner

 

 

 

 

 

 

 

 

 

 

 

Workshop Day 2

Thursday June 13th

 

 

 

 

 

 

 

 

 

 

 

 

Programming Models (cont.)
Chair: Frederic Desprez

08:30

Jean-François Mehaut 

INRIA

Result

Progresses in the European FP7 Mont-Blanc 1 project and objectives of its follow up: Mont-Blanc 2

 

 

09:00

Rajeev Thakur

ANL

Background

TBA

 

 

09:30

Andra Ecaterina Hugo

INRIA

Results 

TBA

 

 

10:00

Celso Mendes

UIUC

Background

TBA

 

 

10:30

Break

 

 

 

 

Big Data, I/O, Visualization
Chair: Gabriel Antoniu

11:00

Dries Kimpe

ANL

Results

TBA

 

 

11:30

Gilles Fedak

INRIA

Result

Active Data: A Programming Model to Manage Data Life Cycle Across Heterogeneous Systems and Infrastructures

 

 

12:00

Matthieu Dorrier

INRIA

Joint Result

Data Analysis of Ensemble Simulations: an In Situ Approach using Damaris

 

 

12:30

Ian Foster

ANL

Background

TBA

 

 

13:00

Lunch

 

 

 

 

 

 

 

 

 

 

 

Mini Workshop1

 

 

 

 

 

 

Resilience
Chair: Marc Snir 

14:00

Ana Gainaru

UIUC

Results

Failure prediction on Blue Waters

 

 

14:30

Xiang Ni

UIUC 

Results

TBA

 

 

15:00

Tatiana

INRIA & ANL

Result

TBA

 

 

15:30

Mohamed Slim Bouguerra

INRIA & ANL

Result

TBA

 

 

16:00

Break

 

 

 

 

 

16:30

Amina Guermouche

UVSQ

Result 

Multi-criteria Checkpointing Strategies: Response-time versus Resource Utilization

 

 

17:00

Thomas Ropars

EPFL

Result

TBA

 

 

17h30

Mehdi Diouri

INRIA 

Result

ECOFIT: A Framework to Estimate Energy Consumption of Fault Tolerance Protocols for HPC Applications

 

 

18:00

Adjourn

 

 

 

 

 

 

 

 

 

 

 

Mini Workshop2

 

 

 

 

 

 

Numerical Algorithms and Libraries
Chair:  Bill Gropp 

14:00

Laura Grigori 

INRIA

Result

TBA

 

 

14:30

Stefan Wild

ANL 

Result

TBA

 

 

15:00

Frederic Hecht

INRIA/P6

Result

TBA

 

 

15:30

Jed Brown

ANL

Result

TBA

 

 

16:00

Break

 

 

 

 

 

16:30

Yushan Wang

INRIA P11

Result

TBA

 

 

17:00

Jean Utke

ANL

Result

Designing and implementing a tool-indedendent, adjoinable MPI wrapper library

 

 

17:30

Laurent Hascoet

INRIA

Result

The adjoint of MPI one-sided communications

 

 

18:00

Adjourn

 

 

 

 

 

 

 

 

 

 

 

 

19:00

Banquet

 

 

Lyon

 

 

 

 

 

 

 

 

Workshop Day 3

Friday June 14th

 

 

 

 

 

 

 

 

 

 

 

 

Mini Workshop1 (cont.)

 

 

 

 

 

 

Resilience
Chair:  Franck Cappello.

08:30

Di Sheng

INRIA

Result

TBA 

 

 

09:00

Guillaume Aupy

INRIA

Result

TBA

 

 

09:30

Discussion

 

 

 

 

 

10:00

Break

 

 

 

 

Mini Workshop3 

10:30

Guillaume Mercier

INRIA

Result

TBA

 

Programming and Scheduling 
Chair:  Rajeev Thakur

11:00

Vincent Lanore

INRIA

Result

TBA

 

 

11:30

Anne Benoit

INRIA

Result

Energy-efficient scheduling

 

 

12:00

François Tessier

INRIA

Result

TBA

 

 

12:30

Discussions

 

 

 

 

 

13:00

Closing and Lunch

 

 

 

 

 

 

 

 

 

 

 

Mini Workshop2 (cont.)

 

 

 

 

 

 

Numerical Algorithms and Libraries 
Chair:  Paul Hovland

08:30

François Pellegrini

INRIA

Result

Shared memory parallel algorithms in Scotch 6

 

 

09:00

Luc Giraud

INRIA 

Result

TBA

 

 

09:30

Discussions

 

 

 

 

 

10:00

Break

 

 

 

 

Mini Workshop4 

10:30

Kate Keahey

ANL 

Result

TBA

 

Clouds 
Chair:  Frederic Desprez

11:00

Gabriel Antoniu

INRIA

Result

TBA

 

 

11:30

Christian Perez

INRIA

Result

TBA

 

 

12:00

Eddy Caron

INRIA

Result

TBA

 

 

12:30

Discussions

 

 

 

 

 

13:00

Closing and Lunch

 

 

 

 

...

Future exascale computers will open up new perspectives in numerical simulation, but they will also experience more errors because of their massive scale. We will focus here on round-off errors and on silent soft errors, of which propagation needs to be studied in order to ensure results accuracy. Round-off errors come from numerical calculation finite precision and can lead to catastrophic losses in significant numbers when they accumulate. We will discuss the limits of existing error bounds when facing large scale problems. Soft hardware errors can also perturb computations by randomly flipping memory bits. Some of these errors are automatically corrected but others can propagate silently through the calculations. We will present some strategies to determine the sensitive sections of an application as part of future research work.

Bogdan Nicolae
 

AI-Ckpt: Leveraging Memory Access Patterns for Adaptive Asynchronous Incremental Checkpointing

With increasing scale and complexity of supercomputing and cloud computing architectures, faults are becoming a frequent occurrence, which makes reliability a difficult challenge. Although for some applications it is enough to restart failed tasks, there is a large class of applications where tasks run for a long time or are tightly coupled, thus making a restart from scratch unfeasible. Checkpoint-Restart (CR), the main method to survive failures for such applications faces additional challenges in this context: not only does it need to minimize the performance overhead on the application due to checkpointing, but it also needs to operate with scarce resources. To this end, this paper contributes with a novel approach that leverages both the current and past memory access pattern in order to optimize the order in which memory pages are flushed to stable storage during asynchronous checkpointing. Large scale experiments show up to 60% improvement when compared to state-of-art checkpointing approaches, all this achievable with an extra memory requirement of less than 5% of the total application memory.

 

Bill Gropp

Topics for Collaboration in Numerical Libraries

This talk will discuss some open problems in numerical libraries for extreme scale systems, including issues currently facing some of the application teams that are currently using the Blue Waters sustained petascale system.


Luke Olson

Opportunities in developing a more robust and scalable multigrid solver

Multigrid methods have increased in robustness in recent years due to new algorithmic advances and new theoretical developments.  The result is a more robust multilevel framework leading to improved convergence for a wider range of non-elliptic problems.  Yet, many of these developments have not been adapted at scale despite their intended use while many of the optimizations could be
strengthened by considering the high-perfromance computing architectures more directly.  In this talk, we discuss a particular example of these recent optimizations in multigrid, to define optimal interpolation, that moves toward a more general framework, and highlight some focused directions for collaboration in this respect.  In addition, recent trends in highthrouput computing have motivated algorithmic changes in the multigrid design.  In this talk, we will also highlight some directions to futher advance multigrid solvers at scale based on this work with collaborion through the Joint Lab.