Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Main Topics

Schedule

            Speaker

Affiliation

Type of presentation

Title (tentative)

Download

 

 

 

 

 

 

 

Dinner Before the Workshop

7:00 PM

Only people registered for the dinner

 

 

 

 

 

 

 

 

 

 

 

Workshop Day 1

Monday Nov. 25th

 

 

 

 

 

 

 

 

 

 

TITLES ARE TEMPORARY (except if in bold font)

 

Registration

08:00

 

 

 

 

 

Welcome and Introduction

Amphitheatre

Chair: Franck

08:30

Marc Snir + Franck Cappello

INRIA&UIUC&ANL

Background

Welcome, Workshop objectives and organization

 

 

08:45

Peter Schiffer

UIUC

Background

Welcome from UIUC Vice Chancellor for Research

 

 

09:00

Ed. Siedel

UIUC

Background

NCSA update and vision of the collaboration

 

 

09:15

Michel Cosnard

Inria

Background

INRIA updates and vision of the collaboration

 


9:30

Marc Snir

ANL

Background

Argonne updates and vision of the collaboration

 

 

9h45

Franck Cappello

ANL

Background

Joint-Lab, New Joint-Lab, PUF articulation

 

 

10:15

Break

 

 

 

 

Extreme Scale Systems and infrastructures

Amphitheatre

Chair: Marc Snir

10:45

Pete Beckman

ANL

 

Extreme Scale Computing & Co-design Challenges

 

 

11:15

John Towns

UIUC

 

Plenary talk

 
 11:45Gabriel AntoniuINRIA  Plenary talk 

 

12:15

Lunch

 

 

Plenary talk

 


13:45

Bill Kramer

UIUC

Blue Waters

BW Observations and new challenges

 


14:15

Marc Snir

UIUC

 

G8 ECS and international collaboration toward extreme scale

 

 

14:45

Rob Ross

ANL

 

Plenary talk

 
 15:15François PellegriniINRIA Plenary talk 
 15:45Break    

 

16:15

Yves Robert

INRIA

 

Plenary talk

 
 16:45Wen Mei HwuUIUC 

Plenary talk

 
 17:15Adjourn    

 

18:45

Bus for Diner

 

 

 

 

 

 

 

 

 

 

 

Workshop Day 2


Tuesday Nov. 26

 

 

 

 

 

Applications, I/O, Visualization, Big data

Amphitheatre

Chair: Rob Ross

08:30

Greg BauerUIUC  Plenary talk

 

 

09:00

Matthieu Dorier

INRIA

jointJoint-result, submitted

Plenary talk

 
 

09:30

Dries Kempe

ANL

 

Plenary talk

 

 

10:00

Venkat Vishwanath

ANL

 

Plenary talk

 

 

10:30

Break

 

 

 

 

 

11:00

Babak Behzad

UIUC 

ACM/IEEE SC13

Plenary talk

 

 

11:30

McHenry, Kenton Guadron

UIUC

 

Plenary talk

 

 

12:00

Lunch

 

 


 

 

 

 

 

 

 

 

Mini Workshop1

Resilience

Room 1030

Chair: Yves Robert

 

 

 

 

 

 

 

13:30

Leonardo

ANL

Joint-result


 

 

14:00

Tatiana

INRIA

Joint-result


 

 

14:30

Mohamed Slim Bouguera

INRIA

Joint-result, submitted


 

 

15:00

Ana Gainaru

UIUC

Joint-result, submitted


 

 

15:30

Break

 

 

 

 

 

16:00

Sheng Di

INRIA

Joint-result, submitted


 

 

16:30

Frederic Vivien

INRIA

 


 

 

17h00

Weslay Bland

ANL

 


 

 

17H30

Adjourn

 

 

 

 

 

19:00

Bus for Diner

 

 

 

 

       

Mini Workshop2

Numerical Agorithms

Room 1040

Chair: Bill Gropp

 

 

 

 

 

 

 

13:30

Luke Olson

UIUC

 

  
 14:00 Prasanna BalaprakashANL   

 

14:30

Hushang

INRIA

 


 

 

15:00

Jed Brown

ANL

 

 

 

 

15:30

Break

 

 

 

 

 

16:00

Pierre Jolivet

INRIA

Best Student Paper nomiee, IEEE, ACM SC13


 
 16:30Vincent BaudouiTotal&ANL   
 17:00Stefan WildANL   

 

17:30

Adjourn

 

 

 

 

       

 

19:00

Bus for diner

 

 

 

 

 

 

 

 

 

 

 

Workshop Day 3


Friday Nov. 27

 

 

 

 

 

 

 

 

 

 

 

 

Mini Workshop3


 

 

 

 

 

 

 Programming models, compilation and runtime.

Room 1030

Chair: Marc Snir

08:30

Grigori Fursin

INRIA

 

 

 

 

09:00

Maria Garzaran

UIUC

 


 


09:30

Jean-François Mehaut

INRIA

 


 
 10:00Break    

 

10:30

Pavan Balaji

ANL

 


 

 

11:00

Rafael Tesser

INRIA

Joint result PDP 2013


 

 

11:30

Emmanuel Jeannot

INRIA

Joint-result, IEEE Cluster2013


 

 

12:00

Closing

 

 

 

 

 

12:30

Lunch

 

 

 

 

       

 

18:00

Bus for diner

 

 

 

 

Mini Workshop4

Large scale systems and their simulators

Room 1040

Chair: Bill Kramer

 

 

 

 

 

 


08:30

Sanjay Kale

 

 


 

 

09:00

Arnault Legrand

 

 


 


09:30

Kate Kahey

 

 


 

 

10:00

Break

 

 

 

 


10:30

Gille Fedak

 

 


 

 

11:00

Jeremy Henos

 

 


 

 

11:30

TBD

 

 


 

 

12:00

Closing

 

 

 

 

 

12:30

Lunch

 

 

 

 

       
 18:00Bus for diner    

...

Emmanuel Jeannot, Esteban Meneses-Rojas, Guillaume Mercier, François Tessier and Gengbin Zheng

Communication and Topology-aware Load Balancing in Charm++ with TreeMatch

Abstract—Programming multicore or manycore architectures is a hard challenge particularly if one wants to fully take advantage of their computing power. Moreover, a hierarchical topology implies that communication performance is heterogeneous and this characteristic should also be exploited. We developed two load balancers for Charm++ that take into account both aspects depending on the fact that the application is compute-bound or communication-bound. This work is based on our TREEMATCH library that compute process placement in order to reduce an application communication cost based on the hardware topology. We show that the proposed load-balancing scheme manages to improve the execution times for the two classes of parallel applications.

...

CALCioM: Mitigating I/O Interferences in HPC Systems through Cross-Application Coordination

Unmatched computation and storage performance in new HPC systems have led to a plethora of I/O optimizations ranging from application-side collective I/O to network and disk-level request scheduling on the file system side. As we deal with ever larger machines, the interference produced by multiple applications accessing a shared parallel file system in a concurrent manner become a major problem. Interference often breaks single-application I/O optimizations, dramatically degrading application I/O performance and, as a result, lowering machine wide efficiency.
This talk will focuse on CALCioM, a framework that aims to mitigate I/O interference through the dynamic selection of appropriate scheduling policies. CALCioM allows several applications running on a supercomputer to communicate and coordinate their I/O strategy in order to avoid interfering with one another. In this work, we examine four I/O strategies that can be accommodated in this framework: serializing, interrupting, interfering and coordinating. Experiments on Argonne’s BG/P Surveyor machine and on several clusters of the French Grid’5000 show how CALCioM can be used to efficiently and transparently improve the scheduling strategy between two otherwise interfering applications, given specified metrics of machine wide efficiency.


Babak Behzad

Taming Parallel I/O Complexity with Auto-Tuning

We present an auto-tuning system for optimizing I/O performance of HDF5 applications and demonstrate its value across platforms, applications, and at scale. The system uses genetic algorithms to search a large space of tunable parameters and to identify effective settings at all layers of the parallel I/O stack. The parameter settings are applied transparently by the auto-tuning system via dynamically intercepted HDF5 calls. To validate our auto-tuning system, we applied it to three I/O benchmarks (VPIC, VORPAL, and GCRM) that replicate the I/O activity of their respective applications. We tested the system with different weak-scaling configurations (128, 2048, and 4096 CPU cores) that generate 30 GB to 1 TB of data, and executed these configurations on diverse HPC platforms (Cray XE6, IBM BG/P, and Dell Cluster). In all cases, the auto-tuning framework identified tunable parameters that substantially improved write performance over default system settings. We consistently demonstrate I/O write speedups between 2x and 100x for test configurations.


Yves Robert, ENS Lyon, INRIA & Univ. Tenn. Knoxville

Assessing the impact of ABFT & Checkpoint composite strategies
 

Algorithm-specific fault tolerant approaches promise unparalleled scalability and performance in failure-prone environments. With the advances in the theoretical and practical understanding of algorithmic traits enabling such approaches, a growing number of frequently used algorithms (including all widely used factorization kernels) have been proven capable of such properties. These algorithms provide a temporal section of the execution when the data is protected by it's own intrinsic properties, and can be algorithmically recomputed without the need of checkpoints. However, while typical scientific applications spend a significant fraction of  their execution time in library calls that can be ABFT-protected, they interleave sections that are difficult or even impossible to protect with ABFT.  As a consequence, the only fault-tolerance approach that is currently used for these applications is  checkpoint/restart. In this talk, we propose a model and a simulator to investigate the behavior of a composite protocol,  that alternates  between ABFT and checkpoint/restart protection for effective protection of each phase of an iterative application composed of ABFT-aware and ABFT-unaware sections. We highlight this approach drastically increases the performance delivered by the system, especially at scale, by providing means to rarefy the checkpoints while simultaneously decreasing the volume of data needed to be saved in the checkpoints.