...
Main Topics | Schedule | Speakers | Types of presentation | Topic | Download |
Sunday Nov. 18th | Dinner | Giordano's | http://www.giordanos.com/ |
| |
Workshop Day 1 (Room 1416, TCS conference center) | Monday Nov. 19th |
|
|
| |
| 07:30-8:30 | Transportation: Guest House to TCS (building 240) |
| (Entrance of the conference center) |
|
| 08:00 | Contiental Breakfast and Registration |
| Food available in Room 1407, Lunch seating in room 1416 (second half) |
|
Welcome and Introduction | 08:30 | Franck Cappello, INRIA & UIUC, Marc Snir ANL | Opening | Welcome, formal opening and workshop details |
|
| 08:40 | Marc Snir | Opening | ANL presentation and vision of the collaboration |
|
| 08:50 | Bill Gropp | Opening | UIUC/NCSA update and vision of the collaboation |
|
| 09:00 | Frederic Desprez | Opening | INRIA update on HPC strategy and vision of the collaboration |
|
Big Apps, Big DATA - Big I/O | 09:15 | Robert Jacob | Trends in HPC | Climate simulation at extreme scale | |
| 09:45 | Rob Ross, ANL | Trends in HPC | Trends in HPC I/O and File systems |
|
| 10:15 | Break |
|
|
|
| 10:45 | Rob Pennington, NCSA | Trends in HPC | Big Data | |
| 11:15 | Andrew Chien, ANL | Potential collaboration | Presto/Blockus: Towards a Scalable R Programming System | |
| 11:45 | Matthieu Dorier, INRIA | Joint Results | I/O and in-situ visualization: recent results with the Damaris approach | |
| 12:15 | Lunch |
|
|
|
Programming Models/Runtime chair: Sanjay Kale | 13:30 | Wen-Mei Hwu, UIUC | TBA | Scalability, Performance, and Numerical Stability of Many-core GPU Algorithms - A Case Study of Tri-diagonal Solvers | |
| 14:00 | Pavan Balaji, ANL | Potential collaboration | MPI3 and Unified Runtime | |
| 14:30 | Andra Hugo, Raymond Namyst, INRIA | Potential collaboration | Composing multiple StarPU applications over heterogeneous machines: a supervised approach | |
| 15:00 | Jean-François Mehaut, INRIA | Potential collaboration | Optimizations for modern NUMA |
|
| 15:30 | Break |
|
|
|
Numerical algorithms and Methods | 16:00 | Barry Smith, ANL | Trend | Performance Issues in DOE PDE Simulations | |
16:30 | Laura Grigori | Results | Communication avoiding | ||
| 17:00 | Bill Gropp, UIUC | Results | Hybrid Scheduling | |
| 17:30 | Laurent Hascoet, INRIA | Early Results | The Data-Dependence graph of Adjoint Codes | |
18:00 | Adjourn |
| |||
19:00 | Dinner | Jameson's |
| ||
|
|
|
|
|
|
Workshop Day 2 (Main room) | Tuesday Nov. 20th |
|
|
|
|
|
|
|
|
|
|
Big Systems | 08:30 | Pete Beckman, ANL | Trends | New Directions in Extreme-Scale Operating Systems and Runtime Software |
|
| 09:00 | Bill Kramer, UIUC/NCSA | Trends | Blue Waters update |
|
Cloud | 09:30 | Ian Foster, ANL | Potential collaboration | Big Process for Big Data | |
| 10:00 | Christine Morin, INRIA | Potential collaboration | Contrial | |
| 10:30 | Break |
|
|
|
11:00 | Frederic Desprez, INRIA | Potential collaboration | TBA | ||
Resilience: | 11:30 | Yves Robert | Early Result | Performance modeling of checkpointing under failure prediction | |
| 12:00 | Rinku Gupta, ANL | Potential collaboration | CIFTS: An infrastructure for coordinated and comprehensive system-wide fault tolerance. |
|
| 12:30 | Ana Gainaru, UIUC | Early Results | Coupling failure prediction, proactive and preventive checkpoint for current production HPC systems. |
|
| 13:00 | Lunch |
| Food buffet in Room 1407, Lunch seating in room 1416 (second half) |
|
|
|
|
| Parallel Session |
|
Mini workshop on Numerical libraries | 8:30 | Stefan Wild, ANL | Potential collaboration | Numerical optimization for "automatic" tuning of codes | |
| 09:00 | Bill Gropp, UIUC | Potential collaboration | TBA | |
| 09:30 | Laura Grigori, INRIA | Potential collaboration | TBA | |
| 10:00 | Break | TBA | ||
| 10:30 | Anshu Dubey, ANL | Potential collaboration | Optimizing Scientific Codes While Retaining Portability |
|
| 11:00 | Discussion |
|
|
|
| 12:00 | Adjourn |
|
|
|
| 13:00 | Lunch |
|
|
|
|
|
|
| Parallel Sessions |
|
Mini workshop on Performance Modeling and simulation | 14:30 | Sanjay Kale, UIUC | Early Results | BIG SIM |
|
| 15:00 | Arnaud Legrand, INRIA |
| SimGrid for HPC |
|
| 15:30 | Torsten Hoefler, ETH | Early Results TBA | Performance Modeling for Parallel Software Development and Tuning |
|
| 16:00 | Break |
|
|
|
| 16:30 | Timo Schneider, ETH | Early Results | Optimization Principles for Collective Neighborhood Communications |
|
| 17:00 | Discussion |
|
|
|
| 18:00 | Adjourn |
|
|
|
| 19:00 | Dinner | Meggaiano's | [http://www.maggianos.com/EN/Oak-Brook_Oak-Brook_IL/Pages/LocationLanding.aspx?AspxAutoDetectCookieSupport=1 |
|
|
|
|
|
|
|
Mini workshop on Cloud | 14:30 | Kate Keahey, ANL | Potential collaboration | TBA |
|
| 15:00 | Narayan Deai, ANL | Potential collaboration | TBA |
|
| 15:30 | Jonathan Rouzaud, INRIA | Potential collaboration | Provisioning Virtual Machines in Federated Clouds |
|
| 16:00 | Break |
|
|
|
| 16:30 | Michael Wilde | Potential collaboration | Swift: simpler parallel programming for cloud and HPC domains http://www.ci.uchicago.edu/swift (Swift for clouds and clusters) |
|
| 17:00 | Discussion |
|
|
|
| 18:00 | Adjourn |
|
|
|
| 19:00 | Dinner | Meggaiano's | [http://www.maggianos.com/EN/Oak-Brook_Oak-Brook_IL/Pages/LocationLanding.aspx?AspxAutoDetectCookieSupport=1 |
|
|
|
|
|
|
|
Workshop Day 3 (Main room) | Wednesday Nov 21st |
|
|
|
|
|
|
|
| Parallel Sessions |
|
Mini workshop on Programming models/runtime | 08:30 | Emmanuel Jeannot, INRIA | Results | TBA |
|
09:00 | Sanjay Kale, UIUC | Charm++ update |
| ||
09:30 | Christian Perez, INRIA |
| TBA |
| |
10:00 | Break |
|
| ||
10:30 | Jim Dinan |
| A One-Sided View of HPC: Global-View Models and Portable Runtime Systems |
| |
11:00 | Sebastien Fourestier | Potential collaboration | Parallel repartitioning and re-mapping in Scotch |
| |
| 11:30 | Discussion |
|
|
|
| 12:30 | Closing |
|
|
|
| 13:00 | Lunch |
|
|
|
|
|
|
|
|
|
Mini workshop on Resilience | 08:30 | Mohamed Slim Bouguerra | Result | TBA |
|
| 09:00 | Amina Guermouche, INRIA | Result | Unified Model for Assessing Checkpointing Protocols at Extreme-Scale |
|
| 09:30 | Bogdan Nicolae, IBM | Result | I-Ckpt: Leveraging memory access patterns and inline collective deduplication to improve scalability of CR |
|
| 10:00 | Break |
|
|
|
| 10:30 | Tatiana Martsinkevich, INRIA | Result | Fully distributed recovery for send-determinism applications |
|
| 11:00 | Peter Brune, ANL | Trends | Multilevel Resiliency for PDE Simulations |
|
| 11:30 | Xiang Ni, Estaban Menese | Results | Scalable in-memory checkpoint with automatic restart on failure |
|
| 12:00 | Discussion |
|
| |
| 12:30 | Closing |
|
|
|
| 13:00 | Lunch |
| Boxe Lunches |
|
...
This talk deals with the impact of fault prediction techniques on checkpointing strategies. We extend the classical analysis of Young and Daly in the presence of a fault prediction system, which is characterized by its recall and its precision, and which provides either exact or window-based time predictions. We succeed in deriving the optimal value of the checkpointing period (thereby minimizing the waste of resource usage due to checkpoint overhead) in all scenarios. These results allow to analytically assess the key parameters that impact the performance of fault predictors at very large scale. In addition, the results of this analytical evaluation are nicely corroborated by a comprehensive set of simulations, thereby demonstrating the validity of the model and the accuracy of the results.
Torsten Heofler
Performance Modeling for Parallel Software Development and Tuning
Scientific applications are commonly developed and used over the lifecycle of multiple parallel computing architectures. Despite all efforts to develop performance-portable parallel programming environments, several changes are often necessary to adapt the code to new architectures and systems. Performance modeling has been discussed as a viable tool to support all stages of the software development process of parallel applications and to support co-design of different layers. In this talk, we particularly focus on the last, and most expensive stage, the continuous porting and improvement phase. We show how to apply semi-analytic modeling techniques to understand the structure of large parallel applications and to pinpoint bottlenecks and viable targets for code improvements. To do this, we combine techniques for optimizing serial and parallel codes and we demonstrate several real-world application examples. We expect that our methodology can help to improve the effectivity of the software development process for parallel computing.
Timo Schneider
Optimization Principles for Collective Neighborhood Communications
Abstract: Many scientific applications work in a bulk-synchronous mode of iterative communication and computation steps. Even though the communication steps happen at the same time, important patterns such as stencil computations cannot be expressed as collective communications in MPI. Neighborhood collective operations allow to specify arbitrary collective communication relations during runtime and enable optimizations similar to traditional collective calls. We show a number of optimization opportunities and algorithms for different communication scenarios. We also show how users can assert additional constraints that provide new optimization opportunities in a portable way. Our communication and protocol optimizations result in a performance improvement of up to a factor of two for stencil communications. We found that our optimization heuristics can automatically generate communication schedules that are comparable to hand-tuned collectives. With those optimizations, we are able to accelerate arbitrary collective communication patterns, such as regular and irregular stencils with optimization methods for collective communications