...
Main Topics | Schedule | Speakers | Types of presentation | Topic | Download |
Sunday Nov. 18th | Dinner | Giordano's | http://www.giordanos.com/ |
| |
Workshop Day 1 (Room 1416, TCS conference center) | Monday Nov. 19th |
|
|
| |
| 07:30-8:30 | Transportation: Guest House to TCS (building 240) |
| (Entrance of the conference center) |
|
| 08:00 | Contiental Breakfast and Registration |
| Food available in Room 1407, Lunch seating in room 1416 (second half) |
|
Welcome and Introduction | 08:30 | Franck Cappello, INRIA & UIUC, Marc Snir ANL | Opening | Welcome, formal opening and workshop details |
|
| 08:40 | Marc Snir | Opening | ANL presentation and vision of the collaboration |
|
| 08:50 | Bill Gropp | Opening | UIUC/NCSA update and vision of the collaboation |
|
| 09:00 | Frederic Desprez | Opening | INRIA update on HPC strategy and vision of the collaboration |
|
Big Apps, Big DATA - Big I/O | 09:15 | Robert Jacob | Trends in HPC | Climate simulation at extreme scale | |
| 09:45 | Rob Ross, ANL | Trends in HPC | Trends in HPC I/O and File systems |
|
| 10:15 | Break |
|
|
|
| 10:45 | Rob Pennington, NCSA | Trends in HPC | Big Data | |
| 11:15 | Andrew Chien, ANL | Potential collaboration | Presto/Blockus: Towards a Scalable R Programming System | |
| 11:45 | Matthieu Dorier, INRIA | Joint Results | I/O and in-situ visualization: recent results with the Damaris approach | |
| 12:15 | Lunch |
|
|
|
Programming Models/Runtime chair: Sanjay Kale | 13:30 | Wen-Mei Hwu, UIUC | TBA | Scalability, Performance, and Numerical Stability of Many-core GPU Algorithms - A Case Study of Tri-diagonal Solvers | |
| 14:00 | Pavan Balaji, ANL | Potential collaboration | MPI3 and Unified Runtime | |
| 14:30 | Andra Hugo, Raymond Namyst, INRIA | Potential collaboration | Composing multiple StarPU applications over heterogeneous machines: a supervised approach | |
| 15:00 | Jean-François Mehaut, INRIA | Potential collaboration | Optimizations for modern NUMA |
|
| 15:30 | Break |
|
|
|
Numerical algorithms and Methods | 16:00 | Stefan Wild, ANL | Potential collaboration | Numerical optimization for "automatic" tuning of codes | |
16:30 | Laura Grigori | Results | Communication avoiding | ||
| 17:00 | Bill Gropp, UIUC | Results | Hybrid Scheduling | |
| 17:30 | Laurent Hascoet, INRIA | Early Results | The Data-Dependence graph of Adjoint Codes | |
18:00 | Adjourn |
| |||
19:00 | Dinner | Jameson's |
| ||
|
|
|
|
|
|
Workshop Day 2 (Main room) | Tuesday Nov. 20th |
|
|
|
|
|
|
|
|
|
|
Big Systems | 08:30 | Pete Beckman, ANL | Trends | New Directions in Extreme-Scale Operating Systems and Runtime Software |
|
| 09:00 | Bill Kramer, UIUC/NCSA | Trends | Blue Waters update |
|
Cloud | 09:30 | Ian Foster, ANL | Potential collaboration | Big Process for Big Data | |
| 10:00 | Christine Morin, INRIA | Potential collaboration | Work in Progress on Cloud Computing in Myriads Team and Contrail European Project | |
| 10:30 | Break |
|
|
|
11:00 | Frederic Desprez, INRIA | Potential collaboration TBA | Workflow Allocations and Scheduling on IaaS Platforms, from Theory to Practice | ||
Resilience: | 11:30 | Yves Robert | Early Result | Performance modeling of checkpointing under failure prediction | |
| 12:00 | Rinku Gupta, ANL | Potential collaboration | CIFTS: An infrastructure for coordinated and comprehensive system-wide fault tolerance. |
|
| 12:30 | Ana Gainaru, UIUC | Early Results | Coupling failure prediction, proactive and preventive checkpoint for current production HPC systems. |
|
| 13:00 | Lunch |
| Food buffet in Room 1407, Lunch seating in room 1416 (second half) |
|
|
|
|
| Parallel Session |
|
Mini workshop on Numerical libraries | 8:30 | ||||
| 09:00 | Bill Gropp, UIUC | Potential collaboration | TBA | |
| 09:30 | Laura Grigori, INRIA | Potential collaboration | TBA | |
| 10:00 | Break | TBA | ||
| 10:30 | Anshu Dubey, ANL | Potential collaboration | Optimizing Scientific Codes While Retaining Portability |
|
| 11:00 | Discussion |
|
|
|
| 12:00 | Adjourn |
|
|
|
| 13:00 | Lunch |
|
|
|
|
|
|
| Parallel Sessions |
|
Mini workshop on Performance Modeling and simulation | 14:30 | Sanjay Kale, UIUC | Early Results | BIG SIM |
|
| 15:00 | Arnaud Legrand, INRIA |
| SimGrid for HPC |
|
| 15:30 | Torsten Hoefler, ETH | Early Results | Performance Modeling for Parallel Software Development and Tuning |
|
| 16:00 | Break |
|
|
|
| 16:30 | Timo Schneider, ETH | Early Results | Optimization Principles for Collective Neighborhood Communications |
|
| 17:00 | Discussion |
|
|
|
| 18:00 | Adjourn |
|
|
|
| 19:00 | Dinner | Meggaiano's | [http://www.maggianos.com/EN/Oak-Brook_Oak-Brook_IL/Pages/LocationLanding.aspx?AspxAutoDetectCookieSupport=1 |
|
|
|
|
|
|
|
Mini workshop on Cloud | 14:30 | Kate Keahey, ANL | Potential collaboration | Infrastructure Outsourcing in Multi-Cloud Environment |
|
| 15:00 | Narayan Deai, ANL | Potential collaboration | Building Clouds for Technical Computing |
|
| 15:30 | Jonathan Rouzaud, INRIA | Potential collaboration | Provisioning Virtual Machines in Federated Clouds |
|
| 16:00 | Break |
|
|
|
| 16:30 | Michael Wilde | Potential collaboration | Swift: simpler parallel programming for cloud and HPC domains http://www.ci.uchicago.edu/swift (Swift for clouds and clusters) |
|
| 17:00 | Discussion |
|
|
|
| 18:00 | Adjourn |
|
|
|
| 19:00 | Dinner | Meggaiano's | [http://www.maggianos.com/EN/Oak-Brook_Oak-Brook_IL/Pages/LocationLanding.aspx?AspxAutoDetectCookieSupport=1 |
|
|
|
|
|
|
|
Workshop Day 3 (Main room) | Wednesday Nov 21st |
|
|
|
|
|
|
|
| Parallel Sessions |
|
Mini workshop on Programming models/runtime | 08:30 | Emmanuel Jeannot, INRIA | Results | TBA |
|
09:00 | Sanjay Kale, UIUC | Charm++ update |
| ||
09:30 | Christian Perez, INRIA |
| TBA |
| |
10:00 | Break |
|
| ||
10:30 | Jim Dinan |
| A One-Sided View of HPC: Global-View Models and Portable Runtime Systems |
| |
11:00 | Sebastien Fourestier | Potential collaboration | Parallel repartitioning and re-mapping in Scotch |
| |
| 11:30 | Discussion |
|
|
|
| 12:30 | Closing |
|
|
|
| 13:00 | Lunch |
|
|
|
|
|
|
|
|
|
Mini workshop on Resilience | 08:30 | Mohamed Slim Bouguerra | Result | TBA |
|
| 09:00 | Amina Guermouche, INRIA | Result | Unified Model for Assessing Checkpointing Protocols at Extreme-Scale |
|
| 09:30 | Bogdan Nicolae, IBM | Result | I-Ckpt: Leveraging memory access patterns and inline collective deduplication to improve scalability of CR |
|
| 10:00 | Break |
|
|
|
| 10:30 | Tatiana Martsinkevich, INRIA | Result | Fully distributed recovery for send-determinism applications |
|
| 11:00 | Peter Brune, ANL | Trends | Multilevel Resiliency for PDE Simulations |
|
| 11:30 | Xiang Ni, Estaban Menese | Results | Scalable in-memory checkpoint with automatic restart on failure |
|
| 12:00 | Discussion |
|
| |
| 12:30 | Closing |
|
|
|
| 13:00 | Lunch |
| Boxe Lunches |
|
...
In this talk, I will describe a multi-cloud auto-scaling service that enables the user to leverage "computational power on tap" provided by infrastructure clouds, i.e., allows the user to easily deploy resources across multiple private, community, and commercial clouds; provides high availability in that it allows users to replace failed resources; and scales to demand. The policies governing scaling are customizable based on system and application-specific indicators. We will describe the service architecture and implementation and discuss results obtained in the sustained deployment and management of thousands of virtual machines on EC2.
Frederic Desprez
Workflow Allocations and Scheduling on IaaS Platforms, from Theory to Practice
Many scientific applications are described through workflow structures. Due to the increasing level of parallelism offered by modern computing infrastructures, workflow applications now have to be composed not only of sequential programs, but also of parallel ones. Cloud platforms bring on-demand resource provisioning and pay-as-you-go billing model. Then the execution of a workflow corresponds to a certain budget. The current work addresses the problem of resource al- location for non-deterministic workflows under budget constraints. We present a way of transforming the initial problem into sub-problems that have been studied before. We propose two new allocation algorithms that are capable of determining resource allocations under budget constraints and we present ways of using them to address the problem at hand. Then we present a first implementation of a workflow management system based on DIET/Phantom/Nimbus of the FutureGrid platform for the Ramses workflow, a n-body simulations of dark matter interactions application.