Page History

...

Main Topics	Schedule	Speaker	Affiliation	Type of presentation	Title (tentative)	Download

Dinner Before the Workshop	7:30 PM	Only people registered for the dinner			Valpré hotel

Workshop Day 1	Wednesday June 12th
					TITLES ARE TEMPORARY (except if in bold font)
Registration	08:00
Welcome and Introduction Amphitheatre	08:30	Marc Snir + Franck Cappello	INRIA&UIUC&ANL	Background	Welcome, Workshop objectives and organization	Opening-9th-Workhsop.ppt
	08:45	Bill Kramer	UIUC	Background	NCSA updates and vision of the collaboration	Kramer-Joint Lab Workshop - 20130612-v3.pdf
	09:00	Marc Snir	ANL	Background	ANL updates vision of the collaboration	aics-130612.pptx intro for Lyon.pdf
	09:15	Frederic Desprez	Inria	Background	INRIA updates and vision of the collaboration	Desprez-HPC@Inria-JLPC-0613.pdf
Big systems Chair: Christian Perez	9:30	Bill Kramer	UIUC	Background	Update on BlueWaters	BW Overview - Inria-Illinois Joint Workshop June 2013-v1.pdf
	10:00	Break
	10:30	Mitsuhisa Sato	U. Tsukuba & AICS	Background	AICS and the K computer	aics-130612.pptx
CANCELED	11:00	Paul Gibbon	Juelich	Background	Meeting the Exascale Challenge at the Juelich Supercomputing Centre.
Resilience&fault tolerance and simulation Chair: Franck Cappello	11:00	Marc Snir	ANL&UIUC	Report	ICIS report on Resilience	UWM resilience.pdf
	11:30	Vincent Baudoui	Total & ANL	Joint-Results	Round-off error and silent soft error propagation in exascale applications	Lyon_12_juin_2013_Error_propagation_in_exascale_applications_Vincent_Baudoui.pdf
	12:00	Lunch
Numerical Algorithms Chair: Frederic Desprez	13:30	Bill Gropp	UIUC	Background	Topics for Collaboration in Numerical Libraries	libraries-gropp-final.pdf
	14:00	Paul Hoveland	ANL	Background	Argonne strategic plan in applied math
	14:30	Marc Baboulin	INRIA	Background	Using con dition condition numbers to assess numerical quality in high-performance computing applications	baboulin.pdf
	15:00	Luke Olson	UIUC	Background	Opportunities in developing a more robust and scalable multigrid solver	201306_JointLab.pdf
	15:30	Break
	16:00	Frederic Nataf	INRIA&P6	Background	Toward black-box adaptive domain decomposition methods	talkJLPC20130612.pdf
Resilience&fault tolerance and simulation Chair: Franck Cappello	16:30	Bogdan Nicolae	IBM	Joint Result	AI-Ckpt: Leveraging Memory Access Patterns for Adaptive Asynchronous Incremental Checkpointing	AICkpt-9thJLPC.pdf
	17:00	Martin Quison	INRIA	Result	Improving Simulations of MPI Applications Using A Hybrid Network Model with Topology and Contention Support	JLPC-simgrid-smpi.pdf
	17:30	Adjourn
	18:45	Bus for Diner

Workshop Day 2	Thursday June 13th

Programming Models Chair: Frederic Desprez	08:30	Jean-François Mehaut	INRIA	Result	Progresses in the European FP7 Mont-Blanc 1 project and objectives of its follow up: Mont-Blanc 2
	09:00	Rajeev Thakur	ANL	Background	Update on MPI and OS/R Activities at Argonne	Rajeev.pdf
	09:30	Andra Ecaterina Hugo	INRIA	Results	Composing multiple StarPU applications over heterogeneous machines: a supervised approach	ahugo_Composability_StarPU.pdf
	10:00	Celso Mendes	UIUC	Background	Dynamic Load Balancing for Weather Models via AMPI	AMPI-BRAMS-JointLab2013.pdf
	10:30	Break
Big Data, I/O, Visualization Chair: Kate Keahey	11:00	Dries Kimpe	ANL	Results	Triton: Exascale Storage
	11:30	Gilles Fedak	INRIA	Result	Active Data: A Programming Model to Manage Data Life Cycle Across Heterogeneous Systems and Infrastructures	active-data-fedak.pdf
	12:00	Matthieu DorrierDorier	INRIA	Joint Result	Data Analysis of Ensemble Simulations: an In Situ Approach using Damaris	DORIER-JLPC-06-2013.pdf
	12:30	Ian Foster	ANL	Background	Compiler optimization for distributed dynamic data flow programs
	13:00	Lunch

Mini Workshop1 Amphitheatre
Resilience Chair: Marc Snir	14:00	Ana Gainaru	UIUC	Results	Challenges in predicting failures on the Blue Waters system.	againaru (1).pdf
	14:30	Xiang Ni	UIUC	Results	ACR: Automatic Checkpoint/Restart for Soft and Hard Error Protection.	JLPC_workshop_Xiang.pdf
	15:00	Tatiana Martsinkevich	INRIA & ANL	Result	On the feasibility of message logging in hybrid hierarchical FT protocols		Martsinkevich message logging.pdf
	15:30	Mohamed Slim Bouguerra	INRIA & ANL	Result	Investigating the probability distribution of false negative failure alerts in HPC systems	Slim_jointlab_icpp_presentation_v0.pdf
	16:00	Break
	16:30	Amina Guermouche	UVSQ	Result	Multi-criteria Checkpointing Strategies: Response-time versus Resource Utilization	AminaGuermouche.pdf
	17:00	Thomas Ropars	EPFL	Result	Towards efficient replication of HPC applications to deal with crash failures	Limited access
	17h30	Mehdi Diouri	INRIA	Result	ECOFIT: A Framework to Estimate Energy Consumption of Fault Tolerance Protocols for HPC Applications	MehdiDiouri.pdf
	18:00	Adjourn

Mini Workshop2 Room: Saint Maur
Numerical Algorithms and Libraries Chair: Bill Gropp	14:00	Jean Utke	ANL	Result	Designing and implementing a tool-indedendent, adjoinable MPI wrapper library	JointLabLyon.pdf
	14:30	Laurent Hascoet	INRIA	Result	The adjoint of MPI one-sided communications
	15:00	Stefan Wild,	ANL	Result	Loud computations? Noise in iterative solvers	wild (1).pdf
	15:30	Jed Brown	ANL	Result	Vectorization, communication aggregation, and reuse in stochastic and temporal dimensions	20130613-JointLab.pdf
	16:00	Break
	16:30	Yushan Wang	INRIA P11	Result	Accelerating incompressible fluid flows simulations using SIMD or GPU computing	Jointlab_Lyon.pdf
	17:00	Frederic Hecht	INRIA/P6	Result	FreeFem++, a user language to solve PDE.	ff-lyon-2013.pdf
	18:00	Adjourn

	18:45	Bus for diner			Lyon

Workshop Day 3	Friday June 14th

Mini Workshop1 (cont.) Room: Les essarts
Resilience Chair: Franck Cappello.	08:30	Di Sheng	INRIA	Result	Optimization of Google Cloud Task Processing with Checkpoint-Restart Mechanism	Lyon-workshop-sdi.ppt
	09:00	Guillaume Aupy	INRIA	Result	On the Combination of Silent Error Detection and Checkpointing	G-aupy-silent-errors.pdf
Mini Workshop3	09:30	Guillaume Mercier	INRIA	Result	Topology Management and MPI Implementations Improvements	JointLab9.pdf
	10:00	Break
Programming and Scheduling Chair: Rajeev Thakur	10:30	Vincent Lanore	INRIA	Result	Static 2D FFT adaptation through a component model based on Charm++	vlanore_jointlab.pdf
	11:00	Anne Benoit	INRIA	Result	Energy-efficient scheduling	BenoitAnne.pdf
	11:30	François Tessier	INRIA	Result	Communication-aware load balancing with TreeMatch in Charm++	Tessier_JLPC13.pdf
	12:00	Closing
	12:30	Lunch

Mini Workshop2 (cont.) Room: Saint Maur
Numerical Algorithms and Libraries Chair: Paul Hovland	08:30	François Pellegrini	INRIA	Result	Shared memory parallel algorithms in Scotch 6	inria-uiuc_20130614.pdf
	09:00	Luc GiraudAbdou Guermouche	INRIA	ResultTBA	Towards resilient parallel linear Krylov solvers
Mini Workshop4	09:30	Kate Keahey	ANL	ResultTBA	Research Topics and Collaboration Opportunities in the Nimbus Team
Clouds Chair: Frederic desprez	10:00	Break
	10:30	Jonthan	Jonathan Rouzaud-Cornabas	CNRS&INRIA	Result	SimGrid Cloud Broker: Simulation of Public and Private Clouds	sgcb_pres.pdf
	11:00	Christian Perez	INRIA	Result	On Component Models to Deploy Application on Clouds	130614_Cloud_Component (1).pdf
	11:30	Eddy Caron	INRIA	Result	Seed4C: Secured Embedded Element and Data privacy for Cloud Federation
	12:00	Closing
	12:30	Lunch

...

Adaptation algorithms for HPC applications can improve performance but their implementation is often costly in terms of development and maintenance. Component models such as Gluon++, which is built on top of Charm++, propose to separate the business code, encapsulated incomponents, and the application structure, expressed through a component assembly. Adaptation of component-based HPC applications can be achieved through the optimization of the assembly. We have studied such an approach with the adaptation to network topology and data size of a gluon++ 2D FFT application. In this talk, we present our work thus far and comment preliminary experimental results on the Grid'5000 platform.

Setefan Wild

Stefan Wild

Loud computations? Noise in iterative solvers

The adjoint of MPI one-sided communications
Roundoff errors, discretizations, numerical solutions to systems of equations, and adaptive techniques can destroy the smoothness of processes underlying computations at scale. Such computational noise complicates optimization, sensitivity analysis, and other applications that depend on the simulation output. We present a method for analyzing computational noise and illustrate the insights it enables on a collection of problems based on Krylov solvers.

Guillaume Aupy

On the Combination of Silent Error Detection and Checkpointing

...

In this talk we introduce the design of a secure federated cloud from end to end. We discussed the core of this platform based on a High Performance Computing middleware that uses federated clouds and other virtual resources as well classic HPC resources. We propose an architecture to ensure a high level of security from personal devices to the targeted virtual machine. The Seed4C platform improved security in each layer. With DIET Cloud, we are able to deploy a large-scale, distributed and secure HPC platform that spans across a large pool of resources aggregated from different providers through a secure way

Jonathan Rouzaud-Cornabas

SimGrid Cloud Broker: Simulation of Public and Private Clouds

...

Distributed, dynamic data flow is an execution model well-suited for many large-scale parallel applications, particularly scientific simulations and analysis pipelines running on large, distributed-memory clusters. In this paper we describe compiler optimization techniques and an intermediate representation for distributed dynamic data flow programs. These techniques are applied to Swift/T, a high-level declarative language that allows flexible data flow composition of functions written in other programming languages such a C or Fortran. We show that compiler optimization can reduce communication overhead by 70-93% on distributed memory systems, making the high-level language competitive with hand-coded coordination logic for certain common application styles

Christian PerezGille Fedak

On Component Models to Deploy Application on Clouds

Clouds have become a complex ecosystem, providing many kinds of virtual machines (with different capabilities), of usage (on demand, spot instances, reservation), of data storage, etc. Moreover, some clouds provides worldwide "regions", enabling large scale distributed applications. Users also have very different requirements, potentially from execution to another such as minimizing execution time, respecting budget constraints, etc. Therefore, automatically and efficiently deciding how to map an application to a set of VM is a difficult challenge. This talk will discuss how the European PaaSage project as well as the French ANR MapReduce are using component models to describe and map an application structure, independently of anycloud, to an actual cloud

Kate Keahey

Research Topics and Collaboration Opportunities in the Nimbus Team

The advent of IaaS cloud computing promises acquisition and management of customized on-demand resources. What is the best way to leverage those resources? What new applications are emerging in this context? How will they change our work patterns? What new technical approaches need to be developed to support them? What new opportunities will they lead to? In this talk, I will describe tools the Nimbus team is developing, among others, in the context of the Ocean Observatory Initiative project, that focus on answering these questions. I will describe our approach and tools, the problems we are trying to address, as well as the interaction patterns associated with scientific applications currently driving our approach.

Abdou Guermouche

Towards resilient parallel linear Krylov solvers

The advent of exascale machines will require the use of parallel resources at an unprecedented scale, probably leading to a high rate of hardware faults. High Performance Computing (HPC) applications that aim at exploiting all these resources will thus need to be resilient, i.e., being able to still compute a correct solution even in presence of faults. In this work, we investigate possible remedies in the framework of the solution of large sparse linear systems that is often the inner most numerical kernel in many scientific and engineering applications and also one of the most time consuming part. More precisely, we present recovery followed by restarting strategies in the framework of Krylov subspace solvers where lost entries of the iterate are interpolated to define a new initial guess before restarting. In particular, we consider two interpolation policies that preserve key numerical properties of well-known solvers. We assess the impact of the recovery method, the fault rate and the number of processors on the robustness of the resulting linear solvers. We consider experiments with CG, GMRES and Bi-CGStab.

Child pages

Versions Compared

Old Version 65

New Version Current

Key

TITLES ARE TEMPORARY (except if in bold font)