Overview
Summary
- The practice of implementing cloud computing in Environmental science area is focused on modeling such as ocean climate modeling (Evangelinos et al. 2008) and groundwater modeling (Hunt et al. 2010).
- Cloud computing is also used in data analysis such as parallelize sequential data analysis tasks (Hasenkamp et al. 2010).
Workflow
Data
- Example: Analyzing 500GB repository on Magellan with 8 VM instances: 9hrs, on single-processor workstation: 90~95hrs (Hasenkamp et al. 2010)
Cloud platform
- Amazon EC2 (Evangelinos et al. 2008, He et al. 2010)
- GoGrid (Hunt et al. 2010, He et al. 2010)
- FutureGrid with Nimbus and Eucalyptus (Fox et al. 2011)
- Azure (Eye on Earth)
- Magellan with Eucalyptus (Hasenkamp et al. 2010)
Issues/Gaps
- Performance of Cloud is limited by messaging performance (Evangelinos et al. 2008)
- Lack of robust script/batch programs (Hunt et al. 2010)
- Most public clouds are optimized for running business applications instead of HPC (He et al. 2010)
Running Coupled Atmosphere-Ocean Climate Models on Amazon's EC2 1
Summary
- A parallel scientific HPC application for Coupled Atmosphere-Ocean climate modeling is tested on Amazon EC2 platform.
- An EC2 computing cloud on-demand cluster system and a custom AMI software image is combined.
Workflow
Data
- MPMD climate application
Cloud platform
Cloud performance
- The performance of Amazon EC2 is comparable with low-cost cluster systems
- The performance of Amazon EC2 is below the level with supercomputer systems
Issues/Gaps
- The performance of cloud is limited by messaging performance where latencies and bandwidths are between one and two orders of magnitude inferior to big computer center facilities.
Using a cloud to replenish parched groundwater modeling efforts 2
Summary
- Parameter estimation techniques in a highly parameterized context have high computational costs. Cloud computing can be used to implement parameter esimation which can improve groundwater models.
Workflow
Data
- The files and model executables are uploaded to the cloud server through the Windows remote desktop protocol (RDP)
Cloud platform
Cloud performance
Comparison of Runtimes of One Model Run:
Computer |
Time(s) |
---|---|
Q6700 Core 2 Quad (2.66 GHz) |
85 |
GoGrid Cloud virtual machine |
81 |
Xeon (3.0 GHz) |
73 |
Q9650 Core 2 Quad (3.0 GHz) |
71 |
i7 (3.33 GHz) |
58 |
Issues/Gaps
- One current limitation of cloud computing is the lack of robust scripts/batch programs that automatically clone, load, and launch slaves while accounting for the dynamic nature of computing resources on the cloud
Measured Characteristics of FutureGrid Clouds for Scalable Collaborative Sensor-Centric Grid Applications 3
Summary
As a heterogeneous distributed cloud infrastructure, FutureGrid can be used to support the study of large-scale, collaborative sensor-centric applications that have stringent real-time and quality of service requirements.
Workflow
Three experiments are performed:
- Network-level measurement
- Message-level measurement
- Application-level measurement
Data
Cloud platform
- FutureGrid: IaaS
- FutureGrid is oriented towards developing tools and technologies rather than providing production computational capacity
- FutureGrid is an infrastructure comprising approximately 4,000 cores at six sites:
- Indiana University
- University of Chicago
- San Diego Supercomputing Center
- University of Florida
- Purdue University
- Texas Advanced Computing Center
- Nimbus
- Eucalyptus
Cloud performance
Issues/Gaps
Case study for running HPC applications in public clouds 4
Summary
Classical HPC benchmarks are run in different cloud platforms to research the ability of cloud platforms for HPC applications.
Workflow
- NPB: NAS Parallel Benchmark for evaluating the performance of parallel supercomputers
- HPL: High Performance LINPACK
- CSFV: Cubed-Sphere-Finite_Volume (CSFV) Dynamic Core
Data
Cloud platform
Three public clouds are tested:
- Amazon EC2
- GoGrid
- IBM Cloud: Intel's Nehalem X5570@2.93GHz, only 32-bit linux OS available
Cloud performance
Issues/Gaps
- Most public clouds are optimized for running business applications instead of HPC
- Networking of cloud must be upgraded for HPC applications
- DoE Magellan will use Infiniband
- Penguin Computing will use GigE or Infiniband
- Current cloud requests special approval for launching 20+ servers
- Cluster size is about ~10 nodes, HPC applications can deploy 100+ nodes.
- Dynamic IP
- Will test the benchmark on NASA's Nebula
Finding Tropical Cyclones on a cloud computing cluster 5
Summary
- TSTORMS is run by using virtual machine on a remote 500GB repository of climate simulation data.
- Parallel virtualization
Workflow
- TSTORMS: find tropical storms in climate simulation output data
Data
Cloud platform
- Eucalyptus on Magellan Scientific Cloud: Each node with dual qual-core Intel Nehalem 2.66GHz processors and 24GB RAM
- Comparsion: Grid Laboratory of Wisconsin (GLOW) at University of Wisconsin
Cloud performance
- Analyzing 500GB repository
- Single-processor workstation with similar computational power: 90-95 hrs
- On Carver: 8 processes, 3 hrs
- On Magellan: 8 virtual machine instances, ~9 hrs, 30 VMI, ~4.5 hrs (reduce total analysis time by a factor of ~21)
Issues/Gaps
- Job coordination and data movements can be built into a management system as a part of a cloud computing ecosystem
Scientific Gateway and Visualization Tool 6
Summary
Workflow
Data
Cloud platform
Cloud performance
Issues/Gaps
Eye on Earth with Azure 7
Summary
- The Eye on Earth application is based on Windows Azure and Bing Maps.
- Citizens can view or contribute to water and air quality data anywhere in Europe.
Workflow
Data
Cloud platform
Cloud performance
Issues/Gaps
HiClouds: Taming Clouds on the Clouds 8
Summary
- Proposed a HiCloud framework to enable large-scale high resolution weather forecasting on clouds of multi-site of supercomputers with over 2500 CPUs.
Workflow
Data
Cloud platform
Cloud performance
Issues/Gaps
References
- Evangelinos, C. et al. Cloud Computing and Its Application Chicago (2008)
- Hunt, R.J. et al. Ground water 48 3, 360-365 (2010)
- Fox, G. et al. CTS (2011)
- He, Q. et al. HPDC (2010)
- Hasenkamp, D. et al. CloudCom (2010)
- Pajorova, E. et al. Lecture Notes in Computer Science (2011)
- Eye on Earth built on Azure
- Zhao, H. et al. IEEE CCGrid (2009)
- Zinn, D. et al. WORKS (2010)
- Ramakrishnan, L. et al. (2010)
1 Comment
Yong Liu
Eye on Earth on Windows Azure:
http://www.eyeonearth.eu/