Overview

Summary

  • The practice of implementing cloud computing in Environmental science area is focused on modeling such as ocean climate modeling (Evangelinos et al. 2008) and groundwater modeling (Hunt et al. 2010).
  • Cloud computing is also used in data analysis such as parallelize sequential data analysis tasks (Hasenkamp et al. 2010).

Workflow

Data

  • Example: Analyzing 500GB repository on Magellan with 8 VM instances: 9hrs, on single-processor workstation: 90~95hrs (Hasenkamp et al. 2010)

Cloud platform

Issues/Gaps

  • Performance of Cloud is limited by messaging performance (Evangelinos et al. 2008)
  • Lack of robust script/batch programs (Hunt et al. 2010)
  • Most public clouds are optimized for running business applications instead of HPC (He et al. 2010)

Running Coupled Atmosphere-Ocean Climate Models on Amazon's EC2 1

Summary

  • A parallel scientific HPC application for Coupled Atmosphere-Ocean climate modeling is tested on Amazon EC2 platform.
  • An EC2 computing cloud on-demand cluster system and a custom AMI software image is combined.

Workflow

Data

  • MPMD climate application

Cloud platform

Cloud performance

  • The performance of Amazon EC2 is comparable with low-cost cluster systems
  • The performance of Amazon EC2 is below the level with supercomputer systems

Issues/Gaps

  • The performance of cloud is limited by messaging performance where latencies and bandwidths are between one and two orders of magnitude inferior to big computer center facilities.

Using a cloud to replenish parched groundwater modeling efforts 2

Summary

  • Parameter estimation techniques in a highly parameterized context have high computational costs. Cloud computing can be used to implement parameter esimation which can improve groundwater models.

Workflow

Data

  • The files and model executables are uploaded to the cloud server through the Windows remote desktop protocol (RDP)

Cloud platform

Cloud performance

Comparison of Runtimes of One Model Run:

Computer

Time(s)

Q6700 Core 2 Quad (2.66 GHz)

85

GoGrid Cloud virtual machine

81

Xeon (3.0 GHz)

73

Q9650 Core 2 Quad (3.0 GHz)

71

i7 (3.33 GHz)

58

Issues/Gaps

  • One current limitation of cloud computing is the lack of robust scripts/batch programs that automatically clone, load, and launch slaves while accounting for the dynamic nature of computing resources on the cloud

Measured Characteristics of FutureGrid Clouds for Scalable Collaborative Sensor-Centric Grid Applications 3

Summary

As a heterogeneous distributed cloud infrastructure, FutureGrid can be used to support the study of large-scale, collaborative sensor-centric applications that have stringent real-time and quality of service requirements.

Workflow

Three experiments are performed:

  • Network-level measurement
  • Message-level measurement
  • Application-level measurement

Data

Cloud platform

  • FutureGrid: IaaS
  • FutureGrid is oriented towards developing tools and technologies rather than providing production computational capacity
  • FutureGrid is an infrastructure comprising approximately 4,000 cores at six sites:
    • Indiana University
    • University of Chicago
    • San Diego Supercomputing Center
    • University of Florida
    • Purdue University
    • Texas Advanced Computing Center
  • Nimbus
  • Eucalyptus

Cloud performance

Issues/Gaps

Case study for running HPC applications in public clouds 4

Summary

Classical HPC benchmarks are run in different cloud platforms to research the ability of cloud platforms for HPC applications.

Workflow

  • NPB: NAS Parallel Benchmark for evaluating the performance of parallel supercomputers
  • HPL: High Performance LINPACK
  • CSFV: Cubed-Sphere-Finite_Volume (CSFV) Dynamic Core

Data

Cloud platform

Three public clouds are tested:

  • Amazon EC2
  • GoGrid
  • IBM Cloud: Intel's Nehalem X5570@2.93GHz, only 32-bit linux OS available

Cloud performance

Issues/Gaps

  • Most public clouds are optimized for running business applications instead of HPC
  • Networking of cloud must be upgraded for HPC applications
    • DoE Magellan will use Infiniband
    • Penguin Computing will use GigE or Infiniband
  • Current cloud requests special approval for launching 20+ servers
  • Cluster size is about ~10 nodes, HPC applications can deploy 100+ nodes.
  • Dynamic IP
  • Will test the benchmark on NASA's Nebula

Finding Tropical Cyclones on a cloud computing cluster 5

Summary

  • TSTORMS is run by using virtual machine on a remote 500GB repository of climate simulation data.
  • Parallel virtualization

Workflow

  • TSTORMS: find tropical storms in climate simulation output data

Data

Cloud platform

  • Eucalyptus on Magellan Scientific Cloud: Each node with dual qual-core Intel Nehalem 2.66GHz processors and 24GB RAM
  • Comparsion: Grid Laboratory of Wisconsin (GLOW) at University of Wisconsin

Cloud performance

  • Analyzing 500GB repository
    • Single-processor workstation with similar computational power: 90-95 hrs
    • On Carver: 8 processes, 3 hrs
    • On Magellan: 8 virtual machine instances, ~9 hrs, 30 VMI, ~4.5 hrs (reduce total analysis time by a factor of ~21)

Issues/Gaps

  • Job coordination and data movements can be built into a management system as a part of a cloud computing ecosystem

Scientific Gateway and Visualization Tool 6

Summary

Workflow

Data

Cloud platform

Cloud performance

Issues/Gaps

Eye on Earth with Azure 7

Summary

  • The Eye on Earth application is based on Windows Azure and Bing Maps.
  • Citizens can view or contribute to water and air quality data anywhere in Europe.

Workflow

Data

Cloud platform

Cloud performance

Issues/Gaps

HiClouds: Taming Clouds on the Clouds 8

Summary

  • Proposed a HiCloud framework to enable large-scale high resolution weather forecasting on clouds of multi-site of supercomputers with over 2500 CPUs.

Workflow

Data

Cloud platform

Cloud performance

Issues/Gaps

References

  1. Evangelinos, C. et al. Cloud Computing and Its Application Chicago (2008)
  2. Hunt, R.J. et al. Ground water 48 3, 360-365 (2010)
  3. Fox, G. et al. CTS (2011)
  4. He, Q. et al. HPDC (2010)
  5. Hasenkamp, D. et al. CloudCom (2010)
  6. Pajorova, E. et al. Lecture Notes in Computer Science (2011)
  7. Eye on Earth built on Azure
  8. Zhao, H. et al. IEEE CCGrid (2009)
  9. Zinn, D. et al. WORKS (2010)
  10. Ramakrishnan, L. et al. (2010)
  • No labels

1 Comment

  1. Eye on Earth on Windows Azure:

    http://www.eyeonearth.eu/