You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 39 Next »

Overview

Summary

Workflow

Data

Cloud platform

Issues/Gaps

A study of cost and performance of the application of cloud computing to Astronomy 1

The performance of three workflow applications with different I/O, memory and CPU requirements are investigated on Amazon EC2 and the performance of cloud are compared with that of a typical HPC (Abe in NCSA).
The goal is to determine which type of scientific workflow applications are cheaply and efficiently run on the Amazon EC2 cloud.
Also the application of cloud computing to the generation of an atlas of periodograms for the 210,000 light curves is described.

Part I - Performance of three workflow applications

Summary
  • For CPU-bound applications, virtualization overhead on Amazon EC2 is generally small.
  • The resources offered by EC2 are generally less powerful than those available in HPC. Particularly for I/O-bound applications.
  • Amazon EC2 offers no cost benefit over locally hosted storage, but does eliminate local maintenance and energy costs, and does offer high-quality, reliable storage.
  • As a result, commercial clouds may not be best suited for large-scale computations c.
Cloud platform
  • Cloud platform: Amazon EC2 (http://aws.amazon.com/ec2/)
  • Workflow a applications
    Three different workflow applications are chosen.
    • Montage (http://montage.ipac.caltech.edu) from astronomy: 10,429 tasks, read 4.2 GB of input data, and produced 7.9 GB of output data.
      I/O-bound because it spends more than 95% of its time waiting on I/O operations
    • Broadband (http://scec.usc.edu/research/cme) from seismology: 320 tasks, read 6 GB of input data, and produced 160 MB of output data.
      Memory-limited because more than 75% of its runtime is consumed by tasks requiring more than 1 GB of physical memory.
    • Epigenome (http://epigenome.usc.edu) from biochemistry: 81 tasks, read 1.8 GB of input data, and produced 300 MB of output data.
      CPU-bound because it spends 99% of its runtime in the CPU and only 1% on I/O and other activities.
  • Methods
    The experiments were all run on single nodes to provide an unbiased comparison of the performance of workflows on Amazon EC2 and Abe.
Cloud performance
  1. Montage (I/O-bound)
    The processing times on abe.lustre are nearly three times faster than the fastest EC2 machines b.
  2. Broadband (Memory-bound)
    The processing advantage of the parallel file system largely disappears. And abe.local's performance is only 1% better than cl.xlarge.
    For memory-intensive application, Amazon EC2 can achieve nearly the same performance as Abe.
  3. Epigenome (CPU-bound)
    The parallel file system in Abe provides no processsing advantage for Epigenome. The machines with the most cores gave the best performance for CPU-bound application.
Issues/Gaps
  • Storage Cost: Cost to store VM images in S3 and cost of storing input data in EBS.
    The table summarizes the monthly storage cost

    Application

    Input Volume

    Monthly Storage Cost

    Montage

    4.3 GB

    $0.66

    Broadband

    4.1 GB

    $0.66

    Epigenome

    1.8 GB

    $0.26

  • Transfer cost: AmazonEC2 charges $0.10 per GB for transter into the cloud and $0.17 per GB for transfer out of the cloud.
    The data size and transfer costs are summarized in the tables below.
    Data transfer size per workflow on Amazon EC2

    Application

    Input

    Output

    Logs

    Montage

    4,291 MB

    7,970 MB

    40 MB

    Broadband

    4,109 MB

    159 MB

    5.5 MB

    Epigenome

    1,843 MB

    299 MB

    3.3 MB

    Costs of transferring data into and out the EC2 cloud

    Application

    Input

    Output

    Logs

    Total

    Montage

    $0.42

    $1.32

    $<0.01

    $1.75

    Broadband

    $0.40

    $0.03

    $<0.01

    $0.43

    Epigenome

    $0.18

    $0.05

    $<0.01

    $0.23

  • Cost effectiveness study
    Cost calculations based on processing reqeusts for 36,000 mosaic of 2MASS images (Total size 10TB) of size 4 sq deg over a period of three years (typical workload for image mosaic service).
    Results show that Amazon EC2 is much less attractive than a local service for I/O-bound application due to the high costs of data storage in Amazon EC2.

Part II - Application to calculation of periodograms

Generation of a science product: an atlas of periodograms for the 210,000 light curves released by the NASA Kepler Mission.

 

 

Result

Runtimes

Tasks

631,992

 

Mean Task Runtime

6.34 sec

 

Jobs

25,401

 

Mean Job Runtime

2.62 min

 

Total CPU Time

1,113 hr

 

Total Wall Time

26.8 hr

Inputs

Input Files

210,664

 

Mean Input Size

0.084 MB

 

Total Input Size

17.3 GB

Outputs

Output Files

1,263,984

 

Mean Output Size

0.124 MB

 

Total Output Size

76.52 GB

Cost

Compute Cost

$291.58

 

Transfer Cost

$11.48

 

Total Cost

$303.06

Seeking Supernovae in the Clouds: A Performance Study 2

Summary

Nearby Supernova Factory(SNfactory) experiment measures the expansion history of the Universe to explore the nature of Dark Energy with Type Ia supernovae. SNfactory is a pipeline of serial processes executing various image processing algorithms in parallel on ~10TBs of data. SNfactory is ported to Amazon Web Services environment.

Cloud platform

  • Cloud Platform: Amazon Web Services
    • EC2 32-bit highCPU medium instances (c1.mediu: 2 virtual cores, 2.5 ECU each)
    • 80-core runs were used.
  • Design: Port the environment into EC2 first, then decide the location of data and the size of compute resource.
  • Setup virtual cluster in EC2. Create EBS volume for shared file system.
  • Data size:
    • Raw data: 10TB
    • Processed data: 20TB

Cloud performance

  • EBS vs S3
    • In the 80-core experiment, a run of processing took ~7 hours for EBS variants and only 3 hours for S3.
    • Output data loading time into S3 is an order of magnitude smaller than into EBS.
    • Cost: data transfers between EC2 and S3 are free d. S3 storage is better than EBS for SNfactory.
    • Input data and application data will be stored in EBS and output data will be writen to S3.

Issues/Gaps

  • Need to replicate HPC cluster environment in EC2 or the application must be modified.
  • Mean rate of failure is higher in EC2 than in traditional cluster environments which needs to be handled.
  • Inability to acquire all of the VMI requested because insufficient resources are available, so need to modify the application to adapt this.
  • Transient errors.

Application of Cloud computing to the creation of image mosaic and management of their provenance 3

Summary

Similar content as the first paper.

Workflow

Data

Cloud platform

Cloud performance

Issues/Gaps

Scientific workflow applications on Amazon EC2 4

Summary

Similar content as the first paper.

Workflow

Data

Cloud platform

Cloud performance

Issues/Gaps

Data Sharing Options for Scientific Workflows on Amazon EC2 5

Summary

  • Choice of storage system has a significant impact on workflow runtime
  • Investigated data management options in the cloud for workflow applications

Workflow

  • Montage: high I/O, low Memory, low CPU
  • Broadband: medium I/O, high memory, medium CPU
  • Epigenome: low I/O, medium memory, high CPU

Data

Cloud platform

Comparison:

Cloud performance

  • S3 produces good performance for one application due to the use of caching in the implementation of the S3 client
  • S3 performs poorly on workflows with a large number of small files
  • Cost of S3 is at a disadvantage for workflows with many files, because Amazon charges a fee per S3 transaction

Issues/Gaps

Using MapReduce for Image Coaddition 6

Summary

  • The paper presents implementation and evaluation of image coaddition within the MapReduce data-processing framework using Hadoop.

Workflow

Data

  • Processed dataset containing 100,000 individual FITS files

Cloud platform

  • Hadoop on cluster

Cloud performance

  • Process 100,000 files (300 million pixels) in three minutes on a 400-node cluster

Issues/Gaps

CANFAR: Canadian Advanced Network for Astronomical Research 7

Summary

  • The Canadian Advanced Network For Astronomical Research (CANFAR) is a project that is delivering a network-enabled platform for the accessing, processing, storage, analysis, and distribution of very large astronomical datasets

Workflow

Data

Cloud platform

Comparison of processing models

 

Grid

Cloud

CANFAR

Ample CPU Cycles

(tick)

(tick)

(tick)

Job Scheduling

(tick)

(error)

(tick)

User customized environment

(error)

(tick)

(tick)

Resource Sharing

(tick)

(error)

(tick)

Portability of environment

(error)

(tick)

(tick)

Cloud performance

Issues/Gaps

A Multi-Dimensional Classification model for Scientifc workflow Characteristics 8

Summary

  • A multi-dimensional classification model is presented with workflow examples.

Workflow

  • Astronomy workflow:
    • Pan-STARRS's (Panoramic Survey Telescope And Rapid Response System) project is a continuous survey of the entire sky
      • PSLoad workflow stages incoming data files from the telescope pipeline and loads them into individual relational databases each night
      • PSMerge workflow: Each week, the production databases that astronomers query are updated with the new data staged during the week

Data

Cloud platform

Cloud performance

Issues/Gaps

Trident Scientific Workflow Workbench for Data Management in the cloud 9

Summary

Workflow

Data

Cloud platform

Cloud performance

Issues/Gaps

On the use of cloud computing for scientific workflows 10

Summary

Workflow

Data

Cloud platform

Cloud performance

Issues/Gaps

Experiences using cloud computing for a scientific workflow application 11

Summary

Workflow

Data

Cloud platform

Cloud performance

Issues/Gaps

References

  1. Berriman, G.B. et al. Sixth IEEE International Conference on e-Science, 1-7 (2010)
  2. Jackson, K.R. et al. Proc. ACM Int. Symp. HPDC, 421-429 (2010)
  3. Berriman, G.B. et al. SPIE Conference 7740: Software and Cyberinfrastructure for Astronomy (2010)
  4. Juve, G. et al. Cloud Computing Workshop in Conjunction with e-Science Oxford, UK: IEEE (2009)
  5. Juve, G. et al. SC(2010)
  6. Wiley, K. et al. Publications of the Astronomical Society of the Pacific 123 366-380 (2011)
  7. Gaudet, S. et al. Proc SPIE (2010)
  8. Ramakrishnan, L. et al. Wands '10 (2010)
  9. Simmhan, Y. et al. ADVCOMP 09' (2009)
  10. Hoffa, C. et al. ESCIENCE 08' (2008)
  11. Vockler, J. et al. ScienceCloud '11 (2011)

Notes and other links

a. Workflow: loosely coupled parallel applications that consist of a set of computational tasks linked by data- and control-flow dependencies.
b. A parallel file system and high-speed interconnect would make dramatic performance upgrades. Recently Amazon released a new resource type including a 10Gb interconnect.
c. There is a movement towards providing academic clouds, such as FutureGrid or Magellan.
d. Only true for intra-zone transfer (before July 1st, 2011). Also the request for data transfer is not free.

  • No labels