...
- Methods
The experiments were all run on single nodes to provide an unbiased comparison of the performance of workflows on Amazon EC2 and Abe.
For experiments on EC2:- Executables were pre-installed in a Virtual Machine image which is deployed on the node.
- Input data was stored in the Amazon EBS.
- Output, intermediate files and the application executables were stored on local disks.
- All jobs were managed and executed through a job submission host at the Information Sciences Institute (ISI) using the Pegasus Workflow Management System (Pegasus WMS) including Pegasus and Condor.
Cloud performance
- Montage (I/O-bound)
The processing times on abe.lustre are nearly three times faster than the fastest EC2 machines b. - Broadband (Memory-bound)
The processing advantage of the parallel file system largely disappears. And abe.local's performance is only 1% better than cl.xlarge.
For memory-intensive application, Amazon EC2 can achieve nearly the same performance as Abe. - Epigenome (CPU-bound)
The parallel file system in Abe provides no processsing advantage for Epigenome. The machines with the most cores gave the best performance for CPU-bound application.
Figure below shows the processing time for the three workflows.
Cost
The cost of Amazon EC2 includes:
- Resource cost: the figure below shows processing cost of three workflows in EC2.
- Storage Cost: Cost to store VM images in S3 and cost of storing input data in EBS.
The table summarizes the monthly storage costApplication
Input Volume
Monthly Storage Cost
Montage
4.3 GB
$0.66
Broadband
4.1 GB
$0.66
Epigenome
1.8 GB
$0.26
- Transfer cost: AmazonEC2 charges $0.10 per GB for transter into the cloud and $0.17 per GB for transfer out of the cloud.
The data size and transfer costs are summarized in the tables below.
Data transfer size per workflow on Amazon EC2Costs of transferring data into and out the EC2 cloudApplication
Input
Output
Logs
Montage
4,291 MB
7,970 MB
40 MB
Broadband
4,109 MB
159 MB
5.5 MB
Epigenome
1,843 MB
299 MB
3.3 MB
Application
Input
Output
Logs
Total
Montage
$0.42
$1.32
$<0.01
$1.75
Broadband
$0.40
$0.03
$<0.01
$0.43
Epigenome
$0.18
$0.05
$<0.01
$0.23
- Cost effectiveness study
Cost calculations based on processing reqeusts for 36,000 mosaic of 2MASS images (Total size 10TB) of size 4 sq deg over a period of three years (typical workload for image mosaic service).
Results show that Amazon EC2 is much less attractive than a local service for I/O-bound application due to the high costs of data storage in Amazon EC2.
Tables below show the cost of both local and Amazon EC2 service.
Cost per mosaic of a locally hosted image mosaic serviceCost per mosaic of a mosaic service hosted in the Amazon EC2 cloudItem
Cost ($)
12 TB RAID 5 disk farm and enclosure
(3 yr support)12,000
Dell 2650 Xeon quad-core processor,
1 TB staging area5,000
Power, cooling and administration
6,000
Total 3-year Cost
23,000
Cost per mosaic
0.64
Item
Cost ($)
Network Transfer In
1000
Data Storage on Elastic Block Storage
36,000
Processor Cost (cl.medium)
4,500
I/O operations
7,000
Network Transfer Out
4,200
Total 3-year Cost
52,700
Cost per mosaic
1.46
Summary
- For CPU-bound applications, virtualization overhead on Amazon EC2 is generally small.
- The resources offered by EC2 are generally less powerful than those available in HPC. Particularly for I/O-bound applications.
- Amazon EC2 offers no cost benefit over locally hosted storage, but does eliminate local maintenance and energy costs, and does offer high-quality, reliable storage.
Part II - Application to calculation of periodograms
...
a. Workflow: loosely coupled parallel applications that consist of a set of computational tasks linked by data- and control-flow dependencies.
b. A parallel file system and high-speed interconnect would make dramatic performance upgrades. Recently Amazon released a new resource type including a 10Gb interconnect.