Story1 Belle Monte-Carlo production on the Amazon EC2 Cloud 1 (High energy physics, 2010)
Summary
Belle Monte Carlo simulation benchmark test on EC2. Monte Carlo data production is an essential task in particle physics experiments. It is demonstrated that EC2 can be used for full-scale Monte-Carlo production within the Bell environment. The cost of data generation is comparable to a purpose-built cluster. Belle II is a High Energy Physics (HEP) experiment which needs 200,000HEPSpec (compared to 40,000HEPSpec three years ago).
Cloud platform
- Cloud platform: Amazon EC2
- Unrestricted access to the virtual machines created
- Monte Carlo is CPU-intensive with relatively low storage and transfer cost
- Test were run on 20 HighCPU XL instances (160 cores, with 17GB of ram each)
- Input data:
- "pgen" files (4-vectors describing physics processes): stored in S3
- Random-triggered background data: stored in S3
- Calibration constants: stored in a postgres database (physically at Melbourne)
- Simulation data:
- MDST files and status files: temporarily stored in S3
- Scientific Linux
Cloud performance
- First run:
- 20 instances. 4 hours 57 minutes.
- 3.1Gb addbg, 0.5Gb pgen, results: 37Gb
- Second run:
- 20 instances of 8 cores. 1.47 million events. 22 hours wall clock time.
- 16% failure rate
- Cost of full production runs on EC2
Run
Number of Events
CPU cost
Storage Cost
Transfer Cost
Total Cost
Cost of 10 4 events
1
752,233
$80
$0.2
$6.65
$86.85
$1.16
2
1,473,818
$108.11
$0.25
$7.121
$115.37
0.78
- Time: an 8-virtual core EC2 HighCPU-XL instance generates 10,000 events in about 30 minutes
- Comparison with an 8-core Dell Poweredge 1950: the time is about the same
- The real hardware of the Dell Poweredge is $4000
Issues/Gaps
- Security: SSH tunnel is used to connect to the server
- The largest bottlenecks are data transfer