You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 13 Next »

Survey White Paper Outline

Part 1. Introduction of Cloud Computing Technology

Part 2. Science Stories and Requirements for the Cloud

There are a lot of practices of implementing scientific applications on cloud computing resources such as biology/bioinformatics (Stein 2010, Schatz et al. 2010), Geospatial Information System (Yang et al. 2011), Astronomy, and Environmental Science.

Due to different requirements in each science area, the focuses of cloud computing applications are various. In Biology/Bioinformatics area, many applications such as DNA sequencing require processing of large data throughput (Schatz et al. 2010 and Langmead et al. 2009). The cloud computing workflow in Geospatial sciences mainly involves data storage and processing (Cui et al. 2010, Huang et al. 2010, Park et al. 2011, Yang et al. 2010, Bunzel et al. 2010) and simulation and modeling. Also, a main IT challenge in Geospatial sciences is to deal with massive concurrent users access (Huang et al. 2010, Bernstein et al. 2010, Wang et al. 2010, Janakiraman et al. 2010, Blower et al. 2010). The practice of cloud computing in Astronomy is focused on data processing such as processing images from telescope (Berriman et al. 2010, Jackson et al. 2010, Berriman et al. 2010(2), Hoffa et al. 2008) or data sharing (Juve et al. 2010). In Environmental sciences, the practice of implementing cloud computing is focused on modeling such as ocean climate modeling (Evangelinos et al. 2008) and groundwater modeling (Hunt et al. 2010), cloud computing is also used in data analysis such as parallel sequential data analysis tasks (Hasenkamp et al. 2010).

The most common used cloud service model in scientific applications is IaaS. Amazon Cloud services is the most popular cloud platform in almost all the scientific areas, this is because it is convenient to implement existing techniques on the Amazon cloud. For example, in the Biology/Bioinformatics area, most applications use linux-based system and technologies which can be easily implement on to Amazon EC2 (Gunarathne et al. 2010, Qiu et al., Langmead et al. 2010, Vecchiola et al. 2009, Nguyen et al. 2011, Afgan et al. 2010). Amazon cloud service is also popular in Astronomy (Berriman et al. 2010, Jackson et al. 2010, Juve et al. 2009, Vockler et al. 2011), GIS (Huang et al. 2010, Janakiraman et al. 2010, Bunzel et al. 2010), and Environmental sciences (Evangelinos et al. 2008, He et al. 2010). Other community IaaS cloud platforms are also used because the cost effective property compared to commercial clouds. For example, FutureGrid(Qiu et al. 2010) and Magellan(Taylor et al. 2010) are used in Bioinformatics applications; Nimbus(Hoffa et al. 2008), FutureGrid(Vockler et al. 2011), and Magellan(Vockler et al. 2011) are used in Astronomy; GoGrid(Hunt et al. 2010, He et al. 2010); OpenNebula(Park et al. 2011) is used in GIS application; FutureGrid with Nimbus and Eucalyptus(Fox et al. 2011) and Magellan with Eucalyptus(Hasenkamp et al. 2010) are used in Environmental sciences applications.

PaaS are also used in scientific areas. For example, Microsoft Azure is used in Biology/Bioinformatics applications(Qiu et al. 2009, Qiu et al. 2010, Lu et al. 2010) and Astronomy(Eye on Earth project). Google App Engine is used in GIS area(Blower et al. 2010). Scientific researchers choose PaaS platform because some technologies they need is constructed based on specific platform, such as MapReduce implementation Dryad is based on Microsoft platform(Qiu et al. 2009).

Table below lists the cloud platforms used in scientific applications.

Unknown macro: {table-plus}

 

Astronomy

Biology

Environmental

GIS

Amazon

6

9

2

3

Azure

 

4

1

 

Google App Engine

 

 

 

1

FutureGrid

1

1

1

 

Magellan

1

 

1

 

GoGrid

 

 

2

 

Eucalyptus

1

 

2

 

Nimbus

 

 

1

 

OpenNebula

 

 

 

1

IBM Grid

 

 

1

 

Part 3. Cloud Computing Platforms and Tools

Part 4. Gap Analysis and Known Issues

Part 5. Recommendations

What should NCSA do next step? Should we invest some time on setting up some private cloud? some Cloud tools?

  • No labels