Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Siege was used as part of the 2006 Unidata Workshop. For more details see the report submitted to the American Meteorological Society.

Planned Extension and Modifications

The current version of Siege requires an intimate understanding of the inner workings of Siege, PWE, and the backend HPCs that the workflows will execute upon. This proposal calls for a far more user friendly user interface that must hide the complexity of the underlying system while not limiting access to the extensive capabilities provided.

...

Wiki Markup
The Digital Synthesis Framework (DSF) provides a coherent framework for dynamically publishing visual analysis environments based on underlying observational and modeled information. The concept of a synthesis framework involves core capabilities for integrating data from multiple sources, enabling on-demand execution of scientific workflows, and the association of data outputs with multiple visualization and analysis widgets in a dynamically generated web application.   In the DSF, NCSA's Cyberintegrator workflow environment is used to integrate data sources and invoke modeling modules. When the workflow is complete, it can be saved and run repeatedly as a service. A publication service allows the workflow outputs (which may be observational data or model outputs) to be associated with visualization widgets and embedded into a dynamically generated scenario viewer web application. The application can display data outputs from completed workflows or can trigger new workflows on demand. Along with maps, graphs, tables, and other displays, the application can display provenance information and links to associated reference material. As a concrete example, we present one of the TRECC pilot projects to incorporate a model predicting hypoxic (low oxygen) conditions in Corpus Christi Bay, Texas based on information from sensors deployed in and around the bay, into a hypoxia forecast web application \[4\].

Service Layer

Parameterized Workflow Engine (PWE)

...

  • To assemble a loosely coupled set of components with well-defined functions such that we could extend them or compose into them other components as the need arose;
  • To build an infrastructure above the lower middleware layer which would be general enough to suit a wide range of applications and not be tightly bound to the requirements of any single domain (along with this goal goes, needless to say, the requirement that application-level code remain insulated from this infrastructure; that is, that it need not be recompiled to work in our environment);
  • o place high priority on productive and efficient interaction with the resource disposition typical of HPC centers, which more specifically means traditional batch queue managers; hence we have imposed a fundamental asynchronicity on the manner in which the highest level of the execution process must be handled (for instance, it would not make sense in this light that the component responsible for the execution logic hold the entire graph in memory for the duration of the workflow);
  • To interact with existing resources such as mass storage systems and HPCs while requiring no modifications, additions, or installations on the remote resources. 

Data Management

Metadata Management

...

Previous and Existing Uses

The PWE is currently used along with the Siege user interface.

Planned Extension and Modifications

The PWE is a robust, general system for the management of workflows and little effort should be required to modify the existing system to support the use cases provided for by this proposal. Logical extensions, however, may be required to suit the particular needs of both the back-end HPC systems and any Grid-middleware required accessing those systems.

Data Management

Scientific workflows can require an immense amount of archival storage.  For this proposal, these storage resources must be identified and the appropriate access and security privileges must be given to both the developers and the end users.  These resources must be able to be accessible by both the back end resources so jobs can store the data as well as by the client applications, whether they are web-based or desktop-based. 

Previous and Existing Uses

Identify the resources that will be used on this project.

Planned Extensions and Modifications

It is assumed that the configuration and administration of these resources lie outside the purview of this proposal.

Metadata Management

Managing information about and generated by scientific applications is critical to the long term success of any project launching large number of jobs onto HPC resources.  Merely storing the data is typically a straightforward process, but finding a particular data set in the black hole that can be a mass storage system can be frustrating and time consuming.  To ensure the success of this project, care must be taken to develop a system to track the inputs, outputs, and provenance information generated by the scientific workflows so users can easily find and retrieve their data when needed.

Previous and Existing Uses

The use of metadata for management of scientific processes exists in many incarnations. At NCSA we are concentrating on the use of the Resource Description Framework (RDF) as a model for tracking this metadata, such as provenance and user submitted metadata (e.g. tagging and annotations.)

Planned Extensions and Modifications

This component is not in the scope of this project.

Event Management

The existing service stack uses both synchronous Java RMI calls as well as asynchronous Java Messaging Service (JMS) events to handle interprocess communications.  A custom Event Service was developed to manage archival event storage as well providing a means for desktop clients which may go offline for extended periods to not miss any events.

Previous and Existing Uses

Both Java RMI and JMS are widely used across the globe. The custom Event Service is used along with PWE and Siege.

Planned Extensions and Modifications

The event service will not be extended or modified for this project.

Grid Middleware Layer

Globus/gLite/etc

...