Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 4.0

NCSA Workflow Management Infrastructure (as of October 21, 2009)

Wiki Markup*The NCSA Workflow Management Infrastructure* is an end-to-end system for generating, editing, submitting and monitoring high-performance computing \ [HPC\] workflows which have the following characteristics:

  • Wiki Markupeach node in the workflow graph comprises a payload, or script, which is to be executed, usually by being submitted to a Distributed Resource Manager \ [DRM\], or batch system; the script can be given input values and can return (small) output values as well;
  • the graph itself is directed and acyclic, meaning there are no conditionals or loops at this level (however, conditional or repeated submissions of a workflow with varying values can be achieved through the use of the Trigger service, described below);
  • any node or subgraph in the workflow could be subject to "parameterization", meaning multiple submissions with varying input values.

...

  1. a front-end desktop client, Siege;
  2. Wiki Markupa *Parameterized Workflow Engine \ [PWE\]*, responsible for workflow management;unmigrated-wiki-markup
  3. information and data transfer services (*Host Information*, *Event Repository*, *Tuple Space \ [VIZIER\]*);
  4. a transient, compute-resource-resident container for running the payload scripts (ELF);
  5. a service for triggering actions, most typically submissions of workflows to PWE, on the basis of events or as cron jobs;
  6. a message bus.

...

  1. Submission Protocol Module
    1. Wiki MarkupAIX (Interactive) - for launching (via \ [GSI\]SSH) directly onto the head node of an AIX machine or cluster (very specialized use, usually for preprocessing actions which are short-lived);unmigrated-wiki-markup
    2. IBM Loadlevler - for submitting (again via \ [GSI\]SSH) to the LL Scheduler.
  2. Job Status Handler
      unmigrated-wiki-markup
    1. AIX Polling Handler - for monitoring the status of interactively submitted jobs (polls the head node via \ [GSI\]SSH using 'ps');
    2. IBM Loadlevler Job Status Handler - this is just a façade for receiving the callback events sent by the User Exit Trigger (see below).

...

There are actually two kinds of DRM Agent, as indicated above; the Interactive Manager is akin to the pre-WS Globus job-manager, in that it is meant to run on a head node of a system where back-end Java is unavailable, and monitors arbitrary jobs (not necessarily ELF) put through the resident batch system.  Our concern here, however, is with the second type of manager, which is itself a job (step) submitted to the scheduler or resource manager, and which, when it becomes active, distributes the work among the given number of ELF singleton members.  Similar to PWE internals, this agent has a system-specific component, as can be seen from the following diagram of the agent's architecture.

Wiki MarkupWe thus needed to provide an implementation of _Member Access_ for LoadLeveler (= *{_}IBM LL{_}* *{_}Access{_}*).    Once again, this child class relies heavily on its parent for much of the basic functionality, but in the case of the LoadLeveler extension, there were some peculiarities with the job set-up for which it is responsible, deriving from the fact that our mode of interaction with LoadLeveler is to resubmit through the scheduler using a reservation id, rather than to control the distribution of members among the resources made available to the glide-in job, as we do in the case of LSF or PBS (using either _exec_ or _dplace_ on \ [pseudo-\]SMP machines or SSH to the compute nodes in the case of distributed memory).    The access module is reponsible for

  • creating a "nodefile" from the queue of available cores/cpus
  • generating the necessary bootstrap.properties and container.xml files for ELF to run the member script
  • generating a command file (bash) to be used in conjunction with member launch
  • releasing the resources when the member has completed
  • issuing an appropriate Cancel command when necessary

...

This native C wrapper around LoadLeveler's API serves two purposes:

  1. Wiki MarkupTo translate the command issued by PWE via \ [GSI\]SSH into a job script;
  2. To provide to LoadLeveler the arguments necessary for reporting job state back to PWE (i.e., the "User Exit" parameters in the llsubmit signature).

...

  • the job command file is created with this name;
  • the command-line arguments are translated into LoadLeveler's job properties pragmas (using the '# @ property = value' syntax; the current environment is also passed on to the job step using '# @ environment=COPY_ALL');
  • the stdin redirected to this wrapper is written to the file as its contents;
  • a properties file is created using the job command name as prefix; this file contains arguments necessary for the callback trigger;
  • llsubmit is called using the job command file, the path to the trigger script (see below), and the properties file as arguments;
  • if the call is successful, the job step id is returned to PWE on stdout.

...

4.2

...

PWE

...

Notification

...

Agent

...

[=

...

LLUserExitTrigger

...

]

This is the Java trigger called by LoadLeveler to report edge events ("User Exit"). It is implemented as another RCP headless application. Though designed specifically for this purpose, we have nonetheless made the LoadLeveler implementation a concrete instance of an abstract class, PweNotificationAgent.

...