Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3

Siege-PWE-ELF 3.0

See also Workflow Descriptor.

What's in 3.0?

  • Refactorings & additional features to accommodate:
    1. Running on IBM systems (AIX, LoadLeveler)
    2. Basic "scheduling" across systems, including:
      1. Rudimentary "match-making"
      2. Programmatic reservation requesting
        • hard: explicit start-times given to scheduler; entire workflow is scheduled "up-front"; must complete by designated time (a.k.a. "on demand")
        • soft: largely an IBM LL feature (a.k.a. "flexible"); acts like a normal job in the queue; but the reserved resources are bound to an id, not to a specific job
        • NB: in all cases other than hard reservations, workflow nodes are scheduled lazily as they become ready to run - "just-in-time" scheduling; on non-IBM platforms, we usually bypass programmatic reservation requests
  • Bug fixes and improvements

...

SIEGE (1): <scheduling> properties

These are the properties available to be set in a profile used for <scheduling> in the workflow description.   See also Job XML Schema.

There is essentially only one required property for all workflows; default is interactive ...

...

Code Block
xml
xml
	<property name="maxWallTimePerMember" type="long" />
   	<property name="minWallTimePerMember" type="long" />

Wiki MarkupThese default to 'std\[\].log' in the initial directory:

Code Block
xml
xml
	<property name="stdout" type="string" />
	<property name="stderr" type="string" />

...

Code Block
xml
xml
<execution>
    <profile name="paths">
      <property xmlns:ncsa.updateable.id="paths-${HOST_KEY}" name="paths-${HOST_KEY}" 		category="platform.configuration"/>
    </profile>
</execution>

...

  1. The algorithm attribute defines the method used to order potential matching target resources, defining the sequence in which they will be tried. There are currently two available algorithms, one which randomly orders the target names, and the other ("static-load", the default), which contacts the machine to determine something like a "load" number on the system.
  2. Including this element indicates the workflow should be treated as "on-demand" (hard, time-based reservations determined all up front for the entire graph).
  3. This element establishes a set of rules to apply, in order, to the resource request issued to each potentially matching target machine when the original request fails. Currently, there are three available modifiers: starttime, cpus, walltime; rules are separated by a semicolon, and clauses of the rule by commas; the predicate stands for a percentage alteration of the original value or, in the case of starttime, an increment.  Thus the rules tell the scheduler to try 4 times; first with the original request; then by pushing forward the start-time; then by halving the number of cpus and doubling wall time; then finally, by taking one-fourth of the cpus and increasing wall time by 4.

...

SIEGE (4): <global-resource>

...

NOTE the same semantics apply to individual execute nodes:

Code Block
xml
xml
'
       <execute name="setR" type="remote">
           <resource>cobalt.ncsa.uiuc.edu,tg-login.ncsa.teragrid.org</resource>
       </execute>

...