You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 7 Next »

Ensemble Broker Documentation

Overview

The EnsembleBroker is a metaworkflow service designed to manage thousands of user submitted "nodes".

Documentation

View schematics images.

Examples

View example files.


Towards a revision of the Broker service.

The following are some thoughts concerning what will need to be done for the next version.

(1) Modify the structure of the descriptor.

This should involve the following:

  • Eliminate the top-level "ensemble", making the spawning of an ensemble a particular kind of workflow node
  • Add pre and post-conditions to each node. These could be file existence or property definition checks, and would be verified through the metadata system. Pre-condition property values can be retrieved and set on the node; post-condition property values and file locations would be registered with the metadata system.
  • The Node-to-node "dependency" (edge) would carry these as attributes.

(2) Modify the synchronization system.

Rather than synchronizing en bloc on the user; see the model adopted for the execution service.
Synchronization on state needs to be handled differently. All updates on objects should (1) reload the object from the database and (2) return the new object with the newer state. The Hibernate class needs to maintain three semaphores on ensembles, workflows and nodes which block modification of the given object by concurrent users. This needs to replace the class-level locks on the update methods.

(3) Implement a full clean-up system which also boots users who have no active workflows from the in-memory authentication repository. Since we will be using Tupelo (question) to store events, etc., there is no longer a need to clean those up.

(4) Ready Cache logic: we need to refactor the Ready Cache logic handling NodeAttempted logic. Two things should be kept in mind: should we account for how many previous times we have already tried to schedule the node in determining the next try? how do we ensure priority factors? should we retry higher priority ensembles more often? Currently there is only one global retry timeout setting for the entire service.


  • No labels