FuncX provides functions as a service on heterogeneous compute environments. Starting a funcX endpoint on Delta will allow researchers to easily execute their analysis code by simply writing a Python function ands using the Python executor pattern. The funcX endpoint currently has to run as a user on Delta. The endpoint manages slurm jobs that will accept python function task requests, execute them in a correctly configured environment and return results back to the caller. The endpoint configuration hides the details of job settings from the researcher to allow them to focus on their task without needing to understand the details.
Step-by-step guide
The endpoint is a lightweight process that runs as a user, typically on a login node.
The steps we will follow are:
- Log into a login node
- Setup a python environment to run the endpoint
- Install funcx_endpoint package
- Obtain globus auth credentials for the endpoint
- Configure the endpoint
- Start the endpoint
- Compute!
module load modtree/cpu module load gcc anaconda3_cpu conda create --name funcx python=3.9
conda activate funcx pip install funcx-endpoint
funcx-endpoint configure delta
from parsl.addresses import address_by_query from parsl.launchers import SrunLauncher from parsl.providers import SlurmProvider from funcx_endpoint.endpoint.utils.config import Config from funcx_endpoint.executors import HighThroughputExecutor user_opts = { 'delta': { 'worker_init': 'bash; conda activate funcx', 'scheduler_options': '#SBATCH --account=bbmi-delta-cpu', } } config = Config( executors=[ HighThroughputExecutor( max_workers_per_node=2, worker_debug=False, address=address_by_query(), provider=SlurmProvider( partition='cpu', launcher=SrunLauncher(), # string to prepend to #SBATCH blocks in the submit # script to the scheduler eg: '#SBATCH --constraint=knl,quad,cache' scheduler_options=user_opts['delta']['scheduler_options'], # Command to be run before starting a worker, such as: # 'module load Anaconda; source activate parsl_env'. worker_init=user_opts['delta']['worker_init'], # Scale between 0-1 blocks with 2 nodes per block nodes_per_block=2, init_blocks=0, min_blocks=0, max_blocks=1, # Hold blocks for 30 minutes walltime='00:30:00' ), ) ], )
funcx-endpoint start delta
The first time you start this endpoint it will need to collect your Globus credentials. A link will be printed. Paste this link into a browser, log in with one of the identities supported by globus and paste the resulting token into the terminal. This is only needed the first time you start the endpoint.
Make a note of the uuid for your new endpoint. You must provide this in your funcX code to direct task invocations to your endpoint.
Example Using FuncX
Here's a simple example that shows how to invoke a function on your endpoint using funcX. It will hang while waiting for the job to be submitted at run. You can watch progress of your workers in your Delta terminal with squeue command.
from funcx import FuncXExecutor def double(x): return x * 2 delta_endpoint = '<< your endpoint's ID >>' with FuncXExecutor(endpoint_id=delta_endpoint) as fxe: fut = fxe.submit(double, 7) print("Result of doubling:",fut.result())
1 Comment
Dan Katz
Benjamin Galewsky - can we update this since funcX is now Globus Compute