Page History

...

The HAL Slurm Wrapper Suite was designed to help users use the HAL system easily and efficiently. The current version is "swsuite-v0.34", which includes

srun (slurm command) → swrun : request resources to run interactive jobs.

...

squeue (slurm command) → swqueue : display resource from all computing nodescheck current running jobs and computational resource status.

Rule of Thumb

Minimize the required input options.
Consistent with the original "slurm" run-script format.
Submits job to suitable partition based on the number of GPUs needed (number of nodes for CPU partition).

...

swrun -p <partition_name> -c <cpu_per_gpu> -t <walltime> -r <reservation_name>
- <partition_name> (required) : cpun1, cpun2, cpun4, cpun8, gpux1, gpux2, gpux3, gpux4, gpux8, gpux12, gpux16.
- <cpu_per_gpu> (optional) : 12 cpus (default), range from 12 cpus to 36 cpus.
- <walltime> (optional) : 24 hours (default), range from 1 hour to 72 hours.
- <reservation_name> (optional) : reservation name granted to user.
- example: swrun -p gpux4 -c 36 -t 72 (request a full node: 1x node, x4 node, 144x cpus, 72x hours)
swbatch <run_script>
- <run_script> (required) : same as original slurm batch.
- <job_name> (optional) : job name.
- <output_file> (optional) : output file name.
- <error_file> (optional) : error file name.
- <partition_name> (required) : cpucpun1, cpun2, cpun4, cpun8, gpux1, gpux2, gpux3, gpux4, gpux8, gpux12, gpux16.
- <cpu_per_gpu> (optional) : 12 cpus (default), range from 12 cpus to 36 cpus.
- <walltime> (optional) : 24 hours (default), range from 1 hour to 72 hours.
- <reservation_name> (optional) : reservation name granted to user.
- example: swbatch demo.swb
  Code Block
  language bash
  title demo.swb
  #!/bin/bash #SBATCH --job-name="demo" #SBATCH --output="demo.%j.%N.out" #SBATCH --error="demo.%j.%N.err" #SBATCH --partition=gpux1 srun hostname
swqueue
- example: swqueue

...

designed to access 1 node to run debug job.

Partition Name	Priority	Max Walltime	Nodes Allowed	Min-Max CPUs Per Node Allowed	Min-Max Mem Per Node Allowed	GPU Allowed	Local Scratch	Description	gpu-debug	high	4 hrs	1	12-144	18-144 GB	4	none
gpux1	normal	72 hrs	1	1218-36	18-54 GB	1	none	designed to access 1 GPU on 1 node to run sequential and/or parallel job.
gpux2	normal	72 hrs	1	24-72	36-108 GB	2	none	designed to access 2 GPUs on 1 node to run sequential and/or parallel job.
gpux3	normal	72 hrs	1	36-108	54-162 GB	3	none	designed to access 3 GPUs on 1 node to run sequential and/or parallel job.
gpux4	normal	72 hrs	1	48-144	72-216 GB	4	none	designed to access 4 GPUs on 1 node to run sequential and/or parallel job.
gpux8	lownormal	72 hrs	2	48-144	72-216 GB	8	none	designed to access 8 GPUs on 2 nodes to run sequential and/or parallel job.
gpux12	lownormal	72 hrs	3	48-144	72-216 GB	12	none	designed to access 12 GPUs on 3 nodes to run sequential and/or parallel job.
gpux16	lownormal	72 hrs	4	48-144	72-216 GB	16	none	designed to access 16 GPUs on 4 nodes to run sequential and/or parallel job.
cpu_mini	normal	72 hrs	1	8-8
cpun1	normal	72 hrs	1-16	96-96	144-144 GB	0	none	designed to access 96 CPUs on 1-16 node to run sequential and/or parallel job.
cpun2	normal
cpun4	normal
cpun8	normal
cpun8	normal

HAL Wrapper Suite Example Job Scripts

...

Child pages

Versions Compared

Old Version 45

New Version 46

Key

Rule of Thumb

HAL Wrapper Suite Example Job Scripts