...
The HAL Slurm Wrapper Suite was designed to help users use the HAL system easily and efficiently. The current version is "swsuite-v0.34", which includes
srun (slurm command) → swrun : request resources to run interactive jobs.
...
squeue (slurm command) → swqueue : display resource from all computing nodescheck current running jobs and computational resource status.
Rule of Thumb
- Minimize the required input options.
- Consistent with the original "slurm" run-script format.
- Submits job to suitable partition based on the number of GPUs needed (number of nodes for CPU partition).
...
- swrun -p <partition_name> -c <cpu_per_gpu> -t <walltime> -r <reservation_name>
- <partition_name> (required) : cpun1, cpun2, cpun4, cpun8, gpux1, gpux2, gpux3, gpux4, gpux8, gpux12, gpux16.
- <cpu_per_gpu> (optional) : 12 cpus (default), range from 12 cpus to 36 cpus.
- <walltime> (optional) : 24 hours (default), range from 1 hour to 72 hours.
- <reservation_name> (optional) : reservation name granted to user.
- example: swrun -p gpux4 -c 36 -t 72 (request a full node: 1x node, x4 node, 144x cpus, 72x hours)
- swbatch <run_script>
- <run_script> (required) : same as original slurm batch.
- <job_name> (optional) : job name.
- <output_file> (optional) : output file name.
- <error_file> (optional) : error file name.
- <partition_name> (required) : cpucpun1, cpun2, cpun4, cpun8, gpux1, gpux2, gpux3, gpux4, gpux8, gpux12, gpux16.
- <cpu_per_gpu> (optional) : 12 cpus (default), range from 12 cpus to 36 cpus.
- <walltime> (optional) : 24 hours (default), range from 1 hour to 72 hours.
- <reservation_name> (optional) : reservation name granted to user.
example: swbatch demo.swb
Code Block language bash title demo.swb #!/bin/bash #SBATCH --job-name="demo" #SBATCH --output="demo.%j.%N.out" #SBATCH --error="demo.%j.%N.err" #SBATCH --partition=gpux1 srun hostname
- swqueue
- example: swqueue
...
Partition Name | Priority | Max Walltime | Nodes Allowed | Min-Max CPUs Per Node Allowed | Min-Max Mem Per Node Allowed | GPU Allowed | Local Scratch | Description | gpu-debug | high | 4 hrs | 1 | 12-144 | 18-144 GB | 4 | none | designed to access 1 node to run debug job.|
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
gpux1 | normal | 72 hrs | 1 | 1218-36 | 18-54 GB | 1 | none | designed to access 1 GPU on 1 node to run sequential and/or parallel job. | |||||||||
gpux2 | normal | 72 hrs | 1 | 24-72 | 36-108 GB | 2 | none | designed to access 2 GPUs on 1 node to run sequential and/or parallel job. | |||||||||
gpux3 | normal | 72 hrs | 1 | 36-108 | 54-162 GB | 3 | none | designed to access 3 GPUs on 1 node to run sequential and/or parallel job. | |||||||||
gpux4 | normal | 72 hrs | 1 | 48-144 | 72-216 GB | 4 | none | designed to access 4 GPUs on 1 node to run sequential and/or parallel job. | |||||||||
gpux8 | lownormal | 72 hrs | 2 | 48-144 | 72-216 GB | 8 | none | designed to access 8 GPUs on 2 nodes to run sequential and/or parallel job. | |||||||||
gpux12 | lownormal | 72 hrs | 3 | 48-144 | 72-216 GB | 12 | none | designed to access 12 GPUs on 3 nodes to run sequential and/or parallel job. | |||||||||
gpux16 | lownormal | 72 hrs | 4 | 48-144 | 72-216 GB | 16 | none | designed to access 16 GPUs on 4 nodes to run sequential and/or parallel job. | |||||||||
cpu_mini | normal | 72 hrs | 1 | 8-8 | |||||||||||||
cpun1 | normal | 72 hrs | 1-16 | 96-96 | 144-144 GB | 0 | none | designed to access 96 CPUs on 1-16 node to run sequential and/or parallel job. | |||||||||
cpun2 | normal | ||||||||||||||||
cpun4 | normal | ||||||||||||||||
cpun8 | normal | ||||||||||||||||
cpun8 | normal |
HAL Wrapper Suite Example Job Scripts
...