Page History

...

swrun -q <queue_name> -c <cpu_per_gpu> -t <walltime>
- <queue_name> (required) : cpu, gpux1, gpux2, gpux3, gpux4, gpux8, gpux12, gpux16.
- <cpu_per_gpu> (optional) : 12 cpus (default), range from 12 cpus to 36 cpus.
- <walltime> (optional) : 24 hours (default), range from 1 hour to 72 hours.
- example: swrun -q gpux4 -c 36 -t 72 (request a full node: 1x node, x4 node, 144x cpus, 72x hours)
swbatch <run_script>
- <run_script> (required) : same as original slurm batch.
- <job_name> (required) : job name.
- <output_file> (required) : output file name.
- <error_file> (required) : error file name.
- <queue_name> (required) : cpu, gpux1, gpux2, gpux3, gpux4, gpux8, gpux12, gpux16.
- <cpu_per_gpu> (optional) : 12 cpus (default), range from 12 cpus to 36 cpus.
- <walltime> (optional) : 24 hours (default), range from 1 hour to 72 hours.
- example: swbatch demo.sb

Partition Name	Priority	Max Walltime	Nodes Allowed	Min-Max CPUs Per Node Allowed	Min-Max Mem Per Node Allowed	GPU Allowed	Local Scratch
gpu-debug	high	4 hrs	1	12-144	18-144 GB	4	none
gpux1	normal	72 hrs	1	12-36	18-54 GB	1	none
gpux2	normal	72 hrs	1	24-72	36-108 GB	2	none
gpux3	normal	72 hrs	1	36-108	54-162 GB	3	none
gpux4	normal	72 hrs	1	48-144	72-216 GB	4	none
cpu	normal	72 hrs	1	96-96	144-144 GB	0	none
gpux8	low	72 hrs	2	48-144	72-216 GB	8	none
gpux12	low	72 hrs	3	48-144	72-216 GB	12	none
gpux16	low	72 hrs	4	48-144	72-216 GB	16	none

Native SLURM style

...

Code Block
scancel [job_id] # cancel job with [job_id]

Partition Name	Priority	Max Walltime	Min-Max Nodes Allowed	Max CPUs Per Node	Max Memory Per CPU (GB)	Local Scratch (GB)	Description
debug	high	4 hrs	1-1	144	1.5	None	designed to access 1 node to run debug job
solo	normal	72 hrs	1-1	144	1.5	None	designed to access 1 node to run sequential and/or parallel job
ssd	normal	72 hrs	1-1	144	1.5	220	similar to solo partition with extra local scratch, limited to hal[01-04]
batch	low	72 hrs	2-16	144	1.5	None	designed to access 2-16 nodes (up to 64 GPUs) to run parallel job

New users should check the example job scripts at "/opt/apps/samples-runscript" and request adequate resources.

Script Name	Job Type	Partition	Max Walltime	Nodes	CPU	GPU	Memory (GB)	Description
run_debug_00gpu_96cpu_216GB.sh	interactive	debug	4:00:00	1	96	0	144	submit interactive job, 1 full node for 4 hours CPU only task in "debug" partition
run_debug_01gpu_12cpu_18GB.sh	interactive	debug	4:00:00	1	12	1	18	submit interactive job, 25% of 1 full node for 4 hours GPU task in "debug" partition
run_debug_02gpu_24cpu_36GB.sh	interactive	debug	4:00:00	1	24	2	36	submit interactive job, 50% of 1 full node for 4 hours GPU task in "debug" partition
run_debug_04gpu_48cpu_72GB.sh	interactive	debug	4:00:00	1	48	4	72	submit interactive job, 1 full node for 4 hours GPU task in "debug" partition
sub_solo_01node_01gpu_12cpu_18GB.sb	sbatch	solo	72:00:00	1	12	1	18	submit batch job, 25% of 1 full node for 72 hours GPU task in "solo" partition
sub_solo_01node_02gpu_24cpu_36GB.sb	sbatch	solo	72:00:00	1	24	2	36	submit batch job, 50% of 1 full node for 72 hours GPU task in "solo" partition
sub_solo_01node_04gpu_48cpu_72GB.sb	sbatch	solo	72:00:00	1	48	4	72	submit batch job, 1 full node for 72 hours GPU task in "solo" partition
sub_ssd_01node_01gpu_12cpu_18GB.sb	sbatch	ssd	72:00:00	1	12	1	18	submit batch job, 25% of 1 full node for 72 hours GPU task in "ssd" partition
sub_batch_02node_08gpu_96cpu_144GB.sb	sbatch	batch	72:00:00	2	96	8	144	submit batch job, 2 full nodes for 72 hours GPU task in "batch" partition
sub_batch_16node_64gpu_768cpu_1152GB.sb	sbatch	batch	72:00:00	16	768	64	1152	submit batch job, 16 full nodes for 72 hours GPU task in "batch" partition

Some PBS commands are supported by SLURM.

...