Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Partition
Name
PriorityMax
Walltime
Min-Max
Nodes /JobAllowed

Max CPUs
Per Node

Max Memory
Per CPU (GB)

Description
debughigh12 hrs1-1366designed to access 1 GPU to run debug or short-term job
solonormal72 hrs1-1366designed to access 1 GPU to run long-term job
batchnormal72 hrs2-16366designed to access 2-16 nodes (up to 64 GPUs) to run parallel job

HAL Example Job Scripts (Recommended)

New users should check the example job scripts at "/opt/apps/runscripts" and request adequate resources.

Script
Name
Job
Type
Partition

Max
Walltime

Nodes

CPU

GPU

Memory
(GB)
Description
run_debug_00gpu_036cpu_0216mem.shinteractivedebug12:00:001360216submit interactive job, 1 full node for 12 hours CPU only task in "debug" partition
run_debug_01gpu_008cpu_0048mem.shinteractivedebug12:00:0018148submit interactive job, 25% of 1 full node for 12 hours GPU task in "debug" partition
run_debug_02gpu_016cpu_0096mem.shinteractivedebug12:00:00116296submit interactive job, 50% of 1 full node for 12 hours GPU task in "debug" partition
run_debug_04gpu_032cpu_0192mem.shinteractivedebug12:00:001324192submit interactive job, 1 full node for 12 hours GPU task in "debug" partition
sub_solo_01node_01gpu_08cpu_0048mem.sbsbatchsolo72:00:0018148submit batch job, 25% of 1 full node for 72 hours GPU task in "solo" partition
sub_solo_01node_02gpu_16cpu_0096mem.sbsbatchsolo72:00:001324192submit batch job, 50% of 1 full node for 72 hours GPU task in "solo" partition
sub_solo_01node_04gpu_32cpu_0192mem.sbsbatchsolo72:00:001324192submit batch job, 1 full node for 72 hours GPU task in "solo" partition
sub_batch_02node_08gpu_064cpu_0384mem.sbsbatchbatch72:00:002648384submit batch job, 2 full nodes for 72 hours GPU task in "batch" partition
sub_batch_16node_64gpu_512cpu_3072mem.sbsbatchbatch72:00:0016512643072submit batch job, 16 full nodes for 72 hours GPU task in "batch" partition


Native SLURM style

Submit Interactive Job with "srun"

...

Check Job Status

Code Block
squeue                # check all jobs from all users 
squeue -u [username]user_name] # check all jobs belong to user_name

Cancel Running Job

Code Block
scancel -u [job_id] # cancel job with [job_id]

PBS style

Some PBS commands are supported by SLURM.

...