To request access: fill out this form. Make sure to follow the link on in the application confirmation page email to request actual system account.
Frequently Asked Questions
To report problems: email us.
For our new users: New User Guide for HAL System
User group Slack space: https://join.slack.com/t/halillinoisncsa
Real-time Dashboards: Here
HAL OnDemand portal: system status: https://hal-monitorondemand.ncsa.illinois.edu:3000//
Globus Endpoint: ncsa#hal
Quick start guide: (for complete details see Documentation section on the left)
To connect to the cluster:
Code Block |
---|
ssh <username>@hal.ncsa.illinois.edu |
To submit interactive job:
or
Code Block |
---|
|
srun --partition=gpux1 --pty --nodes=1 \
--ntasks-per-node=12 --cores-per-socket=3 \
--threads-per-core=4 --sockets-per-node=1 \
--gres=gpu:v100:1 --mem-per-cpu=1500 \
--time=2:00:00 --wait=0 --export=ALL /bin/bash |
To submit a batch job:
Code Block |
---|
swbatch run_script.swb |
or
Code Block |
---|
sbatch run_script.sb |
The following information is out of date - see Job management with SLURM instead.
See run_script.swb and run_script.sb for a basic example.Job Queue time limits:
- "debug" queue: 4 hours
- "gpux<n>" and "cpun<n>" queues: 72 hours 24 hours
Resource limits:
- 5 concurrently running jobs
- concurrently allocated resources
- For larger/more numerous jobs, please contact admins for a special arrangement and/or a reservation
To load the OpenCE module (provides PyTorch, Tensorflow and other ML tools)To load IBM Watson Machine Learning Community Edition (former IBM PowerAI) module:
Code Block |
---|
module load wmlceopence |
To see CLI scheduler status: