Child pages
  • Hardware-Accelerated Learning (HAL) Cluster
Skip to end of metadata
Go to start of metadata

Quick Start


Contact Us

Request for HAL Access

Please fill out the following form. Make sure to follow the link on the application confirmation page to request an actual system account.

Submit Tech-Support Ticket

Please submit a tech-support ticket to the admin team.

Join HAL Slack User Group

Please join HAL slack user group.

Check System Status

Please visit the following website to monitor real-time system status.

User Guide

Connect to HAL Cluster

There are 2 methods to log on to the HAL system. The first method is to SSH via a terminal,

SSH
ssh <username>@hal.ncsa.illinois.edu

and the second method is to visit the HAL OnDemand webpage.

HAL OnDemand
https://hal.ncsa.illinois.edu:8888

Submit Jobs Using Slurm Wrapper Suite (Recommended)

Submit an interactive job using Slurm Wrapper Suite,

swrun -p gpux1

Submit a batch job using Slurm Wrapper Suite,

swbatch run_script.swb

The run_script.swb example

run_script.swb
#!/bin/bash

#SBATCH --job-name="hostname"
#SBATCH --output="hostname.%j.%N.out"
#SBATCH --error="hostname.%j.%N.err" 
#SBATCH --partition=gpux1

srun /bin/hostname # this is our "application"

Submit Jobs Using Native Slurm 

Submit an interactive job using Slurm directly.

srun --partition=gpux1 --pty --nodes=1 --ntasks-per-node=12 \
  --cores-per-socket=3 --threads-per-core=4 --sockets-per-node=1 \
  --gres=gpu:v100:1 --mem-per-cpu=1500 --time=4:00:00 \
  --wait=0 --export=ALL /bin/bash

Submit a batch job using  Slurm directly.

swbatch run_script.sb

The run_script.sb example

#!/bin/bash

#SBATCH --job-name="hostname"
#SBATCH --output="hostname.%j.%N.out"
#SBATCH --error="hostname.%j.%N.err" 
#SBATCH --partition=gpux1 
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1 
#SBATCH --export=ALL 
#SBATCH -t 04:00:00 

srun /bin/hostname # this is our "application"

Submit Jobs Using HAL OnDemand (New)

Log in with your own user name and password.

HAL OnDemand
https://hal.ncsa.illinois.edu:8888

Files Apps

This Open OnDemand application provides a web-based file explorer that allows the user to remotely interact with the files on the HPC center’s local file system. This application uses Node.js as the code base and is based on the CloudCommander file explorer app.

The Files app provides access to create files and folders, view files, manipulate file locations, upload files, and download files. It also provides integrated support for launching the Shell App in the currently browsed directory as well as launching the File Editor App for the currently selected file.

Active Jobs App

This Open OnDemand application provides a web-based view of the current status of all the available jobs on the batch servers hosted by the HPC center. This application is built with the Ruby on Rails web application framework.

The Active Jobs App displays all the available jobs in a dynamically searchable and sortable table. The user can search on job id, job name, job owner, charged account, status of job, as well as the cluster the job was submitted to. Progressive disclosure is used to show further details on individual jobs by clicking in the “right arrow” to the left of a table row.

Job Composer App

This Open OnDemand application provides a web-based utility for creating and managing batch jobs from template directories. This application is built with the Ruby on Rails web application framework.

The Job Composer App attempts to model a simple but common workflow that typical users of an HPC center use. When users create new batch jobs they will follow the given workflow:

  1. Copy a directory of a previous job, either one of their previous jobs or a job from a group member
  2. Make minor modifications to the input files
  3. Submit this new job

Shell Apps

This Open OnDemand application provides a web-based terminal that connects the user through an SSH session to either the local machine or any other machine allowed within the internal network. Typically this will connect the user to a login node. This application uses Node.js for its exceptional support of websockets providing a responsive user-experience as well as its event-driven framework allowing for multiple sessions simultaneously.

The terminal client is an xterm-compatible terminal emulator written entirely in JavaScript. The Shell App uses the Google client hterm for this. It performs reasonably well across most modern browsers on various operating systems. It is currently used by the developers of Open OnDemand quite a bit.

Documentation


System Overview

Hardware Information

Login node

Login NodeIBM9006-12P1x
CPUIBMPOWER9 16 Cores2x
NetworkMellanox2 Ports EDR InfiniBand1x

Compute node

Compute NodeIBMAC922 8335-GTH16x
CPUIBMPOWER9 20 Cores2x
GPUNVidiaV100 16GB Memory4x
NetworkMellanox2 Ports EDR InfiniBand1x

Storage Node

Storage NodeIBM9006-22P1x
CPUIBMPOWER9 20 Cores2x
StorageWDNFS1x
NetworkMellanox2 Ports EDR InfiniBand1x

Software Information

ManufacturerSoftware PackageVersion
IBMRedHat Linux7.6
NVidiaCUDA10.1.105
NVidiaPGI Compiler19.4
IBMAdvance Toolchain12.0
IBMXLC/XLF16.1.1
IBMPowerAI1.6.1
SchedMDSlurm19.05.2
OSCOpen OnDemand1.6.20

Job Management with Slurm

Modules Management

Getting started with WMLCE (former PowerAI)

Using Jupyter Notebook on HAL

Working with Containers

Installing python packages

Frequently Asked Questions


  • No labels