Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
sbatch tf_sample.sb
squeue

Visualization with TensorBoard

Interactive mode

Get node for interactive use:

Code Block
srun --partition=debug --pty --nodes=1 --ntasks-per-node=8 --gres=gpu:v100:1 -t 01:30:00 --wait=0 --export=ALL /bin/bash

Once on the compute node, load PowerAI module using one of these:

Code Block
module load ibm/powerai/1.6.0.py2 # for python2 environment
module load ibm/powerai/1.6.0.py3 # for python3 environment
module load ibm/powerai           # python3 environment by default

Download the code mnist-with-summaries.py to local machine and copy the file to $HOME folder on hal-login:

Code Block
scp ./mnist-with-summaries.py [user_name]@hal.ncsa.illinois.edu:~

Train on MNIST with TensorFlow summary:

Code Block
cd ~
python ./mnist-with-summaries.py

After job completed the TensorFlow log files can be found in "~/tensorflow/mnist/logs", now go back to login node and run:

Code Block
exit
tensorboard --logdir ~/tensorflow/mnist/logs/

Forward the port 6006 on remote machine to the port 16006 on local machine:

Code Block
ssh -N -f -L localhost:16006:localhost:6006 dmu@hal

Paste the follow address into web browser to start the TensorBoard session:

Code Block
localhost:16006

Simple Example for Pytorch

Interactive mode

Get node for interactive use:

...