You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 22 Next »

IBM PowerAI 1.6.0

PowerAI is an enterprise software distribution that combines popular open source deep learning frameworks, efficient AI development tools, and accelerated IBM Power Systems servers. It includes the following frameworks:

FrameworkVersionDescription
Caffe1.0Caffe is a deep learning framework made with expression, speed, and modularity in mind. It is developed by Berkeley AI Research and by community contributors.
Caffe2n/aCaffe2 is a companion to PyTorch. PyTorch is great for experimentation and rapid development, while Caffe2 is aimed at production environments.
Pytorch1.0.1Pytorch is an open source deep learning platform that provides a seamless path from research prototyping to production deployment. It is developed by Facebook and by community contributors.
TensorFlow1.13.1TensorFlow is an end-to-end open source platform for machine learning. It is developed by Google and by community contributors.

For complete PowerAI documentation, see https://www.ibm.com/support/knowledgecenter/SS5SF7_1.6.0/navigation/pai_getstarted.htm. Here we only show simple examples with system-specific instructions.

Simple Example for Caffe

Interactive mode

Get node for interactive use:

srun --partition=debug --pty --nodes=1 --ntasks-per-node=8 --gres=gpu:v100:1 -t 01:30:00 --wait=0 --export=ALL /bin/bash

Once on the compute node, load PowerAI module using one of these:

module load ibm/powerai/1.6.0.py2 # for python2 environment
module load ibm/powerai/1.6.0.py3 # for python3 environment
module load ibm/powerai           # python3 environment by default

Install samples for Caffe:

caffe-install-samples ~/caffe-samples
cd ~/caffe-samples

Download data for MNIST model:

./data/mnist/get_mnist.sh

Convert data and create MNIST model:

./examples/mnist/create_mnist.sh

Train LeNet on MNIST:

./examples/mnist/train_lenet.sh

Batch mode

The same can be accomplished in batch mode using the following caffe_sample.sb script:

wget https://wiki.ncsa.illinois.edu/download/attachments/82510352/caffe_sample.sb
sbatch caffe_sample.sb
squeue

Simple Example for Caffe2

Interactive mode

Get node for interactive use:

srun --partition=debug --pty --nodes=1 --ntasks-per-node=8 --gres=gpu:v100:1 -t 01:30:00 --wait=0 --export=ALL /bin/bash

Once on the compute node, load PowerAI module using one of these:

module load ibm/powerai/1.6.0.py2 # for python2 environment
module load ibm/powerai/1.6.0.py3 # for python3 environment
module load ibm/powerai           # python3 environment by default

Install samples for Caffe2:

caffe2-install-samples ~/caffe2-samples
cd ~/caffe2-samples

Download data with LMDB:

python ./examples/lmdb_create_example.py --output_file lmdb

Train ResNet50 with Caffe2:

python ./examples/resnet50_trainer.py --train_data ./lmdb

Batch mode

The same can be accomplished in batch mode using the following caffe2_sample.sb script:

wget https://wiki.ncsa.illinois.edu/download/attachments/82510352/caffe2_sample.sb
sbatch caffe2_sample.sb
squeue

Simple Example for TensorFlow

Interactive mode

Get node for interactive use:

srun --partition=debug --pty --nodes=1 --ntasks-per-node=8 --gres=gpu:v100:1 -t 01:30:00 --wait=0 --export=ALL /bin/bash

Once on the compute node, load PowerAI module using one of these:

module load ibm/powerai/1.6.0.py2 # for python2 environment
module load ibm/powerai/1.6.0.py3 # for python3 environment
module load ibm/powerai           # python3 environment by default

Copy the following code into file "mnist-demo.py":

import tensorflow as tf
mnist = tf.keras.datasets.mnist

(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(512, activation=tf.nn.relu),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

model.fit(x_train, y_train, epochs=5)
model.evaluate(x_test, y_test)

Train on MNIST with keras API:

python ./mnist-demo.py

Batch mode

The same can be accomplished in batch mode using the following tf_sample.sb script:

sbatch tf_sample.sb
squeue

Visualization with TensorBoard

Interactive mode

Get node for interactive use:

srun --partition=debug --pty --nodes=1 --ntasks-per-node=8 --gres=gpu:v100:1 -t 01:30:00 --wait=0 --export=ALL /bin/bash

Once on the compute node, load PowerAI module using one of these:

module load ibm/powerai/1.6.0.py2 # for python2 environment
module load ibm/powerai/1.6.0.py3 # for python3 environment
module load ibm/powerai           # python3 environment by default

Download the code mnist-with-summaries.py to $HOME folder:

cd ~
wget https://wiki.ncsa.illinois.edu/download/attachments/82510352/mnist-with-summaries.py

Train on MNIST with TensorFlow summary and go back to login node:

python ./mnist-with-summaries.py
exit

Batch mode

The same can be accomplished in batch mode using the following tfbd_sample.sb script:

sbatch tfbd_sample.sb
squeue

Start the TensorBorad session

After job completed the TensorFlow log files can be found in "~/tensorflow/mnist/logs", start the TensorBoard server on login node:

tensorboard --logdir ~/tensorflow/mnist/logs/

Forward the port 6006 on remote machine to the port 16006 on local machine:

ssh -N -f -L localhost:16006:localhost:6006 dmu@hal

Paste the follow address into web browser to start the TensorBoard session:

localhost:16006

Simple Example for Pytorch

Interactive mode

Get node for interactive use:

srun --partition=debug --pty --nodes=1 --ntasks-per-node=8 --gres=gpu:v100:1 -t 01:30:00 --wait=0 --export=ALL /bin/bash

Once on the compute node, load PowerAI module using one of these:

module load ibm/powerai/1.6.0.py2 # for python2 environment
module load ibm/powerai/1.6.0.py3 # for python3 environment
module load ibm/powerai           # python3 environment by default

Install samples for Pytorch:

pytorch-install-samples ~/pytorch-samples
cd ~/pytorch-samples

Train on MNIST with Pytorch:

python ./examples/mnist/main.py

Batch mode

The same can be accomplished in batch mode using the following pytorch_sample.sb script:

sbatch pytorch_sample.sb
squeue







  • No labels