Tensorflow Official Benchmarks (May 2017, GitHub source): https://www.tensorflow.org/performance/benchmarks
IBM Power9 benchmark results (Nov 2017, 1.4.0): https://developer.ibm.com/linuxonpower/perfcol/perfcol-mldl/
Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour, Facebook (Jun 2017): https://research.fb.com/wp-content/uploads/2017/06/imagenet1kin1h5.pdf
https://github-dev.cs.illinois.edu/kindrtnk/DL
* Instance type: IBM Power9 Hal000, 8335-GTG AC922 server
* CPU: 2x 20-core IBM POWER9 CPU @ 2.00GHz
* SDRAM: 512G DDR4
* GPU: 4x NVIDIA® Tesla® V100, 5120 cores, 16 GB HBM 2
* OS: Red Hat Enterprise Linux Server release 7.4
* Python Distribution: Anaconda python 3.6.2
* CUDA / cuDNN: 9.1/7.0.5
* TensorFLow Version: 1.5.0
* Disk: Local SSD
* DataSet: ImageNet (synthetic)
* Precision: floating point 32 and 16
* Test Date: Mar 25 2018POWER9 (hal000)
Green bars stand for our benchmark results using floating point 16.
Red bars are the official Tensorflow result.
Blue bars stand for our benchmark results using floating point 32.