NCSA ML Perf Benchmarking

MLPerf is a set of benchmarks designed to measure the performance of a machine learning model on a target system. Each benchmark measures the wall time to train a model to a target quality metic.

MLPerf is a trademark and is maintained by MLCommons and as such if publishing results of these benchmarks in a public work, use their guidelines.

MLPerf Training (v2.1)

Benchmark	Model	Dataset	Domain Area	Benchmark Target Metric	NCSA Notes
Image Classification	ResNet	ImageNet	Vision	75.90% classification	Doesn't have download and prep steps commited to repo
Image Segmentation	3D U-Net	KiTS19	Vision	0.908 Mean DICE score	Had issues with apptainer conversion
Natural Language Processing	BERT	Wikipedia	Language	0.72 Mask-LM accuracy
Object Detection (light-weight)	RetinaNet	OpenImages	Vision	34.0% mAP
Object Detection (heavy-weight)	Mask R-CNN	COCO	Vision	0.377 Box min AP and 0.339 Mask min AP
Recommendation	DLRM	1TB Clickthrough	Commerce	0.8025 AUC
Reinforcement learning	Minigo	GO	Research	50% win rate vs. checkpoint
Speech Recognition	RNN-T	LibriSpeech	Language	0.058 Word Error Rate

MLPerf Inference (v2.1)

Benchmark	Dataset	Model
Image Classification	ImageNet2012	ResNet50
Image Classification	OpenImages	ResNext50
Image Segmentation	KiTS19	3D U-Net
Natural Language Processing	Squad 1.1	BERT
Recommendation	Criteo Terabyte	DLRM
Speech Recognition	OpenSLR LibriSpeech Corpus	RNN-T

Page tree

NCSA ML Perf Benchmarking

MLPerf Training (v2.1)

MLPerf Inference (v2.1)