You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 4 Next »

MPI Benchmarks

The following uses nodecomm to measure the bandwidth between 2 nodes, with 1-32 processes on each node communicating with its partner on the other node.


TensorFlow Benchmarks


  ←  OLD, NEW →

The figure above shows effect of increasing batch size on training rate. Trying to run Resnet 50 v1.5 with batch size 256 using fp32 for calculations results in OOM, which is why that result is omitted.

← OLD, NEW →

The figure above shows effect of increasing number of gpus on training rate with fixed batch size for both real (blue) and synthetic (orange) data.


A broader set of results can be seen in the following table:

model

num_gpus

batch_size

use_fp16

use_synth

img_per_sec

resnet50_v1.5

1

32

False

False

305.1581998762749

resnet50_v1.5

1

32

False

True

305.0756935566205

resnet50_v1.5

1

32

True

False

557.608344009375

resnet50_v1.5

1

32

True

True

551.7856108469189

resnet50_v1.5

1

64

False

False

345.86744731677936

resnet50_v1.5

1

64

False

True

343.33576731867544

resnet50_v1.5

1

64

True

False

687.2315721842947

resnet50_v1.5

1

64

True

True

676.3865560680504

resnet50_v1.5

1

128

False

False

365.42804577993536

resnet50_v1.5

1

128

False

True

361.9197194570446

resnet50_v1.5

1

128

True

False

780.5401907002691

resnet50_v1.5

1

128

True

True

787.9725710218387

resnet50_v1.5

1

256

True

False

827.7779526071831

resnet50_v1.5

1

256

True

True

838.8718495151428

resnet50_v1.5

2

32

False

False

606.6988150458972

resnet50_v1.5

2

32

False

True

611.4476119199953

resnet50_v1.5

2

32

True

False

1093.3154950403268

resnet50_v1.5

2

32

True

True

1099.2094967673627

resnet50_v1.5

2

64

False

False

684.8824289499631

resnet50_v1.5

2

64

False

True

682.3657230029216

resnet50_v1.5

2

64

True

False

1337.7101641206896

resnet50_v1.5

2

64

True

True

1356.9406586722887

resnet50_v1.5

2

128

False

False

727.4366020878481

resnet50_v1.5

2

128

False

True

714.9118280128031

resnet50_v1.5

2

128

True

False

1520.3306305011697

resnet50_v1.5

2

128

True

True

1567.667541866167

resnet50_v1.5

2

256

True

False

1663.5347160145286

resnet50_v1.5

2

256

True

True

1685.7265358502561

resnet50_v1.5

4

32

False

False

1099.8515970079466

resnet50_v1.5

4

32

False

True

1158.8182004104326

resnet50_v1.5

4

32

True

False

1806.7010872696358

resnet50_v1.5

4

32

True

True

2140.346448732058

resnet50_v1.5

4

64

False

False

1263.7396753889122

resnet50_v1.5

4

64

False

True

1323.4508475425275

resnet50_v1.5

4

64

True

False

2132.559146072129

resnet50_v1.5

4

64

True

True

2630.0377706917334

resnet50_v1.5

4

128

False

False

1382.0710902623055

resnet50_v1.5

4

128

False

True

1414.0828353841684

resnet50_v1.5

4

128

True

False

2557.1778582125

resnet50_v1.5

4

128

True

True

3065.898543897684

resnet50_v1.5

4

256

True

False

2864.1530570235277

resnet50_v1.5

4

256

True

True

3308.184721599039

inception3

1

32

False

False

231.6304999196131

inception3

1

32

False

True

231.62821340833554

inception3

1

32

True

False

397.8483574516062

inception3

1

32

True

True

397.61870298268843

inception3

1

64

False

False

255.79774049318286

inception3

1

64

False

True

254.96217612362278

inception3

1

64

True

False

464.3570055911474

inception3

1

64

True

True

466.5224032998034

inception3

1

128

False

False

265.4893643297776

inception3

1

128

False

True

265.9010072543052

inception3

1

128

True

False

523.6175139095815

inception3

1

128

True

True

529.3233120661808

inception3

1

256

True

False

548.3831188132465

inception3

1

256

True

True

558.6182587782852

inception3

2

32

False

False

461.21245811536164

inception3

2

32

False

True

461.7955449394074

inception3

2

32

True

False

776.9118181766349

inception3

2

32

True

True

803.1606840744608

inception3

2

64

False

False

505.62156764514185

inception3

2

64

False

True

510.4880571128736

inception3

2

64

True

False

910.3834390431858

inception3

2

64

True

True

941.0765499887631

inception3

2

128

False

False

530.073420084333

inception3

2

128

False

True

532.0966824977393

inception3

2

128

True

False

1031.042745566893

inception3

2

128

True

True

1059.298960623237

inception3

2

256

True

False

1097.3179582016833

inception3

2

256

True

True

1112.944630855092

inception3

4

32

False

False

846.1137665298596

inception3

4

32

False

True

895.5092904990373

inception3

4

32

True

False

1271.1921617960843

inception3

4

32

True

True

1562.0603997601588

inception3

4

64

False

False

967.953751912867

inception3

4

64

False

True

994.9078067162436

inception3

4

64

True

False

1513.182913280423

inception3

4

64

True

True

1842.119923117435

inception3

4

128

False

False

992.0283404721883

inception3

4

128

False

True

1027.2404515646604

inception3

4

128

True

False

1751.5253071397437

inception3

4

128

True

True

2089.504434176232

inception3

4

256

True

False

1992.6604934734398

inception3

4

256

True

True

2174.329529373497


An updated table (utilizing full compute resources / without filesystem bottleneck) with results accumulated from 5 separate trials follows:

model

num_gpus

batch_size

use_fp16

use_synth

mean of img_per_sec

std deviation of img_per_sec

resnet50_v1.5

1

32

False

False

300.7115095987747

0.8334252017088293

resnet50_v1.5

1

32

False

True

303.2748894498352

0.518768277442615

resnet50_v1.5

1

32

True

False

552.8543259023911

3.7419280205833876

resnet50_v1.5

1

32

True

True

551.3152982879091

1.9256151492887408

resnet50_v1.5

1

64

False

False

340.3455581202521

0.4352822822221454

resnet50_v1.5

1

64

False

True

341.4178425579963

1.399820974317093

resnet50_v1.5

1

64

True

False

679.092793370331

1.811493519714515

resnet50_v1.5

1

64

True

True

679.4220076333565

0.4846137767752205

resnet50_v1.5

1

128

False

False

365.97794971142156

0.6660901988191761

resnet50_v1.5

1

128

False

True

365.1002182169415

0.290659074503093

resnet50_v1.5

1

128

True

False

780.7795427908641

2.6289190299425735

resnet50_v1.5

1

128

True

True

782.0787833811031

1.4035397167521415

resnet50_v1.5

1

256

True

False

841.0209098895596

0.9327257479352289

resnet50_v1.5

1

256

True

True

841.6281134422322

0.6754027617726148

resnet50_v1.5

2

32

False

False

597.7123015353376

2.5358048324228486

resnet50_v1.5

2

32

False

True

602.1851277710804

1.4932197457537746

resnet50_v1.5

2

32

True

True

1101.2976316254073

3.1791943633783832

resnet50_v1.5

2

64

False

False

680.4331810776715

1.1986421991311502

resnet50_v1.5

2

64

False

True

680.168662439279

1.628589727697411

resnet50_v1.5

2

64

True

False

1350.6915781119133

4.771936223937959

resnet50_v1.5

2

64

True

True

1337.9212057737357

4.3318244554829635

resnet50_v1.5

2

128

False

False

722.1292515998915

1.7079082542095976

resnet50_v1.5

2

128

False

True

724.5907650506392

1.3016121917621204

resnet50_v1.5

2

128

True

False

1560.6063143166916

5.574390740438405

resnet50_v1.5

2

128

True

True

1558.8925541664316

3.7645037153069234

resnet50_v1.5

2

256

True

False

1681.9367020855304

2.135490739835972

resnet50_v1.5

2

256

True

True

1679.1043258972677

1.9428881159794065

resnet50_v1.5

4

32

False

False

1137.8151032757787

3.6070602577235196

resnet50_v1.5

4

32

False

True

1147.7653259816507

10.086122200344256

resnet50_v1.5

4

32

True

False

2090.181533002181

27.753203322924346

resnet50_v1.5

4

32

True

True

2118.951749390374

9.671318775529995

resnet50_v1.5

4

64

False

False

1323.4940363814712

2.4288696630804347

resnet50_v1.5

4

64

False

True

1336.5457321023537

2.8028756633168586

resnet50_v1.5

4

64

True

False

2627.208652367316

30.590239282719534

resnet50_v1.5

4

64

True

True

2605.2372571940177

16.77388699379653

resnet50_v1.5

4

128

False

False

1428.9004479146274

5.19872681737769

resnet50_v1.5

4

128

False

True

1431.0649693247092

2.1140696182827456

resnet50_v1.5

4

128

True

False

3038.123973486709

32.14983993002834

resnet50_v1.5

4

128

True

True

3050.6240630834754

13.087954923831944

resnet50_v1.5

4

256

True

False

3269.060930519893

18.13964231626883

resnet50_v1.5

4

256

True

True

3288.0027951477073

10.833033249948164

  • No labels