Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
titlesqueue %f shows features
$ squeue -u $USER
       JOBID    PARTITION         NAME           USER ST       TIME  NODES   NODELIST(REASON) FEATURES
     2734871          cpu         bash         gbauer  R       1:47      2        cn[122-123] ss11

GPU direct support 

These MPI implementations should be used only when mpi + cuda/gpu_direct are needed in an application.  The pure-mpi performance will be less than the MPI implementations above for small message sizes.  For large messages, the performance should be close to equivalent to the cpu-only implementations.

openmpi

choose one of:

Code Block
module load gcc openmpi/4.1.5+cuda     # the default gcc/11.4.0
module load nvhpc openmpi/4.1.5+cuda   # will load the openmpi/4.1.5+cuda built with nvhpc compilers

# in testing mode
module load gcc openmpi/5.0.1+cuda     # only mpirun is supported, do not use with srun

see also: gpudirect s10 vs s11 performance

CrayPE Programming Environments

...

Code Block
languagebash
titlecraype modules listing
collapsetrue
------------------------ /opt/cray/pe/lmod/modulefiles/craype-targets/default ------------------------
   craype-accel-amd-gfx908    craype-hugepages128M    craype-hugepages512M    craype-x86-milan
   craype-accel-amd-gfx90a    craype-hugepages16M     craype-hugepages64M     craype-x86-rome
   craype-accel-amd-gfx940    craype-hugepages1G      craype-hugepages8M      craype-x86-spr-hbm
   craype-accel-host          craype-hugepages256M    craype-network-none     craype-x86-spr
   craype-accel-intel-max     craype-hugepages2G      craype-network-ofi      craype-x86-trento------------------- /opt/cray/pe/lmod/modulefiles/craype-targets/default ------------------------
   craype-accel-amd-nvidia70gfx908    craype-hugepages128M    craype-hugepages2Mhugepages512M      craype-networkx86-ucxmilan
   craype-accel-nvidia80 -amd-gfx90a    craype-hugepages16M     craype-hugepages32Mhugepages64M     craype-x86-genoarome
   craype-accel-armamd-gracegfx940    craype-hugepages1G       craype-hugepages4Mhugepages8M      craype-x86-milanspr-x
--------------------------------- /opt/cray/pe/lmod/modulefiles/core ---------------------------------
   PrgEnv-cray/8.4.0      cray-R/4.2.1.2hbm
   craype-accel-host          craype-hugepages256M    craype-network-none     craype-x86-spr
   craype-accel-intel-max     craype-hugepages2G      craype-network-ofi      craype-x86-trento
   craype-accel-nvidia70      craype-hugepages2M      craype-network-ucx
   craype-accel-nvidia80      craype-hugepages32M     cray-pals/1.2.12craype-x86-genoa
   craype-arm-grace        gdb4hpc/4.15.1
   PrgEnv-gnu/8.4.0craype-hugepages4M        cray-ccdb/5.0.1          cray-pmi/6.1.12            papi/7.0.1.1craype-x86-milan-x
--------------------------------- /opt/cray/pe/lmod/modulefiles/core ---------------------------------
   PrgEnv-nvhpccray/8.4.0      cray-ctiR/4.2.181.12           cray-pythonpals/31.10.102.12           perftools-basegdb4hpc/234.0915.01
   PrgEnv-nvidiagnu/8.4.0       cray-dsmmlccdb/5.0.2.21          cray-statpmi/46.1.12.1            sanitizers4hpcpapi/17.0.1.1
   atpPrgEnv-nvhpc/38.15.1        4.0     cray-dyninstcti/122.318.0 1     craype/2.7.23      cray-python/3.10.10        valgrind4hpcperftools-base/223.1309.10
   cce/16PrgEnv-nvidia/8.4.0.1    cray-dsmml/0.2.2         cray-libpalsstat/14.2.1212.1           craypkg-gensanitizers4hpc/1.31.301
   cpe-cuda/23.09atp/3.15.1             cray-libscidyninst/2312.09.1.1 3.0   gcc-native/10.3
   cpecraype/2.7.23.09              cray-mrnet/5.1.1valgrind4hpc/2.13.1
   cce/16.0.1            gcc cray-nativelibpals/111.2     (D)

Using a Cray programming environment

Here is how you can enable the GNU CrayPE programming environment. 

Code Block
[gbauer@dt-login04 ~]$ module unload openmpi gcc 
[gbauer@dt-login04 ~]$ module load PrgEnv-gnu cuda craype-x86-milan craype-accel-ncsa

Compiling

Again, you will need to use the  cc, CC and ftn compiler wrappers.

Code Block
languagebash
# Use the HPE/Cray compiler wrappers cc, CC and ftn to compile and link
# you might need to add libraries manually 
[gbauer@dt-login04 ~]$ cc -fopenmp -o xthi xthi.c -lcuda -lcudart

The compiler wrappers enable linking of a libmpi_gtl_cuda library that enables gpu-rdma with the Cray MPI.

GPU direct support 

These MPI implementations should be used only when mpi + cuda/gpu_direct are needed in an application.  The pure-mpi performance will be less than the MPI implementations above for small message sizes.  For large messages, the performance should be close to equivalent to the cpu-only implementations.

openmpi

choose one of:

Code Block
module load gcc openmpi/4.1.5+cuda     # the default gcc/11.4.0
module load nvhpc openmpi/4.1.5+cuda   # will load the openmpi/4.1.5+cuda built with nvhpc compilers

# in testing mode
module load gcc openmpi/5.0.1+cuda     # only mpirun is supported, do not use with srun
.12      craypkg-gen/1.3.30
   cpe-cuda/23.09         cray-libsci/23.09.1.1    gcc-native/10.3
   cpe/23.09              cray-mrnet/5.1.1         gcc-native/11.2     (D)

Using a Cray programming environment

Here is how you can enable the GNU CrayPE programming environment. 

Code Block
[gbauer@dt-login04 ~]$ module unload openmpi gcc 
[gbauer@dt-login04 ~]$ module load PrgEnv-gnu cuda craype-x86-milan craype-accel-ncsa

Compiling

Again, you will need to use the  cc, CC and ftn compiler wrappers.

Code Block
languagebash
# Use the HPE/Cray compiler wrappers cc, CC and ftn to compile and link
# you might need to add libraries manually 
[gbauer@dt-login04 ~]$ cc -fopenmp -o xthi xthi.c -lcuda -lcudart

The compiler wrappers enable linking of a libmpi_gtl_cuda library that enables gpu-rdma with the Cray MPI.see also: gpudirect s10 vs s11 performance


Running a CrayPE job

See the Running jobs section above for details on the partitions etc.

...