Nvidia MPI
General Description https://developer.nvidia.com/mpi-solutions-gpus
Nvidia MPI Example
This example is Cray MPICH application that can be compiled with GPU offload.
#include <stdio.h>
#include <unistd.h>
#include <mpi.h>
int main(int argc, char *argv[]) {
int rank, size, new_rank, new_size;
int color, key;
char hostname[256];
// Initialize MPI
MPI_Init(&argc, &argv);
// Get the rank and size of the MPI communicator
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
// Set color and key for the split
color = rank % 2; // Even and odd ranks will have different colors
key = rank; // Use the original rank as the key for ordering within the new communicator
// Split the communicator based on color and key
MPI_Comm new_comm;
MPI_Comm_split(MPI_COMM_WORLD, color, key, &new_comm);
// Get the rank and size within the new communicator
MPI_Comm_rank(new_comm, &new_rank);
MPI_Comm_size(new_comm, &new_size);
gethostname(hostname, sizeof(hostname));
// Print information
printf("Original rank %d, color %d, key %d --> New rank %d of %d on node %s\n", rank, color, key, new_rank, new_size, hostname);
// Free the new communicator
MPI_Comm_free(&new_comm);
// Finalize MPI
MPI_Finalize();
return 0;
}
Nvidia MPI GPU
Compling the application with the CCE compiler requires loading the nvidia80 environment. This provides optimization for the Nvidia A100 cards.
module load craype-accel-nvidia80
export CRAY_ACCEL_TARGET=nvidia80
cc nvidia-mpi.c -o nvidia-mpi-cce -lcudart
The application is dynamically linked with Cray MPI library and Cuda.
$ ldd nvidia-mpi-cce | grep mpi
libmpi_cray.so.12 => /opt/cray/pe/lib64/libmpi_cray.so.12 (0x00007f9ebf8b4000)
libmpi_gtl_cuda.so.0 => /opt/cray/pe/lib64/libmpi_gtl_cuda.so.0 (0x00007f9ebf66e000)
Compiling the application with the Nvidia compiler requires loading the Nvidia HPC SDK module:
module swap PrgEnv-cray PrgEnv-nvhpc
cc nvidia-mpi.c -o nvidia-mpi
The application is dymanically linked with Cray Nvidia MPI:
$ ldd nvidia-mpi | grep libmpi
libmpi_nvidia.so.12 => /opt/cray/pe/lib64/libmpi_nvidia.so.12 (0x00007f1530c9e000)
Nvidia MPI Batch Job
In the next batch job example we submit the application using 4 nodes and 8 tasks:
#!/bin/bash
#SBATCH -A hpcteszt
#SBATCH --partition=ai
#SBATCH --job-name=nvidia-mpi
#SBATCH --output=nvidia-mpi.out
#SBATCH --time=06:00:00
#SBATCH --nodes=4
#SBATCH --ntasks-per-node=8
#SBATCH --gres=gpu:1
srun ./nvidia-mpi