Nvidia OpenACC

OpenACC is a parallel programming model that facilitates the use of an accelerator device attached to a host CPU. The OpenACC API allows the programmer to supplement information available to the compilers in order to offload code from a host CPU to an attached accelerator device. HPE CCE supports full OpenACC 2.0 and partial OpenACC 2.6 for Fortran.

Nvidia OpenACC Example

The following example code showcases the very basics of OpenACC programming:

#include <stdio.h>
#include <unistd.h>

#define N 1000
int array[N];
int main() {
char hostname[256];
gethostname(hostname, sizeof(hostname));
#pragma acc parallel loop copy(array[0:N])
   for(int i = 0; i < N; i++) {
      array[i] = 3.0;
   }
   printf("Success on node %s!\n",hostname);
}

More information: https://www.openacc.org/resources

Nvidia OpenACC GPU

The application can be compiled using the Nvidia HPC SDK:

module swap PrgEnv-cray PrgEnv-nvhpc
cc openacc-nvidia.c -o openacc-nvidia -acc

Dynamic linking for OpenACC as follows:

$ ldd openacc-nvidia | grep acc
     libacchost.so => /scratch/software/packages/nvhpc/Linux_x86_64/23.11/compilers/lib/libacchost.so (0x00007fa64f5f2000)
     libaccdevaux.so => /scratch/software/packages/nvhpc/Linux_x86_64/23.11/compilers/lib/libaccdevaux.so (0x00007fa64f3d6000)
     libaccdevice.so => /scratch/software/packages/nvhpc/Linux_x86_64/23.11/compilers/lib/libaccdevice.so (0x00007fa64f0aa000)

Nvidia OpenACC Batch Job

Example batch job using 4 nodes and 8 tasks:

#!/bin/bash
#SBATCH -A hpcteszt
#SBATCH --partition=ai
#SBATCH --job-name=openacc-nvidia
#SBATCH --output=openacc-nvidia.out
#SBATCH --time=06:00:00
#SBATCH --nodes=4
#SBATCH --ntasks-per-node=8
#SBATCH --gres=gpu:1
srun ./openacc-nvidia