Nvidia OpenACC

OpenACC is a parallel programming model that facilitates the use of an accelerator device attached to a host CPU. The OpenACC API allows the programmer to supplement information available to the compilers in order to offload code from a host CPU to an attached accelerator device. HPE CCE supports full OpenACC 2.0 and partial OpenACC 2.6 for Fortran.

Nvidia OpenACC Example

The following example code showcases the very basics of OpenACC programming:

#include <stdio.h>
#include <unistd.h>

#define N 1000
int array[N];
int main() {
char hostname[256];
gethostname(hostname, sizeof(hostname));
#pragma acc parallel loop copy(array[0:N])
   for(int i = 0; i < N; i++) {
      array[i] = 3.0;
   }
   printf("Success on node %s!\n",hostname);
}

More information: https://www.openacc.org/resources

Nvidia OpenACC GPU

The application can be compiled using the Nvidia HPC SDK:

module swap PrgEnv-cray PrgEnv-nvhpc
cc openacc-nvidia.c -o openacc-nvidia -acc

Dynamic linking for OpenACC as follows:

$ ldd openacc-nvidia | grep acc
     libacchost.so => /scratch/software/packages/nvhpc/Linux_x86_64/23.11/compilers/lib/libacchost.so (0x00007fa64f5f2000)
     libaccdevaux.so => /scratch/software/packages/nvhpc/Linux_x86_64/23.11/compilers/lib/libaccdevaux.so (0x00007fa64f3d6000)
     libaccdevice.so => /scratch/software/packages/nvhpc/Linux_x86_64/23.11/compilers/lib/libaccdevice.so (0x00007fa64f0aa000)

Nvidia OpenACC Batch Job

Example batch job using 4 nodes and 8 tasks:

#!/bin/bash
#SBATCH -A hpcteszt
#SBATCH --partition=ai
#SBATCH --job-name=openacc-nvidia
#SBATCH --output=openacc-nvidia.out
#SBATCH --time=06:00:00
#SBATCH --nodes=4
#SBATCH --ntasks-per-node=8
#SBATCH --gres=gpu:1
srun ./openacc-nvidia