3.3. LLVM

This section describes how to use the C/C++/Fortran compilers based on LLVM. LLVM, which generates binaries for compute nodes, is available on both the login nodes (Intel) and the compute nodes.

3.3.1. Environment setting

These compiler is located at:

On login node (cross environment) /vol0004/apps/r/OSS_CN/llvm-21.1.0/cross_clangfx On compute node (native environment) /vol0004/apps/r/OSS_CN/llvm-21.1.0/own_clangfx

(If you prefer to use the previous version, ‘lvm-v19.1.4’ remains available.)

Execute the followin command to setup the environment before runnig compliers.

login node

[_LNIlogin]$ module load LLVM/llvmorg-21.1.0

compute node

[_CNlogin]$ module load LLVM/llvmorg-21.1.0

Available languages are following

Software

Language

Version

Command

Clang

C

21.1.0

clang

Clang++

C++

21.1.0

clang++

Flang

Fortran

21.1.0

flang

3.3.2. How to Use

Here is an example of how to compile C. The commands are common for both nodes. For compilation options, please refer to those provided by LLVM’s Clang/Flang.

In particular, with flang, the compile process may sometimes take longer. In such cases, please allocate a larger cache size on the compute node, as described below.

[Sequential execution]

login node
[_LNIlogin]$ clang -O3 source_file
compute node
[_CNlogin]$ clang -O3 source_file

【OpenMP】

Please add “-fopenmp” option for OpenMP library.

login node
[_LNIlogin]$ clang -O3 -fopenmp source_file
compute node
[_CNlogin]$ clang -O3 -fopenmp source_file

【MPI】

You can use MPI library via mpiclang/mpiclang++/mpiflang commands.

login node
[_LNIlogin]$ mpiclang -O3 -fopenmp source_file
compute node
[_CNlogin]$ mpiclang -O3 -fopenmp source_file

See also

The same procedure applies when using C++ (mpiclang++) and Fortran (mpiflang).

Example: Compilation in a job script When assigning a job to a compute node, specify the pjsub option –llio cn-cache-size=1Gi.

If omitted, the translation time may increase.

#!/bin/sh -x
#PJM -L  "node=1"
#PJM -L  "rscgrp=small"
#PJM -L  "elapse=01:00:00"
#PJM -x PJM_LLIO_GFSCACHE=/vol0004
#PJM -g groupname
#PJM --llio cn-cache-size=1Gi
#PJM -s
#

module purge
module load LLVM/llvmorg-21.1.0

mpiclang -o sample1.out sample1.c

3.3.3. How to execute

The compiled binary can be executed either from an interactive job or from a script. Here is an example of script. To improve performance, transfer the executable and LLVM libraries using llio_transfer.

#!/bin/sh -x
#PJM -L  "node=128"
#PJM --mpi "max-proc-per-node=4"
#PJM -L  "rscgrp=small"
#PJM -L  "elapse=01:00:00"
#PJM -x PJM_LLIO_GFSCACHE=/vol0004
#PJM -g groupname
#PJM -s
#

module purge
module load LLVM/llvmorg-21.1.0
llio_transfer `find ${LLVM_BASEDIR}/lib64 -type f -name \*.so\*`
llio_transfer `find ${MPI_HOME}/lib64 -type f -name \*.so\*`
llio_transfer sample1.out

mpiexec ./sample1.out

See also

For C++ and Fortran programs, the same execution procedure should be followed.

3.3.4. Additional Information

With this LLVM, the OpenBLAS and FFTW libraries are available for use.

Please note that this library cannot be used with other compilers.

Usage of OpenBLAS

The libraries are located at the following path:

Library

Path of Library

LP64, sequential

/vol0004/apps/r/OSS_CN/llvm/openblas-seq

LP64, thread-parallel

/vol0004/apps/r/OSS_CN/llvm/openblas-omp

ILP64, sequential

/vol0004/apps/r/OSS_CN/llvm/openblas-seq-ilp64

ILP64, thread-parallel

/vol0004/apps/r/OSS_CN/llvm/openblas-omp-ilp64

How to compile

When compiling a program that uses OpenBLAS, please first configure the environment for LLVM. Then, add the option -lopenblas at the linking stage.

For multi-threaded execution, please also add the option -fopenmp.

In addition, please add the following options depending on the programming language.

For C/C++:

  • Please specify the library and header files of the OpenBLAS installation directory (${OPENBLAS_DIR}) using the -L and -I options.

  • Please add -lflang_rt.runtime.

For Fortran:

  • Please specify the library files of the OpenBLAS installation directory (${OPENBLAS_DIR}) using the -L option.

How to execute

When running a program that uses OpenBLAS, please add ${OPENBLAS_DIR}/lib to the LD_LIBRARY_PATH.

Modifications from the original OpenBLAS

Compared with the original OpenBLAS 0.3.26, the following routines have been tuned for A64FX.

  • DGEMM

  • SGEMM

Usage of FFTW

FFTW libraries are located at /vol0004/apps/r/OSS_CN/llvm/fftw3. (FFTW_DIR)

When compiling a program that uses FFTW, please first configure the environment for LLVM. Then, add the following options at the linking stage:

  • Single-precision sequential library : -lfftw3f -lm

  • Single-precision multi-threaded library : -lfftw3f_omp -lfftw3f -lm

  • Double-precision sequential library : -lfftw3 -lm

  • Double-precision multi-threaded library : -lfftw3_omp -lfftw3 -lm

When using the multi-threaded library, please add the option -fopenmp.

In addition, please add the following options depending on the programming language:

For C/C++:

  • Please specify the library and header files of the FFTW installation directory (${FFTW_DIR}) using the -L and -I options.

  • Please add -lflang_rt.runtime.

For Fortran:

  • Please specify the library files of the FFTW installation directory (${FFTW_DIR}) using the -L option.

How to execute

When running a program that uses FFTW, please add ${FFTW_DIR}/lib to the LD_LIBRARY_PATH.

Modifications from the original FFTW

  • Header files supporting various SIMD instructions have been adapted for SVE instructions.

  • In the automatic tuning mechanism called plan generation, efficiency is achieved by removing processing routes that are not selected.

Lrge page

Performance can be improved by using large pages. On the Fugaku system, please add the following option at translation (linking) time:

-L/opt/FJSVxos/mmm/lib64 -lmpg -Wl,-T/opt/FJSVxos/mmm/util/bss-2mb.ldsを

This option must be specified before all other libraries. For details on the large page functionality, please refer to the manual Job Operation Software: End-user’s Guide fpr HPC Extensions.

Differences from the Clang mode of the Fujitsu complier

In the Fujitsu compiler, a Clang mode based on LLVM is provided for C/C++. However, compiler features uniquely extended by Fujitsu are not available in LLVM. The main differences are shown below.

  • Environment variables and optimization pragmas specific to the Clang mode of the Fujitsu compiler are not available in the Clang provided by LLVM.

  • Compiler-specific options of the Fujitsu compiler (e.g., -K, -ffj) cannot be used.

  • Math libraries for the Fujitsu compiler (e.g., -SSL2, -lfjlapacksve) cannot be used.

  • A64FX processor-specific features (such as sector cache and hardware barrier) cannot be used.

  • “Since the program instruction sequences generated by the Clang provided with LLVM and those generated by the Clang mode of the Fujitsu compiler may differ, the profiling results of the same program may not necessarily match.

Cautions

Caution for REAL(KIND=16) and COMPLEX(KIND=16) types.

On the cross-compilation environment flang available on the login nodes, the following error occurs when compiling Fortran programs that use REAL(KIND=16) or COMPLEX(KIND=16) types.

Example for the failure

flang complex16.f90
error: Semantic errors in complex16.f90
 ./complex16.f90:3:3: error: COMPLEX(KIND=16) is not an enabled type for this target
 complex(kind=16) :: x
 ^^^^^^^^^^^^^^^^^^^^^
flang real16.f90
error: Semantic errors in real16.f90
 ./real16.f90:3:3: error: REAL(KIND=16) is not an enabled type for this target
 real(kind=16) :: x, y, z
 ^^^^^^^^^^^^^^^^^^^^^^^^

Workaround

When using these types, please compile in the native environment (i.e., on the compute nodes).