Compile and Execute

Compilation and Execution of Parallel Programs on JUWELS

This section gives a short introduction to the compiling and execution procedure on JUWELS.

Compilation

This section gives you an overview of the compiling procedure on JUWELS.

In order to have access to compilers and libraries a toolchain must be loaded. As software installed is analogous to JURECA, detailed information on how to do so can be found in Software on JURECA.

As a rule of thumb, for most applications, the Intel compilers provide the highest performance on the JUWELS platform. However, we recommend to experiment with different compilers and compiler options to find an optimal setting for the specific application.

The following table shows the names of the MPI wrapper procedures for the Intel compilers as well as the names of compilers themselves. The wrappers build up the MPI environment for your compilation task. Therefore we recommend to always use the wrappers instead of the compiler drivers themselves.

Programming Language Wrapper Intel Compiler
Fortran 90 mpif90 ifort
Fortran 77 mpif77 ifort
C++ mpicxx icpc
C mpicc icc

Useful general options for the compilers:

-openmp Enables the parallelizer to generate multi-threaded code based on the OpenMP directives
-help Gives a long list of quite a big amount of options
-sox Stores useful information like compiler version, options used etc. in the executable. If you want to extract these information use: strings -a <executable> | grep comment:
-g Creates debugging information in the object files. This is necessary if you want to debug your program

Some useful preprocessor options:

-D Defines a macro
-U Undefines a macro
-I Allows to add further directories to the include file search path
-H Gives the include file order. This options is very useful if you want to find out which directories are used and in which order they are applied

Linker option:

-L A path can be given in which the linker searches for libraries

Some options for optimization:

-O0 No optimization: useful, if you want to debug your program (default setting, if you use the "-g" option)
-O1 Optimize with respect to code size and code locality
-O2 Optimize with respect to code speed. This is the default setting. In most cases this option is a better choice than -O1
-O3 Try this option, if your code includes a big amount of loops and floating-point calculations
-ipo Interprocedural optimization
-axCORE-AVX2 Indicates the processor for which code is created

General command line example for the compile step:

mpicxx -O2 program.cpp -o program.x

Omitting -o program.x results in an executable named a.out.

If your program runs fine with these parameters, you could check whether setting the variable PSP_ONDEMAND=1 improves the performance of the application.

Execution

Programs have to be execucted through the workload manager on JURECA. Please see the Quick Introduction and Batch System pages for more information.