# Known Issues on JURECA¶

Note

The following list of known issue is intended to provide a quick reference for users experiencing problems on JURECA. We strongly encourage all users to report the occurrence of problems, whether listed below or not, to the user support.

## Incorrect RMA rendezvous cache handling in ParaStation MPI and Intel MPI¶

Description: In rare cases, Fortran applications utilizing buffers that have been deallocated and then reallocated for MPI communication observed data corruption in transit. The root cause could be identified as an error in the handling of the rendezvous cache used by the MPI implementations. The error only only occures only with newer Intel compilers.

Status: Resolved in stage 2016b and newer since 2017-04-11.

Workaround/Suggested Action: A fix for ParaStation MPI and workaround for Intel MPI have been installed in the 2016b stage on 2017-04-11. Users of this stage or newer stages are not affected by the problem.

## Intel compiler error with std::valarray and optimized headers¶

Description: An error was found in the implementation of several C++ std::valarray operations in the Intel compiler suite that occurs if the option -use-intel-optimized-headers of icpc is used.

Status: Open.

Workaround/Suggested Action: Users are strongly advised not to use the -use-intel-optimized-headers option on JURECA.

## Errors with IntelMPI and Slurm's cyclic job/task distribution¶

Description: If using IntelMPI together with srun's option

--distribution=cyclic or if variable SLURM_DISTRIBUTION=cyclic is exported there is a limitation of the maximum number of MPI tasks that can be spawned and jobs fail completely for more than 6 total MPI tasks in a job step.

You have to be aware that the cyclic distribution is the default behavior of Slurm when using compute nodes interactively, i.e. the number of tasks is no larger than the number of allocated nodes! The problem has already been reported to Intel in 2017 and a future release may solve this issue.

Status: Open.

Workaround/Suggested Action: The recommended workarounds are:

1. Avoid srun's option --distribution=cyclic
2. Unset SLURM_DISTRIBUTION inside the jobscript or export SLURM_DISTRIBUTION=block before starting the srun
3. Export I_MPI_SLURM_EXT=0 to disable the optimized startup algorithm for IntelMPI

## MPI_Gather and MPI_Gatherv hang with Intel MPI 2018.02¶

Description: With Intel MPI-version 2018.02, MPI_Gather hangs for large message sizes. MPI_Gatherv does not terminate as well.

Status: Workaround implemented.

Workaround/Suggested Action: Mitigating environment variables have been added to the module file.

## Collectives in Intel MPI 2019 can lead to hanging processes or segmentation faults¶

Description: Problems with collective operations and Intel MPI 2019 have been observed. Segmentation faults in MPI_Allreduce, MPI_Alltoall, MPI_Alltoallv have been reproduced. Hangs in MPI_Allgather, MPI_Allgatherv have been observed. As the occurrence is dependent on the underlying dynamically chosen algorithm in the MPI implementation, the issue may or may not be visible depending on job and buffer sizes. Hangs in MPI_Cart_create call have been reported, likely due to problems with the underlying collective operations.

Status: Open.

Workaround/Suggested Action: The default Intel MPI in the Stage 2018b has been changed to Intel MPI 2018.04. Alternatively a fall-back to Stage 2018a may be an option.