Parallel Debugging

A description of a number of available tools can be found at the UNITE webpage.

Debugging with TotalView on JUWELS

TotalView is a very powerful debugger supporting C, C++, Fortran 77, Fortran 90, PGI HPF and assembler programs and offers among others the following features:

  • Support for debugging of multi-process and multi-threaded applications
  • C++ support (templates, inheritance, inline functions)
  • F90 support (user types, pointers, modules)
  • 1D + 2D Array Data visualization
  • Support for parallel debugging (MPI: automatic attach, message queues, OpenMP, pthreads)
  • Scripting and batch debugging
  • Memory Debugging
  • Reverse Debugging with ReplayEngine

Using TotalView

TotalView is a debugger with a graphical user interface (GUI). In order to be able to use the GUI please make sure to connect to JUWELS using ssh -X (or alternatively ssh -Y, if the former one fails) to enable X11 forwarding and ensure that your local client (laptop or workstation) has a running X server. An X server for Windows can, for example, be downloaded from An X server for Mac OS X is available from

More details about accessing the JUWELS login nodes can be found here.

Once you are logged in to JUWELS (with X11 forwarding enabled) load the modules (mind the capital T and V in Totalview's name):

module load TotalView intel-para

Before executing TotalView we advise to allocate an interactive session using salloc as described in the Quick Introduction. Once the allocation is active, salloc starts a shell on the login node in which TotalView can be run by executing the command:

  • A startup wizard will appear.
  • Choose A new parallel program.
  • At the wizard step PARALLEL DETAILS, the important parameters are:
    • Parallel system: choose SLURM
    • Tasks: the number of tasks required by your code
    • Nodes: the number of nodes required by your code
    • Additional starter arguments: any argument you need to send to srun (not to your code, this is in a further step)
  • Press Next.
  • Here you specify the location of your binary and arguments to be passed to it.
  • Press Next.
  • Enable reverse debugging: reverse debugging capability records the execution history of your program and makes that history available for diagnosis. This new approach - working back from a failure, error, or crash to its root cause - eliminates the need to restart your program repeatedly with different breakpoint locations. The ability to do reverse debugging, stepping freely both forwards and backwards through program execution, drastically reduces the amount of time invested in troubleshooting your code.

  • Enable memory debugging: the memory debugging functions of TotalView are packaged as a separate, but integrated, client called MemoryScape. This option enables its use. Run your program under TotalView as usual, to a stopping point. Then launch MemoryScape:

    Process Window > Debug Menu > Open MemoryScape

    The MemoryScape main window should then appear.

  • CUDA debugging: if your application uses CUDA, you can debug its directives here.

  • Press Next.

  • If your program needs special environment variables, you can add it on this screen.
  • Press Next.
  • The last screen allows you to review the launch command. From here, the only available option forward is Start Session. TotalView's main window should appear, with the starting point of your code (main) shown.

TotalView’s online help cannot be directly accessed from JUWELS. Please check TotalView’s help files for usage.

Memory Debugging with TotalView

In order to support memory debugging with TotalView using the MemoryScape tool one needs to link the code with MemoryScape's heap agent. To do so for the following parameters should be added to the compilation/linking step:

-L$(PATH) -ltvheap_64 -Wl,-rpath,$(PATH)

where PATH equals: