Heterogeneous and Cross-Module Jobs

Heterogeneous Jobs

With Slurm 17.11 support for heterogeneous jobs was introduced. A heterogeneous job consists of several job components, all of which can have individual job options. In particular, the different components can request nodes from different partitions. That way, a heterogeneous job can for example be spawned across multiple modules of our supercomputers.

Specifying Individual Job Options

The syntax of the interactive and non-interactive submission mechanisms -- salloc and srun -- has been extended to the user to specify individual options for the different job components. For srun, the sequence of command line is partitioned into several blocks with the colon : acting as the seperator. The resulting heterogeneous job will have as many job components as there were blocks of command line arguments. The first block of arguments contains the job options of the first job component as well as common job options that will apply to all other components. The second block contains options for the second job component and so on. The abstract syntax is as follows:

$ salloc <options 0 + common> : <options 1> [ : <options 2>... ]

The following invocation of salloc submits an interactive heterogeneous job that consists of two components, the first requesting one node from the partition_a partition, the second requesting 16 nodes from the partition_b partition.

$ salloc -A budget -p partition_a -N 1 : -p partition_b -N 16

Submitting non-interactive heterogeneous jobs through sbatch works similarly, but the syntax for seperating blocks of options in a batch script is slightly different. Instead of the colon :, batch scripts use the usual directive #SBATCH followed by the word packjob as a separator:

#SBATCH <options 0 + common>
#SBATCH packjob
#SBATCH <options 1>
#SBATCH packjob
#SBATCH <options 2>...

To submit a non-interactive heterogeneous job with the same setup as the interactive job above, the jobscript would read

#SBATCH -A budget -p partition_a -N 1
#SBATCH packjob
#SBATCH -p partition_b -N 16

As always, one can also specify job options on the sbatch command line and even mix options specified on the command line and in the batch script. Again, the colon : acts as the seperator of blocks of command line arguments. For example to specify that particular job components should always run on certain partitions they could be specified in the job script, while the number of nodes is left to be specified on the command line. The following batch script, submitted via sbatch -N 1 : -N 16 <batch script> results in the same heterogeneous job as the previous two examples.

#SBATCH -A budget -p partition_a
#SBATCH packjob
#SBATCH -p partition_b

A overview of the available partitions can be found at the Quick Introduction page.

Running Job Components Side by Side

As with homogeneous jobs, applications are launched inside a heterogeneous job using srun. Like salloc and sbatch, srun can be used to specify different options and also commands to run for different components through blocks of command line arguments separated by the colon :.

$ srun <options and command 0> : <options and command 1> [ : <options and command 2> ]

For example, in a heterogeneous job with two components, srun accepts up to two blocks of arguments and commands:

$ srun --ntasks-per-node 24 ./prog1 : --ntasks-per-node 1 ./prog2

The first block applies to the first component, the second block to the second component and so on. If there are less blocks than job components, the resources of the latter job components go unused as no application is launched there.

The option --pack-group=<expr> can be used to explicitly assign a block of command line arguments to a job component. It takes as its argument <expr> either a single job component index in the range 0 ... n - 1 where n is the number of job components, or a range of indices like 1-3 or a comma seperated list of both indices and ranges like 1,3-5. The following invocation of srun runs the same application ./prog in components 0 and 2 of a three component heterogeneous job, leaving component 1 idle:

$ srun --pack-group=0,2 ./prog

The same application ./prog can be run in all three job components using:

$ srun --pack-group=0-2 ./prog

For detailed information about Slurm, please take a look on the Quick Introduction and Batch system page as well as the official Slurm documentation on heterogeneous jobs for additional information on this feature.

Loading Software in a Heterogeneous Environment

Executing applications in a modular environment, especially when different modules have different architectures or the dependencies of programs are not uniform, can be a challenging tasks.

Uniform Architecture and Dependencies

As long as the architecture of the given modules are uniform and there are not mutually exclusive dependencies for the binaries that are going to be executed, one can rely on the module command. Take a look on the Quick Introduction if module is new for you.

#!/bin/bash -x
module load [...]
srun ./prog1 : ./prog2

Non Uniform Architectures and Mutual Exclusive Dependencies

A tool called xenv was implement to ease the task of loading modules for heterogeneous jobs. For details on supported command line arguments, execute xenv -h on the given system.

srun --account=<budget account> \
  --partition=<batch, ...> xenv -L intel-para IMB-1 : \
  --partition=<knl, ...> xenv -L Architecture/KNL -L intel-para IMB-1