Advanced tutorial

This tutorial demonstrates more detailed functions and tools of JUBE. If you want a basic overview you should read the general JUBE tutorial first.

Schema validation

To validate your input files you can use DTD or schema validation. You will find jube.dtd, jube.xsd and jube.rnc inside the schema folder. You have to add these schema information to your input files which you want to validate.

DTD usage:

1
2
3
4
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE jube SYSTEM "<jube.dtd path>">
<jube>
...

Schema usage:

1
2
3
4
 <?xml version="1.0" encoding="UTF-8"?>
 <jube xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
   xsi:noNamespaceSchemaLocation="<jube.xsd path>">
 ...

RELAX NG Compact Syntax (RNC for emacs nxml-mode) usage:

In order to use the provided rnc schema file schema/jube.rnc in emacs open an xml file and use C-c C-s C-f or M-x rng-set-schema-file-and-validate to choose the rnc file. You can also use M-x customize-variable rng-schema-locating-files after you loaded nxml-mode to customize the default search paths to include jube.rnc. After successful parsing emacs offers to automatically create a schema.xml file which looks like

1
2
3
4
<?xml version="1.0"?>
<locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
   <uri resource="jube-file.xml" uri="../schema/jube.rnc"/>
</locatingRules>

The next time you open the same xml file emacs will find the correct rnc for the validation based on schema.xml.

Example validation tools:

  • eclipse (using DTD or schema)

  • emacs (using RELAX NG)

  • xmllint:

    • For validation (using the DTD):

      >>> xmllint --noout --valid <xml input file>
      
    • For validation (using the DTD and Schema):

      >>> xmllint --noout --valid --schema <schema file> <xml input file>
      

Scripting parameter

In some cases it is needed to create a parameter which is based on the value of another parameter. In this case you can use a scripting parameter.

The files used for this example can be found inside examples/scripting_parameter.

The input file scripting_parameter.xml:

<?xml version="1.0" encoding="UTF-8"?>
<jube>
  <benchmark name="scripting_parameter" outpath="bench_run">
    <comment>A scripting parameter example</comment>
    
    <!-- Configuration -->
    <parameterset name="param_set">
      <!-- Normal template -->
      <parameter name="number" type="int">1,2,4</parameter>
      <!-- A template created by a scripting parameter-->
      <parameter name="additional_number" mode="python" type="int">
        ",".join(str(a*${number}) for a in [1,2])
      </parameter>
      <!-- A scripting parameter -->
      <parameter name="number_mult" mode="python" type="float">
        ${number}*${additional_number}
      </parameter>
      <!-- Normal template -->
      <parameter name="text">Number: $number</parameter>
    </parameterset>
    
    <!-- Operation -->
    <step name="operation">
      <use>param_set</use> <!-- use existing parameterset -->
      <!-- shell commands -->
      <do>echo "number: $number, additional_number: $additional_number"</do>
      <do>echo "number_mult: $number_mult, text: $text"</do>
    </step>    
  </benchmark>
</jube>

In this example we see four different parameters.

  • number is a normal template which will be expanded to three different workpackages.
  • additional_number is a scripting parameter which creates a new template and bases on number. The mode is set to the scripting language (python, perl and shell are allowed). The additional type is optional and declares the result type after evaluating the expression. The type is only used by the sort algorithm in the result step. It is not possible to create a template of different scripting parameters. Because of this second template we will get six different workpackages.
  • number_mult is a small calculation. You can use any other existing parameters (which are used inside the same step).
  • text is a normal parameter which uses the content of another parameter. For a simple concatenation parameter you do not need a scripting parameter.

For this example we will find the following output inside the run.log-file:

====== operation ======
>>> echo "number: 1, additional_number: 1"
>>> echo "number_mult: 1, text: Number: 1"
====== operation ======
>>> echo "number: 1, additional_number: 2"
>>> echo "number_mult: 2, text: Number: 1"
====== operation ======
>>> echo "number: 2, additional_number: 2"
>>> echo "number_mult: 4, text: Number: 2"
====== operation ======
>>> echo "number: 2, additional_number: 4"
>>> echo "number_mult: 8, text: Number: 2"
====== operation ======
>>> echo "number: 4, additional_number: 4"
>>> echo "number_mult: 16, text: Number: 4"
====== operation ======
>>> echo "number: 4, additional_number: 8"
>>> echo "number_mult: 32, text: Number: 4"

Implicit Perl or Python scripting inside the <do> or any other position is not possible. If you want to use some scripting expressions you have to create a new parameter.

Scripting pattern

Similar to the Scripting parameter also different patterns, or patterns and parameters can be combined. For this a scripting pattern can be created by using the mode= attribute in the same way as it is used for the Scripting parameter.

All scripting patterns are evaluated at the end of the analyse part. Each scripting pattern is evaluated once. If there are multiple matches as described in the Statistic pattern values section, only the resulting statistical pattern is available (not each individual value). Scripting pattern do not create statistic values by themselves.

In addition the default= attribute can be used to set a default pattern value, if the value can't be found during the analysis.

The files used for this example can be found inside examples/scripting_pattern.

The input file scripting_pattern.xml:

<?xml version="1.0" encoding="UTF-8"?>
<jube>
  <benchmark name="scripting_pattern" outpath="bench_run">
    <comment>A scripting_pattern example</comment>

    <!-- Configuration -->
    <parameterset name="param_set">
      <parameter name="value" type="int">0,1,2</parameter>
    </parameterset>

    <!-- Operation -->
    <step name="operation">
      <use>param_set</use>
      <do>echo "$value"</do>
    </step>

    <!-- Pattern to extract -->
    <patternset name="pattern_set">
      <!-- A normal pattern -->
      <pattern name="value_pat" type="int">$jube_pat_int</pattern>
      <!-- A combination of a pattern and a parameter -->
      <pattern name="dep_pat" type="int" mode="python">$value_pat+$value</pattern>
      <!-- This pattern is not available -->
      <pattern name="missing_pat" type="int">
        pattern_not_available: $jube_pat_int
      </pattern>
      <!-- The combination will fail (create NaN) -->
      <pattern name="missing_dep_pat" type="int" mode="python">
        $missing_pat*$value
      </pattern>
      <!-- Default value for missing pattern -->
      <pattern name="missing_pat_def" type="int" default="0">
        pattern_not_available: $jube_pat_int
      </pattern>
      <!-- Combination of default value and parameter -->
      <pattern name="missing_def_dep_pat" type="int" mode="python">
        $missing_pat_def*$value
      </pattern>
    </patternset>

    <analyser name="analyse">
      <use>pattern_set</use>
      <analyse step="operation">
        <file>stdout</file>
      </analyse>
    </analyser>

    <!-- result table creation -->
    <result>
      <use>analyse</use>
      <table name="result" style="pretty">
        <column>value</column>
        <column>value_pat</column>
        <column>dep_pat</column>
        <column>missing_pat</column>
        <column>missing_dep_pat</column>
        <column>missing_pat_def</column>
        <column>missing_def_dep_pat</column>
      </table>
    </result>
  </benchmark>
</jube>

It will create the following output:

value | value_pat | dep_pat | missing_pat | missing_dep_pat | missing_pat_def | missing_def_dep_pat
------+-----------+---------+-------------+-----------------+-----------------+--------------------
    0 |         0 |       0 |             |             nan |               0 |                   0
    1 |         1 |       2 |             |             nan |               0 |                   0
    2 |         2 |       4 |             |             nan |               0 |                   0

Statistic pattern values

Normally a pattern should only match a single entry in your result files. But sometimes there are multiple similar entries (e.g. if the benchmark uses some iteration feature).

JUBE will create the statistical values last, min, max, avg, std, cnt and sum automatically. To use these values, the user had to specify the pattern name followed by _<statistic_option>, e.g. pattern_name_last (the pattern_name itself will always be the first match).

An example for multiple matches and the statistic values can be found in examples/statistic.

The input file statistic.xml:

<?xml version="1.0" encoding="UTF-8"?>
<jube>
  <benchmark name="reduce_example" outpath="bench_run">
    <comment>A result reduce example</comment>

    <!-- Regex pattern -->
    <patternset name="pattern">
      <pattern name="number_pat" type="int">$jube_pat_int</pattern>
    </patternset>
    
    <!-- Operation -->
    <step name="write_some_numbers">
      <do>echo "1 2 3 4 5 6 7 8 9 10"</do> <!-- shell command -->
    </step>
    
    <!-- Analyse -->
    <analyser name="analyse">
      <use>pattern</use> <!-- use existing patternset -->
      <analyse step="write_some_numbers">
        <file>stdout</file> <!-- file which should be scanned -->
      </analyse>
    </analyser>
    
    <!-- Create result table -->
    <result>
      <use>analyse</use> <!-- use existing analyser -->
      <table name="result" style="pretty">
        <column>number_pat</column> <!-- first match -->
        <column>number_pat_last</column> <!-- last match -->
        <column>number_pat_min</column> <!-- min of all matches -->
        <column>number_pat_max</column> <!-- max of all matches -->
        <column>number_pat_sum</column> <!-- sum of all matches -->
        <column>number_pat_cnt</column> <!-- number of matches -->
        <column>number_pat_avg</column> <!-- avg of all matches -->
        <column format=".2f">number_pat_std</column> <!-- std of all matches -->
      </table>
    </result>
  </benchmark>
</jube>

It will create the following output:

number_pat | number_pat_last | number_pat_min | number_pat_max | number_pat_sum | number_pat_cnt | number_pat_avg | number_pat_std
-----------+-----------------+----------------+----------------+----------------+----------------+----------------+---------------
         1 |              10 |              1 |             10 |             55 |             10 |            5.5 |           3.03

Jobsystem

In most cases you want to submit jobs by JUBE to your local jobsystem. You can use the normal file access and substitution system to prepare your jobfile and send it to the jobsystem. JUBE also provide some additional features.

The files used for this example can be found inside examples/jobsystem.

The input jobsystem file job.run.in for Torque/Moab (you can easily adapt your personal jobscript):

#!/bin/bash -x
#MSUB -l nodes=#NODES#:ppn=#PROCS_PER_NODE#
#MSUB -l walltime=#WALLTIME#
#MSUB -e #ERROR_FILEPATH#
#MSUB -o #OUT_FILEPATH#
#MSUB -M #MAIL_ADDRESS#
#MSUB -m #MAIL_MODE#

### start of jobscript

#EXEC#
touch #READY#

The JUBE input file jobsystem.xml:

<?xml version="1.0" encoding="UTF-8"?>
<jube>
  <benchmark name="jobsystem" outpath="bench_run">
    <comment>A jobsystem example</comment>
    
    <!-- benchmark configuration -->
    <parameterset name="param_set">
      <parameter name="number" type="int">1,2,4</parameter>
    </parameterset>
    
    <!-- Job configuration -->
    <parameterset name="executeset">
      <parameter name="submit_cmd">msub</parameter>
      <parameter name="job_file">job.run</parameter>
      <parameter name="nodes" type="int">1</parameter>
      <parameter name="walltime">00:01:00</parameter>
      <parameter name="ppn" type="int">4</parameter>      
      <parameter name="ready_file">ready</parameter>
      <parameter name="mail_mode">abe</parameter>
      <parameter name="mail_address"></parameter>
      <parameter name="err_file">stderr</parameter>
      <parameter name="out_file">stdout</parameter>
      <parameter name="exec">echo $number</parameter>
    </parameterset>
    
    <!-- Load jobfile -->
    <fileset name="files">
      <copy>${job_file}.in</copy>
    </fileset>
    
    <!-- Substitute jobfile -->
    <substituteset name="sub_job">
      <iofile in="${job_file}.in" out="$job_file" />
      <sub source="#NODES#" dest="$nodes" />
      <sub source="#PROCS_PER_NODE#" dest="$ppn" />
      <sub source="#WALLTIME#" dest="$walltime" />
      <sub source="#ERROR_FILEPATH#" dest="$err_file" />
      <sub source="#OUT_FILEPATH#" dest="$out_file" />
      <sub source="#MAIL_ADDRESS#" dest="$mail_address" />
      <sub source="#MAIL_MODE#" dest="$mail_mode" />
      <sub source="#EXEC#" dest="$exec" />
      <sub source="#READY#" dest="$ready_file" />
    </substituteset> 
         
    <!-- Operation -->
    <step name="submit" work_dir="$$WORK/jobsystem_bench_${jube_benchmark_id}_${jube_wp_id}" >
      <use>param_set</use>
      <use>executeset</use>
      <use>files,sub_job</use>
      <do done_file="$ready_file">$submit_cmd $job_file</do> <!-- shell command -->
    </step>    
  </benchmark>
</jube>

As you can see the jobfile is very general and several parameters will be used for replacement. By using a general jobfile and the substitution mechanism you can control your jobsystem directly out of your JUBE input file.

The submit command is a normal Shell command so there are no special JUBE tags to submit a job.

There are two new attributes:

  • done_file inside the <do> allows you to set a filename/path to a file which should be used by the jobfile to mark the end of execution. JUBE does not know when the job ends. Normally it will return when the Shell command was finished. When using a jobsystem we had to wait until the jobfile was executed. If JUBE found a <do> containing a done_file attribute JUBE will return directly and will not continue automatically until the done_file exists. If you want to check the current status of your running steps and continue the benchmark process if possible you can type:

    >>> jube continue bench_run
    

    This will continue your benchmark execution (bench_run is the benchmarks directory in this example). The position of the done_file is relativly seen towards the work directory.

  • work_dir can be used to change the sandbox work directory of a step. In normal cases JUBE checks that every work directory gets a unique name. When changing the directory the user must select a unique name by his own. For example he can use $jube_benchmark_id and $jube_wp_id, which are JUBE internal parameters and will be expanded to the current benchmark and workpackage ids. Files and directories out of a given <fileset> will be copied into the new work directory. Other automatic links, like the dependency links, will not be created!

You will see this Output after running the benchmark:

stepname | all | open | wait | error | done
---------+-----+------+------+-------+-----
  submit |   3 |    0 |    3 |     0 |    0

and this output after running the continue command (after the jobs where executed):

stepname | all | open | wait | error | done
---------+-----+------+------+-------+-----
  submit |   3 |    0 |    0 |     0 |    3

You have to run continue multiple times if not all done_file were written when running continue for the first time.

Include external data

As you have seen in the example before a benchmark can become very long. To structure your benchmark you can use multiple files and reuse existing sets. There are three different include features available.

The files used for this example can be found inside examples/include.

The include file include_data.xml:

<?xml version="1.0" encoding="UTF-8"?>
<jube>
  <parameterset name="param_set">
    <parameter name="number" type="int">1,2,4</parameter>
  </parameterset>
    
  <parameterset name="param_set2">
    <parameter name="text">Hello</parameter>
  </parameterset>
    
  <dos>
    <do>echo Test</do>
    <do>echo $number</do>
  </dos>
</jube>

All files which contain data to be included must use the XML-format. The include files can have a user specific structure (there can be none valid JUBE tags like <dos>), but the structure must be allowed by the searching mechanism (see below). The resulting file must have a valid JUBE structure.

The main file main.xml:

<?xml version="1.0" encoding="UTF-8"?>
<jube>
  <benchmark name="include" outpath="bench_run">
    <comment>A include example</comment>
    
    <!-- use parameterset out of an external file and add a additional parameter -->
    <parameterset name="param_set" init_with="include_data.xml">
      <parameter name="foo">bar</parameter>
    </parameterset>
    
    <!-- Operation -->
    <step name="say_hello">
      <use>param_set</use> <!-- use existing parameterset -->
      <use from="include_data.xml">param_set2</use> <!-- out of an external file -->
      <do>echo $foo</do> <!-- shell command -->
      <include from="include_data.xml" path="dos/do" /> <!-- include all available tag -->
    </step>    
  </benchmark>
</jube>

In these file there are three different include types:

The init_with can be used inside any set definition. Inside the given file the search mechanism will search for the same set (same type, same name), will parse its structure (this must be JUBE valid) and copy the content to main.xml. Inside main.xml you can add additional values or overwrite existing ones. If your include-set uses a different name inside your include file you can use init_with="filename.xml:new_name".

The second method is the <use from="...">. This is mostly the same like the init_with structure, but in this case you are not able to add or overwrite some values. The external set will be used directly. There is no set-type inside the <use>, because of that, the set's name must be unique inside the include-file.

The last method is the most generic include. By using <include /> you can copy any XML-nodes you want to your main-XML file. The included file can provide tags which are not JUBE-conform but it must be a valid XML-file (e.g. only one root node allowed). The resulting main configuration file must be completely JUBE valid. The path is optional and can be used to select a specific node set (otherwise the root-node itself will be included). The <include /> is the only include-method that can be used to include any tag you want. The <include /> will copy all parts without any changes. The other include types will update path names, which were relative to the include-file position.

To run the benchmark you can use the normal command:

>>> jube run main.xml

It will search for include files inside four different positions (in the following order):

  • inside a directory given over the command line interface:

    >>> jube run --include-path some_path another_path -- main.xml
    
  • inside any path given by an <include-path>-tag:

    1
    2
    3
    4
    5
    6
    7
    <?xml version="1.0" encoding="UTF-8"?>
    <benchmarks>
      <include-path>
        <path>some_path</path>
        <path>another_path</path>
      </include-path>
      ...
    
  • inside any path given with the JUBE_INCLUDE_PATH environment variable (see Configuration):

    >>> export JUBE_INCLUDE_PATH=some_path:another_path
    
  • inside the same directory of your main.xml

Tagging

Tagging is an easy way to hide selectable parts of your input file.

The files used for this example can be found inside examples/tagging.

The input file tagging.xml:

<?xml version="1.0" encoding="UTF-8"?>
<jube>
  <benchmark name="tagging" outpath="bench_run">
    <comment>Tags as logical combination</comment>

    <!-- Configuration -->
    <parameterset name="param_set">
      <parameter name="hello_str" tag="!deu+eng">Hello</parameter>
      <parameter name="hello_str" tag="deu|!eng">Hallo</parameter>
      <parameter name="world_str" tag="eng">World</parameter>
    </parameterset>
    
    <!-- Operation -->
    <step name="say_hello">
      <use>param_set</use> <!-- use existing parameterset -->
      <do>echo '$hello_str $world_str'</do> <!-- shell command -->
    </step>    
  </benchmark>
</jube>

When running this example:

>>> jube run tagging.xml

all <tags> which contain a special tag="..." attribute will be hidden if the tag results to false. !deu stands for not deu. To connect the tags | can be used as the oprator OR and + for the operator AND. Also brackets are allowed.

The result (if no tag is set on the commandline) inside the stdout file will be

Hallo $world_str

because !deu+eng and eng will be false and there is no other input available for $world_str. deu|!eng will be true.

When running the same example using a specific tag:

>>> jube run tagging.xml --tag eng

the result inside the stdout file will be

Hello World

A tag which results to false will trigger to complety ignore the corresponding <tag>! If there is no alternative this can produce a wrong execution behaviour!

Also a list of tags, separated by spaces, can be provided on the commandline.

The tag attribute can be used inside every <tag> inside the input file (except the <jube>).

Platform independent benchmarking

If you want to create platform independent benchmarks you can use the include features inside of JUBE.

All platform related sets must be declared in an includable file e.g. platform.xml. There can be multiple platform.xml in different directories to allow different platforms. By changing the include-path the benchmark changes its platform specific data.

An example benchmark structure bases on three include files:

  • The main benchmark include file which contain all benchmark specific but platform independent data
  • A mostly generic platform include file which contain benchmark independent but platform specific data (this can be created once and placed somewhere central on the system, it can be easily accessed using the JUBE_INCLUDE_PATH environment variable.
  • A platform specific and benchmark specific include file which must be placed in a unique directory to allow inlcude-path usage

Inside the platform directory you will find some example benchmark independent platform configuration files for the supercomputers at Forschungszentrum Jülich.

To avoid writing long include-paths every time you run a platform independent benchmark, you can store the include-path inside your input file. This can be mixed using the tagging-feature:

1
2
3
4
5
6
7
8
9
<?xml version="1.0" encoding="UTF-8"?>
<jube>
  <include-path>
    <path tag="plat1">some path</path>
    <path tag="plat2">another path</path>
    ...
  </include-path>
  ...
</jube>

Now you can run your benchmark using:

>>> jube run filename.xml --tag plat1

Multiple benchmarks

Often you only have one benchmark inside your input file. But it is also possible to store multiple benchmarks inside the same input file:

1
2
3
4
5
6
<?xml version="1.0" encoding="UTF-8"?>
<jube>
  <benchmark name="a" outpath="bench_runs">...</benchmark>
  <benchmark name="b" outpath="bench_runs">...</benchmark>
  ...
</jube>

All benchmarks can use the same global (as a child of <jube>) declared sets. Often it might be better to use an include feature instead. JUBE will run every benchmark in the given order. Every benchmark gets a unique benchmark id.

To select only one benchmark you can use:

>>> jube run filename.xml --only-bench a

or:

>>> jube run filename.xml --not-bench b

This information can also be stored inside the input file:

1
2
3
4
5
6
7
8
<?xml version="1.0" encoding="UTF-8"?>
<jube>
  <selection>
    <only>a</only>
    <not>b</not>
  </selection>
  ...
</jube>

Shared operations

Sometimes you want to communicate between the different workpackages of a single step or you want a single operation to run only once for all workpackages. Here you can use shared steps.

The files used for this example can be found inside examples/shared.

The input file shared.xml:

<?xml version="1.0" encoding="UTF-8"?>
<jube>
  <benchmark name="shared" outpath="bench_run">
    <comment>A shared folder example</comment>
    
    <!-- Configuration -->
    <parameterset name="param_set">
      <parameter name="number" type="int">1,2,4</parameter>
    </parameterset>
    
    <!-- Operation -->
    <step name="a_step" shared="shared">
      <use>param_set</use>
      <!-- shell command will run three times -->
      <do>echo $jube_wp_id >> shared/all_ids</do>
      <!-- shell command will run one time -->
      <do shared="true">cat all_ids</do>
    </step>    
  </benchmark>
</jube>

The step must be marked using the shared attribute. The name, given inside this attribute, will be the name of a symbolic link, which will be created inside every single sandbox work directory pointing to a single shared folder. Every Workpackage can access this folder by using its own link. In this example every workpackage will write its own id into a shared file ($jube_wp_id is an internal variable, more of these you will find here).

To mark an operation to be a shared operation shared="true" inside the <do> must be used. The shared operation will start after all workpackages reached its execution position. The work directory for the shared operation is the shared folder itself.

You will get the following directory structure:

bench_run               # the given outpath
|
+- 000000               # the benchmark id
   |
   +- configuration.xml # the stored benchmark configuration
   +- workpackages.xml  # workpackage information
   +- 000000_a_step     # the first workpackage
      |
      +- done           # workpackage finished marker
      +- work           # user sandbox folder
         |
         +- stderr      # standard error messages of used shell commands
         +- stdout      # standard output of used shell commands
         +- shared      # symbolic link pointing to shared folder
   +- 000001_a_step     # workpackage information
   +- 000002_a_step     # workpackage information
   +- a_step_shared     # the shared folder
      |
      +- stdout         # standard output of used shell commands
      +- stderr         # standard error messages of used shell commands
      +- all_ids        # benchmark specific generated file

Environment handling

Shell environment handling can be very important to configure paths or parameter of your program.

The files used for this example can be found inside examples/environment.

The input file environment.xml:

<?xml version="1.0" encoding="UTF-8"?>
<jube>
  <benchmark name="environment" outpath="bench_run">
    <comment>An environment handling example</comment>
    
    <!-- Configuration -->
    <parameterset name="param_set">
      <parameter name="EXPORT_ME" export="true">VALUE</parameter>
    </parameterset>
    
    <!-- Operations -->
    <step name="first_step" export="true">
      <do>export SHELL_VAR=Hello</do> <!-- export a Shell var -->
      <do>echo "$$SHELL_VAR world"</do><!-- use exported Shell var --> 
    </step>
    
    <!-- Create a dependency between both steps -->
    <step name="second_step" depend="first_step">
      <use>param_set</use>
      <do>echo $$EXPORT_ME</do>
      <do>echo "$$SHELL_VAR again"</do> <!-- use exported Shell var out of privious step -->
    </step>    
  </benchmark>
</jube>

In normal cases all <do> within one <step> shares the same environment. All exported variables of one <do> will be available inside the next <do> within the same <step>.

By using export="true" inside of a <parameter> you can export additional variables to your Shell environment. Be aware that this example uses $$ to explicitly use Shell substitution instead of JUBE substitution.

You can also export the complete environment of a step to a dependent step by using export="true" inside of <step>.

Parameter dependencies

Sometimes you need parameters which based on other parameters or only a specific parameter combination make sense and other combinations are useless or wrong. For this there are several techniques inside of JUBE to create such a more complex workflow.

The files used for this example can be found inside examples/parameter_dependencies.

The input file parameter_dependencies.xml:

<?xml version="1.0" encoding="UTF-8"?>
<jube>
  <benchmark name="parameter_dependencies" outpath="bench_run">
    <comment>A parameter_dependencies example</comment>

    <!-- Configuration -->
    <parameterset name="param_set">
      <parameter name="index" type="int">0,1</parameter>
      <parameter name="text" mode="python">["hello","world"][$index]</parameter>
    </parameterset>

    <parameterset name="depend_param_set0">
      <parameter name="number" type="int">3,5</parameter>
    </parameterset>

    <parameterset name="depend_param_set1">
      <parameter name="number" type="int">1,2,4</parameter>
    </parameterset>

    <!-- Operation -->
    <step name="operation">
      <use>param_set</use> <!-- use basic parameterset -->
      <use>depend_param_set$index</use> <!-- use dependent parameterset -->
      <use from="include_file.xml:depend_param_set0:depend_param_set1">
        depend_param_set$index
      </use>
      <do>echo "$text $number $number2"</do>
    </step>
  </benchmark>
</jube>

The include file include_file.xml:

<?xml version="1.0" encoding="UTF-8"?>
<jube>
  <parameterset name="depend_param_set0">
    <parameter name="number2" type="int">10</parameter>
  </parameterset>

  <parameterset name="depend_param_set1">
    <parameter name="number2" type="int">20</parameter>
  </parameterset>
</jube>

The easiest way to handle dependencies is to define an index-parameter which can be used in other scripting parameters to combine all dependent parameter combinations.

Also complete sets can be marked as dependent towards a specific parameter by using this parameter in the <use>-tag. When using parametersets out of an other file the correct set-name must be given within the from attribute, because these sets will be loaded in a pre-processing step before the corresponding parameter will be evaluated. Also sets out of different files can be combined within the same <use> by using the file1:set1,file2:set2 syntax. The sets names must be unique.

Parameter update

Once a parameter is specified and evaluated the first time, its value will not change. Sometimes this behaviour can produce the wrong behaviour:

<parameter name="foo">$jube_wp_id</parameter>

In this example foo should hold the $jube_wp_id. If you have two steps, where one step depends on the other one foo will be available in both, but it will only be evaluated in the first one.

There is a simple work-around to change the update behaviour of a parameter by using the attribute update_mode:

  • update_mode="never" No update (default behaviour)
  • update_mode="use" Re-evaluate the parameter if the parameterset is explicitly used
  • update_mode="step" Re-evaluate the parameter for each new step
  • update_mode="cycle" Re-evaluate the parameter for each new cycleloop, but not at the begin of a new step
  • update_mode="always" Combine step and cycle

Within a cycle loop no new workpackages can be created. Templates will be reevaluated, but they can not increase the number of existing workpackages within a cycle.

Within the result generation, the parameter value, which is presented in the result table is the value of the selected analysed step. If another parameter representation is needed as well, all other steps can be reached by using <parameter_name>_<step_name>.

The files used for this example can be found inside examples/parameter_update.

The input file parameter_update.xml:

<?xml version="1.0" encoding="UTF-8"?>
<jube>
  <benchmark name="parameter_updates" outpath="bench_run">
    <comment>A parameter_dependencies example</comment>

    <!-- Configuration -->
    <parameterset name="foo">
      <parameter name="bar_never" mode="text" update_mode="never">
        iter_never: $jube_wp_id
      </parameter>
      <parameter name="bar_use" mode="text" update_mode="use">
        iter_use: $jube_wp_id
      </parameter>
      <parameter name="bar_step" mode="text" update_mode="step">
        iter_step: $jube_wp_id
      </parameter>
    </parameterset>

    <!-- Operation -->
    <step name="step1">
      <use>foo</use>
      <do>echo $bar_never</do>
      <do>echo $bar_use</do>
      <do>echo $bar_step</do>
    </step>

    <step name="step2" depend="step1">
      <use>foo</use>
      <do>echo $bar_never</do>
      <do>echo $bar_use</do>
      <do>echo $bar_step</do>
    </step>

    <step name="step3" depend="step2">
      <do>echo $bar_never</do>
      <do>echo $bar_use</do>
      <do>echo $bar_step</do>
    </step>
  </benchmark>
</jube>

The use and influence of the three update modes update_mode="never", update_mode="use" and update_mode="step" is shown here. Keep in mind, that the steps have to be dependent from each other leading to identical outputs otherwise.

Step iteration

Especially in the context of benchmarking an application should be executed multiple times to generate some meaningful statistical values. The handling of statistical values is described in Statistic pattern values. This allows you to aggregate multiple result lines if your application automatically support to run multiple times.

In addition there is also an iteration feature within JUBE to run a specific step and its parametrisation multiple times.

The files used for this example can be found inside examples/iterations.

The input file iterations.xml:

<?xml version="1.0" encoding="UTF-8"?>
<jube>
  <benchmark name="iterations" outpath="bench_run">
    <comment>A Iteration example</comment>

    <!-- Configuration -->
    <parameterset name="param_set">
      <parameter name="foo" type="int">1,2,4</parameter>
      <parameter name="bar" mode="text" update_mode="step">$foo iter:$jube_wp_iteration</parameter>
    </parameterset>

    <step name="first_step" iterations="2">
      <use>param_set</use> <!-- use existing parameterset -->
      <do>echo $bar</do> <!-- shell command -->
    </step>

    <step name="second_step" depend="first_step" iterations="2">
      <do>echo $bar</do> <!-- shell command -->
    </step>

    <!-- analyse without reduce -->
    <analyser name="analyse_no_reduce" reduce="false">
      <analyse step="second_step" />
    </analyser>

    <!-- Analyse with reduce -->
    <analyser name="analyse" reduce="true">
      <analyse step="second_step" />
    </analyser>

    <result>
      <use>analyse</use>
      <use>analyse_no_reduce</use>
      <table name="result" style="pretty">
        <column>jube_res_analyser</column>
        <column>jube_wp_id_first_step</column>
        <column>jube_wp_id</column>
        <column>jube_wp_iteration_first_step</column>
        <column>jube_wp_iteration</column>
        <column>foo</column>
      </table>
    </result>
  </benchmark>
</jube>

In this example either step 1 as well as step 2 are executed 2 times for each parameter and dependency configuration. Because of the given parameter step 1 is executed 6 times in total (3 parameter combinations x 2). Step 2 is executed 12 times (6 from the dependent step x 2). Each run will be executed in the normal way using its individual sandbox folder.

$jube_wp_iteration holds the individual iteration id. The update_mode is needed here to reevaluate the parameter bar in step 2.

In the analyser reduce=true or reduce=false can be enabled, to allow you to see all individual results or to aggregate all results of the same parameter combination. for the given step. If reduce=true is enabled (the default behaviour) the output of the individual runs, which uses the same parametrisation, are treated like a big continuous file before applying the statistical patterns.

jube_res_analyser | jube_wp_id_first_step | jube_wp_id | jube_wp_iteration_first_step | jube_wp_iteration | foo
------------------+-----------------------+------------+------------------------------+-------------------+----
analyse_no_reduce |                     0 |          6 |                            0 |                 0 |   1
analyse_no_reduce |                     0 |          7 |                            0 |                 1 |   1
analyse_no_reduce |                     1 |          8 |                            1 |                 2 |   1
analyse_no_reduce |                     1 |          9 |                            1 |                 3 |   1
analyse_no_reduce |                     2 |         10 |                            0 |                 0 |   2
analyse_no_reduce |                     2 |         11 |                            0 |                 1 |   2
analyse_no_reduce |                     3 |         12 |                            1 |                 2 |   2
analyse_no_reduce |                     3 |         13 |                            1 |                 3 |   2
analyse_no_reduce |                     4 |         14 |                            0 |                 0 |   4
analyse_no_reduce |                     4 |         15 |                            0 |                 1 |   4
analyse_no_reduce |                     5 |         16 |                            1 |                 2 |   4
analyse_no_reduce |                     5 |         17 |                            1 |                 3 |   4
          analyse |                     5 |         16 |                            1 |                 2 |   4
          analyse |                     0 |          7 |                            0 |                 1 |   1
          analyse |                     1 |          8 |                            1 |                 2 |   1
          analyse |                     2 |         10 |                            0 |                 0 |   2
          analyse |                     3 |         12 |                            1 |                 2 |   2
          analyse |                     4 |         15 |                            0 |                 1 |   4

Step cycle

Instead of having a new workpackage you can also redo the <do> commands inside a step using the cycle-feature.

The files used for this example can be found inside examples/cycle.

The input file cycle.xml:

<?xml version="1.0" encoding="UTF-8"?>
<jube>
  <benchmark name="cycle" outpath="bench_run">
    <comment>A cycle example</comment>

    <step name="a_step" cycles="5">
      <do break_file="done">echo $jube_wp_cycle</do>
      <do active="$jube_wp_cycle==2">touch done</do>
    </step>

  </benchmark>
</jube>

The cycles attribute allows to repeat all <do> commands within a step multiple times. The break_file can be used to cancel the loop and all following commands in the current cycle (the command itself is still executed). In the given example the output will be:

0
1
2
3

In contrast to the iterations, all executions for the cycle feature take place inside of the same folder.