Examples of using CUBE c-writer library

Present example shows in several short steps the main idea of creating a cube file using C writer library.

In this example we do not show the optimization, which is needed to prevent unnecessary seeks while writing.

Commented source

Include standard c header

#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
#include <string.h>

Include CUBE headers.

Notice, that CUBE4 c-writer headers got an prefix cubew_XXX.h

#include "cubew_cube.h"

Start main and define a name of the cube file. Extension ".cubex" will be append automatically.

int main(int argc, char* argv[])
char cubefile[12] = "simple-cube";

Create the structure of the cube. CUBE_MASTER defines, that in parallel MPI environment this cube (usually rank 0) writes all parts of the cube (anchor, indexes and data). The last argument is ignored in current version of CUBE c-writer.

cube_t* cube=cube_create("example", CUBE_MASTER, CUBE_FALSE);
if (!cube) {
fprintf(stderr,"cube_create failed!\n");

Specify general properties of cube object.

cube_def_mirror(cube, "http://icl.cs.utk.edu/software/kojak/");
cube_def_mirror(cube, "http://www.fz-juelich.de/jsc/kojak/");
cube_def_attr(cube, "description", "a simple example");
cube_def_attr(cube, "experiment time", "November 1st, 2004");
cube_set_statistic_name(cube, "mystat");

Now we start to define the dimensions of the cube.

First we define metric dimension. Notice, that metrics build a tree and parents have to be declared before their children.

Every metric can be of two kinds: inclusive or exclusive.

Every metric needs a display name, an unique name, type of values (INTEGER, DOUBLE, MAXDOUBLE, MINDOUBLE, others), units of measurement, value (usually empty string), URL, where one can find the documentation about this metric, description and its parent in the metric tree.

The cube returns a pointer on structure cube_metric, which has to be used for saving or reading values from the cube.

cube_metric *met0, *met1, *met2;
met0 = cube_def_met(cube, "Time", "Uniq_name1", "FLOAT", "sec", "",
met1 = cube_def_met(cube, "User time", "Uniq_name2", "FLOAT", "sec", "",
"2nd level", met0, CUBE_METRIC_INCLUSIVE);
met2 = cube_def_met(cube, "System time", "Uniq_name3", "INTEGER", "sec", "",
"2nd level", met0, CUBE_METRIC_EXCLUSIVE);

Then we define the calltree dimension. This dimension gets defined in a two-step way:

  1. One defines a list of regions in the instrumented source code;
  2. One builds a call tree with the regions defined in the previous step.

First one defines the regions.

Every region has a name, start and end line, URL with the documentation of the region, description and source file (module). Regions build a list, therefore no "parent-child" relation is given.

The cube returns a pointer on structure cube_region, which can be used later for the calculations, visualization or access to the data.

char* mod = "/ICL/CUBE/example.c";
cube_region *regn0, *regn1, *regn2;
regn0 = cube_def_region(cube, "main", 21, 100, "", "1st level", mod);
regn1 = cube_def_region(cube, "<<init>>foo", 1, 10, "", "2nd level", mod);
regn2 = cube_def_region(cube, "<<loop>>bar", 11, 20, "", "2nd level", mod);

Then one defines an actual dimension, the call tree dimension.

Call tree consists of so called CNODEs. Cnode stands for "call path".

Every cnode gets as a parameter a region, source file (module), its id and parent cnode (caller).

Parent cnodes have to be defined before their children. Region might be entered from different places in the program, therefore different cnodes might have same region as a parameter.

cube_cnode *cnode0, *cnode1, *cnode2;
cnode0 = cube_def_cnode_cs(cube, regn0, mod, 21, NULL);
cnode1 = cube_def_cnode_cs(cube, regn1, mod, 60, cnode0);
cnode2 = cube_def_cnode_cs(cube, regn2, mod, 80, cnode0);

CUBE4 supports two kind of parameters of a cnode: numeric and string parameter. Every cnode can carry any number of both of them.

cube_cnode_add_numeric_parameter(cnode0, "Phase", 1);
cube_cnode_add_numeric_parameter(cnode0, "Phase", 2);
cube_cnode_add_string_parameter(cnode0, "Iteration", "Initialization");
cube_cnode_add_string_parameter(cnode2, "Etappe", "Finish");

Thelast dimension is the system tree dimension. Currently CUBE defines the system dimension with the fixed hierarchy: MACHINE $\rightarrow$ NODES $\rightarrow$ PROCESSES $\rightarrow$ THREADS

It leads to the fixed sequence of calls in the system dimension definition:

  1. First one creates a root for the system dimension : cube_machine. Machine has a name and description.
  2. Machine consists of cube_nodes. Every cube_node has a name and a cube_machine as a parent.
  3. On every cube_node run several cube_processes (as many cores are available). cube_process has a name, MPI rank and cube_node as a parent.
  4. Every cube_process spawns several (one or more) cube_threads (OMP, Pthreads, Java Threads). cube_thread has a name, its rank and cube_process as a parent.

The cube returns a pointer on cube_machine, cube_node, cube_process or cube_thread, which has to be used later to define further level in the system tree or to access the data in the cube.

cube_machine* mach = cube_def_mach(cube, "MSC<<juelich>>", "");
cube_node* node = cube_def_node(cube, "Athena<<juropa>>", mach);
cube_process* proc0 = cube_def_proc(cube, "Process 0<<master>>", 0, node);
cube_process* proc1 = cube_def_proc(cube, "Process 1<<worker>>", 1, node);
cube_thread* thrd0 = cube_def_thrd(cube, "Thread 0<<iterator>>", 0, proc0);
cube_thread* thrd1 = cube_def_thrd(cube, "Thread 1<<solver>>", 1, proc1);

CUBE can carry a set of so called "topologies": mappings THREAD $\rightarrow$ (x, y, z, ...)

Then the GUI is used to visualize every value (cube_metric, cube_cnode, cube_thread) for selected metric and cnode as a 1D, 2D or 3D set of points with the different colors.

First one specifies a number of dimensions (any number is supported), a vector with the sizes in every dimension and its periodicity and creates a structure of type cube_cartesian

long dimv0[NDIMS] = { 5, 5 };
int periodv0[NDIMS] = { 1, 0 };
cube_cartesian* cart0 = cube_def_cart(cube, NDIMS, dimv0, periodv0);
cube_cart_set_name(cart0, "Application Topology 1");

The coordinates are defined like a vector and create a mapping.

long coordv[NDIMS] = { 0, 0};
cube_def_coords(cube, cart0, thrd1, coordv);
long dimv1[NDIMS] = { 3, 3 };
int periodv1[NDIMS] = { 1, 0 };
cube_cartesian* cart1 = cube_def_cart(cube, NDIMS, dimv1, periodv1);
cube_cart_set_name(cart1, "MPI Topology 3");
long coordv0[NDIMS] = { 0, 1 };
long coordv1[NDIMS] = { 1, 0 };
cube_def_coords(cube, cart1, thrd0, coordv0);
cube_def_coords(cube, cart1, thrd1, coordv1);

The same way one can create any number of topologies. They are shown in the GUI.

long dimv2[4] = { 3, 3, 3, 3 };
int periodv2[4] = { 1, 0, 0, 0 };
cube_cartesian* cart2 = cube_def_cart(cube, 4, dimv2, periodv2);
long coordv20[4] = { 0, 1, 0, 0 };
long coordv21[4] = { 1, 0, 0 ,0 };
cube_def_coords(cube, cart2, thrd0, coordv20);
cube_def_coords(cube, cart2, thrd1, coordv21);
long dimv3[14] = { 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3 };
int periodv3[14] = { 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 };
cube_cartesian* cart3 = cube_def_cart(cube, 14, dimv3, periodv3);
long coordv32[14] = { 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2 };
long coordv33[14] = { 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 };
cube_def_coords(cube, cart3, thrd0, coordv32);
cube_def_coords(cube, cart3, thrd1, coordv33);

Once the dimensions are defined, one fills the cube object with the data. Definition of topologies can be done after filling the cube.

Every data value is specified by three "coordinates": ( cube_metric, cube_cnode, cube_thread)

Note, that cube_machine, cube_node and cube_process are not a "coordinate". They are used only to build up the physical construction of the machine.

Actual writing is done metric-wise and row-wise. First all values of one metric written, then the next metric and so on. No mixing of metrics in this sequence is allowed.

Cube writes data row-wise. It means, for a given cnode, one has to provide an array of values, written in the order of threads in the system dimension.

double sev1[2];
cube_write_sev_row(cube, met0, cnode2, sev1);
cube_write_sev_row(cube, met0, cnode1, sev1);
cube_write_sev_row(cube, met0, cnode0, sev1);
cube_write_sev_row(cube, met1, cnode0, sev1);
uint64_t sev2[2];
cube_write_sev_row(cube, met2, cnode2, sev2);
printf("Test file %s complete.\n", cubefile);
return 0;

Scalasca     Copyright © 1998–2015 Forschungszentrum Jülich, Jülich Supercomputing Centre