SIONlib: Scalable I/O library for parallel access to task-local files

[go_db] Templates for parallel I/O with SIONlib
[go_db] Sample parallel MPI program for I/O with SIONlib
[go_db] Serial program for SIONlib file access
[go_db] Other examples

Templates for parallel I/O with sionlib

All templates can also be found in the distribution package under the PATH:
sionlib/examples

Parallel write:

sid=sion_paropen_mpi( ... ,chunksize, comm, &fileptr, ...)  # collective



loop: {

        sion_ensure_free_space(sid,nbytes);                 # non-collective

        fwrite(data,1,nbytes,fileptr)

      }


sion_parclose_mpi(sid);                                     # collective

- the call to sion_ensure_free_space can be omitted, if it is guaranteed that
not more bytes as specified in chunksize are written to the file.
- chunksize can be different on different tasks
- if the data does not fit in the current chunk, sion_ensure_free_space would
assign a new chunk in the file to the task and also advance the file pointer to
the new position. This operation is non-collective; all information about the
locally used chunks and the number of bytes written to the chunks are buffered
in memory and stored in the sion file in the collective sion_parclose_mpi.
- one parameter of the open function is an MPI communicator, which allows
opening a sion file from a subset of all MPI tasks


Parallel read:

sid=sion_paropen_mpi( ... ,&chunksize, comm, &fileptr, ...) # collective



while((!sion_feof(sid))) {                                  # non-collective

      btoread=sion_bytes_avail_in_block(sid);               # non-collective

      bread=fread(localbuffer,1,btoread,fileptr);

}


sion_parclose_mpi(sid);                                     # collective

- it must be ensured that the number of bytes read from a chunk is not larger
as the number of bytes written to it. For this the function
sion_bytes_avail_in_block provides this number for the current chunk.

- if all bytes of a chunk are already read and there are more chunks available
for this task, sion_feof will advance the filepointer to the start position of
the next chunk in the sion file

- sion_paropen_mpi is collective. The meta data will read only by one task and
broadcasted to all other tasks. It is also possible to open/close the sion file
on each with serial sion functions without collective operations, see function
sion_open_rank.


Parallel read without collective operation:

sid=sion_open_rank( ... ,&chunksize, rank, &fileptr, ...)   # non-collective



while((!sion_feof(sid))) {                                  # non-collective

      btoread=sion_bytes_avail_in_block(sid);               # non-collective

      bread=fread(localbuffer,1,btoread,fileptr);

}


sion_close(sid);                                            # non-collective

- the meta-information of the sion file will be read from each task. This is a
parallel access to same filesystem blocks, however this are only read accesses
which should not be lock the filesystem block.

[go_db] top of page

Sample parallel MPI program for I/O with SIONlib

This simple parallel program can be found under the PATH:
sionlib/examples/simple
This directory contain also Makefiles to build the program.

...

#include "sion.h"

#define FNAMELEN 255

#define BUFSIZE (1024*1024)



int main(int argc, char **argv)

{



  int rank, size, globalrank, sid, i, numFiles;

  char fname[FNAMELEN], *newfname=NULL;

  int ;

  MPI_Comm gComm, lComm;

  sion_int64 chunksize,left;

  sion_int32 fsblksize;

  size_t btoread, bread, bwrote;

  char *localbuffer;

  FILE *fileptr;



  MPI_Init(&argc, &argv);

  MPI_Comm_size(MPI_COMM_WORLD, &size);

  MPI_Comm_rank(MPI_COMM_WORLD, &rank);



  /* allocate and initalize a buffer  */

  localbuffer = (char *) malloc(BUFSIZE);

  srand(time(NULL));

  for (i = 0; i < BUFSIZE; i++) localbuffer[i] = (char) rand() % 256;



  /* inital parameters */

  strcpy(fname, "parfile.sion");

  numFiles   = 1;

  gComm=lComm= MPI_COMM_WORLD;

  chunksize  = 10*1024*1024;

  fsblksize  = 1*1024*1024;

  globalrank = rank;



  /* write */

  sid = sion_paropen_mpi(fname, "bw", &numFiles, gComm, &lComm,

                         &chunksize, &fsblksize, &globalrank, &fileptr, &newfname);

  left=BUFSIZE;

  while (left > 0) {

    sion_ensure_free_space(sid, left);

    bwrote = fwrite(localbuffer, 1, left, fileptr);

    left -= bwrote;

  }

  sion_parclose_mpi(sid);



  printf("Task %02d: wrote sionfile -> %s\n",rank,newfname);





  /* read */

  sid=sion_paropen_mpi(fname,"br",&numFiles,MPI_COMM_WORLD,&lComm,

                       &chunksize,&fsblksize, &globalrank, &fileptr, &newfname);

  while((!sion_feof(sid))) {

    btoread=sion_bytes_avail_in_block(sid);

    bread=fread(localbuffer,1,btoread,fileptr);

  }

  sion_parclose_mpi(sid);



  printf("Task %02d: read sionfile -> %s\n",rank,newfname);



  MPI_Finalize();



  return (0);

}

[go_db] top of page

Serial program for SIONlib file access

Templates for accessing SION-files from a serial program are shown below. The
templates can also be found in the distribution package under the PATH: sionlib
/examples


Serial write:

sid=sion_open( ...,chunksize, &fileptr)



rank_loop: {

    sion_seek(sid,rank,SION_CURRENT_BLK,SION_CURRENT_POS);



    sion_ensure_free_space(sid,nbytes);

    fwrite(...,fileptr)

}

sion_close(id);


Serial read:

sid=sion_open( ...,chunksize, &fileptr)



sion_get_locations(sid,&size,&blocks,&globalskip,&start_of_varheader,

                       &sion_localsizes,&sion_globalranks,

                       &sion_chunkcount,&sion_chunksizes);



loop: {

   sion_seek(sid,rank,blknr,pos);



   fread(...,fileptr)

}

sion_close(id);

- access to all ranks and chunks is possible (sion_seek)
- sion_get_locations returns pointers to internal fields, containing the number
of chunks written by each task (sion_chunkcount) and their sizes
(sion_chunksizes)

[go_db] top of page

Other examples

The example directory also contains in sub-directory
./examples/pepc
an example application dependant converter program for the parallel simulation
application PEPC (Multi-Purpose Parallel Tree-Code, [go_db]PEPC). This serial
program converts data files between the native PEPC ASCII format and a binary
format using sionfiles and can be used to build converter programs for own
applications.

[go_db] top of page
