Giter Site home page Giter Site logo

vol-log-based's Introduction

Log-based VOL - an HDF5 VOL Plugin that stores HDF5 datasets in a log-based storage layout

This software repository contains source codes implementing an HDF5 Virtual Object Layer (VOL)) plugin that stores HDF5 datasets in a log-based storage layout. It allows applications to generate efficient log-based I/O requests using HDF5 APIs.

Software Requirements

  • HDF5 1.12.0
    • Parallel I/O support (--enable-parallel) is required
  • MPI C and C++ compilers
    • The plugin uses the constant initializer; a C++ compiler supporting std 11 is required
  • Autotools utility
    • autoconf 2.69
    • automake 1.16.1
    • libtoolize 2.4.6
    • m4 1.4.18

Building Steps

  • Build HDF5 with VOL and parallel I/O support
    • Download HDF5 source code
    • Run command ./autogen.sh
    • Configure HDF5 with parallel I/O enabled
    • Run make install
    • Example commands are given below. This example will install the HD5 library under the folder ${HOME}/HDF5.
      % wget https://hdf-wordpress-1.s3.amazonaws.com/wp-content/uploads/manual/HDF5/HDF5_1_12_0/source/hdf5-1.12.0.tar.gz
      % tar -zxf hdf5-1.12.0.tar.gz 
      % cd hdf5-1.12.0
      % ./autogen
      % ./configure --prefix=${HOME}/HDF5 --enable-parallel CC=mpicc
      % make -j4 install
      
  • Build this VOL plugin, log-based vol.
    • Clone this VOL plugin repository
    • Run command autoreconf -i
    • Configure log-based VOL
      • Shared library is required to enable log-based VOL by environment variables
      • Compile with zlib library to enable metadata compression
    • Run test programs (make check) to check whether the VOl is working properly
    • Example commands are given below.
      % git clone https://github.com/DataLib-ECP/vol-log-based.git
      % cd log_io_vol
      % autoreconf -i
      % ./configure --prefix=${HOME}/Log_IO_VOL --with-hdf5=${HOME}/HDF5 --enable-shared --enable-zlib
      % make -j 4
      % make check
      % make install
      
      The VOL plugin library is now installed under the folder ${HOME}/Log_IO_VOL.

Running example programs

  • Build the log-based VOL
  • Compile the example programs under the example directory
    • Run make <program name> to compile an example program
    • Running make tests will compile all example programs
  • Run the example programs
    • Run make check to run all example programs as test programs
  • Example commands are given below. This example will install the HD5 library under the folder ${HOME}/HDF5.
    % cd example
    % make create_open
    % ./create_open
    Writing file_name = test.h5 at rank 0 
    

Compile user programs that use this VOL plugin

  • Enable log-based VOL programmatically
    • Include header file.
      • Add the following line to your C/C++ source codes.
        #include <H5VL_log.h>
        
      • Header file H5VL_log.h is located in folder ${HOME}/Log_IO_VOL/include
      • Add -I${HOME}/Log_IO_VOL/include to your compile command line. For example,
        % mpicc prog.c -o prog.o -I${HOME}/Log_IO_VOL/include
        
    • Library file.
      • The library file, libH5VL_log.a, is located under folder ${HOME}/Log_IO_VOL/lib.
      • Add -L${HOME}/Log_IO_VOL/lib -lH5VL_log to your compile/link command line. For example,
        % mpicc prog.o -o prog -L${HOME}/Log_IO_VOL/lib -lH5VL_log \
                               -L${HOME}/HDF5/lib -lhdf5
        
    • Edit the source code to use log-based VOL when opening HDF5 files
      • Register VOL callback structure using H5VLregister_connector
      • Callback structure is named H5VL_log_g
      • Set a file creation property list to use log-based vol
      • For example,
        fapl_id = H5Pcreate(H5P_FILE_ACCESS); 
        H5Pset_fapl_mpio(fapl_id, comm, info);
        H5Pset_all_coll_metadata_ops(fapl_id, true);
        H5Pset_coll_metadata_write(fapl_id, true);
        log_vol_id = H5VLregister_connector(&H5VL_log_g, H5P_DEFAULT);
        H5Pset_vol(fapl_id, log_vol_id, NULL);
        
      • See a full example program in examples/create_open.c
  • Enable log-based VOL dynamically through environment variables
    • No additional action required on programming and compiling
    • Tell HDF5 to use the log-based VOL through environment variables
      • Append log-based VOL lib directory to shared object search path (LD_LIBRARY_PATH)
        % export LD_LIBRARY_PATH=${HOME}/Log_IO_VOL/lib
        
      • Append log-based VOL lib directory to HDF5 VOL search path (HDF5_PLUGIN_PATH)
        % export HDF5_PLUGIN_PATH=${HOME}/Log_IO_VOL/lib
        
      • Set log-based VOL as HDF5's default VOL (HDF5_VOL_CONNECTOR)
        % export HDF5_VOL_CONNECTOR="LOG under_vol=0;under_info={}"
        

Current limitations

  • Not compatible with parallel NetCDF4 applications
  • Does not support dataset reading.
    • Feature to read datasets in log-based storage layout is under development
  • Utility to repack dataset in log-based storage layout into conventional storage layout is under development

References

Developers

Project funding supports:

This research was supported by the Exascale Computing Project (17-SC-20-SC), a joint project of the U.S. Department of Energy’s Office of Science and National Nuclear Security Administration, responsible for delivering a capable exascale ecosystem, including software, applications, and hardware technology, to support the nation’s exascale computing imperative.

vol-log-based's People

Contributors

khou2020 avatar wkliao avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.