Giter Site home page Giter Site logo

sandialabs / faodel Goto Github PK

View Code? Open in Web Editor NEW
4.0 6.0 4.0 3.27 MB

Flexible, Asynchronous, Object Data-Exchange Libraries: HPC libraries for moving data between applications

License: Other

CMake 1.88% C++ 91.80% C 2.83% RPC 0.12% Shell 0.63% HTML 0.02% Perl 2.63% Assembly 0.05% Batchfile 0.04%
snl-performance snl-os-sys-software snl-data-analysis scr-2301

faodel's Introduction

FAODEL Overview

FAODEL (Flexible, Asynchronous, Object Data-Exchange Libraries) is a collection of software libraries that are used to implement different data management services on high-performance computing (HPC) platforms. This project was funded through NNSA's ASC program at Sandia National Laboratories.

  • What Problem Does This Solve? HPC workflows often need a way to move large datasets between two or more MPI applications. Rather than route intermediate data through the filesystem, FAODEL lets you pass the data directly between the two MPI applications or indirectly through a separate distributed memory application. The filesystem can also be used if applications in the workflow do not run concurrently.

  • Who Is the Intended Audience? This software is intended for HPC developers that write parallel MPI applications in C++ and run workflows on cluster computers with hundreds to thousands of compute nodes. FAODEL requires an HPC network fabric such as InfiniBand, RoCE, OmniPath, or Gemini.

  • Pronounciation: We say "Fay-oh-Dell".

Note: FAODEL development takes place in a private repository due to Sandia's software release process. The Github repository is only updated when there are minor bug fixes or new release snapshots. This message will be updated if/when FAODEL is no longer being developed.

Components

FAODEL is composed of multiple libraries:

  • Kelpie: Kelpie is a distributed memory service that enables applications to migrate different data objects between compute nodes in a platform. It utilizes out-of-band RDMA communication to enable different MPI jobs to interact with each other.
  • DirMan: DirMan is a service for managing runtime information (e.g., a list of nodes that make up a pool for storing data).
  • OpBox: OpBox is a communication engine responsible for orchestrating complex communication patterns in a distributed system. Rather than use traditional remote-procedure call (RPC) techniques, communication is facilitated through state machines called Ops. Ops allow distributed protocols to run asynchronously, without explicit maintenance by user services.
  • Lunasa: Lunasa is a memory management unit for data that may be transmitted on the network using RDMAs. In order to reduce network registration overheads, Lunasa allocates sizable amounts of registered memory and then suballocates it to applications through tcmalloc. User allocations are described by Lunasa Data Objects (LDOs), which provide reference counting and object description in the stack.
  • NNTI: NNTI is a low-level, RDMA portability layer for high-performance networks. It provides application with the ability to send messages and coordinate RDMA transfers via registered memory.
  • Whookie: Whookie is a network service for FAODEL nodes that enables users and applications to query and change the state of a node via an HTTP connection.
  • Services: Basic services that make it easier to write communication applications.
  • Common: Common is a collection of data types and software functions that are used throughout FAODEL.
  • SBL: The Simplified Boost Logging (SBL) library provides a way to map log information in FAODEL components to Boost's logging library.

There are two main command-line tools users may find useful:

  • faodel-cli: This tool is an all-in-one tool for hosting and configuring different faodel services such as dirman and kelpie. Users can obtain build and runtime info, as well as query services and put/get/delete objects. Commands can also be replayed through the play command.
  • faodel-stress: This standalone tool can be used to benchmark a system and estimate how well it performs different data management operations.

Additional Information

This release includes files to help guide users. The files are:

  • INSTALL: Details about how to configure, build, install and run the software provided in this release. This document is a good starting point, as the build process can be challenging on different platforms.
  • LICENSE: The FAODEL code uses the MIT license.
  • NEWS: The news file provides a history of major changes provided with each release of this software. Developers should review this document when switching to a new release.
  • Configuration File Cookbook: This page provides a summary of runtime configuration settings to plug into your $FAODEL_CONFIG file
  • What's a Faodel? How we picked the acronym.

Contributors

The following developers contributed code to the FAODEL:

  • Nathan Fabian
  • Todd Kordenbrock
  • Scott Levy
  • Shyamali Mukherjee
  • Gary Templet
  • Craig Ulmer
  • Patrick Widener

The following helped contribute ideas and provided feedback for the project:

  • Margaret Lawson
  • Jay Lofstead
  • Ron Oldfield
  • Jeremy Wilke

This release includes third-party software that contains its own licensing and copyright info:

  • cereal (in tpl/cereal)
  • gperftools (in tpl/gperftools)
  • Boost ASIO examples (in src/whookie/server)

Copyright

Copyright 2021 National Technology & Engineering Solutions of Sandia, LLC (NTESS). Under the terms of Contract DE-NA0003525 with NTESS, the U.S. Government retains certain rights in this software.

faodel's People

Contributors

craigulmer avatar tkordenbrock avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

faodel's Issues

Cannot compile FAODEL with `Faodel_LOGGING_METHOD` set to `sbl`

When trying to compile FAODEL with Faodel_LOGGING_METHOD set to sbl, I get the following error message:

Scanning dependencies of target common
[ 27%] Building CXX object src/faodel-common/CMakeFiles/common.dir/LoggingInterface.cpp.o
/home/francois.budin/devel/adios/playground/faodel/src/faodel-common/LoggingInterface.cpp:14:10: fatal error: sbl/sbl_logger.hpp: No such file or directory
#include <sbl/sbl_logger.hpp>
^~~~~~~~~~~~~~~~~~~~
compilation terminated.

HIP/CUDA Support

Is there a plan to add interop with HIP/CUDA to faodel to directly send/recieve data from/to GPUs?

Build failure w/ LLVM 10

Trying to build faodel 1.1906.1 with LLVM 10 on Ubuntu 18.04 for x86_64 fails with this error:

/tmp/root/spack-stage/spack-stage-faodel-1.1906.1-qtpc4h53tqs7pq5dpw4v5fncvqx5rkqn/spack-src/src/kelpie/pools/DHTPool/DHTPool.cpp:232:75: error: a lambda parameter cannot shadow an explicitly captured entity
                 [&key, &returned_ldo, &cv, &is_found] (bool success, Key key, lunasa::DataObject result_ldo,
...
src/kelpie/CMakeFiles/kelpie.dir/build.make:381: recipe for target 'src/kelpie/CMakeFiles/kelpie.dir/pools/DHTPool/DHTPool.cpp.o' failed
make[2]: *** [src/kelpie/CMakeFiles/kelpie.dir/pools/DHTPool/DHTPool.cpp.o] Error 1

My cmake args:

-DBUILD_SHARED_LIBS:BOOL=ON
-DBUILD_TESTS:BOOL=ON
-DBOOST_ROOT:PATH=/opt/spack/opt/spack/linux-ubuntu18.04-x86_64/clang-11.0.0/boost-1.72.0-3ztuhn7vlvh2f3zqeoalp2esfkr45xuj
-DGTEST_ROOT:PATH=/opt/spack/opt/spack/linux-ubuntu18.04-x86_64/clang-11.0.0/googletest-1.10.0-qenctivumu3p7y4nbhvhpdyudxf66nyp
-DBUILD_DOCS:BOOL=OFF
-DFaodel_ENABLE_IOM_HDF5:BOOL=OFF
-DFaodel_ENABLE_IOM_LEVELDB:BOOL=OFF
-DFaodel_ENABLE_MPI_SUPPORT:BOOL=ON
-DFaodel_ENABLE_TCMALLOC:BOOL=ON
-DFaodel_LOGGING_METHOD:STRING=stdout 
-DFaodel_NETWORK_LIBRARY:STRING=nnti
-DFaodel_ENABLE_CEREAL:BOOL=OFF

Full build log:
spack-build-out.txt

Boost Requirement Update

On the install page, the required Boost version is listed as 1.60 or higher, but 1.67 or higher is actually required.

Cannot compile FAODEL when disabling MPI

When trying to compile FAODEL with Faodel_ENABLE_MPI_SUPPORT set to OFF, I get the following error message:

[ 80%] Building CXX object src/kelpie/CMakeFiles/kelpie.dir/services/PoolServerDriver.cpp.o
/home/francois.budin/devel/adios/playground/faodel/src/kelpie/services/PoolServerDriver.cpp:11:10: fatal error: mpi.h: No such file or directory
#include <mpi.h>
^~~~~~~
compilation terminated.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.