Giter Site home page Giter Site logo

mytrygithub / datasketches-cpp Goto Github PK

View Code? Open in Web Editor NEW

This project forked from apache/datasketches-cpp

0.0 0.0 0.0 3.31 MB

Core C++ Sketch Library

Home Page: https://datasketches.apache.org

License: Apache License 2.0

C++ 94.41% C 0.23% CMake 1.49% Python 1.71% Jupyter Notebook 2.16% Batchfile 0.01%

datasketches-cpp's Introduction

Apache DataSketches Core C++ Library Component

This is the core C++ component of the Apache DataSketches library. It contains all of the key sketching algorithms that are in the Java component and can be accessed directly from user applications.

This component is also a dependency of other components of the library that create adaptors for target systems, such as PostgreSQL.

Note that we have a parallel core component for Java implementations of the same sketch algorithms, datasketches-java.

Please visit the main Apache DataSketches website for more information.

If you are interested in making contributions to this site please see our Community page for how to contact us.


This code requires C++11.

This includes Python bindings. For the Python interface, see the README notes in the python subdirectory.

This library is header-only. The build process provided is only for building unit tests and the python library.

Building the unit tests requires cmake 3.12.0 or higher.

Installing the latest cmake on OSX: brew install cmake

Building and running unit tests using cmake for OSX and Linux:

    $ cmake -S . -B build/Release -DCMAKE_BUILD_TYPE=Release
    $ cmake --build build/Release -t all test

Building and running unit tests using cmake for Windows from the command line:

    $ cd build
    $ cmake ..
    $ cd ..
    $ cmake --build build --config Release
    $ cmake --build build --config Release --target RUN_TESTS

To install a local distribution (OSX and Linux), use the following command. The CMAKE_INSTALL_PREFIX variable controls the destination. If not specified, it defaults to installing in /usr (/usr/include, /usr/lib, etc). In the command below, the installation will be in /tmp/install/DataSketches (/tmp/install/DataSketches/include, /tmp/install/DataSketches/lib, etc)

    $ cmake -S . -B build/Release -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=/tmp/install/DataSketches
    $ cmake --build build/Release -t install

To generate an installable package using cmake's built in cpack packaging tool, use the following command. The type of packaging is controlled by the CPACK_GENERATOR variable (semi-colon separated list). Cmake usually supports packaging types such as RPM, DEB, STGZ, TGZ, TZ, ZIP, etc.

    $ cmake3 -S . -B build/Release -DCMAKE_BUILD_TYPE=Release -DCPACK_GENERATOR="RPM;STGZ;TGZ" 
    $ cmake3 --build build/Release -t package

The DataSketches project can be included in other projects' CMakeLists.txt files in one of two ways. If DataSketches has been installed on the host (using an RPM, DEB, "make install" into /usr/local, or some way, then CMake's find_package command can be used like this:

    find_package(DataSketches 3.2 REQUIRED)
    target_link_library(my_dependent_target PUBLIC ${DATASKETCHES_LIB})

When used with find_package, DataSketches exports several variables, including

  • DATASKETCHES_VERSION: The version number of the datasketches package that was imported.
  • DATASKETCHES_INCLUDE_DIR: The directory that should be added to access DataSketches include files. Because cmake automatically includes the interface directories for included target libraries when using target_link_library, under normal circumstances there will be no need to include this directly.
  • DATASKETCHES_LIB: The name of the DataSketches target to include as a dependency. Projects pulling in DataSketches should reference this with target_link_library in order to set up all the correct dependencies and include paths.

If you don't have DataSketches installed locally, dependent projects can pull it directly from GitHub using CMake's ExternalProject module. The code would look something like this:

    cmake_policy(SET CMP0097 NEW)
    include(ExternalProject)
    ExternalProject_Add(datasketches
        GIT_REPOSITORY https://github.com/apache/datasketches-cpp.git
        GIT_TAG 3.2.0
        GIT_SHALLOW true
        GIT_SUBMODULES ""
        INSTALL_DIR /tmp/datasketches-prefix
        CMAKE_ARGS -DBUILD_TESTS=OFF -DCMAKE_BUILD_TYPE=${CMAKE_BUILD_TYPE} -DCMAKE_INSTALL_PREFIX=/tmp/datasketches-prefix

        # Override the install command to add DESTDIR
        # This is necessary to work around an oddity in the RPM (but not other) package
        # generation, as CMake otherwise picks up the Datasketch files when building
        # an RPM for a dependent package. (RPM scans the directory for files in addition to installing
        # those files referenced in an "install" rule in the cmake file)
        INSTALL_COMMAND env DESTDIR= ${CMAKE_COMMAND} --build . --target install
    )
    ExternalProject_Get_property(datasketches INSTALL_DIR)
    set(datasketches_INSTALL_DIR ${INSTALL_DIR})
    message("Source dir of datasketches = ${datasketches_INSTALL_DIR}")
    target_include_directories(my_dependent_target 
                                PRIVATE ${datasketches_INSTALL_DIR}/include/DataSketches)
    add_dependencies(my_dependent_target datasketches)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.