Giter Site home page Giter Site logo

omnisketch's Introduction

OmniSketch

OmniSketch is a C++ framework for designing, simulating and testing sketch, a telemetry technique that aims at collecting streaming statistics. It is designed to be efficient, consistent and easy to use. Currently it is supported on Windows, Linux and MacOS.

Feature Overview

  • A wide variety of sketches are implemented with minimal code. Some complicated sketches, such as ElasticSketch, are also supported. For a full list of the sketches available, check this table.
  • An auto-testing framework that tests sketch performance and collects interested metrics with minimal configurations.
  • A generally applicable Counter Hierarchy framework that enables a tremendous amount of space saving via hierarchical counter braiding, applicable to all value-based counters.
  • Pcap Parser parses .pcap and .pcapng files to generate streaming data that interoperates with OmniSketch. Data format of a streaming data record is customizable.
  • Flexible definitions of a flowkey are available. Currently, 1-tuple, 2-tuple and 5-tuple are supported.
  • No redundant code has to be rewritten. Designer can focus mainly on the sketch algorithm, and let the framework do the rest, e.g. data parsing and testing details.
  • Since sketching algorithms vary a lot, the framework includes only the most common methods. Do not panic! You can always write inject your own code and tell the framework to run it.

Download & Build

Download from GitHub release page:

git clone --recurse-submodules https://github.com/N2-Sys/OmniSketch.git
cd OmniSketch
mkdir build; cd build

The project is built with CMake. If an additional module of PcapParser, which is capable of parsing .pcap/.pcapng files in quite a versatile manner and is interoperable with OmniSketch, is requested, please define the BUILD_PCAP_PARSER macro by

cmake .. -DBUILD_PCAP_PARSER=True

Upon defining this macro, CMake automatically checks for dependencies on PcapPlusPlus and pcap libraries, so MAKE SURE you have installed these libraries in advance. The default included directory of the PcapPlusPlus header files is /usr/local/include/pcapplusplus (which is certainly not true on Windows). To change this default searching path, provide to CMake a new argument:

cmake .. -DBUILD_PCAP_PARSER=True -DPCPP_INCLUDE_PATH=[path on your system]
# e.g. cmake .. -DBUILD_PCAP_PARSER=True -DPCPP_INCLUDE_PATH=/usr/local/include/pcapplusplus

In the simplest case, if PcapParser is not requested, the PcapPlusPlus and pcap libraries are never needed, so you can just build with

cmake ..

Gee, it saves you a lot of work!

To verify that you indeed build a runnable copy of OmniSketch, in the same directory (build/) run

ctest

If you see the line

100% tests passed, 0 tests failed out of 7

you can proceed to poke around OmniSketch and design your new sketches.

Design New Sketches

Here is an overview of how to design your own sketch in OmniSketch. For a detailed description, please check the docs.

  1. Sketch algorithm should be in src/sketch/. Suppose you add a file XXX.h to this directory.
  2. Sketch testing procedure is defined in src/sketch_test/. You should name your testing file as XXXTest.h accordingly.
  3. Add a new sketch target in CMakeLists.txt. This is done in a single line add_user_sketch(YYY XXX).
  4. Write down the sketch config in a toml file. The default config file is src/sketch_config.toml. All the sketch configs are user-defined, but typically should contain (though not required)
  • Streaming data file
  • Format of the streaming data record
  • Metrics measured during testing
  • Sketch parameters
  • Other user-defined configs

It is the user who controls what, where and how to parse in the config file. Omnisketch imposes no restrictions on how you organize the toml file and what you put in there. Fortunately, OmniSketch provides a rich set of tools to help you do it in just a couple of lines.

  1. Goto the building directory build/, cmake and make it. (CMake is needed since a new target has just been added) From the time on, if the code is modified, only make is needed.
  2. At this point, the sketch is compiled and linked. It is callable from the terminal with ./YYY -c config. If no -c option is provided, src/sketch_config.toml is assumed to be the default config file. A possible output of Count Min Sketch runs as follows:
terminal> ./CM -c ../src/sketch_config.toml
   INFO| Loading config from ../src/sketch_config.toml... @utils.cpp:62
VERBOSE| Config loaded. @utils.cpp:76
VERBOSE| Preparing test data... @data.h:718
   INFO| Loading records from ../data/records.bin... @data.h:720
VERBOSE| Records Loaded. @data.h:746
DataSet: 1090120 records with 99999 keys (../data/records.bin)
============     Count Min      ============
  Mem Footprint: 1.52646 MB
    Update Rate: 1853.95 Mpac/s
     Query Rate: 1694.9 Mpac/s
      Query ARE: 0.161831
      Query AAE: 0.252943
============================================
  1. From now on, every time your sketch is about to run on different data and formats, or to collect some new statistics, all you have to do is simply modifying the config file. If the template header should be changed, you have to make a new driver.

API Docs

Please follow this link.

Tables of Sketches

These are the sketches already implemented and tested.

Status: have, checked, checked but with doubt, tested.

Status
cm sketch t
count sketch t
cu sketch t
bloom filter t
counting bloom filter t
LD-sketch t
MV-sketch t
hashpipe t
FM-sketch(PCSA) t
Linear Counting
Kmin(KMV)
Deltoid t
flow radar t
sketch learn
elastic sketch t
univmon
nitro sketch t
reversible sketch
Mrac t
k-ary sketch t
seqHash
TwoLevel t
multi-resolution bitmap
lossy count t
space saving t
HyperLogLog t
Misra-Gries t
Fast Sketch t
CounterBraids t
HeavyKeeper d

omnisketch's People

Contributors

dromniscience avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.