Giter Site home page Giter Site logo

cs265-lsm-tree's Introduction

Harvard CS265 - Big Data Systems


This repository contains the code of a workload generator for an LSM tree. It follows the DSL specified for the systems project of CS265 (Spring 2017).

More information can be found here.

Workload and Data Generator


Dependencies

You need the GNU scientific library in order to use the generator (https://www.gnu.org/software/gsl/).

  • Ubuntu Linux: sudo apt-get install libgsl-dev
  • Fedora Linux: dnf install gsl-devel
  • Mac OS X: brew install gsl
  • Cygwin: install the gsl, gsl-devel packages

Building

cd generator;
make clean; make;

or simply...

cc generator.c -o generator -lgsl -lgslcblas

Running

You can now run the following to see all available options:

./generator --help

Screen Shot 2017-01-24 at 1.21.17 PM.png

Examples

Query 1: Insert 100000 keys, perform 1000 gets and 10 range queries and 20 deletes. The amount of misses of gets should be approximately 30% (--gets-misses-ratio) and 20% of the queries should be repeated (--gets-skewness).

./generator --puts 100000 --gets 1000 --ranges 10 --deletes 20 --gets-misses-ratio 0.3 --gets-skewness 0.2 > workload.txt

Query 2: Same as above but store the data in external (.dat) binary files.

./generator --puts 100000 --gets 1000 --ranges 10 --deletes 20 --gets-misses-ratio 0.3 --gets-skewness 0.2 --external-puts > workload.txt

Query 3: Perform 100000 puts and issue 100 range queries (drawn from a gaussian distribution).

./generator --puts 100000 --ranges 100 --gaussian-ranges > workload.txt

Query 4: Perform 100000 puts and issue 100 range queries (drawn from a uniform distribution).

./generator --puts 100000 --ranges 100 --uniform-ranges > workload.txt

Note: You can always set the random number generator seed using --seed XXXX

Evaluating a Workload


You can execute a workload and see some basic statistics about it, using the evaluate.py python script.

Dependencies

You need to install the blist library.

Most platforms: pip install blist

Note: In Fedora Linux, you might need to install it using: dnf install python-blist.

Running

Run as follows:

python evaluate.py workload.txt

Note: For extra options etc, please look inside the script.

cs265-lsm-tree's People

Contributors

zoumpatianos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.