Giter Site home page Giter Site logo

byoo's Introduction

byoo

This project should be regarded as "alpha" quality. Documentation is sparse.

This is byoo, the bring your own optimizer execution engine. byoo (pronounced "bio") is a simple, entirely modular, high performance relational database execution engine designed for optimizer research purposes.

byoo uses a pipelined concurrent "push" data model (as opposed to a "pull"/Volcano model), compresses intermediary results, supports row-store and columnar operations, and can directly read CSV files as 1st class citizens. Written in Rust, byoo is free from many sources of undefined behavior and memory issues that plague C/C++ codebases, which (1) makes it easier to work with and (2) should give you more confidence experimental results produced.

Goals:

  • Initially, enough complexity and operators to support the Join Order Benchmark (JOB) and TPC-H.
  • Fast, multithreaded performance faster than, or at least comparable to, PostgreSQL.
  • Serverless, any process, embeddable, like SQLite.
  • A "do-what-I-say" engine that obeys user-specified join orderings, join operators, aggregation operators, index selection, etc.
  • Precisely controllable memory usage and allocation.
  • Easy & convenient interface for researchers.

Non-goals:

  • Transactions, high-performance inserts. byoo is designed for OLAP workloads.
  • A full-SQL interface. Such an interface would certainly require an optimizer. byoo could be used as an execution engine in such a database, but does not seek to be one itself.
  • Multi-user support. byoo supports only a single writer at a time.

Execution plans

You can run execution plans with byoo using the binary built by Cargo (e.g. cargo build --release). You can then use byoo to execute a plan like so:

target/release/byoo my_plan.json # or cargo run --release -- my_plan.json

The referenced JSON file should be a tree of operators. Currently, the format is quite verbose, but easy to generate programatically. For example, to compute some aggregates using a hash group by operator:

{"op": "project",
 "options": { "cols": [0, 3, 4, 5] }, 
 "input": [
     {
         "op": "hashed group by",
         "options": {
             "col": 0,
             "aggregates": [
                 {"op": "min", "col": 1},
                 {"op": "max", "col": 1},
                 {"op": "count", "col": 1}
             ]
         },
         "input": [
             { "op": "csv read",
               "options": {
                   "file": "res/inputs/agg_test.csv",
                   "types": ["INTEGER", "INTEGER", "REAL"]
               }
             }]
     }
 ]
}

More examples of plans can be found in tests/. You can find a listing of all currently supported operators in src/compile/mod/rs.

Tests and benchmarks

To run the tests, clone the repository and execute make in the res folder (this extracts and builds the input files needed for testing). Then, testing is as simple as cargo test. For speed reasons, some tests only run in release mode, so run cargo test --release to get a few extras.

There are benchmarks as well. Currently, the Cargo benchmarking tools only work with nightly Rust (although byoo works perfectly fine with stable rust). To run them, execute cargo bench.

byoo's People

Contributors

ryanmarcus avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.