Giter Site home page Giter Site logo

devops-utils / tensorbase Goto Github PK

View Code? Open in Web Editor NEW

This project forked from tensorbase/tensorbase

0.0 1.0 0.0 29.54 MB

TensorBase is building a modern big data warehouse with performance in its core mind.

Home Page: https://tensorbase.io/

License: Apache License 2.0

Rust 64.20% CMake 0.36% C 4.94% Shell 0.04% TSQL 19.01% PLSQL 0.48% C++ 10.97%

tensorbase's Introduction

What is TensorBase

TensorBase is a modern engineering effort for building a high performance and cost-effective bigdata warehouse in an open source culture.

Status

TensorBase is in its intial stage (milestone 0) and under heavy development.

TensorBase is an architectural performance design. It is demonstrated to query ~1.5 billion rows of NYC taxi dataset in ~100 milliseconds for total response time in its milestone 0. The raw speed of core data scanning in kernel saturates the memory bandwidth (for example, ~120GB/s for six-channel single socket). Column-oriented, vectorized, SIMD all have, and big bangs...

TensorBase is written from scratch in the Rust language (system) and C language (runtime kernel). Here, you use the most familiar tools to challenge the most difficult problems. Comfortable languages and minimized dependencies, from-scratch architecting make it a highly hackable system.

Read launch post to get more about TensorBase's "Who? Where from? Where go?"

Please give TensorBase a star to help it more grown.

Try TensorBase

TensorBase is developed for Linux, but should work for any docker enabled system (for example, Windows 10 WSL2).

  • from source

TensorBase follows the idiomatic development flow of Rust. Make sure your Rust nightly toolchain works. If you only try to run, just play with Quick Start. Thanks to the strong rust ecosystem, it is not necessary to run build first.

  • docker

This mode is portable but more host resource occupied, and the performance is platform dependent.

TBD.

Quick Start

Now TensorBase provides two binaries to enable the following workflow:

  • baseops: cli/workbench for devops, including kinds of processes/roles starts/stop

  • baseshell: query client (now is a monolithic to include everything), m0 only supports query with single integer column type sum aggregation intentionally.

  1. run baseops to create a table definition in Base
cargo run --bin baseops table create -c samples/nyc_taxi_create_table_sample.sql

Base explicitly separates write/mutation behaviors into the cli baseops. the provided sql file is just an ansi-SQL DDL script, which can be seen in the samples directory of repo.

  1. run baseops to import nyc_taxi csv dataset into Base
cargo run --release --bin baseops import csv -c /jian/nyc-taxi.csv -i nyc_taxi:trip_id,pickup_datetime,passenger_count:0,2,10:51

Base import tool uniquely supports to import csv partially into storage like above. Use help to get more infos.

  1. run baseshell to issue query against Base
cargo run --release --bin baseshell

Dev Docs provides a little more explanation for why above commands work.

Engineering Efforts

Welcome to join us, you data nerds!

Here are on-going efforts. If you are interested in any effort, do not hesitate to join us.

subsystem component priority committers
storage* @jinmingjian
data layout
data read
data write
metadata
runtime @jinmingjian
base language(sql)
parsing
base ir (intermediate representation)
codegen
jit compiler*
kernel execution
infra @jinmingjian
common
lib
testing
bench
doc
project
client @jinmingjian
baseshell
baseops
visualization

Communication

Feel free to feedback any problem via issues.

Mailing list: just open an issue with label [type/discuss].

Slack Channel

Contributing

Thanks for your contributions!

Dev Docs

License

TensorBase is distributed under the terms of the Apache License (Version 2.0).

See LICENSE for details.

tensorbase's People

Contributors

jinmingjian avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.