Giter Site home page Giter Site logo

pancake-db's Introduction

PancakeDB is an event ingestion solution. It is simple to set up and computationally cheaper than alternatives. A 1-node instance of PancakeDB can handle >10k writes per second. Reading from PancakeDB into Spark is even faster than reading from Parquet files. PancakeDB is causally consistent, so the data you write is available in real time.

Getting Started

To start a simple 1-node deployment, you can run

git clone https://github.com/pancake-db/pancake-db && cd pancake-db
docker build . -t pancake-db:latest # will take several minutes
mkdir pancake_db_data
docker run --rm -p 3841:3841 -p 3842:3842 -v pancake_db_data:/pancake_db_data pancake-db:latest

Now you can write data either via HTTP or one of the client libraries. E.g.

# create a table
curl -XPOST -H ContentType:application/json localhost:3841/rest/create_table -d '{
  "tableName": "my_purchase_table",
  "schema": {
    "partitioning": {
      "day": {"dtype": "timestampMinute"}
    },
    "columns": {
      "user_id": {"dtype": "string"},
      "cents_amount": {"dtype": "int64"}
    }
  }
}'

# write a row
curl -XPOST -H ContentType:application/json localhost:3841/rest/write_to_partition -d '{
  "tableName": "my_purchase_table",
  "partition": {
    "day": "2022-01-01T00:00:00Z"
  },
  "rows": [{
    "user_id": "abc",
    "cents_amount": 1234
  }]
}'

If you have Spark installed, you can set up a project depending on the PancakeDB Spark connector and access the tables efficiently. For instance,

spark-shell --jars $MY_SPARK_PROJECT_UBERJAR

scala> val t = spark.read.format("pancake").option("host", "localhost").option("port", 3842).option("table_name", "my_purchase_table").load()

scala> t.show()
+-------+------------+-------------------+                                      
|user_id|cents_amount|                day|
+-------+------------+-------------------+
|    abc|        1234|2021-12-31 19:00:00|
+-------+------------+-------------------+


scala> t.createOrReplaceTempView("t")

scala> spark.sql("select count(*) from t").show()
+--------+
|count(1)|
+--------+
|       1|
+--------+

Documentation

See the website's page.

See the IDL repo for an explanation of the API.

Contributing

To get involved, join the Discord or submit a GitHub issue.

pancake-db's People

Contributors

mwlon avatar

Stargazers

Arjun Sunil Kumar avatar Clayton Kehoe avatar Andreas Motl avatar Sean Linsley avatar Dohyung Park avatar Wittawat Manha avatar 羡鱼 avatar Sas-ry avatar  avatar  avatar Stéphane Busso avatar  avatar  avatar Tino S. aka "Nox" avatar Robert Voogdgeert avatar  avatar Siva Shanmugam avatar hcxiong avatar Devin Smith avatar  avatar ben avatar Charles Li avatar Matthias Vallentin avatar Didier Wenzek avatar Shay Musachanov avatar Sage avatar Jeff Carpenter avatar Raayan Pillai avatar Yue Wang avatar Yuan Zhang avatar unvalley avatar Sky Fan avatar zbv avatar Alex avatar Brian Pan avatar André Claudino avatar Alexander Kendrick avatar  avatar Malhar Vora avatar Changsu Jiang avatar Ji Tao avatar Erwin Kroon avatar Tekena Solomon avatar MathxH Chen avatar 二手掉包工程师 avatar Joseph Post avatar 张伯雨 avatar  avatar Paweł Biegun avatar  avatar oyjing avatar  avatar Xuanzi avatar leesf avatar  avatar  avatar 神奇的考拉 avatar  avatar wh7f avatar hello mars avatar Ravi Teja avatar mygui avatar Chen Chenglong avatar flyhawk1010 avatar Jíwěi Wáng avatar Ju Shang avatar  avatar Weny Xu avatar Paul Saunders avatar frankfanslc avatar Cole Lawrence avatar Ismaël Mejía avatar Kevin Johnson avatar Giovanny Gutiérrez avatar Jonathan Lopez avatar  avatar Samuel Rounce avatar Joe Harris avatar Tyrving avatar  avatar  avatar Yang Jiang avatar jakevin avatar zhihanz avatar  avatar Kishan B avatar Arbai Fayçal avatar Noah avatar Tru Hoang avatar Nico Müller avatar Muizz Mahdy avatar  avatar Juanan A avatar 青いほしぞら avatar Seefs avatar Jan Riemer avatar tomotomo avatar Acid Chicken avatar  avatar wangcong avatar

Watchers

 avatar Dmitry Atamanov avatar  avatar  avatar Eric Storm avatar  avatar Ravi Teja avatar Chris Nantau avatar  avatar openopen2 avatar Sage avatar

pancake-db's Issues

Compilation error while trying docker build

Hi. I am just getting started on pancake-db. While trying to build docker image following the instructions on README, it encounters some compilation error - which is the following

 Compiling pancake-db-server v0.0.0 (/workdir)
error[E0308]: mismatched types
   --> src/ops/write_to_partition.rs:134:7
    |
134 |       server.add_flush_candidate(segment_key.clone())
    |       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected enum `Result`, found opaque type
    |
note: while checking the return type of the `async fn`
   --> src/server/mod.rs:204:60
    |
204 |   pub async fn add_flush_candidate(&self, key: SegmentKey) {
    |                                                            ^ checked the `Output` of this `async fn`, found opaque type
    = note:     expected enum `Result<(), ServerError>`
            found opaque type `impl futures::Future<Output = ()>`

For more information about this error, try `rustc --explain E0308`.
error: could not compile `pancake-db-server` due to previous error

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.