pancake-db's Introduction

PancakeDB

PancakeDB is an event ingestion solution. It is simple to set up and computationally cheaper than alternatives. A 1-node instance of PancakeDB can handle >10k writes per second. Reading from PancakeDB into Spark is even faster than reading from Parquet files. PancakeDB is causally consistent, so the data you write is available in real time.

Getting Started

To start a simple 1-node deployment, you can run

git clone https://github.com/pancake-db/pancake-db && cd pancake-db
docker build . -t pancake-db:latest # will take several minutes
mkdir pancake_db_data
docker run --rm -p 3841:3841 -p 3842:3842 -v pancake_db_data:/pancake_db_data pancake-db:latest

Now you can write data either via HTTP or one of the client libraries. E.g.

# create a table
curl -XPOST -H ContentType:application/json localhost:3841/rest/create_table -d '{
  "tableName": "my_purchase_table",
  "schema": {
    "partitioning": {
      "day": {"dtype": "timestampMinute"}
    },
    "columns": {
      "user_id": {"dtype": "string"},
      "cents_amount": {"dtype": "int64"}
    }
  }
}'

# write a row
curl -XPOST -H ContentType:application/json localhost:3841/rest/write_to_partition -d '{
  "tableName": "my_purchase_table",
  "partition": {
    "day": "2022-01-01T00:00:00Z"
  },
  "rows": [{
    "user_id": "abc",
    "cents_amount": 1234
  }]
}'

If you have Spark installed, you can set up a project depending on the PancakeDB Spark connector and access the tables efficiently. For instance,

spark-shell --jars $MY_SPARK_PROJECT_UBERJAR

scala> val t = spark.read.format("pancake").option("host", "localhost").option("port", 3842).option("table_name", "my_purchase_table").load()

scala> t.show()
+-------+------------+-------------------+                                      
|user_id|cents_amount|                day|
+-------+------------+-------------------+
|    abc|        1234|2021-12-31 19:00:00|
+-------+------------+-------------------+


scala> t.createOrReplaceTempView("t")

scala> spark.sql("select count(*) from t").show()
+--------+
|count(1)|
+--------+
|       1|
+--------+

Documentation

See the website's page.

See the IDL repo for an explanation of the API.

Contributing

To get involved, join the Discord or submit a GitHub issue.

pancake-db's People

Contributors

Stargazers

Watchers

pancake-db's Issues

Compilation error while trying docker build

Hi. I am just getting started on pancake-db. While trying to build docker image following the instructions on README, it encounters some compilation error - which is the following

 Compiling pancake-db-server v0.0.0 (/workdir)
error[E0308]: mismatched types
   --> src/ops/write_to_partition.rs:134:7
    |
134 |       server.add_flush_candidate(segment_key.clone())
    |       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected enum `Result`, found opaque type
    |
note: while checking the return type of the `async fn`
   --> src/server/mod.rs:204:60
    |
204 |   pub async fn add_flush_candidate(&self, key: SegmentKey) {
    |                                                            ^ checked the `Output` of this `async fn`, found opaque type
    = note:     expected enum `Result<(), ServerError>`
            found opaque type `impl futures::Future<Output = ()>`

For more information about this error, try `rustc --explain E0308`.
error: could not compile `pancake-db-server` due to previous error

Recommend Projects

pancake-db / pancake-db Goto Github PK

pancake-db's Introduction

PancakeDB

Getting Started

Documentation

Contributing

pancake-db's People

Contributors

Stargazers

Watchers

Forkers

pancake-db's Issues

Compilation error while trying docker build

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent