Giter Site home page Giter Site logo

duckdb-data-api's Introduction

DuckDB Data API and Micro ORM

Ever wondered if MotherDuck offers a REST API? This project steps in to bridge that gap, enabling you to interact with DuckDB databases through a RESTful interface without writing a single line of backend code. Designed for deployment at the edge, it's a cost-effective solution for hobbyists and small projects.

The DuckDB Data API facilitates direct access to your DuckDB database, allowing for CRUD operations via HTTP.

Motivation

  • Simplicity and Rapid Prototyping: Offers a dynamic, RESTful interface to cater to various project requirements with minimal or no coding.
  • Adaptation to Serverless and Edge Computing: Aims to provide a fitting database solution for applications operating at the internet's edge.
  • Leveraging DuckDB’s Speed: While DuckDB excels in performance, integrating it directly into web applications posed challenges.
  • Filling the Gap Left by MotherDuck: Although MotherDuck allows DuckDB to run serverlessly, it lacked a REST interface. This project aims to bridge that gap.
  • Inspiration from MongoDB's Atlas Data Proxy: The convenience and developer-friendly nature of MongoDB's Atlas Data Proxy influenced the development of this project, promoting quick backend support for rapid prototyping.
  • Cost-Effective Deployment on Vercel: The project prioritizes a deployment strategy that incurs minimal to no costs on platforms like Vercel.

Design Goals

  • Adopting the ActiveRecord Pattern: Mimics the ActiveRecord design pattern to offer an intuitive way of interacting with database records.
  • Automated CRUD Operations: Streamlines Create, Read, Update, and Delete operations to save development time and effort.
  • Flexibility and Dynamic Nature: Ensures the proxy can adapt to varying project needs without rigid constraints.
  • Single-File Implementation: Strives for a lightweight solution, ideally encapsulated within a single file for ease of use and integration.

Getting Started

Python Environment Setup

  1. Create a Virtual Environment:

    python3 -m venv env
    source env/bin/activate
    pip install -r requirements.txt
  2. Create a .env file at the root of your project to configure the environment settings:

# .env file
DUCKDB_DATABASE_URL=duckdb:///tickit.duckdb
DUCKDB_SCHEMA_NAME=main
QUERY_BLACKLIST=DELETE,DROP,TRUNCATE,ALTER
  • DUCKDB_DATABASE_URL: Specifies the connection URL to your local DuckDB database file or to a Motherduck connectionstring. Alternatively, you can set this to :memory: to use an in-memory database.
  • DUCKDB_SCHEMA_NAME: Sets the default schema for database operations within the data api. If left unset, it defaults to the "main" schema.
  • QUERY_BLACKLIST: Defines a comma-separated list of SQL keywords that the data api will reject to prevent potentially destructive operations. If this list is empty or not set, no commands will be blocked, and all types of queries will be permitted.
  1. Install Packages

    pip install -r requirements.txt
  2. Run the project:

    uvicorn main:app --reload

Examples of DUCKDB_DATABASE_URL Configurations

Physical DuckDB file:

DUCKDB_DATABASE_URL=duckdb:///path/to/your/database.duckdb

In-memory DuckDB instance:

DUCKDB_DATABASE_URL=duckdb:///:memory:

Motherduck:

DUCKDB_DATABASE_URL=duckdb:///md:[motherduck-token]@[db-name]

RESTful Routes and Actions

Interact with your DuckDB database through the following RESTful routes by replacing entity with your table name:

Method Route Description Query Parameter Examples
GET /entity List entities ?limit=10&skip=20?select=field1..?order=field1 asc?field1.eq=value
POST /entity Create a new entity N/A
GET /entity/:id Get a single entity by ID N/A
PUT /entity/:id Replace an entity by ID (full update) N/A
PATCH /entity/:id Update an entity by ID (partial update) N/A
DELETE /entity/:id Delete an entity by ID N/A

Query Parameter Examples

  • Filtering: ?field1.eq=value filters the list by field1 equal to value.
  • Sorting: ?order=field1 asc sorts the list by field1 in ascending order.
  • Pagination: ?limit=10&skip=20 limits the list to 10 entities, skipping the first 20.
  • Selecting Fields: ?select=field1,field2 selects only field1 and field2 to be returned in each entity in the list.

Supported Filter Operators

The DuckDB Data Proxy supports a range of filter operators for querying data, allowing for precise data retrieval based on specific criteria:

  • .eq: Equals
  • .neq: Not equals
  • .gt: Greater than
  • .gte: Greater than or equal to
  • .lt: Less than
  • .lte: Less than or equal to
  • .like: Like (for pattern matching)
  • .ilike: Case-insensitive pattern matching

These operators can be used in query parameters to filter the data retrieved from the database. For example, ?name.like=%john% would filter records where the name field contains "john".

It would be more organized to place execute/sql in the "Additional Endpoints" section if it serves a different or more specific purpose than the standard CRUD operations. It's common to separate utility or administrative endpoints from the main CRUD operations to clarify their use cases. Here's how you could mention it:

## Additional Endpoints

In addition to the core RESTful routes, the DuckDB Data Proxy provides several utility endpoints for diagnostics, metadata, and system health checks:

| Method | Route                  | Description                                                   | Query Parameter Examples |
|--------|------------------------|---------------------------------------------------------------|--------------------------|
| GET    | `/`                    | Root endpoint returning a welcome message.                   | N/A                      |
| GET    | `/health`              | Health check endpoint.                                        | N/A                      |
| GET    | `/debug/connection`    | Tests database connection.                                    | N/A                      |
| GET    | `/metadata/tables`     | Lists all tables in the database.                            | N/A                      |
| POST   | `/execute/sql`         | Execute a custom SQL query (SELECT or DDL statement).        | N/A                      |

The `POST /execute/sql` endpoint is for advanced users who need to execute custom SQL queries or DDL statements that are not covered by the standard CRUD operations. Please use this endpoint with caution, as improper use can affect database integrity and security.

Playground

Interact with the following tables from tickit db: sale, event, data, category, user, listing, venue

User Table API Endpoints

GET     https://duckdb-data-api.vercel.app/user?state.eq=NL&liketheatre=true&limit=10&offset=5&order=firstname%20asc    - List, filter, sort and paginate users
POST    https://duckdb-data-api.vercel.app/user                                                                         - Create a new user
GET     https://duckdb-data-api.vercel.app/user/{id}                                                                    - Get a single user by ID
PUT     https://duckdb-data-api.vercel.app/user/{id}                                                                    - Replace a user by ID (full update)
PATCH   https://duckdb-data-api.vercel.app/user/{id}                                                                    - Update a user by ID (partial update)
DELETE  https://duckdb-data-api.vercel.app/user/{id}                                                                    - Delete a user by ID

Heads Up on Limitations

  • Performance Considerations: Because of the way we jump from the edge to MotherDuck and back, and how we fetch data, especially when counting items for pagination, there might be a slight delay.
  • Primary Key Expectations: Right now, we expect the primary key in your tables to be named "id". We know that’s not always the case, so we’re thinking of ways to work around this in future updates.

Technology Stack

  1. FastAPI:
  2. DuckDB and MotherDuck:
  3. Python 3.9:
  4. SQLAlchemy:

Architecture

flowchart LR
    Client -->|REST API Call| FastAPI
    FastAPI -->|SQLAlchemy ORM| DuckDB
    DuckDB -->|Process Query| MotherDuck
    MotherDuck -->|Return Results| DuckDB
    DuckDB -->|ORM| FastAPI
    FastAPI -->|JSON Response| Client

    style FastAPI fill:#f9f,stroke:#333,stroke-width:2px
    style DuckDB fill:#bbf,stroke:#333,stroke-width:2px
    style MotherDuck fill:#fbf,stroke:#333,stroke-width:2px
    style Client fill:#dfd,stroke:#333,stroke-width:2px

Deployment in Vercel

Deploying your application to Vercel can significantly simplify the process, thanks to its support for serverless functions. For a detailed guide on deploying FastAPI applications to Vercel, check out this insightful blog post. It walks you through the steps to ensure your application runs smoothly in a serverless environment, making your DuckDB Data Proxy accessible from anywhere.

From Goduck to DuckDB Data Proxy

Transitioning from Golang to Python

Before diving into this Python project, I launched Goduck, a similar initiative built with Golang. It aimed to provide REST API interaction with DuckDB and MotherDuck, much like what we're doing here but in the Go ecosystem.

Shifting Gears to Python

While trying to deploy Goduck across various environments, including serverless platforms, I faced hurdles due to the C-go dependency of the Go duckDB driver, which made the build process tricky for different Linux systems. This challenge highlighted the benefits of Python's straightforwardness and the extensive support from its community. Here are the main insights:

  • Simpler Python Driver: Python's approach to DuckDB felt more straightforward and developer-friendly.
  • Larger Python Community: The vast Python community meant more potential users and contributors for this project.
  • Inspiration from MongoDB: MongoDB's Atlas Data Proxy, which simplifies database operations, inspired me to offer a similar experience for DuckDB users, facilitating quick backend setups for rapid prototyping.

duckdb-data-api's People

Contributors

senthilsweb avatar

Stargazers

Vince Fulco--Bighire.tools avatar Liam Clancy avatar

Watchers

 avatar

Forkers

sivaprasadj

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.