Giter Site home page Giter Site logo

llamadb's Introduction

LlamaDB

Fair warning: this project is not maintained.

Do not use this for anything important, like... for anything. This is a database I wrote for the fun of it. It will never be production-ready.

LlamaDB is a simple SQL database, written entirely in the Rust programming language.

Building

LlamaDB is built using the nightly builds of Rust and Cargo. For installation instructions, go to http://www.rust-lang.org.

To use the CLI, navigate to the cli/ directory and run cargo run. You'll be greeted by the friendly llamadb> prompt, with whom you can enter SQL queries. To exit the CLI, press Ctrl+D.

Examples, Features and TODO

For a quick rundown of implemented features, see the Usage guide.

See the issue tracker for unimplemented features and bugs: https://github.com/nukep/llamadb/issues

About

I started this mostly as a learning project so that I could learn SQL and its implementation details better. This project may or may not go anywhere; we'll just have to see.

The other reason I started this was to see how the Rust programming language could be used to write large projects such as SQL databases. So far, I think it's working out fairly well. :)

Special thanks

A HUGE THANKS goes out to SQLite and the SQLite documentation. Their wonderful docs helped shed some light on the SQL syntax and other crucial details such as their B-Tree implementation.

llamadb's People

Contributors

nukep avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

llamadb's Issues

JOIN syntax

  • INNER JOIN
  • LEFT JOIN
  • RIGHT JOIN
  • FULL OUTER JOIN

INNER JOIN can currently be emulated with a traditional Cartesian + WHERE join.
It would be nice to allow for the INNER JOIN syntax as an alternative for those who prefer it (such as myself!).

OUTER JOIN cannot currently be emulated. The only way to get an outer join is to implement the syntax.

Join tables using indices

Currently, all joins are executed using nested scans. This can obviously be bad for performance.

This would need to be implemented at the query compiler at the optimization stage.

Can't paste multi-line statements into CLI

This is a rather frustrating bug/limitation for those of us who like to copy and paste multi-lined text into the CLI. If a multi-line string is pasted, only the first line will be inputted.

rust-linenoise is used to drive the CLI input. Enabling multi-line mode with linenoise::set_multiline(1); doesn't seem to have any effect on my machine (64-bit Linux, using gnome-terminal).

INSERT values using expressions

Currently, INSERT only works with simple number or string expressions; it cannot fold expressions into a single variant value.

The goal is to be able to run:

INSERT INTO mytable VALUES (5 * 4, NULL), (0, (SELECT avg(x) FROM othertable));

INSERT using a SELECT query

The goal is to be able to run:

INSERT INTO table_a SELECT count(*), avg(a) FROM table_b GROUP BY category;

Aggregate/Scalar functions checklist

List of functions to implement

Aggregate functions

  • COUNT(*)
  • COUNT
  • SUM
  • AVG
  • GROUP_CONCAT
  • MIN
  • MAX

Scalar functions

  • ABS
  • LENGTH
  • ROUND
  • FLOOR
  • CEIL
  • TRUNC
  • TYPEOF

Auto-incrementing columns

The goal is to be able to run:

CREATE TABLE mytable (
  id U32 INC,
  name VARCHAR
);

INSERT INTO mytable (name) VALUES ('Alex'), ('Bob'), ('Caroline');

SELECT * FROM mytable;
/*
-----------------
| id | name     |
-----------------
| 0  | Alex     |
| 1  | Bob      |
| 2  | Caroline |
-----------------
*/

Aliases for INC: AUTOINCREMENT

SELECT without FROM

It'd be useful for simple logic tests and fiddling:

SELECT 'Hello' || 'World' AS message, 5 * '6';
/*
-------------------
| message    | _0 |
-------------------
| HelloWorld | 30 |
-------------------
*/

I'm electing not to use dummy tables like DUAL. :)

Build failed, during cargo run

screenshot from 2015-04-20 07 47 52
screenshot from 2015-04-20 07 48 43

thread 'rustc' panicked at 'Trying to convert unsized value to lval', /build/rust/src/rustc-1.0.0-alpha.2/src/librustc_trans/trans/datum.rs:495

Put on crates.io, make the API stable enough to use as a standalone library

I don't feel that LlamaDB is at a point where it can serve as a standalone library yet. The library crate is currently a bunch of hacks that expose internals such as parser modules.

This issue needs to wait on #4 so that developers have a somewhat reasonable API to work with, as well as a file-based DB. tempdb won't be around for long!

Until then, the llamadb crate has been reserved on crates.io.


Nice-to-haves for the library:

  • A sql!() macro that statically validates and compiles the SQL query (to an extent).
    • This may be hard to do without schema information.
    • At least have it validate the syntax, possibly compile into AST?
  • A C API, for use by other programming languages.

Optimize away redundant temp-group-by's

SELECT avg(milliseconds) FROM track;

currently compiles into:

(temp-group-by :source_id 1
  (scan `track` :source_id 0
    (yield
      (column-field :source_id 0 :column_offset 0)
      (column-field :source_id 0 :column_offset 1)
      (column-field :source_id 0 :column_offset 2)
      (column-field :source_id 0 :column_offset 3)
      (column-field :source_id 0 :column_offset 4)
      (column-field :source_id 0 :column_offset 5)
      (column-field :source_id 0 :column_offset 6)
      (column-field :source_id 0 :column_offset 7)
      (column-field :source_id 0 :column_offset 8)))
  (group-by-values)
  (yield
    (avg :source_id 1 
      (column-field :source_id 1 :column_offset 6))))

The temp-group-by creates a redundant temporary group, when simply using the table's group could suffice. This could be done with a new scan-group function or similar.

(scan-group `track` :source_id 0
  (yield
    (avg :source_id 0
      (column-field :source_id 0 :column_offset 6))))

Column types checklist

  • STRING
    • A variable-length UTF-8 string.
    • Alias: VARCHAR
  • Ux, where x is >= 8 and <= 64, and is a multiple 8.
    • An unsigned integer.
  • Ix, where x is >= 8 and <= 64, and is a multiple 8.
    • An signed integer.
    • Alias for I32: INT or INTEGER
  • F64
    • A double-precision (64-bit) floating point number.
    • Alias: DOUBLE
  • byte[]
    • A variable-length byte array.
  • byte[N]
    • A fixed-length byte array.
  • UUID
  • JSON
    • A JSON document, encoded using MessagePack.
  • BCRYPT
  • SCRYPT
  • PBKDF2

Expressions checklist

  • Numbers
  • Strings
  • NULL
  • Byte arrays
  • Subqueries
  • Aggregate function calls
  • Scalar function calls
  • Binary comparison: =, <>, <, <=, >, >=
  • Binary arithmetic: +, -, *, /
  • Unary -
  • AND
  • OR
  • NOT
  • ||
  • EXISTS
  • BETWEEN ... AND
  • IN
  • CASE
  • IS NULL
  • IS NOT NULL
  • LIKE

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.