fjall-rs / fjall Goto Github PK
View Code? Open in Web Editor NEWLSM-based embedded key-value storage engine written in safe Rust
License: Apache License 2.0
LSM-based embedded key-value storage engine written in safe Rust
License: Apache License 2.0
Column families/partitions are the missing piece to having a great storage engine. Each column family is its own partition inside the LSM, but all share the same journal, enabling atomic writes. So each column family is physically stored separately from the others. Compaction then looks at each partition instead, never mixing tables of different partitions. It's like a big meta-LSM tree, instead of using multiple trees, which don't have atomic semantics, unless putting another super structure on top. That super structure will be the new LSM tree which contains partitions basically. Compaction may even be set differently for each column family.
When creating no column family, everything gets put into the "default" column family.
Creating column families is pretty simple, you add a new memtable. Dropping a column family is simple. Delete its memtable and all the segments inside the partition. The journal needs to be handled accordingly, because flushing one column family doesn't necessarily mean the others are flushed too. And if the column family was deleted, the journal should not flush parts to the partition at all.
There needs to be new semantics for writing to a column family:
https://github.com/facebook/rocksdb/wiki/Column-Families
This issue lists Renovate updates and detected dependencies. Read the Dependency Dashboard docs to learn more.
This repository currently has no open or pending branches.
Cargo.toml
byteorder 1.5.0
crc32fast 1.4.0
lsm-tree 0.6.3
log 0.4.21
std-semaphore 0.1.0
tempfile 3.10.1
fs_extra 1.3.0
path-absolutize 3.1.1
criterion 0.5.1
nanoid 0.4.0
test-log 0.2.15
rand 0.8.5
.github/workflows/release.yml
actions/checkout v4
katyo/publish-crates v2
.github/workflows/test.yml
actions/checkout v4
Swatinem/rust-cache v2
actions/checkout v4
We should have all the pieces to get transactions working:
Batch
I am seeking a lsm-tree based database as my in-memory database backend. Firstly, I found crate lsm_tree, but it doesn't contain auto flushing and partition, the only thing i can do is to store different table in one folder, that's fine. Then I find this crate, it exactly is the wheel that I am looking for. After a few minutes migration, a big problem occurs.
I was using a reflection of a type as the name of the lsm_tree file to store different types in different trees. Let's say, the dictionary-like generic api for my storage is Dense<K, V, D>
, D is for delta. When lib user want to store a type like <u64,u64,()>, the name after reflection of this type is something like Dense<u64, u64, ()>
As you can see, the name contains character <>, and blank space, which cannot used as the name of partition.
remove the limit for partition name
or, use a AsRef<[u8]> as the partition name api.
Need a bloom (or XOR or ribbon or cuckoo) filter, that:
This is obvious. On recovery, load all bloom filters back into memory from disk. So the bloom filter needs to be able to give us its internal bytes and needs a constructor to recreate it from raw bytes.
RocksDB used to store a bloom filter per data block, so filter construction is simple. However, its read path is much worse because you need to travel through the SST file. The new full filter format just stores one big bloom filter per SST file. That requires much more memory when flushing and compacting because all keys or hashes need to be buffered until the SST is written because the number of items is unknown until everything is written. Then the bloom filter needs to be built from the buffer.
Using a scalable filter may solve this memory issue.
M O N K E Y may maximize efficiency for a given amount of memory: http://daslab.seas.harvard.edu/monkey/.
cargo msrv
said 1.68.2, but it's not building on Mac.
Search for TODO: #8
May be an issue in lsm-tree
On a regular, but infrequent, interval, check each level and maybe compact it into itself (need a SingleLevelCompactor), or pull down into last level ideally (to remove tombstones).
Is your feature request related to a problem? Please describe.
Currently it's not possible to ensure you work with latest key version.
Describe the solution you'd like
Make possible to keys have sequence number, which may be used for denying modifying operations, if last sequence of stream and requested number doesn't match.
tree.get
function seems using prefix to match key.
Here is my code:
let folder = "";
// A tree is a single physical keyspace/index/...
// and supports a BTreeMap-like API
// but all data is persisted to disk.
let tree = Config::new(folder).open().unwrap();
// Note compared to the BTreeMap API, operations return a Result<T>
// So you can handle I/O errors if they occur
tree.insert("hello-key-999991", "hello-value-999991")
.unwrap();
let item = tree.get("hello-key-99999").unwrap();
match item {
Some(value) => {
println!("value: {}", std::str::from_utf8(&value).unwrap());
}
None => println!("Not Found"),
}
// Flush to definitely make sure data is persisted
tree.flush().unwrap();
Expect Not Found, but printed value: hello-value-999991
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.