declanvk / blart Goto Github PK
View Code? Open in Web Editor NEWHome Page: https://declanvk.github.io/blart/
License: Apache License 2.0
Home Page: https://declanvk.github.io/blart/
License: Apache License 2.0
Currently the IntoIter
iterator implementation is pretty simple:
struct IntoIter<V>(TreeMap<V>);
impl<V> Iterator for IntoIter<V> {
type Item = (Box<[u8]>, V);
fn next(&mut self) -> Option<Self::Item> {
self.0.pop_first()
}
}
This is inefficient because the pop_first
function does extra work that fixes up the tree so that it is valid after removing the minimum leaf (things like compressing nodes with only a single remaining child, reducing node size below a specific boundary (InnerNode16 to InnerNode4)). it might be possible to gain some efficiency by allowing the tree to be in a "non-standard" state (inner nodes with 1 child or fewer that is advised).
Needs benchmark to accept.
insert_into_left_skewed_tree_deallocate
for an example.Option<parent_node>
, child_node
) where the new leaf should be inserted into child node. This portion can be done via an iterative search like the normal lookup.child_node
, if the node needs to grow we can update the Option<parent_node>
.See rust-lang/libs-team#141 for design of API and links to implementing CR
This feature may be a bit difficult because I believe that the BTreeMap implementation maintains a linked-list among leaf nodes so that traversing among those nodes is easy.
Options:
nnethercote/dhat-rs
to implement testing and profiling of memory usage.dhat::Alloc
, then run individual testsCurrently we have to use the nightly distribution of rust because we need some unstable features.
Nightly features we use:
After removing all unstable features (or they get promoted to stable) we should expand the github actions matrix to run all the build and tests for multiple version of rust: nightly, beta, stable, and a pegged MSRV.
Currently the contains_key
method is very simple:
pub fn contains_key<K>(&self, k: K) -> bool
where
K: AsRef<[u8]>,
{
self.get(k).is_some()
}
Possibly we could optimize this so that it doesn't return any extra information, but just fast returns if the key is present or missing. Needs specific benchmark results
When an index is created for an existing relation, the following recursive algorithm can be used to speed up index construction: Using the first byte of each key the key/value pairs are radix partitioned into 256 partitions and an inner node of the appropriate type is created. Before returning that inner node, its children are created by recursively applying the bulk loading procedure for each partition using the next byte of each key.
range
/range_mut
iteratorsdrain_filter
split_off
append
For split_off
and append
these can be implemented in terms of drain_filter
, but it may be better to go with an optimized version.
Also, completion of these these methods will block some of the progress in #24
So I made a fork of blart a few months ago, because it seemed the project was abandoned.
In my fork I made a lot of changes (like a lot):
If you wanna take a look here is my repo in the correct branch.
A few highlights:
Was also thinking about implementing a syncronized version from the paper: The ART of Practical Synchronization, but I'm not so sure about it, I don't think it's gonna work
I don't know if you want to merge it or keep it a separate crate, just let me know.
(Great project, was easy to understand and the performance was already great)
Going to hold off on endian types:
#[derive(Debug, Clone, Copy)]
#[cfg(feature = "nightly")]
struct BigEndian<T>
where
[(); std::mem::size_of::<T>()]:,
{
_repr_type: std::marker::PhantomData<T>,
bytes: [u8; std::mem::size_of::<T>()],
}
Currently crashes my nightly compiler with:
declan@declan-MS-7C02:~/repos/github/declanvk/blart$ cargo +nightly build --features nightly
Compiling blart v0.1.0 (/home/declan/repos/github/declanvk/blart)
warning: the feature `generic_const_exprs` is incomplete and may not be safe to use and/or cause compiler crashes
--> src/lib.rs:11:9
|
11 | generic_const_exprs
| ^^^^^^^^^^^^^^^^^^^
|
= note: see issue #76560 <https://github.com/rust-lang/rust/issues/76560> for more information
= note: `#[warn(incomplete_features)]` on by default
warning: field `bytes` is never read
--> src/key.rs:34:5
|
29 | struct BigEndian<T>
| --------- field in this struct
...
34 | bytes: [u8; std::mem::size_of::<T>()],
| ^^^^^
|
= note: `BigEndian` has derived impls for the traits `Clone` and `Debug`, but these are intentionally ignored during dead code analysis
= note: `#[warn(dead_code)]` on by default
error: internal compiler error: no errors encountered even though `delay_span_bug` issued
error: internal compiler error: cannot relate constants (Const { ty: fn() -> usize {std::mem::size_of::<T>}, kind: Value(Branch([])) }, Const { ty: fn() -> usize {std::mem::size_of::<_>}, kind: Value(Branch([])) }) of different types: fn() -> usize {std::mem::size_of::<T>} != fn() -> usize {std::mem::size_of::<_>}
|
= note: delayed at 0: <rustc_errors::HandlerInner>::emit_diagnostic
1: <rustc_errors::Handler>::delay_span_bug::<rustc_span::span_encoding::Span, &alloc::string::String>
2: rustc_middle::ty::relate::super_relate_consts::<rustc_infer::infer::equate::Equate>
3: <rustc_infer::infer::InferCtxt>::super_combine_consts::<rustc_infer::infer::equate::Equate>
4: rustc_middle::ty::relate::super_relate_consts::<rustc_infer::infer::equate::Equate>
5: <rustc_infer::infer::InferCtxt>::super_combine_consts::<rustc_infer::infer::equate::Equate>
6: <rustc_infer::infer::InferCtxt>::commit_if_ok::<rustc_infer::infer::InferOk<()>, rustc_middle::ty::error::TypeError, <rustc_infer::infer::at::Trace>::eq<rustc_middle::ty::consts::Const>::{closure#0}>
7: <rustc_infer::infer::at::At>::eq::<rustc_middle::ty::consts::Const>
8: <rustc_trait_selection::traits::engine::ObligationCtxt>::eq::<rustc_middle::ty::consts::Const>
9: <rustc_infer::infer::InferCtxt>::probe::<bool, <rustc_trait_selection::traits::const_evaluatable::satisfied_from_param_env::Visitor as rustc_middle::ty::visit::TypeVisitor>::visit_const::{closure#0}>
10: <rustc_trait_selection::traits::const_evaluatable::satisfied_from_param_env::Visitor as rustc_middle::ty::visit::TypeVisitor>::visit_const
11: rustc_trait_selection::traits::const_evaluatable::satisfied_from_param_env
12: rustc_trait_selection::traits::const_evaluatable::is_const_evaluatable
13: <rustc_trait_selection::traits::fulfill::FulfillProcessor as rustc_data_structures::obligation_forest::ObligationProcessor>::process_obligation
14: <rustc_data_structures::obligation_forest::ObligationForest<rustc_trait_selection::traits::fulfill::PendingPredicateObligation>>::process_obligations::<rustc_trait_selection::traits::fulfill::FulfillProcessor>
15: <rustc_trait_selection::traits::fulfill::FulfillmentContext as rustc_infer::traits::engine::TraitEngine>::select_where_possible
16: <rustc_hir_typeck::fn_ctxt::FnCtxt>::expected_inputs_for_expected_output
17: <rustc_hir_typeck::fn_ctxt::FnCtxt>::check_expr_struct_fields
18: <rustc_hir_typeck::fn_ctxt::FnCtxt>::check_expr_with_expectation_and_args
19: <rustc_hir_typeck::fn_ctxt::FnCtxt>::check_block_with_expected
20: <rustc_hir_typeck::fn_ctxt::FnCtxt>::check_expr_with_expectation_and_args
21: <rustc_hir_typeck::fn_ctxt::FnCtxt>::check_return_expr
22: rustc_hir_typeck::check::check_fn
23: rustc_hir_typeck::typeck
24: <rustc_query_system::dep_graph::graph::DepGraph<rustc_middle::dep_graph::dep_node::DepKind>>::with_task::<rustc_middle::ty::context::TyCtxt, rustc_span::def_id::LocalDefId, &rustc_middle::ty::typeck_results::TypeckResults>
25: rustc_query_system::query::plumbing::try_execute_query::<rustc_query_impl::queries::typeck, rustc_query_impl::plumbing::QueryCtxt>
26: rustc_data_structures::sync::par_for_each_in::<&[rustc_span::def_id::LocalDefId], <rustc_middle::hir::map::Map>::par_body_owners<rustc_hir_typeck::typeck_item_bodies::{closure#0}>::{closure#0}>
27: rustc_hir_typeck::typeck_item_bodies
28: <rustc_query_system::dep_graph::graph::DepGraph<rustc_middle::dep_graph::dep_node::DepKind>>::with_task::<rustc_middle::ty::context::TyCtxt, (), ()>
29: rustc_query_system::query::plumbing::try_execute_query::<rustc_query_impl::queries::typeck_item_bodies, rustc_query_impl::plumbing::QueryCtxt>
30: <rustc_query_impl::Queries as rustc_middle::ty::query::QueryEngine>::typeck_item_bodies
31: <rustc_session::session::Session>::time::<(), rustc_hir_analysis::check_crate::{closure#7}>
32: rustc_hir_analysis::check_crate
33: rustc_interface::passes::analysis
34: <rustc_query_system::dep_graph::graph::DepGraph<rustc_middle::dep_graph::dep_node::DepKind>>::with_task::<rustc_middle::ty::context::TyCtxt, (), core::result::Result<(), rustc_errors::ErrorGuaranteed>>
35: rustc_query_system::query::plumbing::try_execute_query::<rustc_query_impl::queries::analysis, rustc_query_impl::plumbing::QueryCtxt>
36: <rustc_query_impl::Queries as rustc_middle::ty::query::QueryEngine>::analysis
37: <rustc_interface::passes::QueryContext>::enter::<rustc_driver_impl::run_compiler::{closure#1}::{closure#2}::{closure#3}, core::result::Result<(), rustc_errors::ErrorGuaranteed>>
38: <rustc_interface::interface::Compiler>::enter::<rustc_driver_impl::run_compiler::{closure#1}::{closure#2}, core::result::Result<core::option::Option<rustc_interface::queries::Linker>, rustc_errors::ErrorGuaranteed>>
39: rustc_span::with_source_map::<core::result::Result<(), rustc_errors::ErrorGuaranteed>, rustc_interface::interface::run_compiler<core::result::Result<(), rustc_errors::ErrorGuaranteed>, rustc_driver_impl::run_compiler::{closure#1}>::{closure#0}::{closure#0}>
40: <scoped_tls::ScopedKey<rustc_span::SessionGlobals>>::set::<rustc_interface::interface::run_compiler<core::result::Result<(), rustc_errors::ErrorGuaranteed>, rustc_driver_impl::run_compiler::{closure#1}>::{closure#0}, core::result::Result<(), rustc_errors::ErrorGuaranteed>>
41: std::sys_common::backtrace::__rust_begin_short_backtrace::<rustc_interface::util::run_in_thread_pool_with_globals<rustc_interface::interface::run_compiler<core::result::Result<(), rustc_errors::ErrorGuaranteed>, rustc_driver_impl::run_compiler::{closure#1}>::{closure#0}, core::result::Result<(), rustc_errors::ErrorGuaranteed>>::{closure#0}::{closure#0}, core::result::Result<(), rustc_errors::ErrorGuaranteed>>
42: <<std::thread::Builder>::spawn_unchecked_<rustc_interface::util::run_in_thread_pool_with_globals<rustc_interface::interface::run_compiler<core::result::Result<(), rustc_errors::ErrorGuaranteed>, rustc_driver_impl::run_compiler::{closure#1}>::{closure#0}, core::result::Result<(), rustc_errors::ErrorGuaranteed>>::{closure#0}::{closure#0}, core::result::Result<(), rustc_errors::ErrorGuaranteed>>::{closure#1} as core::ops::function::FnOnce<()>>::call_once::{shim:vtable#0}
43: <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once
at /rustc/8996ea93b6e554148c4286e62b613f12a3ee505c/library/alloc/src/boxed.rs:1988:9
44: <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once
at /rustc/8996ea93b6e554148c4286e62b613f12a3ee505c/library/alloc/src/boxed.rs:1988:9
45: std::sys::unix::thread::Thread::new::thread_start
at /rustc/8996ea93b6e554148c4286e62b613f12a3ee505c/library/std/src/sys/unix/thread.rs:108:17
46: start_thread
at /build/glibc-SzIz7B/glibc-2.31/nptl/pthread_create.c:477:8
47: clone
at /build/glibc-SzIz7B/glibc-2.31/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:95
error: internal compiler error: cannot relate constants (Const { ty: fn() -> usize {std::mem::size_of::<T>}, kind: Value(Branch([])) }, Const { ty: fn() -> usize {std::mem::size_of::<_>}, kind: Value(Branch([])) }) of different types: fn() -> usize {std::mem::size_of::<T>} != fn() -> usize {std::mem::size_of::<_>}
|
= note: delayed at 0: <rustc_errors::HandlerInner>::emit_diagnostic
1: <rustc_errors::Handler>::delay_span_bug::<rustc_span::span_encoding::Span, &alloc::string::String>
2: rustc_middle::ty::relate::super_relate_consts::<rustc_infer::infer::equate::Equate>
3: <rustc_infer::infer::InferCtxt>::super_combine_consts::<rustc_infer::infer::equate::Equate>
4: rustc_middle::ty::relate::super_relate_consts::<rustc_infer::infer::equate::Equate>
5: <rustc_infer::infer::InferCtxt>::super_combine_consts::<rustc_infer::infer::equate::Equate>
6: <rustc_infer::infer::InferCtxt>::commit_if_ok::<rustc_infer::infer::InferOk<()>, rustc_middle::ty::error::TypeError, <rustc_infer::infer::at::Trace>::eq<rustc_middle::ty::consts::Const>::{closure#0}>
7: <rustc_infer::infer::at::At>::eq::<rustc_middle::ty::consts::Const>
8: <rustc_trait_selection::traits::engine::ObligationCtxt>::eq::<rustc_middle::ty::consts::Const>
9: rustc_trait_selection::traits::const_evaluatable::satisfied_from_param_env
10: rustc_trait_selection::traits::const_evaluatable::is_const_evaluatable
11: <rustc_trait_selection::traits::fulfill::FulfillProcessor as rustc_data_structures::obligation_forest::ObligationProcessor>::process_obligation
12: <rustc_data_structures::obligation_forest::ObligationForest<rustc_trait_selection::traits::fulfill::PendingPredicateObligation>>::process_obligations::<rustc_trait_selection::traits::fulfill::FulfillProcessor>
13: <rustc_trait_selection::traits::fulfill::FulfillmentContext as rustc_infer::traits::engine::TraitEngine>::select_where_possible
14: <rustc_hir_typeck::fn_ctxt::FnCtxt>::expected_inputs_for_expected_output
15: <rustc_hir_typeck::fn_ctxt::FnCtxt>::check_expr_struct_fields
16: <rustc_hir_typeck::fn_ctxt::FnCtxt>::check_expr_with_expectation_and_args
17: <rustc_hir_typeck::fn_ctxt::FnCtxt>::check_block_with_expected
18: <rustc_hir_typeck::fn_ctxt::FnCtxt>::check_expr_with_expectation_and_args
19: <rustc_hir_typeck::fn_ctxt::FnCtxt>::check_return_expr
20: rustc_hir_typeck::check::check_fn
21: rustc_hir_typeck::typeck
22: <rustc_query_system::dep_graph::graph::DepGraph<rustc_middle::dep_graph::dep_node::DepKind>>::with_task::<rustc_middle::ty::context::TyCtxt, rustc_span::def_id::LocalDefId, &rustc_middle::ty::typeck_results::TypeckResults>
23: rustc_query_system::query::plumbing::try_execute_query::<rustc_query_impl::queries::typeck, rustc_query_impl::plumbing::QueryCtxt>
24: rustc_data_structures::sync::par_for_each_in::<&[rustc_span::def_id::LocalDefId], <rustc_middle::hir::map::Map>::par_body_owners<rustc_hir_typeck::typeck_item_bodies::{closure#0}>::{closure#0}>
25: rustc_hir_typeck::typeck_item_bodies
26: <rustc_query_system::dep_graph::graph::DepGraph<rustc_middle::dep_graph::dep_node::DepKind>>::with_task::<rustc_middle::ty::context::TyCtxt, (), ()>
27: rustc_query_system::query::plumbing::try_execute_query::<rustc_query_impl::queries::typeck_item_bodies, rustc_query_impl::plumbing::QueryCtxt>
28: <rustc_query_impl::Queries as rustc_middle::ty::query::QueryEngine>::typeck_item_bodies
29: <rustc_session::session::Session>::time::<(), rustc_hir_analysis::check_crate::{closure#7}>
30: rustc_hir_analysis::check_crate
31: rustc_interface::passes::analysis
32: <rustc_query_system::dep_graph::graph::DepGraph<rustc_middle::dep_graph::dep_node::DepKind>>::with_task::<rustc_middle::ty::context::TyCtxt, (), core::result::Result<(), rustc_errors::ErrorGuaranteed>>
33: rustc_query_system::query::plumbing::try_execute_query::<rustc_query_impl::queries::analysis, rustc_query_impl::plumbing::QueryCtxt>
34: <rustc_query_impl::Queries as rustc_middle::ty::query::QueryEngine>::analysis
35: <rustc_interface::passes::QueryContext>::enter::<rustc_driver_impl::run_compiler::{closure#1}::{closure#2}::{closure#3}, core::result::Result<(), rustc_errors::ErrorGuaranteed>>
36: <rustc_interface::interface::Compiler>::enter::<rustc_driver_impl::run_compiler::{closure#1}::{closure#2}, core::result::Result<core::option::Option<rustc_interface::queries::Linker>, rustc_errors::ErrorGuaranteed>>
37: rustc_span::with_source_map::<core::result::Result<(), rustc_errors::ErrorGuaranteed>, rustc_interface::interface::run_compiler<core::result::Result<(), rustc_errors::ErrorGuaranteed>, rustc_driver_impl::run_compiler::{closure#1}>::{closure#0}::{closure#0}>
38: <scoped_tls::ScopedKey<rustc_span::SessionGlobals>>::set::<rustc_interface::interface::run_compiler<core::result::Result<(), rustc_errors::ErrorGuaranteed>, rustc_driver_impl::run_compiler::{closure#1}>::{closure#0}, core::result::Result<(), rustc_errors::ErrorGuaranteed>>
39: std::sys_common::backtrace::__rust_begin_short_backtrace::<rustc_interface::util::run_in_thread_pool_with_globals<rustc_interface::interface::run_compiler<core::result::Result<(), rustc_errors::ErrorGuaranteed>, rustc_driver_impl::run_compiler::{closure#1}>::{closure#0}, core::result::Result<(), rustc_errors::ErrorGuaranteed>>::{closure#0}::{closure#0}, core::result::Result<(), rustc_errors::ErrorGuaranteed>>
40: <<std::thread::Builder>::spawn_unchecked_<rustc_interface::util::run_in_thread_pool_with_globals<rustc_interface::interface::run_compiler<core::result::Result<(), rustc_errors::ErrorGuaranteed>, rustc_driver_impl::run_compiler::{closure#1}>::{closure#0}, core::result::Result<(), rustc_errors::ErrorGuaranteed>>::{closure#0}::{closure#0}, core::result::Result<(), rustc_errors::ErrorGuaranteed>>::{closure#1} as core::ops::function::FnOnce<()>>::call_once::{shim:vtable#0}
41: <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once
at /rustc/8996ea93b6e554148c4286e62b613f12a3ee505c/library/alloc/src/boxed.rs:1988:9
42: <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once
at /rustc/8996ea93b6e554148c4286e62b613f12a3ee505c/library/alloc/src/boxed.rs:1988:9
43: std::sys::unix::thread::Thread::new::thread_start
at /rustc/8996ea93b6e554148c4286e62b613f12a3ee505c/library/std/src/sys/unix/thread.rs:108:17
44: start_thread
at /build/glibc-SzIz7B/glibc-2.31/nptl/pthread_create.c:477:8
45: clone
at /build/glibc-SzIz7B/glibc-2.31/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:95
note: we would appreciate a bug report: https://github.com/rust-lang/rust/issues/new?labels=C-bug%2C+I-ICE%2C+T-compiler&template=ice.md
note: rustc 1.69.0-nightly (8996ea93b 2023-02-09) running on x86_64-unknown-linux-gnu
note: compiler flags: --crate-type lib -C embed-bitcode=no -C debuginfo=2 -C incremental=[REDACTED]
note: some of the compiler flags provided by cargo are hidden
query stack during panic:
end of query stack
warning: `blart` (lib) generated 2 warnings
error: could not compile `blart`; 2 warnings emitted
Originally posted by @declanvk in #16 (comment)
OpaqueNotePtr<V>
type.unsafe
blocks and fulfilling some of those requirements.BTreeMap
and BTreeSet
types from the standard library.BTreeMap
and BTreeSet
and update the types and parameters.BTreeMap
API Outlineimpl<K, V> BTreeMap<K, V> {
pub fn new() -> BTreeMap<K, V>;
pub fn clear(&mut self);
pub fn get<Q>(&self, key: &Q) -> Option<&V> where
K: Borrow<Q> + Ord,
Q: Ord + ?Sized;
pub fn get_key_value<Q>(&self, k: &Q) -> Option<(&K, &V)> where
K: Borrow<Q> + Ord,
Q: Ord + ?Sized;
pub fn first_key_value(&self) -> Option<(&K, &V)> where
K: Ord;
pub fn first_entry(&mut self) -> Option<OccupiedEntry<'_, K, V>> where
K: Ord;
pub fn pop_first(&mut self) -> Option<(K, V)> where
K: Ord;
pub fn last_key_value(&self) -> Option<(&K, &V)> where
K: Ord;
pub fn last_entry(&mut self) -> Option<OccupiedEntry<'_, K, V>> where
K: Ord;
pub fn pop_last(&mut self) -> Option<(K, V)> where
K: Ord;
pub fn contains_key<Q>(&self, key: &Q) -> bool where
K: Borrow<Q> + Ord,
Q: Ord + ?Sized;
pub fn get_mut<Q>(&mut self, key: &Q) -> Option<&mut V> where
K: Borrow<Q> + Ord,
Q: Ord + ?Sized;
pub fn insert(&mut self, key: K, value: V) -> Option<V> where
K: Ord;
pub fn try_insert(
&mut self,
key: K,
value: V
) -> Result<&mut V, OccupiedError<'_, K, V>> where
K: Ord;
pub fn remove<Q>(&mut self, key: &Q) -> Option<V> where
K: Borrow<Q> + Ord,
Q: Ord + ?Sized;
pub fn remove_entry<Q>(&mut self, key: &Q) -> Option<(K, V)> where
K: Borrow<Q> + Ord,
Q: Ord + ?Sized;
pub fn retain<F>(&mut self, f: F) where
K: Ord,
F: FnMut(&K, &mut V) -> bool;
pub fn append(&mut self, other: &mut BTreeMap<K, V>) where
K: Ord;
pub fn range<T, R>(&self, range: R) -> Range<'_, K, V> where
T: Ord + ?Sized,
K: Borrow<T> + Ord,
R: RangeBounds<T>;
pub fn range_mut<T, R>(&mut self, range: R) -> RangeMut<'_, K, V> where
T: Ord + ?Sized,
K: Borrow<T> + Ord,
R: RangeBounds<T>;
pub fn entry(&mut self, key: K) -> Entry<'_, K, V> where
K: Ord;
pub fn split_off<Q>(&mut self, key: &Q) -> BTreeMap<K, V> where
Q: Ord + ?Sized,
K: Borrow<Q> + Ord;
pub fn drain_filter<F>(&mut self, pred: F) -> DrainFilter<'_, K, V, F> where
K: Ord,
F: FnMut(&K, &mut V) -> bool;
pub fn into_keys(self) -> IntoKeys<K, V>;
pub fn into_values(self) -> IntoValues<K, V>;
pub fn iter(&self) -> Iter<'_, K, V>;
pub fn iter_mut(&mut self) -> IterMut<'_, K, V>;
pub fn keys(&self) -> Keys<'_, K, V>;
pub fn values(&self) -> Values<'_, K, V>;
pub fn values_mut(&mut self) -> ValuesMut<'_, K, V>;
pub fn len(&self) -> usize;
pub fn is_empty(&self) -> bool;
}
Like the TreeMap
corresponding to BTreeMap
, TreeSet
should copy BTreeSet
where possible.
https://doc.rust-lang.org/stable/std/collections/struct.BTreeSet.html
impl<T> BTreeSet<T> {
pub const fn new() -> BTreeSet<T>;
pub fn range<K: ?Sized, R>(&self, range: R) -> Range<'_, T>
where
K: Ord,
T: Borrow<K> + Ord,
R: RangeBounds<K>;
pub fn difference<'a>(&'a self, other: &'a BTreeSet<T, A>) -> Difference<'a, T, A>
where
T: Ord;
pub fn symmetric_difference<'a>(
&'a self,
other: &'a BTreeSet<T, A>,
) -> SymmetricDifference<'a, T>
where
T: Ord;
pub fn intersection<'a>(&'a self, other: &'a BTreeSet<T, A>) -> Intersection<'a, T, A>
where
T: Ord;
pub fn union<'a>(&'a self, other: &'a BTreeSet<T, A>) -> Union<'a, T>
where
T: Ord;
pub fn clear(&mut self)
where
A: Clone;
pub fn contains<Q: ?Sized>(&self, value: &Q) -> bool
where
T: Borrow<Q> + Ord,
Q: Ord;
pub fn get<Q: ?Sized>(&self, value: &Q) -> Option<&T>
where
T: Borrow<Q> + Ord,
Q: Ord;
pub fn is_disjoint(&self, other: &BTreeSet<T, A>) -> bool
where
T: Ord;
pub fn is_subset(&self, other: &BTreeSet<T, A>) -> bool
where
T: Ord;
pub fn is_superset(&self, other: &BTreeSet<T, A>) -> bool
where
T: Ord;
pub fn first(&self) -> Option<&T>
where
T: Ord;
pub fn last(&self) -> Option<&T>
where
T: Ord;
pub fn pop_first(&mut self) -> Option<T>
where
T: Ord;
pub fn pop_last(&mut self) -> Option<T>
where
T: Ord;
pub fn insert(&mut self, value: T) -> bool
where
T: Ord;
pub fn replace(&mut self, value: T) -> Option<T>
where
T: Ord;
pub fn remove<Q: ?Sized>(&mut self, value: &Q) -> bool
where
T: Borrow<Q> + Ord,
Q: Ord;
pub fn take<Q: ?Sized>(&mut self, value: &Q) -> Option<T>
where
T: Borrow<Q> + Ord,
Q: Ord;
pub fn retain<F>(&mut self, mut f: F)
where
T: Ord,
F: FnMut(&T) -> bool;
pub fn append(&mut self, other: &mut Self)
where
T: Ord,
A: Clone;
pub fn split_off<Q: ?Sized + Ord>(&mut self, value: &Q) -> Self
where
T: Borrow<Q> + Ord,
A: Clone;
pub fn drain_filter<'a, F>(&'a mut self, pred: F) -> DrainFilter<'a, T, F, A>
where
T: Ord,
F: 'a + FnMut(&T) -> bool;
pub fn iter(&self) -> Iter<'_, T>;
pub const fn len(&self) -> usize;
pub const fn is_empty(&self) -> bool;
}
Currently the tree only accepts Box<[u8]>
as keys. I should implement support for different keys types, so long as they can be temporarily transformed to [u8]
of some sort.
My initial idea has the general sketch traits:
trait Key {
fn view_bytes<O, F: FnMut(&[u8]) -> O>(&self, f: F) -> O;
}
/// When the lexicographic ordering of the [`Key`] byte-string matches the ordering of the underlying type.
unsafe trait OrderedKey: Key {}
/// When the type is guaranteed to never produce a [`Key`] byte-string that is the prefix of another [`Key`] or vice-versa.
unsafe trait NoPrefixesKey: Key {}
Example of use for u64
:
impl Key for NativeEndian<u64> {
fn view_bytes<O, F: FnMut(&[u8]) -> O>(&self, mut f: F) -> O {
let bytes = self.0.to_ne_bytes();
(f)(bytes.as_ref())
}
}
unsafe impl NoPrefixesKey for NativeEndian<u64> {}
impl Key for BigEndian<u64> {
fn view_bytes<O, F: FnMut(&[u8]) -> O>(&self, mut f: F) -> O {
let bytes = self.0.to_be_bytes();
(f)(bytes.as_ref())
}
}
unsafe impl OrderedKey for BigEndian<u64> {}
unsafe impl NoPrefixesKey for BigEndian<u64> {}
impl Key for LittleEndian<u64> {
fn view_bytes<O, F: FnMut(&[u8]) -> O>(&self, mut f: F) -> O {
let bytes = self.0.to_le_bytes();
(f)(bytes.as_ref())
}
}
unsafe impl NoPrefixesKey for LittleEndian<u64> {}
#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash)]
#[repr(transparent)]
struct BigEndian<T>(T);
#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash)]
#[repr(transparent)]
struct LittleEndian<T>(T);
#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash)]
#[repr(transparent)]
struct NativeEndian<T>(T);
// hiding some `From` impls...
See full example at https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=0cfdebe7eee884c906ea0607c79e5c07
Ideally I'd like to be able to have the Key
trait just directly return &[u8]
, not go through a closure, but the issue is that some types need to derive a new value for the bytes, not just expose a reference. For integers, the provided methods (to_le_bytes
, etc) produce an owned array of bytes which doesn't quite work.
The uses for the traits:
Key
- this trait is the main one which will be implemented for all keys which can be used with the tree, it allows us to access the component bytes for the type.OrderedKey
- this trait indicates that the lexicographic ordering of the byte-string matches the natural ordering of the type, based on what is used in the PartialOrd
or Ord
implementation.
NoPrefixesKey
- this one is the most important for delivering on a non-failing insert
function. Currently there is only the try_insert
function, which will fail if the given key is a prefix or makes an existing key a prefix of the new key. This trait would indicate that the set of possible keys will have no prefixes.
CStr
)This change would also require changing the overall type variable situation from <V>
to <K,V>
, and the definition of LeafNode
would change from
pub struct LeafNode<V> {
pub value: V,
pub key: Box<[u8]>,
}
to
pub struct LeafNode<K, V> {
pub value: V,
pub key: K,
}
Option<OpaqueNodePtr<V>>
to every node typeThe current implementation of Clone for TreeMap
is:
impl<V: Clone> Clone for TreeMap<V> {
fn clone(&self) -> Self {
self.iter()
.map(|(key, value)| (Box::from(key), value.clone()))
.collect()
}
}
It's possible this could be made more efficient with a custom function that recurses down to leaves and then clones and overwrites each inner node individually. Needs a benchmark to see.
tagged_pointer
module is currently pointer crimes because it converts the input pointer into a NonZeroUsize
and back again when.uint2ptr
conversion.SmallVec<[u8; 8]>
. The SmallVec
struct implements an optimization where it will store the array ([u8; 8]
) inline, and then if the vector grows beyond that capacity it will switch to a heap allocation.Currently the {minimum,maximum}_unchecked
functions have a function signature of (OpaqueNodePtr<V>) -> Option<NodePtr<LeafNode<V>>>
, which seems to indicate that they can fail to find a leaf. However, for a well-formed tree (no loops, all node types have number of children in required range), there will always be a minimum/maximum leaf. Should the _unchecked
functions be asserting this?
Current in the insert_unchecked
function we will unconditionally update the parent node if it was found during search: https://github.com/declanvk/blart/blob/main/src/nodes/operations/insert.rs#L372-L373
Based on experience writing the delete_unchecked
function, we can choose to conditionally write to update the parent node only when the insert function creates a new inner node.
Currently the InnerNodeCompressed
looks for a matching child node by iterating over the key bytes in order:
blart/src/nodes/representation.rs
Lines 566 to 576 in b91d5e4
In the original ART paper, this InnerNodeCompressed
is actually 2 distinct types: InnerNode4
and InnerNode16
. The lookup procedure for InnerNoed4
is the simple loop, while the lookup for InnerNode16
is a SIMD optimized implementation.
I think an easier first optimization would be replacing the lookup with an invocation of memchr::memchr
which has a fancy optimized implementation already.
Later on we could pursue a custom implementation with std::simd
(available on nightly)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.