Comments (13)
I spent some more time thinking about this in the context of #70, and I wonder if the real solution here is to abstract away the core concurrency primitive from evmap
, and then just have two implementations that share that primitive. The question then becomes what that looks like.
I think what you'd want is a LeftRight
type that is generic over:
T
, the type we're sharing (i.e.,HashMap<K, Values<V>>
forevmap
as it stands today)O
, the oplog type (i.e.,Operation
in today'sevmap
)
LeftRight
would require that T: Absorb<O>
, where:
trait Absorb<O> {
fn apply_first(&mut self, op: &mut O);
fn apply_second(&mut self, op: O);
}
This, I think, should let today's Evmap<K, V>
just be a wrapper around
LeftRight<HashMap<K, Values<V>>, Operation<K, V>>
with an appropriate impl Absorb
block. And then, we can get a single-value implementation without too much issue by wrapping
LeftRight<HashMap<K, V>, SingleOperation<K, V>>
LeftRight
itself would provide a lower-level version of ReadHandle
and WriteHandle
, which is used by the higher level variants that look like the ones we have today. ReadHandle
would be little more than a thing that gives access to &T
, whereas WriteHandle
would mainly have:
impl WriteHandle<O> {
fn refresh(&mut self);
fn append_op(&mut self, op: O);
}
Now, I think implementing all of this would be super neat, and I think it would enable a number of neat use cases like super fast vectors, BTreeMap
s, and whatnots. It would also probably be interesting to implement. Maybe I'll do it in a live stream one of these days! But if someone beats me to it, that'd be cool too!
from left-right.
Here's the lib I used the fork in: https://github.com/comath/grandma
The fork is currently a vestigial library embedded in the code. I haven't had a chance to try the changes you suggested. If they work I'll add them and not fork, but if not I can just fork.
Grandma is currently 3rd near complete rewrite of this library and doesn't yet have a some of the old features that the previous generations had. The first generation used traditional references to build the tree, with RwLocks where needed, and the second used a arena with a hashmap with per bucket locking. The node-cache updates in the old generations absolutely killed the performance. Just querying the tree with this evmap fork is much faster, I haven't tried cache updates yet.
from left-right.
I think some of the changes that landed recently in #48 may help with this. You may find the discussion there interesting, as I think your use-case is somewhat similar to that of @MayaWolf :)
from left-right.
Glad to hear it!
Out of curiosity, how much of a difference did the swap from SmallVec<T>
to T
make? It should be marginal if you only ever have one T
per key, since the SmallVec<T; 1>
should be able to hold one T
without allocating. You definitely cannot safely do in-place updates with either though, since the values are aliased between the maps (unless you are cloning them instead?).
As far as merging goes, I'd prefer finding a way to avoid replicating much of the unsafety. One way to do so might be to provide a thin wrapper type around evmap that has an API that assumes and enforces just one entry per key. And behind the scenes that would always benefit from the one-slot space held by the SmallVec
.
from left-right.
I clone them. I couldn't figure out a way without each update cloning the value, modifying the clone, deleting the old entry and finally inserting the update. The savings is mostly in the update code that is simpler and lets me clone only once at creation.
I was thinking of two solutions to multiple largely redundant unsafe blocks. One is to have a back end with all the unsafe that both maps use. This is probably too hard to get right, but if we did it would be nice.
The other is to have the map have 2 values, a metadata value that's cloned and a normal one that's a SmallVec
value. The base map would have:
pub struct WriteHandle<K, D, V, M = (), S = RandomState>
where
K: Eq + Hash + Clone,
S: BuildHasher + Clone,
V: Eq + Hash + ShallowCopy,
D: Clone + Default,
M: 'static + Clone,
{ ... }
The mono map would then just be a WriteHandle<K, D, ()>
and the multi-value map would be WriteHandle<K, (), V>
. This metadata-enhanced map would probably be the most helpful over all to me personally, I am implementing a hierarchical clustering system that could use both values extensively.
I hadn't thought of having a the monomap be a single-value enforced multimap. I don't think it'd work with a single value in the SmallVec
, though. A pair of values in the SmallVec
could work but we'd have to track which one we should read from.
from left-right.
A back end that holds the data and that both maps update would not be safe, since you'd then still have read-write conflicts on it. You could of course always use a lock, but that's equivalent to having your data type be Arc<Mutex<T>>
.
Having both a cloned and a non-cloned value seems like a very strange design indeed. Are there use-cases where someone would need both? I'm hesitant to have to juggle two different types of values in an already complex data-structure — it makes it much more likely that there'll be bugs.
Ah, yes, I was proposing just using the SmallVec
for a case where you never needed &mut
to a value, but wanted a simpler API (i.e., and API that doesn't expect multi-values for each key).
Ultimately, I don't think extending evmap to supporting &mut
to the values is going to be reasonable. It is far too easy to shoot yourself in the foot with it. Many of the APIs would have to become unsafe, since, for example, if a user runs an accidentally non-deterministic update function, the two maps would become out of sync, and all sorts of things would break! If you want mutable access, my suggestion would be using Arc<RwLock<T>>
as the value type (maybe using parking_lot
). That way, you do per-value locking, and you have the ability to modify the value if needed.
from left-right.
My KNN data structure is a tree where the nodes have several fields that hold cached results and other information, and a list of children. Since the list of children is small, between 1 and 20 ints, I just clone it. The clustering is similar, but with a much bigger child list and a much more complicated metadata to enable some topological data analysis.
I think this use case might be specialized for trees, so if you don't want the added complexity I could fork it and call it evtreemap
or something.
from left-right.
I wonder -- have you benchmarked just replacing the child list instead of updating it in-place? My guess is that it won't be that much slower.
from left-right.
I have not, I ran into issue updating the metadata early and then made the monomap. The child list isn't updated often, so I don't think it'll change the overall performance much.
from left-right.
Might be worth trying that so that you don't have to maintain your own fork :) We could in theory also provide a handy method for "replace by calling a closure", which sounds like it might work well in your case!
from left-right.
That could greatly simplify the code for most. I can make a PR for that.
I forked my mono-map a while ago. There's been a lot of changes since then :D.
from left-right.
Hehe, I believe it. Many of them should make the map much nicer to work with! From memory, some of them were also correctness fixes.
from left-right.
FWIW I would also be interested in this, and also a unique-key (like a normal hashmap instead of a bag) version.
from left-right.
Related Issues (20)
- Support for no_std? HOT 5
- i saw the presentation, will it take a lot of CPU like spinlock while write is waiting? HOT 1
- Why you dont use Crossbeam Epoch?? i have question about this please help HOT 1
- Recoverable errors from fallible operation HOT 5
- Memory bloat when using with async runtime HOT 2
- Possible to shallowcopy a HashMap HOT 1
- #[derive(ShallowCopy)] has prohibitive "T: ShallowCopy" bound HOT 4
- Thread Safety Error on ReadHandler HOT 3
- Why not make `map_ref` of `ReadGuard` public? HOT 1
- Recreating WriteHandle after its dropped HOT 2
- RFC: Hash and Eq requirement for values HOT 4
- Replace Clone bound with Freeze HOT 4
- Possible UB from Box<T> aliasing HOT 7
- data races due to incorrect `Send + Sync` bounds HOT 3
- Relaxation of epoch invariant HOT 2
- rollback HOT 5
- Switch oplog to VecDeque HOT 1
- A potential alternative to epochs, if it works HOT 1
- can DropBehavior have an associated constant instead of a function HOT 11
- Is there a way to get the backing data structure out? HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from left-right.