ryzhyk / arc-interner Goto Github PK
View Code? Open in Web Editor NEWA concurrent interner that deallocates unused values
License: MIT License
A concurrent interner that deallocates unused values
License: MIT License
With the move to DashMap, lock contention is reduced, but there is still a global lock acquired via the state::Container that in my use-case causes contention on the try_get. Try_get calls the internal lock method when the Container is not frozen which spins so it uses lots of cpu when contended.
Container has a freeze() method to reduce the lock contention once its fully populated, but with current model I can't see a way to call that as there could always be a new type T coming to ArcIntern::new().
Currently we're effectively using the state::Container as a kind of shared generic static between all interned types. Looking at alternatives to state::Container, they are effectively also a shared HashMap<TypeId, _> that already wrap in a mutex internally (e.g. generic_static crate), or we'd need to wrap in a mutex (e.g. anymap crate) . Macro approaches to generate per-type statics inside a library crate also appear to be unappealing
What would you think about an approach where the crate moves to a shared-nothing approach between types, so that interning type Foo and type Bar don't need to acquire the same locks at all? This could potentially eliminate the state::Container and its lock entirely. This could be done by introducing a factory type (which would be an api change).
Current non-factory pseudo-code
// setup - could just use ArcIntern directly, but have this wrapper so can try different interning approaches.
struct Wrapped<T>(ArcIntern<T>)
impl<T> Wrapped {
pub fn new(t: T) Self {
Self(ArcIntern::new(t))
}
}
// Also implements Deref
// callsite
let a = Wrapped::new(Foo::new())
let b = Wrapped::new(Bar::new())
vs possible alternative factory code where per-type factories are separate statics containing their own DashMap<T,()> and therefore share no state.
// setup
static FOO_INTERNER= OnceCell<ArcInterner<Foo>> = OnceCell::new();
static BAR_INTERNER= OnceCell<ArcInterner<Bar>> = OnceCell::new();
struct Wrapped<T>(ArcIntern<T>)
impl Wrapped<Foo> {
pub fn new(t: Foo) Self {
Self(FOO_INTERNER::intern(t))
}
}
impl Wrapped<Bar> {
pub fn new(t: Bar) Self {
Self(BAR_INTERNER::intern(t))
}
}
...
// callsite
let a = Wrapped::new(Foo::new())
let b = Wrapped::new(Bar::new())
ArcInterner would likely need to pass a function as a type parameter to ArcIntern to handle the cleanup on drop, and ArcIntern::new would become internal only so that construction is always via ArcInterner::intern .
Even though only ArcIntern
is flawed in internment
, using its other APIs carries the risk that later on someone may start using ArcIntern
without realizing its problems.
This crate should also export Intern
and LocalIntern
so that it can be used as a complete functional replacement.
As the API currently stands, it looks like the only way to initialize an ArcIntern is via new
which takes a val: T
. Would it be possible to instead make new
take &T
, and require T: Clone
? That way, no additional copy would be necessary when the value being instantiated is already interned.
I have a gist for how this might work under the hood (looking up in the map for the appropriate value and only copying the Vec
if necessary): https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=b380b64e7ef57499d612a0114e94170a
What do you think?
We're thinking of replacing all Arc
instances with ArcIntern
, however, the T
we're working with here is ?Sized
.
Next to ArcIntern
opting out of that trait, we need From<Box<T>>
and From<&T>
for ArcIntern
, to properly convert some instances of handling these types.
1.45 was stabilized: https://blog.rust-lang.org/2020/07/16/Rust-1.45.0.html
Which included Arc::as_ptr
: https://doc.rust-lang.org/stable/std/sync/struct.Arc.html#method.as_ptr
I believe the use of this would remove the need for the unsafe line added in #5
Since ArcIntern<T>
provides Borrow<T>
, the Hash implementations have to be the same. However, they're not.
This caused some problems when coercing ArcIntern
to Borrow
, and HashMaps
stopped working correctly, as the given hashmap (before the coercion) used ArcIntern
's hash functions, while after coercion, the hashmaps were looked up via T
's Hash
via Borrow<T>
, which caused wrong results.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.