I've been thinking quite a bit about the RFC published a few days ago. While implementing the RFC, a few issues with 'static
bounds came up that @Vtec234 correctly pointed out to in this comment.
Somewhat relatedly, I also gave some thought to dynamically-sized types. Note that we can't, for example, create an Owned<[T]>
and then store it into an Atomic<[T]>
. The problem here is that atomic pointers would have to be two words wide (fat pointers), and AtomicPtr
that is powering Atomic<_>
behind the scenes is just a single-word pointer (thin pointer).
If we want to actually have an atomic pointer to a DST, then we have to embed the size (or length) inside the heap-allocated object to make the pointer thin. See for example thin-vec, which is a Vec
the size of just 1 word (instead of 3 words).
Motivation
The Scope::defer
method we have is quite nice:
impl Scope {
fn defer<F: FnOnce() + Send + 'static>(&self, f: F);
}
The bounds on F
are self-explanatory: this is a function that will be executed once, perhaps by another thread, after an arbitrary amount of time. This is a nice piece of documentation encoded in types, but I worry that every call to defer
method will actually try to cheat around the bounds.
Example: stack
To see what I mean, consider the following scenario. We have a typical stack defined like this:
struct Node<T> {
value: ManuallyDrop<T>,
next: Atomic<Node<T>>,
}
struct Stack<T> {
head: Atomic<Node<T>>,
}
unsafe impl<T: Send> Send for Stack<T> {}
unsafe impl<T: Send> Sync for Stack<T> {}
Now suppose that we have popped a node (p: Ptr<'scope, Node<T>>
) from the top of the stack and need to destroy it. We'd typically do that using scope.defer_free
, but let's forget about it today and try destructing by calling scope.defer
instead:
scope.defer(|| drop(Box::from_raw(p.as_raw())));
But this doesn't compile, for two reasons:
p: 'static
is false because p
holds the lifetime 'scope
.
Node<T>: Send
is false because T: Send
is false.
To work around bounds F: Send
and F: 'static
, let's cheat:
let raw = p.as_raw() as usize;
scope.defer(|| drop(Box::from_raw(raw as *mut Node<T>)));
Example: skiplist
Here's another example. In a skiplist we have dynamically sized nodes:
#[repr(C)]
struct Node<K: 'static, V> {
value: ManuallyDrop<V>,
key: ManuallyDrop<K>,
refcnt: AtomicUsize,
height: usize,
pointers: [Atomic<Node<K, V>>; 0], // This array is of length `height`.
}
To allocate and deallocate such nodes, we implement our custom functions:
impl<K: 'static, V> Node<K, V> {
// Allocates `size_of::<Self>() + size_of::<Atomic<Node<K, V>>>() * height` bytes.
unsafe fn allocate(height: usize) -> *mut Self {
// ...
}
// Reads `(*raw).height` first to compute the size in bytes, then deallocates the object.
unsafe fn deallocate(raw: *mut Self) {
// ...
}
}
Now think how we would go about inserting a node into a skiplist:
fn insert(&self, key: K, value: V) -> bool {
epoch::pin(|scope| unsafe {
let height = self.random_height();
let mut raw = Node::allocate(height); // Custom allocation.
// Fill in the blanks.
ptr::write(&mut (*raw).key, ManuallyDrop::new(key));
ptr::write(&mut (*raw).value, ManuallyDrop::new(value));
ptr::write(&mut (*raw).height, height);
let mut node = Owned::from_raw(raw);
loop {
let (found, left, right) = self.search(&raw.key);
if found {
return false; // DANGER: `node` will be incorrectly deallocated!
}
node.deref().tower(0).store(right[0], SeqCst);
if let Err((_, n)) = left[0].tower(0).compare_and_set_owned(right[0], node, SeqCst, scope) {
node = n;
} else {
break;
}
}
// TODO: Continue building the tower...
true
})
}
A glaring problem here is that, since we're using custom allocation and deallocation, Owned
became unusable. We cannot create an Owned
from our allocated raw pointer and then just drop it if insertion fails -- deallocation must happen through Node::deallocate
!
A correct way of insertion would be:
fn insert(&self, key: K, value: V) -> bool {
epoch::pin(|scope| unsafe {
let height = self.random_height();
let mut raw = Node::allocate(height); // Custom allocation.
// Fill in the blanks.
ptr::write(&mut (*raw).key, ManuallyDrop::new(key));
ptr::write(&mut (*raw).value, ManuallyDrop::new(value));
ptr::write(&mut (*raw).height, height);
let node = Ptr::from_raw(raw);
loop {
let (found, left, right) = self.search(&raw.key);
if found {
// Custom drop and deallocation.
node.deref().key.manually_drop();
node.deref().value.manually_drop();
Node::deallocate(node.as_raw() as *mut _);
return false;
}
node.deref().tower(0).store(right[0], SeqCst);
if left[0].tower(0).compare_and_set(right[0], node, SeqCst, scope).is_ok() {
break;
}
}
// TODO: Continue building the tower...
true
})
}
This is a bit unfortunate because now we can't rely on Owned
anymore. Instead, we're fiddling with unsafe conversion to Ptr
and then manually destructing the node where the destructor of Owned
would otherwise be executed.
Moving on: let's turn to deferred destruction of nodes. Suppose that we've just removed a node: Ptr<'scope, Node<K, V>>
and want to destroy it:
node.deref().value.manually_drop();
scope.defer(|| {
node.deref().key.manually_drop();
Node::deallocate(node.as_raw() as *mut _);
});
This is a curious case: why drop the value now and drop the key later? The reason is that we'll implement the skiplist so that other threads have to increment the refcount before using the value, but only if the node is not yet deleted (i.e. the refcount is positive). A node is removed only when the refcount becomes zero, and then we know no one has a chance of accessing the value anymore.
Disclaimer: We don't have to use refcounts, but for the sake of this text, please just accept that the skiplist is implemented this way. :)
Anyways... As before, this doesn't compile. Let's cheat around F: Send
and F: 'static
:
node.deref().value.manually_drop();
let node = p.as_raw() as usize;
scope.defer(|| {
let node = node as *mut Node;
(*node).key.manually_drop();
Node::deallocate(node);
});
Problem summary
The key take away I want to make from this story is: even though defer
's bounds F: Send
and F: 'static
are quite nice, what is the point if we have to circumvent them every time?
I would argue that defer
is inherently very unsafe -- we have to guarantee several things:
- That the node just became unreachable from the rest of the data structure.
- That it's okay for another thread to execute the deferred destructor (but this is naturally true for all concurrent data structures).
- That we won't touch non-
'static
data in the deferred destructor (e.g. that would be value
in the case of skiplists).
When deferring destruction, we're performing the following steps:
- Grab a few local variables (
Ptr
s, which are not 'static
; or raw pointers, which are not Send
).
- Unsafely move them into a closure.
- Unsafely drop and destroy some data in the closure.
We're already acutely aware of all the requirements for correct destruction, so the type bounds F: Send
and F: 'static
are not very helpful -- they're just getting in the way.
The other problem is that Owned
does not support custom destructors. Sometimes it's useful to manually allocate and deallocate objects (actually, I'd say quite often: in deques, skiplists, Bw-Trees, even segmented queues).
Proposed solution
Think about Owned::drop
and Scope::defer_drop
we have in the current API. There is a strong connection between these two methods: they should do exactly the same thing, but the difference is when they do it. The former destructs immediately, and the latter destructs at a later time. They're like two sides of the same coin.
To support custom destructors, let's create the following trait with default implementation (note that we're using the unstable specialization feature):
pub trait Destroy {
unsafe fn destroy(ptr: *mut Self);
}
impl<T> Destroy for T {
default unsafe fn destroy(ptr: *mut Self) {
drop(Box::from_raw(ptr));
}
}
Next, implement Drop
for Owned
like this:
impl<T: Destroy> Drop for Owned<T> {
fn drop(&mut self) {
unsafe { T::destroy((self.data & !low_bits::<T>()) as *mut T) }
}
}
Let's also add a similar method to Ptr
:
impl<'scope, T: Destroy> Ptr<'scope, T> {
pub unsafe fn destroy(self) {
T::destroy(self.as_raw() as *mut T);
}
}
Now instead of doing this:
we can do this:
scope.defer(|| p.destroy());
The only thing left to do is to relax the bounds on Scope::defer
a bit by removing Send + 'static
and making it unsafe
:
impl Scope {
unsafe fn defer<F: FnOnce()>(&self, f: F);
}
This makes the method Scope::defer_drop
redundant -- of the original defer_free
/defer_drop
/defer
trio, only defer
survives. :)
Example: stack
Now let's see how would we destroy a popped node from a stack:
scope.defer(|| p.destroy());
This is not much less ergonomic than calling defer_drop
!
Example: skiplist
Let's implement Destroy
for Node
:
impl<K: 'static, V> Destroy for Node<K, V> {
unsafe fn destroy(ptr: *mut Self) {
(*ptr).key.manually_drop();
Self::deallocate(ptr);
}
}
Since now we have support for custom destructors in Owned
, we can leverage it to insert nodes into a skiplist:
fn insert(&self, key: K, value: V) -> bool {
epoch::pin(|scope| unsafe {
let height = self.random_height();
let mut raw = Node::allocate(height); // Custom allocation.
// Fill in the blanks.
ptr::write(&mut (*raw).key, ManuallyDrop::new(key));
ptr::write(&mut (*raw).value, ManuallyDrop::new(value));
ptr::write(&mut (*raw).height, height);
let mut node = Owned::from_raw(raw); // Nice!
loop {
let (found, left, right) = self.search(&raw.key);
if found {
return false; // Not dangerous anymore - awesome!
}
node.deref().tower(0).store(right[0], SeqCst);
if Err((_, n)) = left[0].tower(0).compare_and_set_owned(right[0], node, SeqCst, scope) {
node = n;
} else {
break;
}
}
// TODO: Continue building the tower...
true
})
}
To clean up initialization of Node
, we could take a step further and implement:
impl<K: 'static, V> Node<K, V> {
fn new(key: K, value: V, height: usize) -> Owned<Node<K, V>> {
// ...
}
}
Finally, to destroy a removed node from skiplist, we can just do:
node.deref().value.manually_drop();
scope.defer(|| node.destroy());
Nice and clean.
Example: dynamically-sized arrays
Here's one more example. Let's implement a dynamically sized array that can be used as Atomic<Array<T>>
:
Code (click to expand)
#![feature(specialization)]
extern crate crossbeam_epoch as epoch;
extern crate stable_heap;
use std::mem;
use std::ptr;
use std::ops::{Index, IndexMut};
use epoch::Owned;
use epoch::Destroy;
#[repr(C)]
struct Array<T: Default> {
len: usize,
arr: [T; 0],
}
impl<T> Array<T> {
fn size_and_align(len: usize) -> (usize, usize) {
let size = mem::size_of::<Self>() + len * mem::size_of::<T>();
let align = mem::align_of::<Self>();
(size, align)
}
fn new(len: usize) -> Owned<Self>
where
T: Default,
{
let (size, align) = Self::size_and_align(len);
unsafe {
let mut a = Owned::from_raw(stable_heap::allocate(size, align) as *mut Self);
a.len = len;
for i in 0..len {
ptr::write(a.arr.get_unchecked_mut(i), T::default());
}
a
}
}
fn len(&self) -> usize {
self.len
}
}
impl<T> Destroy for Array<T> {
unsafe fn destroy(raw: *mut Self) {
let (size, align) = Self::size_and_align((*raw).len);
stable_heap::deallocate(raw as *mut u8, size, align);
}
}
impl<T> Index<usize> for Array<T> {
type Output = T;
fn index(&self, index: usize) -> &T {
assert!(index < self.len);
unsafe { &*self.arr.get_unchecked(index) }
}
}
impl<T> IndexMut<usize> for Array<T> {
fn index_mut(&mut self, index: usize) -> &mut T {
assert!(index < self.len);
unsafe { &mut *self.arr.get_unchecked_mut(index) }
}
}
fn main() {
// Let's create an `Owned` array of length 5.
let mut a = Array::<usize>::new(5);
for i in 0..5 {
a[i] = i;
}
assert!(a.len() == 5);
unsafe {
epoch::unprotected(|scope| {
// We can convert it into a `Ptr`, if we want...
let p = a.into_ptr(scope);
// And destroy it!
p.destroy();
});
}
}
In the Chase-Lev deque we have Buffer<T>
, which simply wraps around Vec<T>
, creating unnecessary levels of indirection.
To access an element from the deque, we have to:
- Get a reference to the heap-allocated
Buffer<T>
.
- Get a reference to the heap-allocated array within the
Vec<T>
.
- Offset into the array.
Using Array<T>
, we'd have to do the following steps only:
- Get a reference to the heap-allocated
Array<T>
.
- Offset into the array.
This gets us fewer memory allocations and faster access into the array.
But specialization is still in nightly only!
The Destroy
trait has a default impl for all types T
, so we provide custom implementations by specializing. Specialization is still an unstable feature. Until it gets stable, we can work around the problem like this:
impl<T> Destroy for T {
#[cfg(not(feature = "nightly"))]
unsafe fn destroy(ptr: *mut Self) {
drop(Box::from_raw(ptr));
}
#[cfg(feature = "nightly")]
default unsafe fn destroy(ptr: *mut Self) {
drop(Box::from_raw(ptr));
}
}
Of course, if you're not using the nightly
feature, specialization is not available.
Alternatives
Now that we have the Destroy
trait, would anyone ever use Scope::defer
for anything else other than this?
scope.defer(|| ptr.destroy());
I can't imagine such a scenario off the top of my head, so perhaps we could ditch scope
and have only defer_destroy
:
impl Scope {
unsafe fn defer_destroy<T>(&self, ptr: Ptr<T>);
}
Final words
Whew - that was a long text! If some parts of it are confusing, let me know and I'll try to clarify.
Finally, I'd like to summarize what all this is actually attempting to achieve:
- Unify all the
defer*
methods into a single one to make the API smaller and simpler.
- Get rid of pervasive
'static
bounds and prevent scenarios like this one in sled. cc @jeehoonkang
- Allow custom destructors in
Owned
to improve support for DSTs.
- Introduce the equivalent of
Owned::drop
accessible via Ptr
(namely, Ptr::destroy
).
Let me know what you think! If you like at least some of the suggestions, I'll follow up with a more narrowly scoped RFC(s).