Giter Site home page Giter Site logo

Comments (31)

antrik avatar antrik commented on July 23, 2024

@nox well, yes and no. It's true that the IPC transfer (as well as the serde dance) implicitly clones the data; so taking it by value is inefficient in cases where the caller wants to keep the original data, currently needing to perform an extra copy that is not strictly necessary.

However, ipc-channel can also deal with pure move data -- notably IpcReceiver itself. Admittedly, the implementation for this relies on inner mutability to explicitly "consume" the original value, since the serde API doesn't provide a way to take the input by value AIUI; and this would still work if send() took the data by reference too -- except that currently the consumed sender is dropped after serialisation, before returning to the caller; so it's just an implementation detail the caller doesn't need to care about. If we decided to pass the data by reference to send() on the other hand, that would mean the consumed sender could live on, possibly causing confusing bugs.

On a more fundamental level, ipc-channel was conceived as a drop-in replacement for mpsc channels. There is a similar consideration here as with #115 : it's also a property that does not strictly make sense for IPC, but is important to enable switching between IPC channels and MPSC channels seamlessly.

Admittedly, I'm having second thoughts about that change as well... But if we start going down that road, there is actually a ton of things we could do differently to make IPC more efficient, if we disregard mpsc compatibility. Most notably, we would probably toss serde (which always adds a bunch of extra data copies), using something like Cap'n Proto instead.

In fact ipc-channel already added BytesChannel as a special copy-free fast path for raw data, with no semblance of mpsc compatibility. (This one indeed takes the input data by reference.) I guess we could enhance that into a more generic alternative interface if we deem that worthwhile. (Using Cap`n Proto etc.) Or maybe it would be better to create a separate crate for that -- though that might pose interoperability problems I fear...

from ipc-channel.

nox avatar nox commented on July 23, 2024

I would gladly abandon all mpsc compatibility.

I'm not sure what you are thinking about when you say that serde always adds a bunch of extra data copies though, especially when compared with capnproto, could you develop?

from ipc-channel.

antrik avatar antrik commented on July 23, 2024

@nox well, I'm all in favour of adding another API; I'm just not convinced it would be a good idea to change/abandon the existing drop-in API. That would require huge changes throughout Servo -- I suppose that's why @pcwalton conceived ipc-channel as a drop-in replacement for mpsc in the first place...

As for serde, it always creates a new buffer for the serialised data AIUI (correct me if I'm wrong); and also during de-serialisation. Cap'n Proto on the other hand is based on the idea of keeping data in its serialised representation in the first place -- so sending/receiving doesn't actually need to change the representation at all.

The worst case here is that a data structure sees a lot of access before or after transfer, in which case working on the serialised representation would be inefficient; so you still need to translate between serialised and "native" representations, thus not buying (or loosing) anything. For data structures that see only little access however (or are in fact assembled right before sending, which is a very common case I believe), this is strictly more efficient.

Note that Cap'n Proto was designed for network RPC AIUI, and isn't necessarily optimal for local IPC. (It deals with endianness for example, which is irrelevant for ipc-channel.) The fundamental approach though makes sense for all IPC I believe.

from ipc-channel.

nox avatar nox commented on July 23, 2024

How does cap'n'proto cope with opaque types like Vec<T>?

That may interest you, btw: serde-rs/serde#492

from ipc-channel.

antrik avatar antrik commented on July 23, 2024

@nox not sure in what sense Vec<T> is an opaque type?

If the actual question is how dynamically sized types are handled, I have to admit that I don't know (yet)... I haven't actually worked with Cap'n Proto so far -- I just know (and like) the fundamental concept. If we are serious about adding an alternative interface to ipc-channel though, I shall learn the details of course :-)

from ipc-channel.

nox avatar nox commented on July 23, 2024

@antrik We don't know what the memory representation of Vec<T> is, and capnproto is basically about leaking the memory representation. I don't see how the fundamental concept helps at all for us. I would rather improve on serde.

from ipc-channel.

antrik avatar antrik commented on July 23, 2024

@nox one fundamental thing to understand about the Cap'n Proto approach is that it doesn't just take the memory representation of any old native type and calls it serialised. That wouldn't work for Vec for example, because memory addresses have no meaning across process boundaries -- so pointers in Cap'n Proto use relative offsets within a compact memory region instead.

In other words, the serialised representations of data are often actually distinct from native ones -- the idea is just that in many cases, you can do all work on the serialised ones directly, rather then converting between native and serialised.

(To make this ergonomic, the Rust implementation of Cap'n Proto -- or some custom format based on the same ideas -- would have to implement many of Rust's standard library functions on the serialised types I guess...)

Regarding serde, there is only so much you can do when always having to convert. The idea for zero-copy de-serialisation in serde-rs/serde#492 is only useful for large data blocks; and it can only yield borrowed pointers anyway. (No Vec or Box etc.) For serialisation, limited zero-copy would be possible, if the result of serialisation is presented as a vector of slice references -- but that would only be useful if both the input consists largely of large consecutive blocks of trivially serialiseable data, and the I/O mechanisms used on the serialised data provide efficient scatter/gather interfaces. (Which we could have with the Unix back-end of ipc-channel; but not the upcoming Windows one AIUI, nor the MacOS one IIRC.)

Of course we could also try to implement Cap'n Proto concepts in serde: using a tailored binary format, and providing accessor interfaces for the serialised representation... But with serde being all about converting, and Cap'n Proto being all about working on the serialised representation, I'm not sure how much potential for code or idea sharing there really is.

from ipc-channel.

nox avatar nox commented on July 23, 2024

What I meant about capnproto is that I find quite unfortunate that you wouldn't be able to do IPC anymore with plain old Rust types such as Vec<T>, which is quite useful. We have a lot of IPC across Servo, and having to use codegen for all the types involved sounds quite like a huge task.

from ipc-channel.

antrik avatar antrik commented on July 23, 2024

Correction: the lack of scatter/gather costs an extra copy regardless of the serialisation approach I guess; so zero-copy serialisation would still be useful there... The other conditions stand though.

from ipc-channel.

antrik avatar antrik commented on July 23, 2024

@nox depends on where the Vec comes from, and where it goes. Worst case is that it's used in many places before and after the transfer; so you'd want to convert between the Rust Vec for use, and Cap'n Proto's serialised equivalent for transfer... But that's not any different than using serde -- nothing gained, nothing lost in this case.

However, if the Vec is only constructed for the transfer, it would usually be more efficient to construct the equivalent serialised type directly, rather than constructing a Rust Vec and then translating it.

Same for any other type.

from ipc-channel.

nox avatar nox commented on July 23, 2024

Yes, I'm just saying it's more codegen to have, this time with external files (AFAIK) in a specific language to feed it to capnproto, right?

from ipc-channel.

antrik avatar antrik commented on July 23, 2024

@nox I don't think so? I haven't checked how the existing Cap'n Proto implementation for Rust works -- but I don't see any fundamental reason why we couldn't just derive the serialised equivalents from native Rust types...

(Just like serde, it will need to handle container types such as Vec explicitly of course.)

from ipc-channel.

asajeffrey avatar asajeffrey commented on July 23, 2024

I hit this problem too, having to clone data just in order to hand it to an ipc sender and then discard it. If we had send_ref(&data) as well as send(data) I'd be done. If we want to keep the interface to be a subset of the MSPC interface, we could add send_ref(&data) there too, implemented as send(data.clone()).

from ipc-channel.

antrik avatar antrik commented on July 23, 2024

@asajeffrey I'd actually call it send_clone() I think; but other than that, I think I'm fine with this. (With the caveat of any embedded IpcSender being "consumed", in spite of having only a shared reference, as I pointed out in my first reply... Perhaps we should take &mut data instead?)

It's probably not a huge win, but should be noticeable (in the order of 10% I'd guess?) -- and it's very low hanging fruit :-)

from ipc-channel.

antrik avatar antrik commented on July 23, 2024

Actually, calling it send_clone() made me realise that the right thing to do here is probably simply not to drop embedded senders in this case. I'd have to check whether this poses any implementation problems...

from ipc-channel.

asajeffrey avatar asajeffrey commented on July 23, 2024

I realized after I'd written this that this deals with the case where you want to send an &T, but not with the case I actually have, which is sending a T which implements ToOwned, and deserializing it at the other end as T::Owned. We can get very close using Cow, but then we send a Cow<'a,T> and receive a Cow<'static,T> so the lifetimes don't quite line up.

We could work round this by defining a ToStatic trait, something like:

unsafe trait ToStatic {
  type Static: 'static;
  fn to_static(&self) -> Self::Static;
}

with the requirement that serializing x and x.to_static() is the same. Then we could have sender.send_static(&data) which serializes data: &T but deserializes it at type T::Static.

In particular, we'd have:

  unsafe impl<'a,T> ToStatic for Cow<'a,T> where T: ToStatic {
    type Static = Cow<'static,T::Static>;
    fn to_static(&self) -> Self::Static { Cow::Owned(self.deref().to_static()) }
  }

In order to make this usable, though, we'd have to support deriving ToStatic, and this is using quite a bit of the complexity budget.

from ipc-channel.

asajeffrey avatar asajeffrey commented on July 23, 2024

On thinking about it, perhaps the cleanest interface would be something like a DeepClone trait:

trait DeepClone {
  type DeepCloned;
  fn deep_clone(&self) -> Self::DeepCloned;
}

for example:

  impl DeepClone for str {
    type DeepCloned = String;
    fn deep_clone(&self) -> String { self.to_owned() }
  }

This would be different from Clone because it would clone through Rc rather than copying the reference:

  impl<T> DeepClone for Rc<T> where T: DeepClone {
    type DeepCloned = Rc<T::DeepCloned>;
    fn deep_clone(&self) -> Rc<T::DeepCloned> { Rc::new(self.deref().deep_clone()) }
  }

The expectation is that a serialization then deserialization is the same as a deep clone, in psrticular the serialization format of T should be compatible with the deserializer of T::DeepCloned.

The change to IpcSender<T> would be to add: send_deep_clone<S>(&self, data: &S) where S: DeepClone<DeepCloned=T>.

from ipc-channel.

asajeffrey avatar asajeffrey commented on July 23, 2024

Some mucking around in the rust playground: https://play.rust-lang.org/?gist=c5e14d5a9ab2a4ac7dd2aa7190011125

from ipc-channel.

asajeffrey avatar asajeffrey commented on July 23, 2024

And some more mucking around, this time making sure that implementing #[derive(DeepClone)] isn't too bad: https://github.com/asajeffrey/deep-clone.

from ipc-channel.

nox avatar nox commented on July 23, 2024

from ipc-channel.

asajeffrey avatar asajeffrey commented on July 23, 2024

We can then implement send_deep_clone as well as send_clone. The problem is that send_clone doesn't allow sending data whose borrowing involves a change of type, for example sending a &[T] and receiving it as a Vec.

If you'd rather have a separate issue for this, I can open a new one.

from ipc-channel.

antrik avatar antrik commented on July 23, 2024

@asajeffrey I'm not sure I follow. Why do you need deep_clone()? I thought the idea is that you do not actually want to convert? Aren't you really just after a way to define equivalence relationships of types that can be substituted for the declared sender type, since they serialise to the same representation?

(And wouldn't you say this should be handled within the Serde framework?...)

from ipc-channel.

nox avatar nox commented on July 23, 2024

Let's make a new issue.

from ipc-channel.

nox avatar nox commented on July 23, 2024

Cc @dtolnay for the concept of serialisation equivalence in serde.

from ipc-channel.

asajeffrey avatar asajeffrey commented on July 23, 2024

IRC conversation with @nox: http://logs.glob.uno/?c=mozilla%23servo&s=10+Mar+2017&e=10+Mar+2017#c628689

TL;DR: we agree with send_clone, the discussions are a) do we need send_deep_clone, and b) should we introduce a breaking change and rename send_clone to send?

from ipc-channel.

asajeffrey avatar asajeffrey commented on July 23, 2024

@antrik yes, the idea is to have a way to express that you can serialize a T, and deserialize a U. One way to express that is by something like T: DeepClone<DeepCloned=U>. Not sure where we'd put this in serde, is there a round-tripping API in there somewhere?

from ipc-channel.

asajeffrey avatar asajeffrey commented on July 23, 2024

Created a new issue for reducing copying in the case that this requires changing type: #156.

from ipc-channel.

antrik avatar antrik commented on July 23, 2024

I just remembered that the problem is not with embedded IpcSender, but rather with embedded IpcReceiver: since receivers cannot be cloned, send_clone() on a type with embedded receivers poses semantic problems.

(It's not a technical problem, since IpcReceiver implements inner mutability for that purpose; but it's still weird -- and potentially error-prone -- when a method that takes a shared reference, and otherwise clones everything, does actually move out and invalidate any embedded receivers...)

from ipc-channel.

asajeffrey avatar asajeffrey commented on July 23, 2024

Hmm, and even putting a T:Clone requirement doesn't fix this, because there might be an embedded Rc<IpcReceiver>.

from ipc-channel.

antrik avatar antrik commented on July 23, 2024

@asajeffrey I have been thinking about this on and off; and this is my current idea: how about adding both a send_copy() method that takes &T where T: Copy; and also a send_mut() method that takes &mut T for any T?...

send_copy() is the most straightforward one, since technically, that's what the sending actually does -- but it doesn't work with embedded IpcReceiver (or Rc<>). send_mut() on the other hand would reflect the fact that the sending process for non-copy objects modifies the originals in some way...

While each of them is limited in certain ways, my current impression is that together they should actually cover all use cases, while avoiding the semantic problems.

The plethora of variants might seem confusing at first: but together with the original send() -- which is a send_move() really -- I think they would indeed quite naturally reflect the typical &T/&mut T/T trifecta...

What do you think?

from ipc-channel.

antrik avatar antrik commented on July 23, 2024

Correction: send_ref() is supposed to be send_mut() -- sorry for the confusion.

from ipc-channel.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.