Giter Site home page Giter Site logo

Comments (11)

tessi avatar tessi commented on May 31, 2024 2

Hey @sleipnir thanks for asking so nicely πŸ’š

Passing and receiving memory really isn't as easy as I wish it was. So, sadly, I don't have an easy solution for you, but a lengthy post. Hope this still helps (and if helpful, could be something I point others to who have the same question). The good news is, that there is a silver lining (see the footnote).


Handling strings is not easy because WebAssembly does not know "strings" (as we understand them in elixir), but only sees "a bunch of bytes". This means, when looking at our WebAssembly memory, the whole memory is just "a big array of bytes". So we need to know where to start reading the string from memory (memory_position aka pointer) and how many bytes to read (length).

There is two ways to pass this information:

  1. The old-school "C language" way: Every string ends with a zero-byte ("\0") and we read the memory starting from our pointer until we discover a zero byte. The upside is that we only need to pass one argument to/from WebAssembly (the pointer). But the downside is that it opens the door wide for mistakes (what if we forgot a zero byte, what if our string contains zero-bytes, what if an attacker manages to sneak-in or remove some zero-bytes etc.).
  2. We always pass two arguments, the pointer and length (in bytes) of a string. This has the upside of being way more secure (against attackers and programming mistakes) but the downside of us needing to always pass two variables to/from WebAssembly. This is, by the way, how Rust handles Strings internally.

For wasmex, we decided to go with the second approach. This works well for passing Strings down to WebAssembly.
Given we have this method in WebAssembly (implemented in Rust, but could be any language compiling to wasm):

#[no_mangle]
pub extern "C" fn do_something_with_a_string(bytes: *const u8, length: usize) -> u8 {
    // do something with the given byte-array and string length
}

We could call the method in Elixir like this:

{:ok, memory} = Wasmex.memory(instance, :uint8, 0)
string = "hello, world"
memory_position = 42 # aka "pointer"
Wasmex.Memory.write_binary(memory, memory_position, string) # copy the bytes to WASM memory, so our WASM function can see it

Wasmex.call_function(instance, :do_something_with_a_string, [memory_position, String.length(string)])

The other way around (handing a string from WebAssembly back to Elixir) is more complicated, because we can currently only return one value from wasm functions, but we need two values (pointer and length). (see πŸ‘£ footnotes for details)

This is often solved by having two functions in WebAssembly: One function producing a string, and another function returning the string-length.

#[no_mangle]
pub extern "C" fn demo_string() -> *const u8 {
    b"Hello, World!".as_ptr()
}

#[no_mangle]
pub extern "C" fn demo_string_len() -> u8 {
    13
}

We would use it in Elixir like this:

{:ok, [pointer]} = Wasmex.call_function(instance, :demo_string, [])
{:ok, [length]} = Wasmex.call_function(instance, :demo_string_len, [])
assert Wasmex.Memory.read_string(memory, pointer, length) == "Hello, World!"

Now that you know how to pass strings down to WebAssembly and back to Elixir again, there should be nothing stopping you from combining both approaches.

If your specific use-case requires it, you can of course go the "C strings" way of ending strings with zero-bytes or invent any other custom protocol (e.g. starting a string, so the first byte is the string-length). These customs routes don't have helper methods, though, in wasmex.

πŸ‘£ Footnote: "wasm can only return one value from a function call"

This is only half-true. In fact, the WebAssembly standard already allows returning multiple values. Wasmer, the web assembly execution engine we use, already partially implements that. Unfortunately, we don't have that feature yet in singlepass compilation (see https://docs.wasmer.io/ecosystem/wasmer/wasmer-features). Once wasmer supports multi-value returns everywhere, we can build better helper methods in elixir to make the whole process way easier.

from wasmex.

tessi avatar tessi commented on May 31, 2024 2

Alright, I think I understand better. You want to pass arbitrary info into WebAssembly and back up again using protobufs. You can deserialize your proto objects to json, but theoretically also to any string or byte sequence (to safe some space, or be more time efficient).

What about a custom byte serialization where you write the following to wasm memory

00 00 00 00 00 00 00 05 48 65 6c 6c 6f
|---------------------| |------------|
          |                  |
 size (64 bit unsigned int)  |
                             |
                          bytes, must be `size` bytes long

This way, you can pass one pointer to wasm where the first 8 bytes encode the size of the following byte array. The following byte array could contain raw bytes (probably most efficient in your case) or json (better debugging as you can read the content).

The example above, should decode to a string of size 5 containing the byte values for "Hello".

Since the "header" part containing the string size is fixed-size it's hopefully easy do de-/serialize this format. What do you think?

from wasmex.

tessi avatar tessi commented on May 31, 2024 2

I am not exactly sure if it fits your use case, but I sketched a wrapper module together that wraps Wasmex.Memory to implement the protocol mentioned above:

@moduledoc """
  Assuming we have a :uint8-type memory, this module offers ways to write and read binaries
  to/from wasm memory.

  Binaries are written in two parts:

  1. size of the binary (8 bytes, unsigned int, big endian)
  2. binary content (having exactly the number of bytes given in `size`)

  We do *not* ensure that writing/reading memory fits the wasm memory bounds.
  """
defmodule WasmBinaryTrampoline do
  def write_binary(memory, index, binary) when is_binary(binary) do
    length = byte_size(binary)
    length_bytes = <<length::64-big>> # 8 bytes, big endian

    Wasmex.Memory.write_binary(memory, index, length_bytes)
    Wasmex.Memory.write_binary(memory, index + 8, binary)
  end

  def read_binary(memory, index) do
    length = memory
             |> Wasmex.Memory.read_binary(index, 8)
             |> :binary.decode_unsigned(:big)

    Wasmex.Memory.read_binary(memory, index + 8, length)
  end
end

Note: Name of the module might be a little weird (I'm open for suggestions here ;)) and I just tested it briefly.

If you open iex inside the wasmex repository root directory, the following should work

bytes = File.read!("test/wasm_test/target/wasm32-unknown-unknown/debug/wasmex_test.wasm")
{:ok, instance } = Wasmex.start_link(bytes)
{:ok, memory} = Wasmex.memory(instance, :uint8, 0)

WasmBinaryTrampoline.write_binary(memory, 0, "Hello") # :ok
WasmBinaryTrampoline.read_binary(memory, 0) # "Hello"

# just a test to see what we actually wrote to memory,
# we see the first 8 bytes being the size followed by actual content
Wasmex.Memory.read_binary(memory, 0, 8 + 5) # <<0, 0, 0, 0, 0, 0, 0, 5, 72, 101, 108, 108, 111>>

Hope it helps you. Also: if whatever you are building is open source (actually also if not) and you want to tell, I'd be very interested in what you are using wasmex for. It really motivates me to hear what this library is used for.

from wasmex.

sleipnir avatar sleipnir commented on May 31, 2024 2

HI @tessi This is awesome. I am really happy with all your attention to my question. I will try to use the module.

Yes, we have two projects one already open but still very, very early (WIP). And another one project that we plan to open the code soon.
I leave here the link bellow of the project already open, it is a PubSub message Broker based on gRPC, the idea of using Wasm is to connect it to topics and be able to execute "Serveless Functions" (running the risk of sounding clichΓ©). The code that will use Wasm has not yet been merged with the main branch, as I said, it is still a work in progress, but we have already managed to connect producers to consumers and send and receive messages.

https://github.com/eigr/Astreu

from wasmex.

tessi avatar tessi commented on May 31, 2024 2

No worries asking :)

You wrote your string to memory, but (as you may see in your function signature) calling that function needs a param. It wants to have the pointer to the place in memory where you stored the string.

So if you did:

in_str_pointer = 0 # you can make this one up. if there is nothing else in memory, `0` is a good value. otherwise be careful not to overwrite existing data in memory.
in_str = "Hello World"
:ok = WasmBinaryTrampoline.write_binary(memory, in_str_pointer, in_str)

0 is the pointer (the very first byte in memory). If you use any other pointer (because there may already be other stuff in memory at the early bytes, you need to use a different pointer).

Calling the function would be

{:ok, [out_str_pointer]} = Wasmex.call_function(instance, :string_receive_and_result_bytes, [in_str_pointer])
out_str = WasmBinaryTrampoline.read_binary(memory, out_str_pointer)

from wasmex.

tessi avatar tessi commented on May 31, 2024 1

Very cool :) I wish you best of luck and success with Astreu! I'm very interested if things work out for you.

Anyways, considering this issue, I think we're done. Please re-open or create a new issue if you discover any bugs or weirdnesses.

from wasmex.

sleipnir avatar sleipnir commented on May 31, 2024 1

Again I thank you for your attention and I will keep you informed. Unfortunately I was very busy with other things today and I haven't been able to test it yet, but I believe it will work accordingly and again I will keep you informed

from wasmex.

sleipnir avatar sleipnir commented on May 31, 2024 1

Thanks again for the excellent response, now all the pieces fit together perfectly, I had not noticed that the parameter is just a pointer to the memory

from wasmex.

sleipnir avatar sleipnir commented on May 31, 2024

Hello, thanks for the detailed information.

Reading your answer I think I was not entirely clear about my use case and maybe I should try to explain it correctly.
I need to call functions from wasm modules with arbitrary types and receive arbitrary types. These Wasm modules would be developed by third parties that would be implementing a protocol for my application, so they would be loaded and executed dynamically by my implementation.
Unfortunately this is only possible with Webassembly if we use the Interface Types which is not yet fully supported by the largest wasm runtime (unless I know it). So I thought about using strings because, I could parse my arbitrary types (to be more precise, Protobuf types) to strings (protobuf can easily be used with json) and pass them to the functions and thus also be able to receive the result via string and then again transform them into protobuf types.
I am aware that this approach would not be the most efficient but while we do not have support for interface types in Wasm I think it would be one of the few viable alternatives. Unfortunately, waiting for customers to write methods that return string length seems to complicate things for me and I don't know if that would be an option.

Maybe if I could read all the memory and knowing that I am getting arbitrary bytes I could directly transform the result into my Protobuf types (since protobufs use bytes directly) instead of working with Strings.
Did I explain?

from wasmex.

sleipnir avatar sleipnir commented on May 31, 2024

@tessi Thank you for your kind and complete answer, I think it might be worth a try, I just had doubts about how it would look using Wasmex? Sorry if I didn't fully understand.

from wasmex.

sleipnir avatar sleipnir commented on May 31, 2024

Hello @tessi I managed to reproduce the example perfectly. But I'm still not used to the Wasmex API enough to feel safe moving forward. I still have a doubt.
How is the call to a function written in wasm that accepts certain parameters like:

pub extern "C" fn string_receive_and_result_bytes (bytes: * const u8) -> *const u8 {

I can read and write in memory but the call_function API (Wasmex.call_function(instance, "string_receive_and_result_bytes", [])) still asks for parameters that I must pass to the wasm function. How do I do that? Sorry if the answer can be very obvious but I still have this difficulty in understanding how the parameters are passed to the function.

from wasmex.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.