wasmi-labs / wasmi Goto Github PK
View Code? Open in Web Editor NEWWebAssembly (Wasm) interpreter.
Home Page: https://wasmi-labs.github.io/wasmi/
License: Apache License 2.0
WebAssembly (Wasm) interpreter.
Home Page: https://wasmi-labs.github.io/wasmi/
License: Apache License 2.0
I've decided to try out this interpreter, but just loading my module makes it fail on the stack limit being exceeded:
let module = deserialize_file("livesplit_core.wasm").unwrap();
let module = Module::from_parity_wasm_module(module).unwrap();
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Validation("Function #1204 validation error: Stack: exceeded stack limit 1024")', libcore/result.rs:945:5
I'm a user of parity-wasm
but not of wasmi
. When I load wasm from an untrusted source, or I manipulate a wasm AST, I'd like to ensure it is valid. There is a validator in https://github.com/paritytech/wasmi/blob/master/src/validation/mod.rs#L168 that I would like to use without adding a dependency on wasmi
. Are you open to moving that code to the parity-wasm
crate, along with the ModuleContext machinery required to make it work? I'm sorry if this questions a design decision you already made - its OK if you don't feel its appropriate to move this code.
Is there anything that blocks a new release now?
In particular I'm interested in #41
Current testsuite actually is far from current. I tried to upgrade spec suite and it appears that wasmi couldn't pass it (although, I'm not sure why, maybe it is fault in decoding in parity-wasm.
Hi there,
wasmi seems pretty much interesting. I'm looking into it's implementation. I found that MemoryInstance::copy
and MemoryInstance::copy_nonoverlapping
are used nowhere and can be removed along with their unit tests. All tests passed. My question is: are they essential in this project? Could I remove them? Thanks!
Ideally I'd like it to be as seamless to host a wasm module in Rust as it is to host it in JavaScript. wasm_bindgen creates wrappers that let us do things like easily invoke functions that accept strings, etc. Would it be possible to do something like this for Rust so that the interop is smoother? Apologies if something like this exists and I'm just not finding it
Should be a pretty simple task: at a glance it looks like we just need to remove checks that prevent importing/exporting of mutable globals.
see:
Maybe add it to check.sh / build.sh ?
My idea is to use a wrapper around a buffer with shared ownership, so each Instructions
type has a pointer to a section of the buffer but the full buffer doesn't get dropped until all owners are dropped (similar to how Bytes
works). Probably we would have to preallocate a buffer big enough to hold all instructions and then have some kind of fn extend(iter: impl IntoIterator<Item = Instruction>) -> Option<BufferHandle>
(straw man naming, of course) that locks the shared buffer, writes the elements and then returns a handle that spans the newly-added elements. This avoids dealing with lifetimes and makes Module::drop
much faster, at the cost of possibly making allocating a new module slower (we can either generate instructions in serial or generate them in parallel but use intermediate buffers that must then be copied from).
When the heap grows from, say, 1GB to 3GB, in naive vector implementation of MemoryInstance we need to reallocate vector, copy 1GB worth of data into the new vector and then fill with zeroes newly allocated 2GB worth of data.
On 64-bit machines (and maybe in some cases on 32-bit ones) we might allocate virtual memory up to memory instance's limit, and then just move the heap_end pointer upon call to grow
, thus avoiding memory copying. Maybe it is possible to do zero-filling in a lazy way, i.e upon the first access to the page, however, i'm not sure how to do this in robust and portable manner.
To properly support them we need to setup CI for these targets
Can you show an example of having the host read/write data to a linear memory that was imported by the wasm module and not contained within?
As an example, this demo here shows how this string round-trip through linear memory is done when the host is JavaScript. I would love to be able to do the same thing with Rust as my host and am looking for an example of how to accomplish this in wasmi
but I'm not finding a clear direction in the docs.
wabt seems to found an error in validation
paritytech/parity-wasm#182
Many applications of the interpreter can handle some error in a meaningful way (memory access violation, extern signatures mismatch, etc.), so it makes sense to encode them in other than a heap allocated string
I think we want to fuzz our interpreter with fuzzer (wasm-opt --translate-to-fuzz?). AFAIK, it is capable of producing executable inputs.
Due to a bug in Rust's module system, MemoryInstance::get_value
is bounded by T: LittleEndianConvert
but LittleEndianConvert
is not exported.
For now, functions like into_little_endian
do an allocation. This is very unfortunate for the functions that are called upon every access to wasm memory.
At the very least, we can use change definition from
pub trait LittleEndianConvert where Self: Sized {
fn into_little_endian(self) -> Vec<u8>;
fn from_little_endian(buffer: &[u8]) -> Result<Self, Error>;
}
to something like this
pub trait LittleEndianConvert where Self: Sized {
fn into_little_endian(&self, out: &mut [u8]); // Will panic if `out` of not appropriate size
fn from_little_endian(buffer: &[u8]) -> Result<Self, Error>;
}
this will avoid allocations for into_little_endian
case.
hello,
I'm currently developing serverless-wasm and I'm wondering how I could implement two features:
From what I understand, I could proceed like this (for the async part):
Interpreter::start_execution
that would store the function context when receiving that trap from Interpreter::run_interpreter_loop
instead of dropping itIt looks like it would be doable, and could be used to implement breakpoints, too.
The other feature, limiting the run time of a function, would probably be easier to implement, with the loop regularly checking that we do not exceed some limit passed as arguments.
Do you think those features would be useful in wasmi?
The serverless-wasm project is in very early stage, and I admit the readme is not too serious, but I really want to get it to a state where it's nice to use, and I need at least the async networking feature to get it.
At the moment, examples are just copied from parity-wasm repo. However comments are stale and I think we can provide better examples.
AGFzbQEAAAABBAFgAAADAgEAChYBFAADQAJ9QwAAAABBAA4BAAELGgsL
wabt:
BeginModule(version: 1)
BeginTypeSection(4)
OnTypeCount(1)
OnType(index: 0, params: [], results: [])
EndTypeSection
BeginFunctionSection(2)
OnFunctionCount(1)
OnFunction(index: 0, sig_index: 0)
EndFunctionSection
BeginCodeSection(22)
OnFunctionBodyCount(1)
BeginFunctionBody(0)
OnLocalDeclCount(0)
OnLoopExpr(sig: [])
OnBlockExpr(sig: [f32])
OnF32ConstExpr(0 (0x040))
OnI32ConstExpr(0 (0x0))
OnBrTableExpr(num_targets: 1, depths: [0], default: 1)
OnEndExpr
OnDropExpr
OnEndExpr
EndFunctionBody(0)
EndCodeSection
EndModule
test.wasm:0000026: error: br_table labels have inconsistent types: expected f32, got void
spec:
test.wasm:0x24-0x25: invalid module: type mismatch: operator requires [] but stack has [f32]
Interperter
assumes this invariant:
An interpreted function will not pop too many or too few values off the value stack.
Since pop
is in a hot code path, the above invariant allows for optimizations.
If, however, we trusted the user to push a runtime value to the stack, then we popped it, the stack would be one shorter than expected. Near the end of execution, the value stack would underflow.
func::resume_execution
func::resume_execution
looks like this:
pub fn resume_execution<'externals, E: Externals + 'externals>(
&mut self,
return_val: Option<RuntimeValue>,
externals: &'externals mut E,
) -> Result<Option<RuntimeValue>, ResumableError> { ... }
That return_val
argument gets passed to Interpreter::resume_execution
.
When return_val
is None
, Interpreter::resume_execution
ignores it.
When return_val
is Some
, it gets pushed to the value stack.
Some(val)
and the host doesn't pop, the value stack becomes too tall.None
when the host expects an argument 1) causes the host to pop some random value, and 2) makes the stack one shorter than expected.Either get rid of the return_val
argument, or make it mandatory.
I'm trying to write a function that returns a string, but right now I'm stuck. I wrote a function that I'm compiling to wasm:
#[no_mangle]
pub extern fn test3() -> *mut c_char {
CString::new("ohai!").unwrap()
.into_raw()
}
I'm calling it with wasmi like this:
let result = instance.invoke_export(
"test3",
&[],
&mut exports,
).expect("failed to execute export");
println!("wasm returned: {:?}", result);
This returns:
wasm returned: Some(I32(1114120))
But it doesn't seem like there's a way to access the actual bytes. I can imagine some incredibly complex hacks to fit strings through an interface that only allows i32 by writing my own toy memory allocator, but I'd rather not do that.
Probably related to #88
to address lack of benchmarks, we might want to use special wasm binaries which utilize specific benching api
it should consist of:
extern "C" fn start_bench(name_ptr: *const u8, name_len: u32)
)extern "C" fn start_iter()
)extern "C" fn end_iter()
)extern "C" fn bb_observe(ptr: *mut u8, len: u32)
)test::black_box
on nightly)each start_iter
should follow end_iter
sequentally so there should be no nested iterations or overlaps (higher-level utility library should enforce it via closures/ownership)
bb_observe
is used to avoid compiler optimisations
We need to reconsider the limits (particularly maximal value and frame stack height).
Ideally we should provide a way for a user to change this limits.
Also, we might want to synchronize with limits.h
For example, test what if:
ImportResolver
returned a signature that incompatible from requested,Externals::invoke_index
returned a value when no value expected by the signature, and vice versaIt does not matter if module exports, imports, declares own memory and does not exports, there should be a way to request memory the module instance is using while running
AGFzbQEAAAABJAdgAn9/AGACf34AYAF/AX9gAX8BfmABfgF+YAF9AX1gAXwBfAMYFwAAAQICAwIC
AgQEBAQEBQYCAgQEBAUGBQMBAAEH4QERDGkzMl9sb2FkMTZfcwAGDGkzMl9sb2FkMTZfdQAHCGkz
Ml9sb2FkAAgMaTY0X2xvYWQxNl9zAAkMaTY0X2xvYWQxNl91AAoMaTY0X2xvYWQzMl9zAAsMaTY0
X2xvYWQzMl91AAwIaTY0X2xvYWQADQhmMzJfbG9hZAAOCGY2NF9sb2FkAA8LaTMyX3N0b3JlMTYA
EAlpMzJfc3RvcmUAEQtpNjRfc3RvcmUxNgASC2k2NF9zdG9yZTMyABMJaTY0X3N0b3JlABQJZjMy
X3N0b3JlABUJZjY0X3N0VXJlABYK9gIXFgAgACABOgAAIABBAWogAUEIdjoAAAsUACAAIAEQACAA
QQJqIAFBEHYQAAsWACAAIAGnEAEgAEEEaiABQiCIpxABCxMAIAAtAAAgAEEBai0AAEEIdHILEQAg
ABADIABBAmoQA0EQdHILEwAgABAErSAAQQRqEAStQiCGhAsNAEEAIAAQAEEALgEACw0AQQAgABAA
QQAvAQALDQBBACAAEAFBACgCAAsOAAAAAAAAAAANADIBAAsOAEEAIACnEABBADMBAAsOAEEAIACn
EAFBADQCAAsOAEEAIACnEAFBADUCAAsNAEEAIAAQAkEAKQMACw4AQQAgALwQAUEAKgIACw4AQQAg
AL0QAkEAKwMACw0AQQAgADsBAEEAEAMLDQBBACAANgIAQQAQBAsOAEEAIAA9AQBBABADrQsOAEEA
IAA+AgBBABAErQsNAEEAIAA3AwBBABAFCw4AQQAgADgCAEEAEAS+Cw4AQQAgADkDAEEAEAW/Cw==
wasmi: ok
wabt: type mismatch in i64.load16_s, expected [i32] but got [i64]
Depends on #98
For now, every instruction is represented by an enum value. It means that each instruction occupies size required for a tag plus a size of the instruction that has the largest payload (probably BrTable
) (which requires to be properly aligned). Now each instruction occupies about 24 bytes, even though most instructions are 1 byte wide.
Fortunately, we don't need to address instructions by an index and can iterate them sequentially, and branches can specify the exact byte offset for their target.
Shrinking the size of one instruction will allow us to use cache more efficiently and also will reduce memory overhead greatly.
Hi,
Is there any way to debug where trap happened? At least function would be good, file+line and/or traceback is even better.
I'm using rust's wasm32-unknown-unknown for generating wasm code itself (and the trap is a panic). Maybe that's something to tweak at the compiler side?
To achieve parity with alternate implementations of larger projects that happen to use wasmi
, api surface of wasmi
must be as close to browser engines as possible.
This includes (so far) API-s should be under feature flag:
Such as STM32 F4
Right now it points to https://pepyakin.github.io/wasmi/ but should be https://paritytech.github.io/wasmi/
For now, there is a few problems with that load
fuzz target detects, particulary in parity-wasm:
Currently copying a Vec<Instruction>
is very expensive. Although most variants are Copy
-able, BrTable
contains a Box<[Target]>
, which means that copying an instruction always requires checking the variant and you can only copy one at a time in serial. We should instead use an encoding like so:
// Note the addition of `Copy`
#[derive(Debug, Copy, Clone, PartialEq)]
enum Instruction {
// ...
BrTable { count: u32 },
BrTableTarget(Target),
// ...
}
This would mean that we can memcpy
a vector of Instruction
s, much faster. It also reduces pointer chasing, see #136.
Finally, this is good preparation work for #100, since it means that decoding doesn't mean allocating a Box<[Target]>
.
It seems like it delegates the implementations to Rust's f32/64::neg/abs which might to be wrong as according to https://github.com/sunfishcode/wasm-reference-manual/blob/master/WebAssembly.md#floating-point-negate
those instructions are supposed to be bitwise instructions that preserve the bits, so they can't be implemented as subtractions. However f32/64::neg generate the following LLVM IR:
define float @example::foo(float %x) unnamed_addr #0 {
%0 = fsub float -0.000000e+00, %x
ret float %0
}
There are two problems, the way I see it:
Vec
in an Rc
in an Rc
in a Vec
. This can blow the cache unless we're extremely lucky with where data gets allocated. Essentially, this means every function call requires up to 8 derefs which may be (and probably will be) on different cache lines. If we implemented TCO then tail calls would only require 4, but that's still bad.Since optimisers avoid function calls anyway and inline when possible (so the resultant wasm that we're executing should have minimal function calls) the impact of this is somewhat mitigated, but it's something that we can fix, so why not.
How to run wasmi example (tictoctoe.rs) code??
I have done with cargo test everything is fine. Then I used cargo run to execute .rs code. But it not showing me any result.
What is the command to run code?
I'm interested in fitting wasmi into small embedded applications, with footprints in the 32k-64k range. Currently, aggressively optimizing wasmi generates binaries in the 300-400k range. Would you be interested in taking patches that further reduce the size of the wasmi crate?
in addition to:
https://pepyakin.github.io/wasmi/wasmi/struct.ModuleRef.html#method.invoke_export
used in the following way
instance.invoke_export(
"test",
&[],
&mut NopExternals,
).expect("failed to execute export")
something like
instance.export("test").with_externals(&mut NopExternals).with_args(&[]).invoke()
?
AGFzbQEAAAABJAhgAX8AYAF+AGABfQBgAXwAYAF/AGACf30AYAJ8fABgAX4BfgLZAhAIc3BlY3Rl
c3QJcHJpbnRfaTMyAAAIc3BlY3Rlc3QJcHJpbnRfaTMyAAAIc3BlY3Rlc3QJcHJpbnRfZjMyAAII
c3BlY3Rlc3QJcHJpbnRfZjY0AAMIc3BlY3Rlc3QNcHJpbnRfaTMyX2YzMgAFCHNwZWN0ZXN0DXBy
aW50X2Y2NF9mNjQABghzcGVjdGVzdAlwcmludF9pMzIAAAhzcGVjdGVzdAlwcmludF9mNjQAAwR0
ZXN0DWZ1bmMtaTY0LT5pNjQABwhzcGVjdGVzdAlwcmludF9pMzIAAAhzcGVjdGVzdAlwcmludF9p
MzIAAAhzcGVjdGVzdAlwcmludF9pMzIAAAhzcGVjdGVzdAlwcmludF9pMzIAAAhzcGVjdGVzdAlw
cmludF9pMzIAAAhzcGVjdGVzdAlwcmludF9pMzIABAhzcGVjdGVzdAlwcmludF9pMzIADAMDAgAB
BAUBcAECAgczCAJwMQAJAnAyAAoCcDMACwJwNAALAnA1AAwCcDYADQdwcmludDMyABAHcHJpbnQ2
NAARCQgBAEEACwIBAwpgAiwBAX0gALIhASAAEAAgAEEBakMAAChCEAQgABABIAAQBiABEAIgAEEA
EQAACzEBAXwgABAIuSEBIAFEAAAAAAAA8D+gRAAAAAAAgEpAEAUgARADIAEQByABQQERAwAL
wasmi: successful validation
wabt: 000018a: error: invalid import signature index
Depends on #98
Currently, a RuntimeValue
is represented by a rust enum.
As a refresher: A rust enum requires space for the variant tag plus the size of a payload of the largest variant, which also requires to be properly aligned. Thus, a RuntimeValue
takes 8 bytes for the payload (for 64-bit wide values) and 8 bytes for the alignment (on x86_64).
We can shelve 8 bytes by removing a tag. We can achieve it by using C-like unions for representing runtime values internally. It is possible because after validation it is statically guaranteed that each operation will be used with operands of appropriate types (i.e., f32.*
operators will always be used with f32
operands).
Shrinking RuntimeValue
may potentially improve cache efficiency for the operand stack and remove branching to check operand types.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.