Giter Site home page Giter Site logo

cranestation / lightbeam Goto Github PK

View Code? Open in Web Editor NEW
248.0 248.0 15.0 777 KB

Lightbeam has moved and now lives in the Wasmtime repository!

Home Page: https://github.com/CraneStation/wasmtime

License: Apache License 2.0

Rust 99.97% WebAssembly 0.03%
codegen compiler jit rust wasm

lightbeam's People

Contributors

afinch7 avatar eira-fransham avatar mrowqa avatar pepyakin avatar sstangl avatar sunfishcode avatar tiborvass avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

lightbeam's Issues

Lightbeam vs Cranelift and Faerie

Having not used Cranelift but trying to undertsand around how all the projects in Cranestation fit and are to be used together, I have noticed that Cranelift has a component faerie that I think is intended to take Cranelift IR and create machine code. How does lightbeam compare? Is it intended to be a standalone tool or a component in a VM or something else?

not yet implemented: We can't handle cycles in the register allocator

I tried to compile rustc_binary.wasm using lightbeam backed wasmtime, but it panicked with:

thread 'main' panicked at 'not yet implemented: We can't handle cycles in the register allocator: [(Reg(Rq(6)), Reg(Rq(2))), (Reg(Rq(2)), Reg(Rq(6)))]', lightbeam/src/backend.rs:4638:17
stack backtrace:
   0: std::sys::unix::backtrace::tracing::imp::unwind_backtrace
   1: std::sys_common::backtrace::_print
   2: std::panicking::default_hook::{{closure}}
   3: std::panicking::default_hook
   4: std::panicking::rust_panic_with_hook
   5: std::panicking::continue_panic_fmt
   6: std::panicking::begin_panic_fmt
   7: lightbeam::backend::Context<M>::pass_outgoing_args
   8: lightbeam::backend::Context<M>::call_direct_imported
   9: lightbeam::function_body::translate
  10: <wasmtime_environ::lightbeam::Lightbeam as wasmtime_environ::compilation::Compiler>::compile_module
  11: wasmtime_jit::compiler::Compiler::compile
  12: wasmtime_jit::instantiate::RawCompiledModule::new
  13: wasmtime_jit::instantiate::instantiate
  14: wasmtime_jit::context::Context::instantiate_module
  15: wasmtime::handle_module
  16: wasmtime::main
  17: std::rt::lang_start::{{closure}}
  18: std::panicking::try::do_call
  19: __rust_maybe_catch_panic
  20: std::rt::lang_start_internal
  21: main

New stack value type: condition

This means we can avoid temporary variables in the vast majority-case that the test is directly before a jump (emitting cmp ...; jnae ... instead of using setnae, for example), but it introduces significant complexity because we can't emit the cmp immediately, we have to delay emitting the cmp until the condition is either used or something else is pushed, because in the case that the condition has to be stored to a register we have to zero that register before the cmp. This complexity will probably be confined to push, but it's still problematic.

EDIT: I just realised that we can use mov ..., 0 to zero the register which doesn't affect flags

ctz and clz for 0 gives unexpected result on older cpus

Tests fail on older cpus(intel sandy bridge and older). ctz and clz give 0 instead of bit width.
Cpus tested: e5-2670 v1(fail), i7-3820(fail), i7-4790K(pass)

I would assume this has to do with the way that tzcnt and lzcnt are implemented on these cpus.

Milestone: Fibonacci function

Recursive Fibonacci is a nice next milestone because it needs some control flow operators and calls, which should really help establish the shape of the backend.

The steps are:

  • Implement if+else+end.
    • We'll need a stack for tracking nested control flow constructs. To start with there's only one kind of control-flow stack entry, for if/else/end, but eventually we'll have more. It can hold DynamicLables to keep track of the labels for branching.
    • The result value of an if can be carried by the stack just like normal operator results. (For now; later with on-the-fly register allocation it can be more sophisticated.)
  • Implement calls
    • This is a good time to generalize the handling of function signatures. I think we can just collect all the FuncTypes from the type section into a Vec and pass that around for now.
    • Optionally, to keep things simple for this step, we could limit support to just 6 integer arguments, to avoid having to deal with stack arguments just yet.

I've left out a lot of the low-level details here; feel free to ask for more detail!

Milestone: Compile a complete function

Let's start by working to compile this function into machine code. The steps are:

  • Handle local variable declarations -- for now, we'll assume the locals will be allocated in a contiguous memory region, so the task here is: using the information from the local declarations at the top of function_body::translate, to write a function that maps from local indices to offsets in that region, and to compute the needed size of that region.
  • Keep track of the stack pointer -- Move the backend's Registers struct into a backend Context, add a field recording the current depth of the stack pointer, and make push_i32 and pop_i32 subtract from and add to this field so that we always know where the stack pointer is (relative to where it started).
  • Implement get_local/set_local -- In function_body.rs, use the mapping we created above to get the offset within the locals area, then pass that to backend.rs to do the load or store. In the backend, this will look like mov offset(%rsp), Rq(op) where offset is the offset within the locals area plus the current stack depth.
  • Implement function entry -- For now, we can use this simple sequence:
	push   %rbp               # save incoming frame pointer
	mov    %rbp, %rsp         # copy stack pointer to frame pointer
        sub    %rsp, framesize    # allocate the stack frame (the stack grows down)

where framesize is the size of the locals area.

  • Store the incoming function arguments into their slots in the locals area. If you're on Linux/Mac/etc., the first 4 arguments are in rdi, rsi, rdx, rcx. If you're on Windows, they're in rcx, rdx, r8, r9. Eventually we'll want to make the calling convention configurable, but it's ok to hardcode stuff to get started with.

  • Implement returns - at the function exit, pop the last remaining i32, copy it into RAX, add the size of the locals area back to the stack pointer, then do a ret.

    • Then, at the end of function_body::translate, instead of just disassembling the output, return the compiled code, make examples/test.rs transmute the address to a function pointer and... call it!
    • Write a test case to test that "add" works :-).

Once that milestone is achieved, the work can branch out a little bit. One set of tasks is implementing more integer arithmetic operations. Another is to add floating-point register support and after that, floating-point arithmetic operations. And independently of those, the next big milestone will be to compile a fibonacci function. Once we achieve this first milestone, I'll write up a new design issue for that :-).

For anyone interested in getting involved, welcome! and please post in the issue here so that we can coordinate work.

Sandboxing loads and stores

This is the first JIT that I've worked on, so I don't know how one goes about same-process memory isolation without generating check-address-and-trap instructions. Obviously check-and-trap is viable, but I feel like it must be possible to jack into the OS's (and therefore, hardware's) memory protection mechanisms to get the same protections with better performance. I assume it works by calling into the operating system to set accessible memory regions before jumping into wasm code and then resetting the accessible regions afterwards or when calling into host functions, but how do you stop the wasm code from doing an i32.store onto the program counter (by using some method to guess the location of it) without also preventing it from writing to the stack?

Failed to build due to mismatched types

Hi ... I was attempt to walk through lightbeam's example when I failed to compile:

image

I compiled with: cargo build --example test

This is a fresh checkout. Are there some requirements I am unaware of for building or is this a real issue?

Side Quest: Integer arithmetic

With #1 done, it's now straightforward to start work on the rest of the integer arithmetic opcodes as we can compile, execute, and test them. This can happen in the background, as #3 and the next few milestones won't depend on it. I'm filing it now to track it, and in case anyone else is interested in something to get started with.

I suggest doing the i32 and i64 versions of each instruction at the same time, because it's easy to do so on x86-64 :-). As always, feel free to ask questions!


The simple cases first. i32.add is already done (aside, should we rename add_i32 to i32_add?), so that's is an example to work from. Use Rd for 32-bit operations and Rq for 64-bit operations.

  • add
  • sub
  • and
  • or
  • xor
  • mul (for now, use the register-register form of imul)

Next, comparisons. x86 is a little weird here because set<cc> can only write to 8-bit registers. So I suggest using xor REG, REG to zero out the result register first, and then using Rb(REG) with set<cc> to write the result on top of it.

  • eqz
  • eq
  • ne
  • lt_s
  • lt_u
  • le_s
  • le_u
  • gt_s
  • gt_u
  • ge_s
  • ge_u

Shift and rotates, they need their count operand in %cl, so we'll need a way to allocate that register specifically.

  • shl
  • shr_s
  • shr_u
  • rotl
  • rotr

Div/rem. These also need specific registers, and they can also trap. I suggest starting with simple conditional branches testing the trap conditions and using ud2 to do the traps for now.

  • div_s
  • div_u
  • rem_s
  • rem_u

These are easy if you have sufficiently new CPUs :-). At this step, we'll need to figure out how we want to handle subtarget features.

  • clz
  • ctz
  • popcnt

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.