Giter Site home page Giter Site logo

qmonnet / rbpf Goto Github PK

View Code? Open in Web Editor NEW
824.0 24.0 199.0 768 KB

Rust virtual machine and JIT compiler for eBPF programs

License: Apache License 2.0

Rust 99.61% Batchfile 0.39%
jit-compiler ebpf ebpf-programs rust interpreter packet-filtering assembler bpf

rbpf's Introduction

rbpf

Rust (user-space) virtual machine for eBPF

Build Status Build status Coverage Status Crates.io

Description

This crate contains a virtual machine for eBPF program execution. BPF, as in Berkeley Packet Filter, is an assembly-like language initially developed for BSD systems, in order to filter packets in the kernel with tools such as tcpdump so as to avoid useless copies to user-space. It was ported to Linux, where it evolved into eBPF (extended BPF), a faster version with more features. While BPF programs are originally intended to run in the kernel, the virtual machine of this crate enables running it in user-space applications; it contains an interpreter, an x86_64 JIT-compiler for eBPF programs, as well as a disassembler.

It is based on Rich Lane's uBPF software, which does nearly the same, but is written in C.

The crate is supposed to compile and run on Linux, MacOS X, and Windows, although the JIT-compiler does not work with Windows at this time.

Link to the crate

This crate is available from crates.io, so it should work out of the box by adding it as a dependency in your Cargo.toml file:

[dependencies]
rbpf = "0.2.0"

You can also use the development version from this GitHub repository. This should be as simple as putting this inside your Cargo.toml:

[dependencies]
rbpf = { git = "https://github.com/qmonnet/rbpf" }

Of course, if you prefer, you can clone it locally, possibly hack the crate, and then indicate the path of your local version in Cargo.toml:

[dependencies]
rbpf = { path = "path/to/rbpf" }

Then indicate in your source code that you want to use the crate:

extern crate rbpf;

API

The API is pretty well documented inside the source code. You should also be able to access an online version of the documentation from here, automatically generated from the crates.io version (may not be up-to-date with the main branch). Examples and unit tests should also prove helpful. Here is a summary of how to use the crate.

Here are the steps to follow to run an eBPF program with rbpf:

  1. Create a virtual machine. There are several kinds of machines, we will come back on this later. When creating the VM, pass the eBPF program as an argument to the constructor.
  2. If you want to use some helper functions, register them into the virtual machine.
  3. If you want a JIT-compiled program, compile it.
  4. Execute your program: either run the interpreter or call the JIT-compiled function.

eBPF has been initially designed to filter packets (now it has some other hooks in the Linux kernel, such as kprobes, but this is not covered by rbpf). As a consequence, most of the load and store instructions of the program are performed on a memory area representing the packet data. However, in the Linux kernel, the eBPF program does not immediately access this data area: initially, it has access to a C struct sk_buff instead, which is a buffer containing metadata about the packet—including memory addresses of the beginning and of the end of the packet data area. So the program first loads those pointers from the sk_buff, and then can access the packet data.

This behavior can be replicated with rbpf, but it is not mandatory. For this reason, we have several structs representing different kinds of virtual machines:

  • struct EbpfVmMbuffer mimics the kernel. When the program is run, the address provided to its first eBPF register will be the address of a metadata buffer provided by the user, and that is expected to contain pointers to the start and the end of the packet data memory area.

  • struct EbpfVmFixedMbuff has one purpose: enabling the execution of programs created to be compatible with the kernel, while saving the effort to manually handle the metadata buffer for the user. In fact, this struct has a static internal buffer that is passed to the program. The user has to indicate the offset values at which the eBPF program expects to find the start and the end of packet data in the buffer. On calling the function that runs the program (JITted or not), the struct automatically updates the addresses in this static buffer, at the appointed offsets, for the start and the end of the packet data the program is called upon.

  • struct EbpfVmRaw is for programs that want to run directly on packet data. No metadata buffer is involved, the eBPF program directly receives the address of the packet data in its first register. This is the behavior of uBPF.

  • struct EbpfVmNoData does not take any data. The eBPF program takes no argument whatsoever and its return value is deterministic. Not so sure there is a valid use case for that, but if nothing else, this is very useful for unit tests.

All these structs implement the same public functions:

// called with EbpfVmMbuff:: prefix
pub fn new(prog: &'a [u8]) -> Result<EbpfVmMbuff<'a>, Error>

// called with EbpfVmFixedMbuff:: prefix
pub fn new(prog: &'a [u8],
           data_offset: usize,
           data_end_offset: usize) -> Result<EbpfVmFixedMbuff<'a>, Error>

// called with EbpfVmRaw:: prefix
pub fn new(prog: &'a [u8]) -> Result<EbpfVmRaw<'a>, Error>

// called with EbpfVmNoData:: prefix
pub fn new(prog: &'a [u8]) -> Result<EbpfVmNoData<'a>, Error>

This is used to create a new instance of a VM. The return type is dependent of the struct from which the function is called. For instance, rbpf::EbpfVmRaw::new(Some(my_program)) would return an instance of struct rbpf::EbpfVmRaw (wrapped in a Result). When a program is loaded, it is checked with a very simple verifier (nothing close to the one for Linux kernel). Users are also able to replace it with a custom verifier.

For struct EbpfVmFixedMbuff, two additional arguments must be passed to the constructor: data_offset and data_end_offset. They are the offset (byte number) at which the pointers to the beginning and to the end, respectively, of the memory area of packet data are to be stored in the internal metadata buffer each time the program is executed. Other structs do not use this mechanism and do not need those offsets.

// for struct EbpfVmMbuff, struct EbpfVmRaw and struct EbpfVmRawData
pub fn set_program(&mut self, prog: &'a [u8]) -> Result<(), Error>

// for struct EbpfVmFixedMbuff
pub fn set_program(&mut self, prog: &'a [u8],
                data_offset: usize,
                data_end_offset: usize) -> Result<(), Error>

You can use for example my_vm.set_program(my_program); to change the loaded program after the VM instance creation. This program is checked with the verifier attached to the VM. The verifying function of the VM can be changed at any moment.

pub type Verifier = fn(prog: &[u8]) -> Result<(), Error>;

pub fn set_verifier(&mut self,
                    verifier: Verifier) -> Result<(), Error>

Note that if a program has already been loaded into the VM, setting a new verifier also immediately runs it on the loaded program. However, the verifier is not run if no program has been loaded (if None was passed to the new() method when creating the VM).

pub type Helper = fn (u64, u64, u64, u64, u64) -> u64;

pub fn register_helper(&mut self,
                       key: u32,
                       function: Helper) -> Result<(), Error>

This function is used to register a helper function. The VM stores its registers in a hashmap, so the key can be any u32 value you want. It may be useful for programs that should be compatible with the Linux kernel and therefore must use specific helper numbers.

// for struct EbpfVmMbuff
pub fn execute_program(&self,
                 mem: &'a mut [u8],
                 mbuff: &'a mut [u8]) -> Result<(u64), Error>

// for struct EbpfVmFixedMbuff and struct EbpfVmRaw
pub fn execute_program(&self,
                 mem: &'a mut [u8]) -> Result<(u64), Error>

// for struct EbpfVmNoData
pub fn execute_program(&self) -> Result<(u64), Error>

Interprets the loaded program. The function takes a reference to the packet data and the metadata buffer, or only to the packet data, or nothing at all, depending on the kind of the VM used. The value returned is the result of the eBPF program.

pub fn jit_compile(&mut self) -> Result<(), Error>

JIT-compile the loaded program, for x86_64 architecture. If the program is to use helper functions, they must be registered into the VM before this function is called. The generated assembly function is internally stored in the VM.

// for struct EbpfVmMbuff
pub unsafe fn execute_program_jit(&self, mem: &'a mut [u8],
                            mbuff: &'a mut [u8]) -> Result<(u64), Error>

// for struct EbpfVmFixedMbuff and struct EbpfVmRaw
pub unsafe fn execute_program_jit(&self, mem: &'a mut [u8]) -> Result<(u64), Error>

// for struct EbpfVmNoData
pub unsafe fn execute_program_jit(&self) -> Result<(u64), Error>

Calls the JIT-compiled program. The arguments to provide are the same as for execute_program(), again depending on the kind of VM that is used. The result of the JIT-compiled program should be the same as with the interpreter, but it should run faster. Note that if errors occur during the program execution, the JIT-compiled version does not handle it as well as the interpreter, and the program may crash. For this reason, the functions are marked as unsafe.

Example uses

Simple example

This comes from the unit test test_vm_add.

extern crate rbpf;

fn main() {

    // This is the eBPF program, in the form of bytecode instructions.
    let prog = &[
        0xb4, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // mov32 r0, 0
        0xb4, 0x01, 0x00, 0x00, 0x02, 0x00, 0x00, 0x00, // mov32 r1, 2
        0x04, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, // add32 r0, 1
        0x0c, 0x10, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // add32 r0, r1
        0x95, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00  // exit
    ];

    // Instantiate a struct EbpfVmNoData. This is an eBPF VM for programs that
    // takes no packet data in argument.
    // The eBPF program is passed to the constructor.
    let vm = rbpf::EbpfVmNoData::new(Some(prog)).unwrap();

    // Execute (interpret) the program. No argument required for this VM.
    assert_eq!(vm.execute_program().unwrap(), 0x3);
}

With JIT, on packet data

This comes from the unit test test_jit_ldxh.

extern crate rbpf;

fn main() {
    let prog = &[
        0x71, 0x10, 0x02, 0x00, 0x00, 0x00, 0x00, 0x00, // ldxh r0, [r1+2]
        0x95, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00  // exit
    ];

    // Let's use some data.
    let mem = &mut [
        0xaa, 0xbb, 0x11, 0xcc, 0xdd
    ];

    // This is an eBPF VM for programs reading from a given memory area (it
    // directly reads from packet data)
    let mut vm = rbpf::EbpfVmRaw::new(Some(prog)).unwrap();

    #[cfg(windows)] {
        assert_eq!(vm.execute_program(mem).unwrap(), 0x11);
    }
    #[cfg(not(windows))] {
        // This time we JIT-compile the program.
        vm.jit_compile().unwrap();

        // Then we execute it. For this kind of VM, a reference to the packet
        // data must be passed to the function that executes the program.
        unsafe { assert_eq!(vm.execute_program_jit(mem).unwrap(), 0x11); }
    }
}

Using a metadata buffer

This comes from the unit test test_jit_mbuff and derives from the unit test test_jit_ldxh.

extern crate rbpf;

fn main() {
    let prog = &[
        // Load mem from mbuff at offset 8 into R1
        0x79, 0x11, 0x08, 0x00, 0x00, 0x00, 0x00, 0x00,
        // ldhx r1[2], r0
        0x69, 0x10, 0x02, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x95, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
    ];
    let mem = &mut [
        0xaa, 0xbb, 0x11, 0x22, 0xcc, 0xdd
    ];

    // Just for the example we create our metadata buffer from scratch, and
    // we store the pointers to packet data start and end in it.
    let mut mbuff = &mut [0u8; 32];
    unsafe {
        let mut data     = mbuff.as_ptr().offset(8)  as *mut u64;
        let mut data_end = mbuff.as_ptr().offset(24) as *mut u64;
        *data     = mem.as_ptr() as u64;
        *data_end = mem.as_ptr() as u64 + mem.len() as u64;
    }

    // This eBPF VM is for program that use a metadata buffer.
    let mut vm = rbpf::EbpfVmMbuff::new(Some(prog)).unwrap();

    #[cfg(windows)] {
        assert_eq!(vm.execute_program(mem, mbuff).unwrap(), 0x2211);
    }
    #[cfg(not(windows))] {
        // Here again we JIT-compile the program.
        vm.jit_compile().unwrap();

        // Here we must provide both a reference to the packet data, and to the
        // metadata buffer we use.
        unsafe {
            assert_eq!(vm.execute_program_jit(mem, mbuff).unwrap(), 0x2211);
        }
    }
}

Loading code from an object file; and using a virtual metadata buffer

This comes from unit test test_vm_block_port.

This example requires the following additional crates, you may have to add them to your Cargo.toml file.

[dependencies]
rbpf = "0.2.0"
elf = "0.0.10"

It also uses a kind of VM that uses an internal buffer used to simulate the sk_buff used by eBPF programs in the kernel, without having to manually create a new buffer for each packet. It may be useful for programs compiled for the kernel and that assumes the data they receive is a sk_buff pointing to the packet data start and end addresses. So here we just provide the offsets at which the eBPF program expects to find those pointers, and the VM handles the buffer update so that we only have to provide a reference to the packet data for each run of the program.

extern crate elf;
use std::path::PathBuf;

extern crate rbpf;
use rbpf::helpers;

fn main() {
    // Load a program from an ELF file, e.g. compiled from C to eBPF with
    // clang/LLVM. Some minor modification to the bytecode may be required.
    let filename = "examples/load_elf__block_a_port.o";

    let path = PathBuf::from(filename);
    let file = match elf::File::open_path(&path) {
        Ok(f) => f,
        Err(e) => panic!("Error: {:?}", e),
    };

    // Here we assume the eBPF program is in the ELF section called
    // ".classifier".
    let text_scn = match file.get_section(".classifier") {
        Some(s) => s,
        None => panic!("Failed to look up .classifier section"),
    };

    let prog = &text_scn.data;

    // This is our data: a real packet, starting with Ethernet header
    let packet = &mut [
        0x01, 0x23, 0x45, 0x67, 0x89, 0xab,
        0xfe, 0xdc, 0xba, 0x98, 0x76, 0x54,
        0x08, 0x00,             // ethertype
        0x45, 0x00, 0x00, 0x3b, // start ip_hdr
        0xa6, 0xab, 0x40, 0x00,
        0x40, 0x06, 0x96, 0x0f,
        0x7f, 0x00, 0x00, 0x01,
        0x7f, 0x00, 0x00, 0x01,
        0x99, 0x99, 0xc6, 0xcc, // start tcp_hdr
        0xd1, 0xe5, 0xc4, 0x9d,
        0xd4, 0x30, 0xb5, 0xd2,
        0x80, 0x18, 0x01, 0x56,
        0xfe, 0x2f, 0x00, 0x00,
        0x01, 0x01, 0x08, 0x0a, // start data
        0x00, 0x23, 0x75, 0x89,
        0x00, 0x23, 0x63, 0x2d,
        0x71, 0x64, 0x66, 0x73,
        0x64, 0x66, 0x0a
    ];

    // This is an eBPF VM for programs using a virtual metadata buffer, similar
    // to the sk_buff that eBPF programs use with tc and in Linux kernel.
    // We must provide the offsets at which the pointers to packet data start
    // and end must be stored: these are the offsets at which the program will
    // load the packet data from the metadata buffer.
    let mut vm = rbpf::EbpfVmFixedMbuff::new(Some(prog), 0x40, 0x50).unwrap();

    // We register a helper function, that can be called by the program, into
    // the VM.
    vm.register_helper(helpers::BPF_TRACE_PRINTK_IDX,
                       helpers::bpf_trace_printf).unwrap();

    // This kind of VM takes a reference to the packet data, but does not need
    // any reference to the metadata buffer: a fixed buffer is handled
    // internally by the VM.
    let res = vm.execute_program(packet).unwrap();
    println!("Program returned: {:?} ({:#x})", res, res);
}

Building eBPF programs

Besides passing the raw hexadecimal codes for building eBPF programs, two other methods are available.

Assembler

The first method consists in using the assembler provided by the crate.

extern crate rbpf;
use rbpf::assembler::assemble;

let prog = assemble("add64 r1, 0x605
                     mov64 r2, 0x32
                     mov64 r1, r0
                     be16 r0
                     neg64 r2
                     exit").unwrap();

println!("{:?}", prog);

The above snippet will produce:

Ok([0x07, 0x01, 0x00, 0x00, 0x05, 0x06, 0x00, 0x00,
    0xb7, 0x02, 0x00, 0x00, 0x32, 0x00, 0x00, 0x00,
    0xbf, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
    0xdc, 0x00, 0x00, 0x00, 0x10, 0x00, 0x00, 0x00,
    0x87, 0x02, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
    0x95, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00])

Conversely, a disassembler is also available to dump instruction names from bytecode in a human-friendly format.

extern crate rbpf;
use rbpf::disassembler::disassemble;

let prog = &[
    0x07, 0x01, 0x00, 0x00, 0x05, 0x06, 0x00, 0x00,
    0xb7, 0x02, 0x00, 0x00, 0x32, 0x00, 0x00, 0x00,
    0xbf, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
    0xdc, 0x00, 0x00, 0x00, 0x10, 0x00, 0x00, 0x00,
    0x87, 0x02, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
    0x95, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
];

disassemble(prog);

This will produce the following output:

add64 r1, 0x605
mov64 r2, 0x32
mov64 r1, r0
be16 r0
neg64 r2
exit

Please refer to source code and tests for the syntax and the list of instruction names.

Building API

The other way to build programs is to chain commands from the instruction builder API. It looks less like assembly, maybe more like high-level functions. What's sure is that the result is more verbose, but if you prefer to build programs this way, it works just as well. If we take again the same sample as above, it would be constructed as follows.

extern crate rbpf;
use rbpf::insn_builder::*;

let mut program = BpfCode::new();
program.add(Source::Imm, Arch::X64).set_dst(1).set_imm(0x605).push()
       .mov(Source::Imm, Arch::X64).set_dst(2).set_imm(0x32).push()
       .mov(Source::Reg, Arch::X64).set_src(0).set_dst(1).push()
       .swap_bytes(Endian::Big).set_dst(0).set_imm(0x10).push()
       .negate(Arch::X64).set_dst(2).push()
       .exit().push();

Again, please refer to the source and related tests to get more information and examples on how to use it.

Feedback welcome!

This is the author's first try at writing Rust code. He learned a lot in the process, but there remains a feeling that this crate has a kind of C-ish style in some places instead of the Rusty look the author would like it to have. So feedback (or PRs) are welcome, including about ways you might see to take better advantage of Rust features.

Note that the project expects new commits to be covered by the Developer's Certificate of Origin. When contributing Pull Requests, please sign off your commits accordingly.

Questions / Answers

Why implementing an eBPF virtual machine in Rust?

As of this writing, there is no particular use case for this crate at the best of the author's knowledge. The author happens to work with BPF on Linux and to know how uBPF works, and he wanted to learn and experiment with Rust—no more than that.

What are the differences with uBPF?

Other than the language, obviously? Well, there are some differences:

  • Some constants, such as the maximum length for programs or the length for the stack, differs between uBPF and rbpf. The latter uses the same values as the Linux kernel, while uBPF has its own values.

  • When an error occurs while a program is run by uBPF, the function running the program silently returns the maximum value as an error code, while rbpf returns Rust type Error.

  • The registration of helper functions, that can be called from within an eBPF program, is not handled in the same way.

  • The distinct structs permitting to run program either on packet data, or with a metadata buffer (simulated or not) is a specificity of rbpf.

  • As for performance: theoretically the JITted programs are expected to run at the same speed, while the C interpreter of uBPF should go slightly faster than rbpf. But this has not been asserted yet. Benchmarking both programs would be an interesting thing to do.

Can I use it with the “classic” BPF (a.k.a cBPF) version?

No. This crate only works with extended BPF (eBPF) programs. For cBPF programs, such as used by tcpdump (as of this writing) for example, you may be interested in the bpfjit crate written by Alexander Polakov instead.

What functionalities are implemented?

Running and JIT-compiling eBPF programs work. There is also a mechanism to register user-defined helper functions. The eBPF implementation of the Linux kernel comes with some additional features: a high number of helpers, several kinds of maps, tail calls.

  • Additional helpers should be easy to add, but very few of the existing Linux helpers have been replicated in rbpf so far.

  • Tail calls (“long jumps” from an eBPF program into another) are not implemented. This is probably not trivial to design and implement.

  • The interaction with maps is done through the use of specific helpers, so this should not be difficult to add. The maps themselves can reuse the maps in the kernel (if on Linux), to communicate with in-kernel eBPF programs for instance; or they can be handled in user space. Rust has arrays and hashmaps, so their implementation should be pretty straightforward (and may be added to rbpf in the future).

What about program validation?

The ”verifier” of this crate is very short and has nothing to do with the kernel verifier, which means that it accepts programs that may not be safe. On the other hand, you probably do not run this in a kernel here, so it will not crash your system. Implementing a verifier similar to the one in the kernel is not trivial, and we cannot “copy” it since it is under GPL license.

What about safety then?

Rust has a strong emphasis on safety. Yet to have the eBPF VM work, some unsafe blocks of code are used. The VM, taken as an eBPF interpreter, can return an error but should not crash. Please file an issue otherwise.

As for the JIT-compiler, it is a different story, since runtime memory checks are more complicated to implement in assembly. It will crash if your JIT-compiled program tries to perform unauthorized memory accesses. Usually, it could be a good idea to test your program with the interpreter first.

Oh, and if your program has infinite loops, even with the interpreter, you're on your own.

Caveats

  • This crate is under development and the API may be subject to change.

  • The JIT compiler produces an unsafe program: memory access are not tested at runtime (yet). Use with caution.

  • A small number of eBPF instructions have not been implemented yet. This should not be a problem for the majority of eBPF programs.

  • Beware of turnips. Turnips are disgusting.

To do list

  • Implement some traits (Clone, Drop, Debug are good candidates).
  • Provide built-in support for user-space array and hash BPF maps.
  • Improve safety of JIT-compiled programs with runtime memory checks.
  • Add helpers (some of those supported in the kernel, such as checksum update, could be helpful).
  • Improve verifier. Could we find a way to directly support programs compiled with clang?
  • Maybe one day, tail calls?
  • JIT-compilers for other architectures?

License

Following the effort of the Rust language project itself in order to ease integration with other projects, the rbpf crate is distributed under the terms of both the MIT license and the Apache License (Version 2.0).

See LICENSE-APACHE and LICENSE-MIT for details.

Inspired by

  • uBPF, a C user-space implementation of an eBPF virtual machine, with a JIT-compiler and disassembler (and also including the assembler from the human-readable form of the instructions, such as in mov r0, 0x1337), by Rich Lane for Big Switch Networks (2015)

  • Building a simple JIT in Rust, by Sophia Turner (2015)

  • bpfjit (also on crates.io), a Rust crate exporting the cBPF JIT compiler from FreeBSD 10 tree to Rust, by Alexander Polakov (2016)

Other resources

rbpf's People

Contributors

60ke avatar afonso360 avatar alan-jowett avatar badboy avatar ipuustin avatar jackcmay avatar nanxiao avatar pcy190 avatar qmonnet avatar rlane avatar saethlin avatar ttlajus avatar waywardmonkeys avatar yihuaf avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rbpf's Issues

Assembler

Please, do not start working on this, someone already asked me by email to start the work on the assembler.

New feature: an assembler similar to the one of uBPF, to build eBPF programs with a syntax similar to:

mov32 r0, 0
mov32 r1, 2
add32 r0, 1
add32 r0, r1
exit

This would make rbpf natively compatible with uBPF test cases (most functional unit tests have been translated into bytecode already, but it would make it more readable to have them in human-readable form, and it would help be compatible when uBPF receives new test cases.

Improvement suggestion: use add64 instead of add for 64-bits arithmetic operations (but also accept add only, to remain compatible).

panicked at 'attempt to calculate the remainder with a divisor of zero'

Input
in.zip

Code

fn main() {
    // let filepath = input file in the zip
    let data = std::fs::read(filepath).unwrap();
    if let Ok(vm) = rbpf::EbpfVmNoData::new(Some(&data)) {
        vm.execute_program();

    }

    
}

Output

thread 'main' panicked at 'attempt to calculate the remainder with a divisor of zero', /home/xsh/.cargo/registry/src/github.com-1ecc6299db9ec823/rbpf-0.1.0/src/lib.rs:477:33
stack backtrace:
   0: <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt
   1: core::fmt::write
   2: std::io::Write::write_fmt
   3: std::panicking::default_hook::{{closure}}
   4: std::panicking::default_hook
   5: std::panicking::rust_panic_with_hook
   6: rust_begin_unwind
   7: core::panicking::panic_fmt
   8: core::panicking::panic
   9: rbpf::EbpfVmMbuff::execute_program
             at /home/xsh/.cargo/registry/src/github.com-1ecc6299db9ec823/rbpf-0.1.0/src/lib.rs:477
  10: rbpf::EbpfVmRaw::execute_program
             at /home/xsh/.cargo/registry/src/github.com-1ecc6299db9ec823/rbpf-0.1.0/src/lib.rs:1257
  11: rbpf::EbpfVmNoData::execute_program
             at /home/xsh/.cargo/registry/src/github.com-1ecc6299db9ec823/rbpf-0.1.0/src/lib.rs:1540
  12: rbpffuzzvrf::main

Expect
properly return error instead of panic

Incorrect shift implementation in the interpreter

According to the kernel eBPF standard, Shift operations use a mask of 0x3F (63) for 64-bit operations and 0x1F (31) for 32-bit operations.

https://github.com/torvalds/linux/blob/610a9b8f49fbcf1100716370d3b5f6f884a2835a/Documentation/bpf/standardization/instruction-set.rst?plain=1#L300-L301

In both ubpf and kernel eBPF implementation, it masks the operation of LSH/RSH:

https://github.com/iovisor/ubpf/blob/4b9a1cc747cbb9a9aad025a68aaedff7211f7fc2/vm/ubpf_vm.c#L495-L506
https://github.com/torvalds/linux/blob/610a9b8f49fbcf1100716370d3b5f6f884a2835a/kernel/bpf/core.c#L1712-L1724

In the rbpf interpreter, it lacks the masking operation.

rbpf/src/interpreter.rs

Lines 225 to 228 in 4812c52

ebpf::LSH32_IMM => reg[_dst] = (reg[_dst] as u32).wrapping_shl(insn.imm as u32) as u64,
ebpf::LSH32_REG => reg[_dst] = (reg[_dst] as u32).wrapping_shl(reg[_src] as u32) as u64,
ebpf::RSH32_IMM => reg[_dst] = (reg[_dst] as u32).wrapping_shr(insn.imm as u32) as u64,
ebpf::RSH32_REG => reg[_dst] = (reg[_dst] as u32).wrapping_shr(reg[_src] as u32) as u64,

Moreover, in LSH64/RSH64, it doesn't check the range of src value, which could result in the 'attempt to shift left with overflow' panic as the following PoC program shows.

rbpf/src/interpreter.rs

Lines 272 to 275 in 4812c52

ebpf::LSH64_IMM => reg[_dst] <<= insn.imm as u64,
ebpf::LSH64_REG => reg[_dst] <<= reg[_src],
ebpf::RSH64_IMM => reg[_dst] >>= insn.imm as u64,
ebpf::RSH64_REG => reg[_dst] >>= reg[_src],

This program would trigger the 'attempt to shift left with overflow' panic in interpreter

mov64 r8, 0x054545ff
lsh64 r8, r8
exit

Inconsistences in arithmetic shift implementation (mask offset)

This issue follows #99 and #100. The previous PR demonstrates the implementation incompliance on the logic shift implementation. However, the arithmetic shift operation still meets this problem, and needs the additional mask offset as well.

The current implementation of ARSH64 is not compliant, and undefined behavior happens if we overflow the number of bits we have when trying to shift.

The following PoC program could trigger this inconsistency:

mov64 r8, 0x054545ff
arsh64 r8, r8
exit

Note that in the kernel interpreter implementation, it masks the SRC/IMM as well.
https://github.com/torvalds/linux/blob/610a9b8f49fbcf1100716370d3b5f6f884a2835a/kernel/bpf/core.c#L1794-L1805

Hence we need to mask the shift offset with 63 in the following code. Since we don't perform verifier check on the immediate value, we need to mask the IMM as well.

rbpf/src/interpreter.rs

Lines 289 to 290 in 7bebff5

ebpf::ARSH64_IMM => reg[_dst] = (reg[_dst] as i64 >> insn.imm) as u64,
ebpf::ARSH64_REG => reg[_dst] = (reg[_dst] as i64 >> reg[_src]) as u64,

Appveyor tests are flaky (Cargo fails to download dependencies)

Appveyor runs often fails with:

[...]

C:\projects\rbpf>cargo test -vv  
    Updating crates.io index
 Downloading crates ...
warning: spurious network error (2 tries remaining): [35] SSL connect error (schannel: next InitializeSecurityContext failed: SEC_E_ILLEGAL_MESSAGE (0x80090326) - This error usually occurs when a fatal SSL/TLS alert is received (e.g. handshake failed). More detail may be available in the Windows System event log.)
warning: spurious network error (2 tries remaining): [35] SSL connect error (schannel: next InitializeSecurityContext failed: SEC_E_ILLEGAL_MESSAGE (0x80090326) - This error usually occurs when a fatal SSL/TLS alert is received (e.g. handshake failed). More detail may be available in the Windows System event log.)
warning: spurious network error (2 tries remaining): [35] SSL connect error (schannel: next InitializeSecurityContext failed: SEC_E_ILLEGAL_MESSAGE (0x80090326) - This error usually occurs when a fatal SSL/TLS alert is received (e.g. handshake failed). More detail may be available in the Windows System event log.)
warning: spurious network error (2 tries remaining): [35] SSL connect error (schannel: next InitializeSecurityContext failed: SEC_E_ILLEGAL_MESSAGE (0x80090326) - This error usually occurs when a fatal SSL/TLS alert is received (e.g. handshake failed). More detail may be available in the Windows System event log.)
warning: spurious network error (2 tries remaining): [35] SSL connect error (schannel: next InitializeSecurityContext failed: SEC_E_ILLEGAL_MESSAGE (0x80090326) - This error usually occurs when a fatal SSL/TLS alert is received (e.g. handshake failed). More detail may be available in the Windows System event log.)
warning: spurious network error (2 tries remaining): [35] SSL connect error (schannel: next InitializeSecurityContext failed: SEC_E_ILLEGAL_MESSAGE (0x80090326) - This error usually occurs when a fatal SSL/TLS alert is received (e.g. handshake failed). More detail may be available in the Windows System event log.)
warning: spurious network error (2 tries remaining): [35] SSL connect error (schannel: next InitializeSecurityContext failed: SEC_E_ILLEGAL_MESSAGE (0x80090326) - This error usually occurs when a fatal SSL/TLS alert is received (e.g. handshake failed). More detail may be available in the Windows System event log.)
warning: spurious network error (1 tries remaining): [35] SSL connect error (schannel: next InitializeSecurityContext failed: SEC_E_ILLEGAL_MESSAGE (0x80090326) - This error usually occurs when a fatal SSL/TLS alert is received (e.g. handshake failed). More detail may be available in the Windows System event log.)
warning: spurious network error (1 tries remaining): [35] SSL connect error (schannel: next InitializeSecurityContext failed: SEC_E_ILLEGAL_MESSAGE (0x80090326) - This error usually occurs when a fatal SSL/TLS alert is received (e.g. handshake failed). More detail may be available in the Windows System event log.)
warning: spurious network error (1 tries remaining): [35] SSL connect error (schannel: next InitializeSecurityContext failed: SEC_E_ILLEGAL_MESSAGE (0x80090326) - This error usually occurs when a fatal SSL/TLS alert is received (e.g. handshake failed). More detail may be available in the Windows System event log.)
warning: spurious network error (1 tries remaining): [35] SSL connect error (schannel: next InitializeSecurityContext failed: SEC_E_ILLEGAL_MESSAGE (0x80090326) - This error usually occurs when a fatal SSL/TLS alert is received (e.g. handshake failed). More detail may be available in the Windows System event log.)
warning: spurious network error (1 tries remaining): [35] SSL connect error (schannel: next InitializeSecurityContext failed: SEC_E_ILLEGAL_MESSAGE (0x80090326) - This error usually occurs when a fatal SSL/TLS alert is received (e.g. handshake failed). More detail may be available in the Windows System event log.)
warning: spurious network error (1 tries remaining): [35] SSL connect error (schannel: next InitializeSecurityContext failed: SEC_E_ILLEGAL_MESSAGE (0x80090326) - This error usually occurs when a fatal SSL/TLS alert is received (e.g. handshake failed). More detail may be available in the Windows System event log.)
warning: spurious network error (1 tries remaining): [35] SSL connect error (schannel: next InitializeSecurityContext failed: SEC_E_ILLEGAL_MESSAGE (0x80090326) - This error usually occurs when a fatal SSL/TLS alert is received (e.g. handshake failed). More detail may be available in the Windows System event log.)
error: failed to download from `https://crates.io/api/v1/crates/json/0.11.15/download`
Caused by:
  [35] SSL connect error (schannel: next InitializeSecurityContext failed: SEC_E_ILLEGAL_MESSAGE (0x80090326) - This error usually occurs when a fatal SSL/TLS alert is received (e.g. handshake failed). More detail may be available in the Windows System event log.)
C:\projects\rbpf>if 101 NEQ 0 exit 1 
Command exited with code 1

Best related resource I could find is rust-lang/cargo#9788 - But if I read correctly, this should be fixed with recent cargo versions (1.55+, we're using 1.64+), so I'm not sure what's causing those flakes and how to fix them. Any suggestion welcome.

Chained API for assemble / verify / run BPF programs

From the #6

PS: working on this API I start thinking about this RawBpfCode and VerifiedBpfCode and let myself create needless functionality. And I understood the reason. It is because my code is OOP-like rather then procedural-like as in the other modules. 😄
I start to imagine that we could write chaining code for BPF program in general. For instance:

let bpf_vm = ... // create VM of any type

RawBpfCode::new().load(...).push().store(...).push()
// or RawBpfCode::parse(...)
// or RawBpfCode::from_elf(...)
.verify().map(|verified| verified.execute_with(bpf_vm)).err().map(...);

and RawBpfCode::parse() can actually use asm_parser functionality to be synched with it.

Taking into account @qmonnet comment:

we already have the “procedural-like” approach that works well, why change it

we can improve procedural API that instead of panic! return Result.
Any thoughts?

cc @badboy @rlane @waywardmonkeys

Provide Helper function type

I liked the type you declared for a verifier:

pub type Verifier = fn(prog: &[u8]) -> Result<(), Error>;
Do you think it would be worth doing the same for the helpers? So that instead of:

pub fn register_helper(&mut self,
key: u32,
function: fn (u64, u64, u64, u64, u64) -> u64)
-> Result<(), Error>
... we'd have something like:

pub fn register_helper(&mut self,
key: u32,
function: Helper) -> Result<(), Error>
(EbpfHelper or Helper, I don't really know what sounds best. Maybe just Helper if people call that with the rbpf:: prefix already?)

rbpf panic when running bpf_conformance with JIT

thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: 
Custom { 
kind: Other, 
error: "Error: out of bounds memory store (insn #2), addr 0x2, size 4
mbuff: 0x560d8e8a15b0/0x0, 
mem: 0x1/0x0, 
stack: 0x560d8f935e70/0x200" 
}', examples/rbpf_plugin.rs:45:50
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Segmentation fault when executing jitted program

Segmentation fault when executing the jitted ebpf program

Description

Segmentation fault when executting the jitted ebpf program. But the same program works well when using intepreter mode.

Cargo.toml

[package]
name = "rbpf-poc"
version = "0.1.0"
edition = "2021"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
rbpf = { git = "https://github.com/qmonnet/rbpf" }

Rust program

fn main() {
    let prog = std::fs::read("prime.bpf.bin").unwrap();
    let mut vm = rbpf::EbpfVmRaw::new(Some(&prog)).unwrap();
    vm.jit_compile().unwrap();
    let mem = &mut [0x00];
    unsafe { vm.execute_program_jit(mem) }.unwrap();
}

ebpf program

prime.bpf.bin.zip

It was zipped, since github does not support uploading *.bin files.

The ebpf program was compiled from the following C source using clang 14.0.6, then the .text segment was extracted using llvm-objcopy

int main() {
  long cnt = 0;
  for (int i = 1; i < 1e4; i++) {
    int ok = 1;
    for (int j = 2; j * j <= i && ok; j++) {
      if (i % j == 0)
        ok = 0;
    }
    cnt += ok;
  }
  return cnt;
}

How to trigger

root@mnfe-pve:~/rbpf-poc# cargo run
    Finished dev [unoptimized + debuginfo] target(s) in 0.07s
     Running `target/debug/rbpf-poc`
Segmentation fault

Expected behavior

Program successfully exited with no exception

Environment

stable-x86_64-unknown-linux-gnu (default)
rustc 1.71.0 (8ede3aae2 2023-07-12)
root@mnfe-pve:~/bpf-benchmark# uname -a
Linux mnfe-pve 6.2.16-6-pve #1 SMP PREEMPT_DYNAMIC PMX 6.2.16-7 (2023-08-01T11:23Z) x86_64 GNU/Linux
root@mnfe-pve:~/bpf-benchmark# lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description:    Debian GNU/Linux 12 (bookworm)
Release:        12
Codename:       bookworm
root@mnfe-pve:~/bpf-benchmark# 

Gracefully fail when possible, panic only if necessary

Currently rbpf panics for any error. Instead, where possible fail gracefully and return an error to the user.

This also has implications for the verification function. Its current signature returns a bool but rbpf ignores it and expects the verification function to panic. Part of addressing this issue is to respect the return value and potentially pass a more specific failure code up to the user.

Consider moving exampels to examples/ directory

This way they are automatically compiled on test and provide a ready-to-use example file for user.
For the bigger example loading an elf it might be a nice idea to include a minimal C file and how to compile it to use with the example. Dependencies for that example need dependencies, which can be added as dev-dependencies.

If interested I can take care of that.

panicked at 'attempt to add with overflow'

Input
in.zip

Code

fn main() {
    // let filepath = input file in the zip
    let data = std::fs::read(filepath).unwrap();
    if let Ok(vm) = rbpf::EbpfVmNoData::new(Some(&data)) {
        vm.execute_program();

    }

    
}

Output

thread 'main' panicked at 'attempt to add with overflow', /home/xsh/.cargo/registry/src/github.com-1ecc6299db9ec823/rbpf-0.1.0/src/lib.rs:583:45
stack backtrace:
   0: <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt
   1: core::fmt::write
   2: std::io::Write::write_fmt
   3: std::panicking::default_hook::{{closure}}
   4: std::panicking::default_hook
   5: std::panicking::rust_panic_with_hook
   6: rust_begin_unwind
   7: core::panicking::panic_fmt
   8: core::panicking::panic
   9: rbpf::EbpfVmMbuff::check_mem
             at /home/xsh/.cargo/registry/src/github.com-1ecc6299db9ec823/rbpf-0.1.0/src/lib.rs:583
  10: rbpf::EbpfVmMbuff::execute_program::{{closure}}
             at /home/xsh/.cargo/registry/src/github.com-1ecc6299db9ec823/rbpf-0.1.0/src/lib.rs:301
  11: rbpf::EbpfVmMbuff::execute_program
             at /home/xsh/.cargo/registry/src/github.com-1ecc6299db9ec823/rbpf-0.1.0/src/lib.rs:372
  12: rbpf::EbpfVmRaw::execute_program
             at /home/xsh/.cargo/registry/src/github.com-1ecc6299db9ec823/rbpf-0.1.0/src/lib.rs:1257
  13: rbpf::EbpfVmNoData::execute_program
             at /home/xsh/.cargo/registry/src/github.com-1ecc6299db9ec823/rbpf-0.1.0/src/lib.rs:1540

Expect
properly return error instead of panic

Update README.md

The README.md file is somewhat outdated. In particular:

  • It should mention the systems rbpf can be compiled on (Linux/MacOS, Windows as well but without the JIT (see #21)).
  • It does not mention the API created some time ago to build the BPF programs.
  • There are probably a lot of other items to update.

If you stumble on something else, do not hesitate to add elements to update below, so that I do not forget them the day I dive into this!

Improve program patching for context accesses

I pushed a branch called patch_prog that introduces a new module (patch) that proposes a function attempting to patch an eBPF program generated with clang, in order to make it compatible with rbpf.

The function is a dumb heuristic and is nowhere close to what happens in the kernel. See comments in the source code (module patch) for details. Implementing a more efficient algorithm does not look trivial (code in the kernel is GPL and we cannot reuse it).

I only tested it on a single example. If anyone has time to play with it and provide suggestions or feedback, that would be very welcome. Will try to update this issue if I can obtain better results.

Attempt to calculate the remainder with a divisor of zero for MOD32 in interpreter

Details

In the implementation of MOD32, it checks whether the 64-bit source value is zero. However, during the modulus computation, the source is casted to 32-bit, which could still be zero, e.g., 0xffff0000. According to the kernel ebpf standard, we should check the modulo by zero by checking the nullness of the 32-bit source value.

https://github.com/torvalds/linux/blob/610a9b8f49fbcf1100716370d3b5f6f884a2835a/Documentation/bpf/standardization/instruction-set.rst?plain=1#L246-L251

If execution would result in modulo by zero, for BPF_ALU64 the value of
the destination register is unchanged whereas for BPF_ALU the upper
32 bits of the destination register are zeroed.

The current implementation of MOD32_IMM and MOD32_REG lacks the check to the lower 32 bits of source value.

rbpf/src/interpreter.rs

Lines 230 to 233 in 4812c52

ebpf::MOD32_IMM if insn.imm == 0 => (),
ebpf::MOD32_IMM => reg[_dst] = (reg[_dst] as u32 % insn.imm as u32) as u64,
ebpf::MOD32_REG if reg[_src] == 0 => (),
ebpf::MOD32_REG => reg[_dst] = (reg[_dst] as u32 % reg[_src] as u32) as u64,

The following PoC program would cause the attempt to calculate the remainder with a divisor of zero panic in the interpreter:

lddw r1, 0x100000000
mod32 r0, r1
exit

Result:

attempt to calculate the remainder with a divisor of zero
thread '' panicked at 'attempt to calculate the remainder with a divisor of zero', src/interpreter.rs:235:47
stack backtrace:
...
   3: rbpf::interpreter::execute_program
             at ./src/interpreter.rs:235:47
   4: rbpf::EbpfVmMbuff::execute_program
             at ./src/lib.rs:300:9
   5: rbpf::EbpfVmRaw::execute_program
             at ./src/lib.rs:1221:9
   6: rbpf::EbpfVmNoData::execute_program
             at ./src/lib.rs:1581:9

Suggestion

Check nullness of 32-bit casted source value before modulus

Attempt to divide by zero in DIV32 of interpreter

Details

In the implementation of DIV32, it checks whether the 64-bit source value is zero. However, during the division computation, the source is casted to 32-bit, which could still be zero, e.g., 0xffff000000000000. According to the kernel ebpf standard, we should check the division by zero by checking the nullness of the 32-bit source value.

https://github.com/torvalds/linux/blob/610a9b8f49fbcf1100716370d3b5f6f884a2835a/Documentation/bpf/standardization/instruction-set.rst?plain=1#L246-L251

If BPF program execution would result in division by zero, the destination register is instead set to zero.

The current implementation of DIV32_IMM and DIV32_REG lacks the check to the lower 32 bits of source value.

rbpf/src/interpreter.rs

Lines 217 to 220 in 4812c52

ebpf::DIV32_IMM if insn.imm == 0 => reg[_dst] = 0,
ebpf::DIV32_IMM => reg[_dst] = (reg[_dst] as u32 / insn.imm as u32) as u64,
ebpf::DIV32_REG if reg[_src] == 0 => reg[_dst] = 0,
ebpf::DIV32_REG => reg[_dst] = (reg[_dst] as u32 / reg[_src] as u32) as u64,

The following PoC program would cause the attempt to calculate the remainder with a divisor of zero panic in the interpreter:

lddw r1, 0x100000000
div32 r0, r1
exit

Result:

attempt to divide by zero
thread '' panicked at 'attempt to divide by zero', src/interpreter.rs:222:45
stack backtrace:
...
   3: rbpf::interpreter::execute_program
             at ./src/interpreter.rs:222:45
   4: rbpf::EbpfVmMbuff::execute_program
             at ./src/lib.rs:300:9
   5: rbpf::EbpfVmRaw::execute_program
             at ./src/lib.rs:1221:9

The expected result is that the program executed with r0 set to zero, without panic/error.

Suggestion

Check nullness of 32-bit casted source value before division

Instruction stack data structure

Working on #6, looking at #10, #9 and #8. I see that we have a lack of high-level stack of bpf instructions.
Personally, I see it as the following:

struct InsnStack {
   head: RefToInsn // points on the first instruction that will be executed
   tail: RefToInsn // points on `exit` instuction, unless it is an invalid bpf program
}

impl InsnStack {
   pub fn create() -> Self {
       //...
   }

   pub fn disassemble() -> String {
       //...
   }

   //may be a bunch of functions from #6 ... I am ok if we remove `insn_builder` module

   pub fn patch(patch: MbuffStructure) {
       //...
   }

   pub fn validate() -> Result<ValidInsnStack, ValidationError> { // <- the same as InsnStack but `tail` always points to `exit` Insn :smile:
       //...
   }
}

pub fn parse_assemble(raw_bpf_asm: String) -> InsnStack {
   //...
}

RefToInsn can be any type of reference, however, we could not use Box here.
Having that structure we can provide API to easily manipulate its content.

@qmonnet, thoughts?

PS1: again as in #12 we need to do some prototype here.
PS2: I also don't like that we have the bunch of different structures that in some or other way represent bpf instruction, but I don't see the solution to this problem right now.

Attempt to negate with overflow in ld_st_imm_str of disassembler

The disassembler would panic in ld_st_imm_str when it tries to negate the 0x8000i16 value in

rbpf/src/disassembler.rs

Lines 29 to 33 in 4812c52

fn ld_st_imm_str(name: &str, insn: &ebpf::Insn) -> String {
if insn.off >= 0 {
format!("{name} [r{}+{:#x}], {:#x}", insn.dst, insn.off, insn.imm)
} else {
format!("{name} [r{}-{:#x}], {:#x}", insn.dst, -insn.off, insn.imm)

The PoC program to reproduce:

disassembler::disassemble(&[98, 1, 0, 128, 0, 0, 31, 145])

The would panic the disassembler:

thread '<unnamed>' panicked at 'attempt to negate with overflow', /rbpf-0.2.0/src/disassembler.rs:33:56

To enhance the robustness of the disassembler, the negation logic of i16 could be restructured in ld_st_imm_str function.

Support for 32-bit jump instructions

eBPF in Linux kernel just got extended to use the 0x06 class of instructions for 32-bit jumps. See the merge commit and its parents. This should land in kernel 5.1.

To keep rbpf compatibility with Linux eBPF as close as possible, it would be nice to have support for those instructions for the interpreter as well as for the JIT in the future.

Is only support x86_64?

I run it in router
[Linux RT-AC68U 2.6.36.4brcmarm #1 SMP PREEMPT Fri May 10 22:16:14 CST 2019 armv7l GNU/Linux],
execute_program_jit is alway Segmentation fault

main.rs:

extern crate rbpf;

fn main() {
    let prog = &[
        0x71, 0x10, 0x02, 0x00, 0x00, 0x00, 0x00, 0x00, // ldxh r0, [r1+2]
        0x95, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00  // exit
    ];

    // Let's use some data.
    let mem = &mut [
        0xaa, 0xbb, 0x11, 0xcc, 0xdd
    ];
    println!("next EbpfVmRaw::new");
    // This is an eBPF VM for programs reading from a given memory area (it
    // directly reads from packet data)
    let mut vm = rbpf::EbpfVmRaw::new(prog).unwrap();

    println!("next jit_compile");
    // This time we JIT-compile the program.
    vm.jit_compile().unwrap();

    println!("next execute_program_jit");
    // Then we execute it. For this kind of VM, a reference to the packet data
    // must be passed to the function that executes the program.
    unsafe { assert_eq!(vm.execute_program_jit(mem).unwrap(), 0x11); }
    println!("finnal");
}

run it in router:

next EbpfVmRaw::new
next jit_compile
next execute_program_jit
Segmentation fault

Make JIT usable on Windows

Would be great to have rbpf compiling on Windows.

Quoting @badboy:

AppVeyor is free and easy to setup as CI as well to run the build. The jitting might be incompatible, but the rest should probably work.

See issue #19.

Support uprobe and kprobe?

I'm new in BPF. Does this rbpf support kprobe and uprobe? If support uprobe, does it has context switch issues?

Attempt to negate with overflow in ld_reg_str of disassembler

The disassembler would panic in ld_reg_str when it tries to negate the 0x8000i16 value in -insn.off

rbpf/src/disassembler.rs

Lines 37 to 42 in 4812c52

#[inline]
fn ld_reg_str(name: &str, insn: &ebpf::Insn) -> String {
if insn.off >= 0 {
format!("{name} r{}, [r{}+{:#x}]", insn.dst, insn.src, insn.off)
} else {
format!("{name} r{}, [r{}-{:#x}]", insn.dst, insn.src, -insn.off)

The PoC program to reproduce:

disassembler::disassemble(&[113, 1, 0, 128, 0, 0, 31, 145])

This would encounter the following panic:

thread '<unnamed>' panicked at 'attempt to negate with overflow'

Noted that, this is issue is also encountered in st_reg_str, jmp_imm_str, jmp_reg_str functions.

To enhance the robustness of the disassembler, the negation logic of i16 could be restructured in those function.

panicked at 'attempt to shift left with overflow

Input
in.zip

Code

fn main() {
    // let filepath = input file in the zip
    let data = std::fs::read(filepath).unwrap();
    if let Ok(vm) = rbpf::EbpfVmNoData::new(Some(&data)) {
        vm.execute_program();

    }

    
}

Output

thread 'main' panicked at 'attempt to shift left with overflow', /home/xsh/.cargo/registry/src/github.com-1ecc6299db9ec823/rbpf-0.1.0/src/lib.rs:520:37
stack backtrace:
   0: <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt
   1: core::fmt::write
   2: std::io::Write::write_fmt
   3: std::panicking::default_hook::{{closure}}
   4: std::panicking::default_hook
   5: std::panicking::rust_panic_with_hook
   6: rust_begin_unwind
   7: core::panicking::panic_fmt
   8: core::panicking::panic
   9: rbpf::EbpfVmMbuff::execute_program
             at /home/xsh/.cargo/registry/src/github.com-1ecc6299db9ec823/rbpf-0.1.0/src/lib.rs:520
  10: rbpf::EbpfVmRaw::execute_program
             at /home/xsh/.cargo/registry/src/github.com-1ecc6299db9ec823/rbpf-0.1.0/src/lib.rs:1257
  11: rbpf::EbpfVmNoData::execute_program
             at /home/xsh/.cargo/registry/src/github.com-1ecc6299db9ec823/rbpf-0.1.0/src/lib.rs:1540

Expect
properly return error instead of panic

Update dependencies in preparation for release

Updating dependency combine from 2.1.1 to 3.6 will require some code rework since there were non-compatible api changes.

Updating the rest of the dependencies to the following did not change any test results:

combine = "2.1.1"
-libc = "0.2.0"
+libc = "0.2"
time = "0.1"
-byteorder = "1.2.1"
+byteorder = "1.2"

[dev-dependencies]

elf = "0.0.10"
-json = "0.11.4"
+json = "0.11"

prog_exec has no way to report a vm error

Callers may not want to panic if there is a problem running the program and instead may want to exit gracefully and report an error.

Consider changing:

pub fn prog_exec(&self, mem: &[u8], mbuff: &[u8]) -> u64

To:

pub fn prog_exec(&self, mem: &[u8], mbuff: &[u8]) -> Result<u64, Error>

Out-of-bound memory write in the interpreter

There exists several integer overflow in the mem_check check. The addition of 'addr' and 'len' variables may result in overflow towards a lower address, circumventing the checks on pointer addresses.

rbpf/src/interpreter.rs

Lines 15 to 23 in 4812c52

if mbuff.as_ptr() as u64 <= addr && addr + len as u64 <= mbuff.as_ptr() as u64 + mbuff.len() as u64 {
return Ok(())
}
if mem.as_ptr() as u64 <= addr && addr + len as u64 <= mem.as_ptr() as u64 + mem.len() as u64 {
return Ok(())
}
if stack.as_ptr() as u64 <= addr && addr + len as u64 <= stack.as_ptr() as u64 + stack.len() as u64 {
return Ok(())
}

Take addr + len as u64 <= mbuff.as_ptr() as u64 + mbuff.len() as u64 for an example, if addr is the mbuff pointer address and len is -1, the final result overflow towards mbuff.ptr() -1, causing the bad memory access.

The following PoC program will violate the safety check and broke the interpreter.

stdw [r2-0x1], 0x380affff
exit

With overflow-check enabled, we coud get the following panic:

thread '<unnamed>' panicked at src/interpreter.rs:15:41:
attempt to add with overflow
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
...
    #14 0x10668d704 in core::panicking::panic::h28efa1d8b603254d panicking.rs:144
    #15 0x104fe6ed4 in rbpf::interpreter::check_mem::he501d64976cf2caa interpreter.rs:15
    #16 0x104fe858c in rbpf::interpreter::execute_program::he7cae8749900d08a interpreter.rs:176

panicked at 'attempt to divide by zero'

Input
in.zip

Code

fn main() {
    // let filepath = input file in the zip
    let data = std::fs::read(filepath).unwrap();
    if let Ok(vm) = rbpf::EbpfVmNoData::new(Some(&data)) {
        vm.execute_program();

    }

    
}

Output

thread 'main' panicked at 'attempt to divide by zero', /home/xsh/.cargo/registry/src/github.com-1ecc6299db9ec823/rbpf-0.1.0/src/lib.rs:461:33
stack backtrace:
   0: <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt
   1: core::fmt::write
   2: std::io::Write::write_fmt
   3: std::panicking::default_hook::{{closure}}
   4: std::panicking::default_hook
   5: std::panicking::rust_panic_with_hook
   6: rust_begin_unwind
   7: core::panicking::panic_fmt
   8: core::panicking::panic
   9: rbpf::EbpfVmMbuff::execute_program
             at /home/xsh/.cargo/registry/src/github.com-1ecc6299db9ec823/rbpf-0.1.0/src/lib.rs:461
  10: rbpf::EbpfVmRaw::execute_program
             at /home/xsh/.cargo/registry/src/github.com-1ecc6299db9ec823/rbpf-0.1.0/src/lib.rs:1257
  11: rbpf::EbpfVmNoData::execute_program
             at /home/xsh/.cargo/registry/src/github.com-1ecc6299db9ec823/rbpf-0.1.0/src/lib.rs:1540

Expect
properly return error instead of panic

Tests panick with `misaligned pointer dereference: address must be a multiple of ... but is ...`

Tests consistently fail with recent versions of the toolchain, with messages such as:

thread 'test_jit_mbuff' panicked at 'misaligned pointer dereference: address must be a multiple of 0x4 but is 0x7f8038001011', src/jit.rs:100:5
Full error message
     Running tests/misc.rs (target/debug/deps/misc-6f56f75d4e6d43c9)

running 20 tests
thread 'test_jit_mbuff' panicked at 'misaligned pointer dereference: address must be a multiple of 0x4 but is 0x7f8038001011', src/jit.rs:100:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
thread 'test_jit_mbuff' panicked at 'panic in a function that cannot unwind', library/core/src/panicking.rs:126:5
stack backtrace:
   0:     0x562d63bf4351 - std::backtrace_rs::backtrace::libunwind::trace::h28494931c73179b2
                               at /rustc/2f6bc5d259e7ab25ddfdd33de53b892770218918/library/std/src/../../backtrace/src/backtrace/libunwind.rs:93:5
   1:     0x562d63bf4351 - std::backtrace_rs::backtrace::trace_unsynchronized::h9032c52edccf7bd1
                               at /rustc/2f6bc5d259e7ab25ddfdd33de53b892770218918/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
   2:     0x562d63bf4351 - std::sys_common::backtrace::_print_fmt::hd90562e967f4e4e1
                               at /rustc/2f6bc5d259e7ab25ddfdd33de53b892770218918/library/std/src/sys_common/backtrace.rs:65:5
   3:     0x562d63bf4351 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h113657117676131e
                               at /rustc/2f6bc5d259e7ab25ddfdd33de53b892770218918/library/std/src/sys_common/backtrace.rs:44:22
   4:     0x562d63c1995f - core::fmt::rt::Argument::fmt::hd56cdfa11c364505
                               at /rustc/2f6bc5d259e7ab25ddfdd33de53b892770218918/library/core/src/fmt/rt.rs:138:9
   5:     0x562d63c1995f - core::fmt::write::h24c20284e5d6be9e
                               at /rustc/2f6bc5d259e7ab25ddfdd33de53b892770218918/library/core/src/fmt/mod.rs:1094:21
   6:     0x562d63bf1b41 - std::io::Write::write_fmt::hbf02c94f0e7342d1
                               at /rustc/2f6bc5d259e7ab25ddfdd33de53b892770218918/library/std/src/io/mod.rs:1712:15
   7:     0x562d63bf4165 - std::sys_common::backtrace::_print::he85212e2c716c859
                               at /rustc/2f6bc5d259e7ab25ddfdd33de53b892770218918/library/std/src/sys_common/backtrace.rs:47:5
   8:     0x562d63bf4165 - std::sys_common::backtrace::print::h888aaf3ad10f084e
                               at /rustc/2f6bc5d259e7ab25ddfdd33de53b892770218918/library/std/src/sys_common/backtrace.rs:34:9
   9:     0x562d63bf5cb7 - std::panicking::default_hook::{{closure}}::hba0edb58dc223add
  10:     0x562d63bf5aa4 - std::panicking::default_hook::h1555b8bada2010d7
                               at /rustc/2f6bc5d259e7ab25ddfdd33de53b892770218918/library/std/src/panicking.rs:288:9
  11:     0x562d63b6cdd4 - <alloc::boxed::Box<F,A> as core::ops::function::Fn<Args>>::call::h52e6cd440f597cb6
                               at /rustc/2f6bc5d259e7ab25ddfdd33de53b892770218918/library/alloc/src/boxed.rs:1999:9
  12:     0x562d63b6cdd4 - test::test_main::{{closure}}::hdcda637653172e31
                               at /rustc/2f6bc5d259e7ab25ddfdd33de53b892770218918/library/test/src/lib.rs:134:21
  13:     0x562d63bf62c7 - <alloc::boxed::Box<F,A> as core::ops::function::Fn<Args>>::call::h07438796673f3d04
                               at /rustc/2f6bc5d259e7ab25ddfdd33de53b892770218918/library/alloc/src/boxed.rs:1999:9
  14:     0x562d63bf62c7 - std::panicking::rust_panic_with_hook::h72a06453beb2cbcb
                               at /rustc/2f6bc5d259e7ab25ddfdd33de53b892770218918/library/std/src/panicking.rs:695:13
  15:     0x562d63bf6001 - std::panicking::begin_panic_handler::{{closure}}::h0281d6cc05cfd2a4
                               at /rustc/2f6bc5d259e7ab25ddfdd33de53b892770218918/library/std/src/panicking.rs:580:13
  16:     0x562d63bf4796 - std::sys_common::backtrace::__rust_end_short_backtrace::h1c79565770be27d9
                               at /rustc/2f6bc5d259e7ab25ddfdd33de53b892770218918/library/std/src/sys_common/backtrace.rs:150:18
  17:     0x562d63bf5db2 - rust_begin_unwind
                               at /rustc/2f6bc5d259e7ab25ddfdd33de53b892770218918/library/std/src/panicking.rs:578:5
  18:     0x562d63b32b33 - core::panicking::panic_nounwind_fmt::hdb51e63c7c599e80
                               at /rustc/2f6bc5d259e7ab25ddfdd33de53b892770218918/library/core/src/panicking.rs:96:14
  19:     0x562d63b32bd7 - core::panicking::panic_nounwind::hc4511f17cef1f7da
                               at /rustc/2f6bc5d259e7ab25ddfdd33de53b892770218918/library/core/src/panicking.rs:126:5
  20:     0x562d63b32d63 - core::panicking::panic_cannot_unwind::hbac7ba3f5f929c6c
                               at /rustc/2f6bc5d259e7ab25ddfdd33de53b892770218918/library/core/src/panicking.rs:188:5
  21:     0x562d63b8f4e4 - rbpf::jit::emit4::haff8aacbfffbe8b6
                               at /home/runner/work/rbpf/rbpf/src/jit.rs:99:1
  22:     0x562d63b8ff07 - rbpf::jit::emit_alu64_imm32::hd930f0d06f862872
                               at /home/runner/work/rbpf/rbpf/src/jit.rs:211:5
  23:     0x562d63b911af - rbpf::jit::JitMemory::jit_compile::h0a631e7ea59867f2
                               at /home/runner/work/rbpf/rbpf/src/jit.rs:544:9
  24:     0x562d63b93722 - rbpf::jit::compile::hb7b83a17bcef25db
                               at /home/runner/work/rbpf/rbpf/src/jit.rs:1014:5
  25:     0x562d63b842d4 - rbpf::EbpfVmMbuff::jit_compile::h1c675e69584d7361
                               at /home/runner/work/rbpf/rbpf/src/lib.rs:643:25
  26:     0x562d63b34ed0 - misc::test_jit_mbuff::h8eb4bceb4b166566
                               at /home/runner/work/rbpf/rbpf/tests/misc.rs:336:9
  27:     0x562d63b37b57 - misc::test_jit_mbuff::{{closure}}::h55d6770297b8bea5
                               at /home/runner/work/rbpf/rbpf/tests/misc.rs:314:21
  28:     0x562d63b377e5 - core::ops::function::FnOnce::call_once::hafeaeb2bdb95188a
                               at /rustc/2f6bc5d259e7ab25ddfdd33de53b892770218918/library/core/src/ops/function.rs:250:5
  29:     0x562d63b7232f - core::ops::function::FnOnce::call_once::heba63a2808b93cd2
                               at /rustc/2f6bc5d259e7ab25ddfdd33de53b892770218918/library/core/src/ops/function.rs:250:5
  30:     0x562d63b7232f - test::__rust_begin_short_backtrace::he9a5c6c59e1b0086
                               at /rustc/2f6bc5d259e7ab25ddfdd33de53b892770218918/library/test/src/lib.rs:655:18
  31:     0x562d63b3e22c - test::run_test::{{closure}}::h1b8c22c7d438b531
                               at /rustc/2f6bc5d259e7ab25ddfdd33de53b892770218918/library/test/src/lib.rs:646:30
  32:     0x562d63b3e22c - core::ops::function::FnOnce::call_once{{vtable.shim}}::h36a9083506ac2a12
                               at /rustc/2f6bc5d259e7ab25ddfdd33de53b892770218918/library/core/src/ops/function.rs:250:5
  33:     0x562d63b711f7 - <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once::h11e5ff78080e1397
                               at /rustc/2f6bc5d259e7ab25ddfdd33de53b892770218918/library/alloc/src/boxed.rs:1985:9
  34:     0x562d63b711f7 - <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once::hfbfbda226f7a7c27
                               at /rustc/2f6bc5d259e7ab25ddfdd33de53b892770218918/library/core/src/panic/unwind_safe.rs:271:9
  35:     0x562d63b711f7 - std::panicking::try::do_call::h42ac7b4bd35c5b5d
                               at /rustc/2f6bc5d259e7ab25ddfdd33de53b892770218918/library/std/src/panicking.rs:485:40
  36:     0x562d63b711f7 - std::panicking::try::hc6ad089a257e422e
                               at /rustc/2f6bc5d259e7ab25ddfdd33de53b892770218918/library/std/src/panicking.rs:449:19
  37:     0x562d63b711f7 - std::panic::catch_unwind::hdf4784c0c06ac0ba
                               at /rustc/2f6bc5d259e7ab25ddfdd33de53b892770218918/library/std/src/panic.rs:140:14
  38:     0x562d63b711f7 - test::run_test_in_process::hbe86cacc7510eb8b
                               at /rustc/2f6bc5d259e7ab25ddfdd33de53b892770218918/library/test/src/lib.rs:678:27
  39:     0x562d63b711f7 - test::run_test::run_test_inner::{{closure}}::hbaa8a391d0f22860
                               at /rustc/2f6bc5d259e7ab25ddfdd33de53b892770218918/library/test/src/lib.rs:572:39
  40:     0x562d63b38c28 - test::run_test::run_test_inner::{{closure}}::hb4232323322f922d
                               at /rustc/2f6bc5d259e7ab25ddfdd33de53b892770218918/library/test/src/lib.rs:599:37
  41:     0x562d63b38c28 - std::sys_common::backtrace::__rust_begin_short_backtrace::hca81c15946d6f154
                               at /rustc/2f6bc5d259e7ab25ddfdd33de53b892770218918/library/std/src/sys_common/backtrace.rs:134:18
  42:     0x562d63b3e45b - std::thread::Builder::spawn_unchecked_::{{closure}}::{{closure}}::h97d990ba33daf685
                               at /rustc/2f6bc5d259e7ab25ddfdd33de53b892770218918/library/std/src/thread/mod.rs:529:17
  43:     0x562d63b3e45b - <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once::he0d612b643c049a4
                               at /rustc/2f6bc5d259e7ab25ddfdd33de53b892770218918/library/core/src/panic/unwind_safe.rs:271:9
  44:     0x562d63b3e45b - std::panicking::try::do_call::haf308dcb076117ac
                               at /rustc/2f6bc5d259e7ab25ddfdd33de53b892770218918/library/std/src/panicking.rs:485:40
  45:     0x562d63b3e45b - std::panicking::try::h51fd60d29cf6feee
                               at /rustc/2f6bc5d259e7ab25ddfdd33de53b892770218918/library/std/src/panicking.rs:449:19
  46:     0x562d63b3e45b - std::panic::catch_unwind::h8e7f9d8c4a32bc2f
                               at /rustc/2f6bc5d259e7ab25ddfdd33de53b892770218918/library/std/src/panic.rs:140:14
  47:     0x562d63b3e45b - std::thread::Builder::spawn_unchecked_::{{closure}}::h07454567f5413f25
                               at /rustc/2f6bc5d259e7ab25ddfdd33de53b892770218918/library/std/src/thread/mod.rs:528:30
  48:     0x562d63b3e45b - core::ops::function::FnOnce::call_once{{vtable.shim}}::hfb80b7cf501e16ff
                               at /rustc/2f6bc5d259e7ab25ddfdd33de53b892770218918/library/core/src/ops/function.rs:250:5
  49:     0x562d63bfa815 - <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once::h71c821b130855373
                               at /rustc/2f6bc5d259e7ab25ddfdd33de53b892770218918/library/alloc/src/boxed.rs:1985:9
  50:     0x562d63bfa815 - <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once::h1701cec9acb1061c
                               at /rustc/2f6bc5d259e7ab25ddfdd33de53b892770218918/library/alloc/src/boxed.rs:1985:9
  51:     0x562d63bfa815 - std::sys::unix::thread::Thread::new::thread_start::h6dd5ef62dde103f1
                               at /rustc/2f6bc5d259e7ab25ddfdd33de53b892770218918/library/std/src/sys/unix/thread.rs:108:17
  52:     0x7f803c6e9b43 - <unknown>
  53:     0x7f803c77ba00 - <unknown>
  54:                0x0 - <unknown>
thread caused non-unwinding panic. aborting.
error: test failed, to rerun pass `--test misc`

This seems to be a consequence of rust-lang/rust#98112.

As an easy “fix”, we can disable debug assertions for the tests, but we risk introducing more bugs. Not sure how to fix it otherwise, given that we do support unaligned memory access in rbpf at this time.

Tag a new crate version and push it to crates.io

We want to tag a new version and update the crate available for download on crates.io.

ETA: end of this month.

There have been quite some changes since last version, although I'm not sure the current version is stable enough to deserve 1.0.0. Therefore I consider bumping version number to 0.1.0.

Things we want to fix and complete before that:

  • Merging #31 (propagate errors instead of panicking in verifier).
  • Possibly doing the same thing at other parts of the crate? That would be to complete #30.
  • Address #29 (provide a way to pass a custom verifier).
  • Check if dependencies can be updated to newer releases without breaking the lib / the tests. (#35)
  • Does anyone need anything else?

RFC: make asm_parser private

I do not see any particular reason to offer the asm_parser to the end user. As I see it, all the assembler API holds in the assembler module. What about making asm_parser private in src/lib.rs? @rlane, any thought?

Note that as a consequence, we would have to move the tests from the separate file to a test module at the end of file src/asm_parser.rs. (I don't think the function/stucts from the module would remain accessible otherwise).

Define a type for the eBPF helper functions

So we declare a type for a verifier function:

pub type Verifier = fn(prog: &[u8]) -> Result<(), Error>;

Would be nice to have the same for the helpers. So that instead of:

pub fn register_helper(&mut self,
                       key: u32,
                       function: fn (u64, u64, u64, u64, u64) -> u64) 
                       -> Result<(), Error>

... we'd have something like:

pub fn register_helper(&mut self,
                       key: u32,
                       function: Helper) -> Result<(), Error>

(EbpfHelper or Helper, I don't really know what sounds best. Maybe just Helper if people call that with the rbpf:: prefix already?)

From discussion on #32.

Add ability for caller to replace verification function

The verification function is coded directly in rbpf. Providing a way for the caller to override the default verifier with their own would provide a lot of flexibility and less need for others for fork and customize this repository for their own needs

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.