utah-scs / splinter Goto Github PK
View Code? Open in Web Editor NEWA low-latency, extensible, multi-tenant key-value store.
A low-latency, extensible, multi-tenant key-value store.
Steps to reproduce:
sudo scripts/run-server
and use a client with use_invoke = true
to run sudo scripts/run-ycsb
Note: I'm not sure if there are other configurations that also trigger these errors, there may or may not be.
Sandstorm fails with two types of panics present in the log:
ethanr@sandstorm01:~/Sandstorm$ sudo scripts/run-server
INFO:server: Starting up Sandstorm server with config ServerConfig { mac_address: "3c:fd:fe:04:9f:c2", ip_address: "192.168.0.2", udp_port: 0, nic_pci: "0000:04:00.1", client_mac: "3c:fd:fe:04:b0:e2", client_ip: "192.168.0.1", num_tenants: 8, install_addr: "127.0.0.1:7700" }
INFO:server: Populating test data table and extensions... INFO:server: Finished populating data and extensions EAL: Detected 32 lcore(s)
EAL: Probing VFIO support...
Devname: "0000:04:00.1"
EAL: PCI device 0000:04:00.1 on NUMA socket 0
EAL: probe driver: 8086:1572 net_i40e
INFO:server: Successfully added scheduler(TID 43978) with rx,tx,sibling queues (3, 3, 4) to core 13.
INFO:server: Successfully added scheduler(TID 43975) with rx,tx,sibling queues (0, 0, 1) to core 10.
INFO:server: Successfully added scheduler(TID 43979) with rx,tx,sibling queues (4, 4, 5) to core 14.
INFO:server: Successfully added scheduler(TID 43980) with rx,tx,sibling queues (5, 5, 6) to core 15.
INFO:server: Successfully added scheduler(TID 43977) with rx,tx,sibling queues (2, 2, 3) to core 12.
INFO:server: Successfully added scheduler(TID 43982) with rx,tx,sibling queues (7, 7, 0) to core 17.
INFO:server: Successfully added scheduler(TID 43981) with rx,tx,sibling queues (6, 6, 7) to core 16.
INFO:server: Successfully added scheduler(TID 43976) with rx,tx,sibling queues (1, 1, 2) to core 11.
thread 'sched-17' panicked at 'attempt to subtract with overflow', /users/ethanr/Sandstorm/net/framework/src/native/zcsi/mbuf.rs:91thread ':sched-109' panicked at '
attempt to subtract with overflownote: Run with `RUST_BACKTRACE=1` for a backtrace.
', /users/ethanr/Sandstorm/net/framework/src/native/zcsi/mbuf.rs:91:9
WARN:server: Detected misbehaving task 43975 on core 10.
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: "SendError(..)"', libcore/result.rs:983:5
attempt to subtract with overflow
, from net/framework/src/native/zcsi/mbuf.rs:91called 'Result::unwrap()' on an 'Err' value: "SendError(..)"
, I believe the unwrap is coming from "net/framework/src/scheduler/context.rs" line 200
I'm trying to debug the panic issue (#9), and it would be nice to be able to work from a binary that has been built with debug info.
Client code is scattered across multiple files, and the same code(request packet formation, request Tx, and response Rx, etc.) is used multiple times in different clients, which requires manual changes in each client file(whenever needed).
I will be adding an ext/panic
extension soon to make it easier to reproduce this issue. I will edit this issue when I have done so.
For now, this issue can be trivially reproduced by adding a panic!("muahaha")
to the main closure of the get extension. The first handful of invocations will panic and be caught, as intended, but then invariably a double panic will occur, which causes the runtime to abort.
This issue has been hard to debug, but here is what I do know:
panic!()
, which calls the rust runtime handlers present in frames 0-5 of the backtrace, as well as the panic hook.panic!()
. This code panics during the formatting of the panic string, notice how std::panicking::rust_panic_with_hook
appears twice in the backtrace. We only see one copy of this function in our backtrace.Below is a backtrace of the double panic.
Here is a more complete log of a full session, including a backtrace of one of the single panics that was successfully caught:
fullPanicIssueLog.txt.
stack backtrace:
0: 0x7f3d01b67c6b - std::sys::unix::backtrace::tracing::imp::unwind_backtrace::h8458cd77216b6cb4
at libstd/sys/unix/backtrace/tracing/gcc_s.rs:49
1: 0x7f3d01b35b10 - std::sys_common::backtrace::print::hc884ca89c7ab7468
at libstd/sys_common/backtrace.rs:71
at libstd/sys_common/backtrace.rs:59
2: 0x7f3d01b5b3bd - std::panicking::default_hook::{{closure}}::h4a3e30c6d4d0cba4
at libstd/panicking.rs:206
3: 0x7f3d01b5b11b - std::panicking::default_hook::hea868ab86a1b7a87
at libstd/panicking.rs:222
4: 0x7f3d01b5b8cf - std::panicking::rust_panic_with_hook::h2568e23a59a493fa
at libstd/panicking.rs:400
5: 0x7f3d01b21065 - std::panicking::begin_panic::hce7e5a88f7ff4fa1
6: 0x7f3d01b20ee2 - get::init::{{closure}}::h620c422872f2a80f
7: 0x555e505f5e88 - std::panickingWARN:server: Detected misbehaving task 84558 on core 10.::
try::do_call::ha34c11298de1ecc2
8: 0x555e50709cae - __rust_maybe_catch_panic
at libpanic_unwind/lib.rs:102
9: 0x555e505e2d0d - <db::container::Container as db::task::Task>::run::h29a2f1bd716d50bd
10: 0x555e505f587f - db::sched::RoundRobin::poll::hd3a9151cfc6bbe36
11: 0xWARN:server: Detected misbehaving task 84559 on core 17.555e50653927
- e2d2::schedulerINFO:server: Successfully added scheduler(TID 84560) with rx,tx,sibling queues (0, 0, 1) to core 10.::
standalone_scheduler::StandaloneScheduler::execute_internal::h4d688f42b578547c
12: 0x555e5065371a - e2d2::scheduler::thread 'standalone_scheduler<unnamed>::' panicked at 'StandaloneSchedulerexplicit panic::', handle_requestsrc/lib.rs:::h393b287b1d55187346
:13
13: 0x555e50662b31thread ' - <unnamed>std' panicked at '::explicit panicsys_common', ::src/lib.rsbacktrace:::46__rust_begin_short_backtrace:::13h7a42e3ab2bac4c68
14: 0x555e5066111b - std::panicking::try::do_call::h8e2568bebf30af60
15: 0x555e50709cae - __rust_maybe_catch_panic
INFO:server: Successfully added scheduler(TID 84561) with rx,tx,sibling queues (7, 7, 0) to core 17.
at libpanic_unwind/lib.rs:102
16: thread ' <unnamed> ' panicked at '0xexplicit panic555e50659d0c', - src/lib.rs<:F46 :as13
alloc::boxed::FnBox<A>>::thread 'call_box<unnamed>::' panicked at 'h3ddc12d9236471d3explicit panic
', src/lib.rs17:: 46 : 13
0x555e506ff117 - std::sys_common::thread::start_thread::h441a470255b0983b
at /checkout/src/liballoc/boxed.rs:645
at libstd/sys_common/thread.rs:24
18: 0x555e506f24d8 - std::sys::unix::thread::Thread::new::thread_start::h8246db0ba3b8ab5d
at libstd/sys/unix/thread.rs:90
19: 0x7f3d1c2836b9 - start_thread
20: 0x7f3d1bda341c - clone
21: 0x0 - <unknown>
Currently, extensions can't cause arbitrary memory accesses using large stack values, but they can exhaust the stack since they run on the database workers kernel thread.
The default stack overflow handler for Rust is basically just a signal handler that checks to see if an access is to the guard page at the end of the stack, if it is then it basically crashes the program with a "stack overflow" message. If it's not it removes the handler and retries, usually giving a segfault.
This is all safe, but it's a clear denial-of-service. Sandstorm needs to override the signal handler and replace it with one that doesn't terminate the database process.
How we unwind the extension call that caused the problem is a separate issue. For now, detecting the case and preventing the crash is key.
Hi, @ankit-iitb .
I have a question about the implementation of YCSB+T.
Where did you get the latest YCSB+T project?
The only place I can find to fork it is on Akon Dey's github.
https://github.com/akon-dey/YCSB
Could you help me to implement the ycsb+t benchmark correctly?
In his article he reports having such methods:
"
• doTransactionInsert() creates a new account with an
initial balance captured from doTransactionDelete() operation described below.
• doTransactionRead() reads a set of account balances
determined by the key generator.
• doTransactionScan() scans the database given the start
key and the number of records and fetches them from the
data base.
• doTransactionUpdate() reads a record and add $1 from
the balance captured from delete operations to it and write
it back.
• doTransactionDelete() reads an account record, add the
amount to the captured the balance (capture used in
doTransactionInsert()) and then deletes the record.
• doTransactionReadModifyWrite() reads two records,
subtracts $1 from the one of the two and adds $1 to
the other before writing them both back.
"
In the akon repository where I made the fork, I didn't find the implementation of the methods doTransactionInsert() , doTransactionRead() , doTransactionScan() , doTransactionUpdate() and doTransactionDelete().
I just noticed that the doTransactionReadModifyWrite() method is implemented, where it subtracts the value 1 from account A and assigns that value to account B.
Could you help me understand this part of the implementation?
Regards,
Caio
I wanna run splinter in my local cluster, but i meet a lot problem.
I try to use build.sh in net dir, there are some error while build.
Compiling zcsi-delay v0.1.0 (/root/splinter/net/test/delay-test)
error: the legacy LLVM-style asm! syntax is no longer supported
--> test/delay-test/src/nf.rs:7:9
|
7 | asm!("nop"
| ^---
| |
| help: replace with: llvm_asm!
| |
8 | | :
9 | | :
10 | | :
11 | | : "volatile");
| |__________________^
|
= note: consider migrating to the new asm! syntax specified in RFC 2873
= note: alternatively, switch to llvm_asm! to keep your code working as it is
warning: use of deprecated function time::precise_time_ns
: Use OffsetDateTime::now() - OffsetDateTime::unix_epoch()
to get a Duration
since a known epoch.
--> test/delay-test/src/main.rs:85:29
|
85 | let mut start = time::precise_time_ns() as f64 / CONVERSION_FACTOR;
| ^^^^^^^^^^^^^^^^^^^^^
|
= note: #[warn(deprecated)]
on by default
warning: use of deprecated function time::precise_time_ns
: Use OffsetDateTime::now() - OffsetDateTime::unix_epoch()
to get a Duration
since a known epoch.
--> test/delay-test/src/main.rs:90:27
|
90 | let now = time::precise_time_ns() as f64 / CONVERSION_FACTOR;
| ^^^^^^^^^^^^^^^^^^^^^
error[E0593]: closure is expected to take 4 arguments, but it takes 2 arguments
--> test/delay-test/src/main.rs:70:45
|
70 | context.add_pipeline_to_run(Arc::new(
| ____________________________^
71 | | move |p, s: &mut StandaloneScheduler| test(p, s, delay),
| | ------------------------------------- takes 2 arguments
72 | | ));
| |^ expected closure that takes 4 arguments
error: aborting due to 2 previous errors; 2 warnings emitted
For more information about this error, try rustc --explain E0593
.
error: could not compile zcsi-delay
i don't know how to fix it, plz help :)
or should i run splinter in cloudlab? I don't know how to get a account of cloudlab.
when i run ycsb test, error happened like this:
WARN:splinter::dispatch: Failed to send all packets!
WARN:splinter::dispatch: Failed to send all packets!
WARN:splinter::dispatch: Failed to send all packets!
WARN:splinter::dispatch: Failed to send all packets!
WARN:splinter::dispatch: Failed to send all packets!
WARN:splinter::dispatch: Failed to send all packets!
WARN:splinter::dispatch: Failed to send all packets!
WARN:splinter::dispatch: Failed to send all packets!
WARN:splinter::dispatch: Failed to send all packets!
WARN:splinter::dispatch: Failed to send all packets!
WARN:splinter::dispatch: Failed to send all packets!
WARN:splinter::dispatch: Failed to send all packets!
WARN:splinter::dispatch: Failed to send all packets!
WARN:splinter::dispatch: Failed to send all packets!
WARN:splinter::dispatch: Failed to send all packets!
WARN:splinter::dispatch: Failed to send all packets!
WARN:splinter::dispatch: Failed to send all packets!
WARN:splinter::dispatch: Failed to send all packets!
WARN:splinter::dispatch: Failed to send all packets!
WARN:splinter::dispatch: Failed to send all packets!
WARN:splinter::dispatch: Failed to send all packets!
WARN:splinter::dispatch: Failed to send all packets!
WARN:splinter::dispatch: Failed to send all packets!
WARN:splinter::dispatch: Failed to send all packets!
WARN:splinter::dispatch: Failed to send all packets!
WARN:splinter::dispatch: Failed to send all packets!
WARN:splinter::dispatch: Failed to send all packets!
WARN:splinter::dispatch: Failed to send all packets!
WARN:splinter::dispatch: Failed to send all packets!
WARN:splinter::dispatch: Failed to send all packets!
WARN:splinter::dispatch: Failed to send all packets!
WARN:splinter::dispatch: Failed to send all packets!
WARN:splinter::dispatch: Failed to send all packets!
WARN:splinter::dispatch: Failed to send all packets!
thread 'sched-0' panicked at 'Failed to allocate packet for request!', /root/splinter/db/src/rpc.rs:129:10
my hugepage config is 1G * 4 and 2M * 10480.
i want to know the env config in paper saying. plz help, really thanks!
Right now extensions can allocation arbitrary amounts of the database process' heap. We need to inject a custom allocator into all extensions so that we can track heap allocations that happen while running untrusted code.
For now, detecting that an extension is past some allocation limit and forcing a panic is acceptable. Eventually, we'll want to track all allocations, unwind the extension's call stack, and free all heap allocations, but that will require us tackling the larger issue of safely unwinding extension call stacks.
There are some tricky edges to this: for example, if a vector is allocated in an extension call, it is possible that later operations on that vector could cause it to reallocate. Related: if an extension calls out to the database, it's likely that we shouldn't count heap allocations against the extension that happen in trusted code (in fact, ideally, we wouldn't have heap allocations that escape the trusted scope at all).
I got this error when trying to run the PUSHBACK workload
thread 'INFO:splinter::dispatch: Received many responses...sched-7
' panicked at 'index 377 out of range for slice of length 278', INFO:pushback: PUSHBACK/rustc/XXXX/src/libcore/slice/mod.rs
Configuration used:
key_size = 30
value_size = 100
num_aggr = 2
order = 1000
Adding it here in case I forget.
Dpdk web server now redirects the request for dpdk-(version).tar.gz to somewhere. This results in download of invalid file and error "gzip: stdin: not in gzip format" when trying to untar the downloaded file.
To allow curl to follow the redirects, add -L argument to curl command in ./net/3rdparty/get-dpdk.sh
DeleteKey functionality would be especially useful as some stored items may no longer be needed. DeleteTable might also be useful in future extensions.
Hello,
I am seeking something like this for a cross-platform project that also included Windows, Linux, and MacOS.
Could this be made work for that purpose?
Thanks
I am a rust beginer, try to run server and have an error:
thread 'main' panicked at 'Failed to load get() extension.', /root/splinter/db/src/master.rs:598:13
stack backtrace:
0: std::panicking::begin_panic
at /rustc/1c389ffeff814726dec325f0f2b0c99107df2673/library/std/src/panicking.rs:519:12
1: db::master::Master::load_test
at /root/splinter/db/src/master.rs:598:13
2: client::main
at ./src/bin/client/client.rs:632:9
3: core::ops::function::FnOnce::call_once
at /rustc/1c389ffeff814726dec325f0f2b0c99107df2673/library/core/src/ops/function.rs:227:5
I use ext/safe-compile to compile extend.
~/splinter/ext# ./safe-compile get get
Compiling src/lib.rs for get target into target/get/deps/libget.so
ERROR: All extensions must include #![no_std]; extensions must only use modules exposed through the sandstorm crate.
plz help me, verrrrry thanks. : )
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.