Comments (9)
I think this only happens with wasm contracts. @BugFreeSoftware is executing view calls in parallel supported by the wasm interpreter?
The following cluster test reproduces the issue. Replacing go func
with just func
makes it so that the view calls are sent in sequence and then the test passes.
func TestSpamCallViewWasm(t *testing.T) {
testutil.SkipHeavy(t)
clu := newCluster(t)
committee := []int{0}
quorum := uint16(1)
addr, err := clu.RunDKG(committee, quorum)
require.NoError(t, err)
chain, err := clu.DeployChain("chain", clu.Config.AllNodes(), committee, quorum, addr)
require.NoError(t, err)
e := &env{t: t, clu: clu}
chEnv := &chainEnv{
env: e,
chain: chain,
}
chEnv.requestFunds(scOwnerAddr, "client")
chEnv.deployContract("inccounter", "inccounter sc", nil)
{
// increment counter once
entryPoint := iscp.Hn("increment")
tx, err := chEnv.chainClient().Post1Request(incHname, entryPoint)
require.NoError(t, err)
err = chain.CommitteeMultiClient().WaitUntilAllRequestsProcessed(chain.ChainID, tx, 30*time.Second)
require.NoError(t, err)
}
const n = 1000
ch := make(chan error, n)
for i := 0; i < n; i++ {
go func(i int) {
cl := chain.SCClient(iscp.Hn("inccounter"), nil)
r, err := cl.CallView("getCounter", nil)
if err != nil {
ch <- err
return
}
v, ok, err := codec.DecodeInt64(r.MustGet(inccounter.VarCounter))
if err != nil {
} else if !ok {
err = errors.New("!ok")
} else if v != 1 {
err = errors.New("v != 1")
}
ch <- err
}(i)
}
for i := 0; i < n; i++ {
err := <-ch
if err != nil {
t.Error(err)
}
}
}
from wasp.
Adding a mutex in WasmProcessor::call()
fixes the panics:
type WasmProcessor struct {
...
mu sync.Mutex
}
func (host *WasmProcessor) call(ctx iscp.Sandbox, ctxView iscp.SandboxView) (dict.Dict, error) {
host.mu.Lock()
defer host.mu.Unlock()
...
}
(But of course this is not the correct solution, we need to be able to run view calls in parallel).
from wasp.
I've been thinking about this and I don't think this is possible with the current Wasm VM. The assumption with Wasm is currently: no threading, because it runs deterministic code. We load our Wasm SC code in an instance of the Wasm runtime and there it runs in isolation from other Wasm SCs. These separate instances can completely run in parallel. They have no way to interfere with each other at Wasm level, and this parallelization is supported by the Wasm runtime (WasmTime/Wasmer/etc).
But how does this work when a state update comes in while view calls are running already? That could seriously affect the results of the view calls, when the state is updated halfway the call. Or do we have a separate state copy from which the views are taking their data, which cannot change until all views are done? Also, even though the Wasm SCs run in separate VM instances, ultimately they share state because the entire chain has only a single state. Which means that calls to core contracts can be made in parallel from those SCs and potentially impact each other unless that is properly synchronized.
from wasp.
I don't have so much knowledge about the deeper logic there, but one thing that comes to mind is,
that the events inside the contract don't get emitted right at the call, but rather after the sc function has ended.
Maybe it's enough to copy the state into some other location which the callview api call then accesses?
Then these values would only change after each function has ended, which would be fine - at least for my use case.
Kind of like the same behavior with the events. But that doesn't solve the last issue Eric said then.
Maybe we could add something like an "uncertain state"?
from wasp.
@BugFreeSoftware the protection from state updates is already in place, in the form of an optimistic reader. Whenever the state is read during the view call, if it has changed the optimistic reader will panic, and then the whole view call will be retried. See packages/webapi/webapiutil/callview.go
.
To make it work with wasm, one idea is to have a pool of instances of wasm interpreters, and in WasmProcessor::Call()
take a free interpreter from the pool, use it to run the code, then return it to the pool. Would something like this work?
from wasp.
I have been thinking about that and we can probably do better. It's not the code that is the problem but the code re-entrancy. We already save the memory state after initialization is done, and restore it before every function call. Because it needs the data from the initialization but after that it should be completely deterministic. I'm going to figure out if it is possible to call functions in parallel on the Wasm VM by giving each function call their own memory upon calling. It could be we are reusing the same memory block for every call and then parallel calls become an issue. But if each call has its own memory that should not be an issue. After that it's only a matter of verifying that the interface to ISCP is re-entrant as well and does not rely on any state. I thought I removed all of those but I may have missed one. Will investigate.
from wasp.
Adding a mutex in
WasmProcessor::call()
fixes the panics:
No it doesn't. It will block any recursive call. Several tests will hang indefinitely.
from wasp.
What did work was to put the mutex here: https://github.com/iotaledger/wasp/blob/roulette_poc/packages/webapi/webapiutil/callview.go#L17
Its not great but at least for the web api, it seems to make it work at the end.
I believe NewFromChain
is not threadsafe at some point. As putting the Mutex somewhere below brings up issues again..
Maybe we are referencing some data structure instead of copying it?
from wasp.
Fixed the problem for now. Mutex per SC processor. Can handle nested calls now. We can have multiple different SCs running in parallel because they each use a different processor. This should fix our problem for now until I can get multiple instances per processor running.
@lmoe please can you try to reproduce the WasmTime crash again with this code to see if this fixes that one for now?
from wasp.
Related Issues (20)
- Investigate Pruning performance
- Performance Improvement around too much CPU usage on GetReceiptsByBlockNumber
- Grafana Metrics do not display timed out requests
- Extend Foundry minting metadata HOT 2
- Check for IRC27 compatible Metadata in ERC721NFTs HOT 1
- Set name of NFT
- Make Solo log level configurable
- Unify type definitions HOT 1
- Make `getSenderAccount` a view call
- Do URL sanitisation in wasp-cli
- Method handler crashes after using wasp node for some hours HOT 12
- Error while calling smart contract method with web3 version 4+ HOT 2
- Addressing Linter Suggestions from 'make lint' Command Execution
- Node dashboard (and API?) down while syncing as an access node
- New user not working after restart
- wasp-cli metrics nodeconn output only zeroes?
- Add missing docstrings HOT 1
- `debug_traceTransaction` returns an empty trace
- `debug_traceTransaction` returns empty gasUsed HOT 1
- Add support for `attributes` to the `ERC721NFTs` contract
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from wasp.