agoric / agoric-sdk Goto Github PK

monorepo for the Agoric Javascript smart contract platform

License: Apache License 2.0

JavaScript 63.81% Shell 1.00% HTML 0.07% Dockerfile 0.01% Makefile 0.25% Go 5.17% C++ 0.04% HCL 0.06% Jinja 0.02% Python 0.74% TypeScript 28.86%

agoric-sdk's Introduction

Agoric Platform SDK

This repository contains most of the packages that make up the upper layers of the Agoric platform, with the endo repository providing the lower layers. If you want to build on top of this platform, you don't need these repositories: instead you should follow our instructions for getting started with the Agoric SDK.

But if you are improving the platform itself, these are the repositories to use.

Prerequisites

Prerequisites are enforced in various places that should be kept synchronized with this section (e.g., repoconfig.sh defines golang_version_check and nodejs_version_check shell functions).

Git
Go ^1.20.2
Node.js ^18.12 or ^20.9
- we generally support the latest LTS release: use nvm to keep your local system up-to-date
Yarn (npm install -g yarn)
gcc >=10, clang >=10, or another compiler with __has_builtin()

Any version of Yarn will do: the .yarnrc file should ensure that all commands use the specific checked-in version of Yarn (stored in .yarn/releases/), which we can update later with PRs in conjunction with any necessary compatibility fixes to our package.json files.

Building on Apple Silicon and Newer Architectures

Some dependencies may not be prebuilt for Apple Silicon and other newer architectures, so it may be necessary to build these dependencies from source and install that package’s native dependencies with your package manager (e.g. Homebrew).

Currently these dependencies are:

Canvas

Additionally, if your package manager utilizes a non-standard include path, you may also need to export the following environment variable before running the commands in the Build section.

export CPLUS_INCLUDE_PATH=/opt/homebrew/include

Finally, you will need the native build toolchain installed to build these items from source.

xcode-select --install

Build

From a new checkout of this repository, run:

yarn install
yarn build

When the yarn install is done, the top-level node_modules/ will contain all the shared dependencies, and each subproject's node_modules/ should contain only the dependencies that are unique to that subproject (e.g. when the version installed at the top level does not meet the subproject's constraints). Our goal is to remove all the unique-to-a-subproject deps.

When one subproject depends upon another, node_modules/ will contain a symlink to the subproject (e.g. ERTP depends upon marshal, so node_modules/@endo/marshal is a symlink to packages/marshal).

Run yarn workspaces info to get a report on which subprojects (aka "workspaces") depend upon which others. The mismatchedWorkspaceDependencies section tells us when symlinks could not be used (generally because e.g. ERTP wants [email protected], but packages/marshal/package.json says it's actually 0.2.0). We want to get rid of all mismatched dependencies.

The yarn build step generates kernel bundles.

Test

To run all unit tests (in all packages):

yarn test (from the top-level)

To run the unit tests of just a single package (e.g. eventual-send):

cd packages/eventual-send
yarn test

Run the larger demo

Visit https://docs.agoric.com for getting started instructions.

TL;DR:

yarn link-cli ~/bin/agoric
cd ~
agoric init foo
cd foo
agoric install
agoric start

Then browse to http://localhost:8000

Edit Loop

modify something in e.g. zoe/
run yarn build (at the top level or in zoe/)
re-run tests or agoric start --reset
repeat

Doing a yarn build in zoe creates the "contract facet bundle", a single file that rolls up all the Zoe contract vat sources. This bundle file is needed by all zoe contracts before they can invoke zoe~.install(...). If you don't run yarn build, then changes to the Zoe contract facet will be ignored.

Development Standards

All work should happen on branches. Single-commit branches can land on trunk without a separate merge, but multi-commit branches should have a separate merge commit. The merge commit subject should mention which packages were modified (e.g. (SwingSet,cosmic-swingset) merge 123-fix-persistence)
Keep the history tidy. Avoid overlapping branches. Rebase when necessary.
All work should have an Issue. All branches names should include the issue number as a prefix (e.g. 123-description). Use "Labels" on the Issues to mark which packages are affected.
Add user-visible changes to a new file in the changelogs/ directory, named after the Issue number. See the README in those directories for instructions.
Unless the issue spans multiple packages, each branch should only modify a single package.
Releases should be made as according to MAINTAINERS.md.

agoric-sdk's People

Contributors

Stargazers

Watchers

Forkers

dckc stateset davidbruant jfparadis hxrts vporton ivanmolto developerfred vinhyenvodoi98 zarutian cwebber kumavis vanessadice zmanian librechain humantraffic sjors-lemniscap sun1534 abef kashuhackerone aleclarson iqlusioninc zenithez glibxinc dezzyboy tet123 tgrecojs tradebitca alchemydc coinshipdefi hellmaster13 dmitrybit slimuss alipostaci2001 olifantt waynewayner 0xmaverick geeksets northa nefritnet nfti-dev alicegivesmagic redeyespb btcgoose icopokrovskyi crumcach deyure2u hayai-xxx ski iwbinb huguito17 xni lgs dw96kim dungphun jessysaurusrex a0145 tonymade8 yashpatel5400 swissagoric mt2721 gnongs hassoon1986 aditya-manit naksir-melody nhhtrung habibrr zebcrypto ctjlewis rupali-04 mohitsaxenaknoldus ewanas namangirdhar16 simpletrontdip iammelea watermaneuro kdsjfl devdjena tommalvoriddle abbas512 lando19 starlord07 chandrashiva waterlord7788 moteesh-reddy knkgun kennyrowe kent0205 bug-hunting-github quydom gibson042 ruiyangdev kyungjun121 dexanode nodestake b4d2 jimmimimmi ftdemov visitskyworld smifrahim

agoric-sdk's Issues

add `resolved-promise` qclass, drop kernel promises after fulfillToData/reject

In today's meeting we worked out a performance improvement that Dean said was pretty important on Midori. Basically we want to forget the short-lived promises that are created for the result of a method send that yields plain data (or rejects to an error), instead of resolving to a callable reference.

When the liveSlots layer serializes an outgoing promise (i.e. in the argument of a method send, but the more interesting case is during the receipt of a dispatch.deliver) for the first time, it calls .then on it to subscribe to hear about its resolution. It also calls syscall.createPromise() to generate a kernel-side promise/resolver pair, and sends the generated promiseID in the marshalled slots. This will let the receiving side know what kernel promise to reference if it wants to send messages to the promise as a target. The .then is used to invoke syscall.fulfillToXYZ when the local promises resolves, and closes over the resolverID value for use in that call.

Now, if the promise is resolved to something (data or a rejection), the kernel learns that the promise state has changed (the kernel promise table has its state value changed from unresolved to data or slot or something), but the local liveSlots layer doesn't particularly care. It sent the syscall.fullfil method, so it's job is done.

But, to improve things, we'd like for both the kernel and the subsequent comms vat (if the message is being sent over the wire) to be able to forget about the promise. In a lot of cases, the sending vat will forget about the promise quickly, so we'd like everyone else to forget about it too. We don't know when the vat forgets about it, though, since we don't have weakrefs. If it turns out the vat didn't forget about it (i.e. it sends it a second time), we need to make sure the receiving side still gets the right thing (a promise which, when asked with .then, yields the same resolution as before) (and BTW it doesn't have to be the exact same Promise instance: we declare Presences to have identity, but not Promises).

So, the liveSlots layer should maintain a weakmap from local promises to their resolved value. If liveSlots is asked to serialize something that turns out to be in this map, it should generate data that represents a "resolved promise": something like {"@qclass": "resolved-promise", "data": {...}}, and something similar for rejection (either using rejection: instead of data:, or using a different @qclass value). The kernel is unaware of this slot type, and does no special mapping on it. This object travels all the way through to the far side, and is demarshalled using Promise.resolve(data).

LiveSlots keeps a map from Promises that have been serialized to the kernel-allocated promiseID that it uses to serialize them. When liveSlot sees the promise get resolved, it should delete this mapping.

When the kernel sees the syscall.fulfillToData, it will queue dispatch.notifyFulfillToData calls to all subscribed vats. When that dispatch makes it to the top of the run-queue, it invokes the dispatch() call. Just before delivering that call, it should remove the promise from the import table of the receiving vat, making that promise no longer accessible. The receiving vat should remove the promiseID from its tables. The kernel should also remove the receiving vat from the subscriber list. The kernel/vat contract is that vats cannot expect to reference a resolved(-to-data-or-rejection) promise after being notified of its resolution.

When all subscribed vats have been removed from the list, the kernel should remove the promiseID from the promise table entirely.

fulfillToPresence cannot prune promises this way, only fulfillToData or reject. TODO: understand why, then explain it.

The comms vat, after it delivers the resolution to some remote machine, can delete the promise too.

Dean pointed out that sending a locally-generated promise is not where this improves anything. The real win is for the promises that are generated in response to an inbound remote message (which represented more than half of all Promises, if I remember him correctly). These promises are frequently resolved quickly and then discarded, never to be shared with anyone but the caller.

get circleCI to build pull requests

I suggested that @michaelfig base a PR from his own fork instead of a branch in the Agoric repo (as @katelynsills and I were planning the other day), but it looks like Agoric/SwingSet#50 wasn't built by CircleCI.

We need to find a configuration fix to get CI to run on PRs from other repos, not just our own.

get persistence working again

I need to implement the plan in https://github.com/Agoric/SwingSet/issues/58 here. At the moment, restarting the solo machine causes an error message like doProcess: errmsg: Error: historical inaccuracy in replay-send Error: historical inaccuracy in replay-send. I think the state isn't being read back correctly (or at all).

rename `makeXYZ.js` to just `XYZ.js`

as a coding style thing, let's name the file that manages an X to be just X.js, instead of naming it after the makeX() function which it exports. So e.g. src/kernel/commsSlots/state/makeCLists.js could be clists.js.

comms-vat: change four tables to two tables

Instead of having questions, answers, ingresses, and egresses as tables for inter-vat communication, we are switching to only having two tables: ingresses and egresses. The ingresses table should only have unresolved remote promises, resolved remote promises, and presences.

I will expand this once I've reviewed my notes more.

add "device" options to addVat()

Pretty soon we're going to need to add specialized Vats with extra-vat authorities (access to objects that can do more than just send messages to objects in other Vats):

the Comms Vat will need to send/receive messages on TCP connections
the Ledger Vat will need synchronous access to the "ledger", a table of balances available to each Meter, used by the kernel scheduler as it decides which message to deliver next
another Vat (maybe some flavor of the Comms Vat) will receive messages delivered as cosmos/tendermint "transactions"

I'm thinking we should enhance the Controller.addVat(vatID, sourceIndex) API to accept a third "options" argument, with some indicator of which special powers this Vat ought to receive. These powers are identified by name ("tcp", "ledger", etc), and other code inside Controller is responsible for constructing the right sort of endowments (included in with the s.evaluate(source, { require: r}) call). It isn't as general as allowing addVat to accept the endowments directly, but his way the shell (the code that calls Controller.addVat can't mess it up by revealing wrong-Realm objects to the confined kernel.

There are basically three ways to pass power into a Vat:

global/ambient endowments that's just magically in scope for all of the Vat's code
available on-demand to the global/ambient require() function
passed as an argument into the setup() function

The usual pros and cons apply: passing it globally or through require denies the Vat code the ability to self-partition itself and deny some portions access to the authority object. Using a global endowment is slightly untidy (to convince editors/etc that the apparently-undefined value is available, you must add a magic comment to the top of the file).

Passing an endowment is easiest when we're using SES, and doesn't require any changes to the kernel. I want to preserve our --no-ses option, though (at least until we get debugging within SES up to par), and the non-SES path uses plain require(), so it isn't easy to add in a global.

Adding something to require() would mean defining a new e.g. @agoric/ledger in a subdirectory (like we do with @agoric/evaluate in agoric-evaluate/), setting our package.json to "install" that from the local subdir, and changing addVat to build a second makeRequire which includes a properly-confined copy (perhaps with some new utility that can chain makeRequire helpers together). The Vat code would then do const ledger = require('@agoric/ledger'). This is still undeniable to self-imposed partitions of the Vat code, because the same require is available everywhere.

In the SES world, where we provide a different require to each Vat, the authority would be isolated properly. In the non-SES world, everything uses the native require, so all Vats would have access to all defined device modules (just like they have access to all of Node's builtins, they can require() anything they want). This is probably OK, as non-SES mode is only for debugging problems. We could prevent access to unapproved code by running bundleSource() in the non-SES path, and enhancing bundleSource() to accept a list of exits (i.e. require() statements that get to survive the bundling process, instead of being replaced with the source they point at), and adding only the specified device modules to the exit list. But this would rewrite all the sources, breaking the debuggability which --no-ses was made to provide. OT3H this particular breakage might be repairable with sourcemaps, as it's closer to the rollup use case. Maybe.

Passing the power as an argument to the setup() function avoids fussing about with require, and would behave the same way in SES/non-SES modes, but would require some changes to the kernel path. At present, the kernel's addVat method (as with most of its methods) tries to protect itself against the controller doing the wrong thing by stringifying the vatID and asserting that the setup function object it receives is of the same Realm as the kernel itself. We could add an endowments to this API, but the kernel would need to check the contents of it just like it does with the setup function itself. The endowments object is going to be outer-Realm, so it needs to be dropped and its contents added to a new in-Realm object.

Ok, after that analysis, I think passing the powers through kernel.addVat() and setup() is going to be the best path.

remove resolverID, consolidate kernel promise table into one identifier

The syscall.createPromise() call currently returns two separate identifiers, a promiseID and a resolverID. Both are mapped through per-vat tables when the kernel delivers them into a vat, which we use to keep track of which vats are allowed access to which promises and resolvers. The kernel-side identifier is the same for each.

We think we can simplify things by only tracking a single value. This will be named promiseID on the kernel side, and importPromiseID on the vat side (since all promises are kernel promises, so any promise-related slots that the vat knows about will be imports from the kernel).

The kernel promise table tracks a "decider vatid" for each promise. We'll use this to authorize syscall.fulfillToXYZ, which will now accept an importPromiseID instead of a resolverID.

This will parallel a change in #162, where promises can be imported/exported between machines just like presences, and are not kept in separate tables.

commsController: change return values

Several of the methods in src/kernel/commsSlots/commsController.js currently do syscall.fulfillToData with a value of "undefined" (a 9-letter string) instead of undefined (the primitive javascript value). I think we should change those to use the primitive javascript value.

ibid mechanism broken

I came across an issue when I was trying to return an array that contains the same presence twice. It errors with the message, 'ibid out of range'.

When notifyFulfillToData (line 409 of liveSlots.js on my branch) is called, it has the following as arguments:

promiseID: 20
data: "[{"@\qclass":"slot","index":0},{"@\qclass":"ibid","index":1}]" (slashes added to not tag qclass)
slots: [{type: "import", id: 11}]

MarkM says: "The ibid mechanism is broken because the replacer during serialization assigns ibid indices in preorder whereas the JSON reviver only sees objects in post order, and therefore has different ibid indices."

apply "quick fix" for ibid

#161 describes the problems with our current ibid code. This ticket is about just applying the "quick fix" which disables the creation of ibid markers, so that any shared data structures will simply be duplicated in the encoded data. The biggest flaw of this fix is when the marshaller is asked to serialize an object graph with cycles, which will cause infinite recursion and termination of the host. But that's better than the current situation, in which complex data structures will be arbitrarily corrupted as they traverse the serialization process.

add description to pass-by-presence objects?

For debugging (as well as general learn-by-experimentation), it would be nice to have a way to attach descriptions to the various pass-by-presence objects we fling around. The current situation is that these arrive as a Presence on the remote side, which is an empty harden()ed object with nothing to suggest what it's good for, where it came from, or what methods it might respond to.

The simplest improvement would to just make sure that .toString() can be (remotely) invoked, and establish a convention of including a toString() when you construct the object. This requires a roundtrip to do the lookup, and doesn't reveal any information to the kernel, so it's only usable from some other vat.

Another approach that we kicked around today would be to define a special symbol (a "registered Symbol", in JS parlance) that we use to attach a descriptive string. The construction-time syntax would look like:

const DESC = Symbol.for("description");
const p = harden({
  foo(args) { doStuff(); },
  [DESC]: 'I am a Foo that can do foo(args)',
});

The description string is the magic one: it would match something defined in liveSlots.js so the marshalling code can recognize the property. We could also pass the symbol into userspace the same we pass E.

To implement this, we'd need to change:

marshal.js: the presence of this Symbol shouldn't disqualify the object from pass-by-presence serialization
liveSlots.js: define the symbol, serialize pass-by-presence with an additional property (instead of just { type: 'export', id }, it needs to be { type: 'export', id, description })
the syscall.send and dispatch.deliver protocol must change to include the description on the first delivery of the presence record
vatManager.js needs to track exported presence records (which it doesn't currently track), to make sure we capture the description only on the first time the presence is exported from a vat. We need these to be immutable, and if a non-liveslots userspace decides to change the description each time it exports the same object, the kernel and other vats shouldn't see the changes.
kernel.js and the dump() function would benefit from including these descriptions in the printable representations of the kernel c-list tables

Have ibids in JSON use a readable path rather than an index

At https://github.com/Agoric/SwingSet/issues/30#issuecomment-498065550 @dckc asks:

Can you use a path rather than an index, a la cycle?

It looks like Crock updated cycle since dscape forked it.
https://github.com/douglascrockford/JSON-js/blob/master/cycle.js
is nicely small. So, perhaps. The resulting JSON text would certainly be more human readable and maintainable, which is the only point of continuing to use JSON.

build `ag-chain-cosmos` tool

Like ag-solo, we want an ag-chain-cosmos tool (maybe abbreviated agcc) to create, configure, and start a chain node. I'd like it to create a new base directory and copy the relevant files into it, including the various starting vats and bootstrap code for the demo. agcc is roughly a frontend for the cosmos-sdk's "baseapp" program, but putting the state in a specific base directory (rather than some $HOME/.dotsomething), and copying the swingset-specific files in addition to creating the normal tendermint ones.

The subcommands should probably be:

agcc init BASEDIR: mkdir, copy files in, do the equivalent of the cosmos-sdk baseapp's init function (create a genesis.json, a validator keypair, etc)
agcc add-genesis-account
agcc tendermint
agcc start

This tool might be pretty short: init needs to do extra work after delegating to the baseapp's version, then the other commands need to basically assert that CWD is a real basedir and then add --home . to whatever the command was. It also needs to do all the golang+node.js magic from the current lib/ag-chain-cosmos.

comms: unify dump() behavior

For the kernel, dump() returns a JSON-serializable object graph (so each Map is iterated and converted into an object, and some of the redundant tables are merged). The idea is that the Controller can make this safe for inspection by doing a JSON.parse(JSON.stringify(kernel.dump())), sort of "laundering" it through a string, so that we wind up with something that 1: consists only of objects from the outer Realm, and 2: can't be used to mutate the kernel tables.

I see the comms layer has some dump() methods, but it looks like sometimes they return strings (like in makeAllocateID.js which returns a stringified integer), and sometimes they return Maps (like in makeCLists.js).

We should probably unify these, so we can rely upon the behavior. For debugging and for tests, I've found it pretty handy to produce a JSON-able object graph, We may want that for vat/kernel save/restore too, since we need something that can be hashed and fed into the consensus mechanism. So maybe we should arrange for the top-level dump() to return something that can be fed into JSON.stringify(), and then all the internal dump() methods can do whatever they like as long as it meets that goal (e.g. makeAllocateID could return the plain integer, makeCLists would have to return an object or an array, etc).

signing isn't working yet

Here's the (convoluted) sequence to get the chain and the solo node set up:

(it assumes ag-chain-cosmos is on $PATH)

# ag-chain-cosmos init --chain-id agoric
# bin/ag-solo init t1
# ag-chain-cosmos add-genesis-account `cat t1/ag-cosmos-helper-address` 1000agtoken
# ag-chain-cosmos start
# make set-local-gci-ingress
# (cd t1 && ../bin/ag-solo start)

Then, once the ag-solo node gets started (wait until it says "deliverInbound" once or two), launch a browser at localhost:8000, and type E(home.chain).getBalance() into the text box, and hit the Eval button.

That will cause the solo node to try and sent a message to the chain (home.chain is a Presence pointing at an egress of the chain). The code in lib/ag-solo/chain-cosmos-sdk.js will fetch the account/sequence number for the ag-solo's keypair (address is in t1/ag-cosmos-helper-address), then POST the transaction body to the REST server's /swingset/mailbox endpoint.

I'm not sure the body is getting signed, but I'm not sure how to tell.

build demo client setup tool

The demo-client setup tool will accept the magic-wormhole code emitted by the provisioning server (#10) and run ag-solo init, send the generated pubkey to the server, accept the blob it returns, configure the solo node with that data, then launch ag-solo start. It will need to be written in python to use the wormhole code.

Enable the 'npm audit fix' CircleCI job

In order to run the 'npm audit fix' CircleCI job, we need to:

Add the npm-audit-fix.sh script
Git add package-lock.json so that there is a package-lock.json to audit
Add the AgoricBot GITHUB_TOKEN env var to the CircleCI settings

add sourcemaps to bundled code

We might be able to improve the debugging experience by having our bundleSource() utility arrange for a sourcemap to be attached to the bundled code. There are two places I'm hoping to improve.

The first is the stack trace displayed when vat code throws an uncaught exception. These currently show a trace to the eval() that SES uses, then a line/column number within the anonymous string bundle:

Error: here
    at m2 (eval at run (.../SwingSet/.misc/smap.js:70:21), <anonymous>:9:9)
    at build (eval at run (.../SwingSet/.misc/smap.js:70:21), <anonymous>:13:14)
    at run (.../SwingSet/.misc/smap.js:73:5)

It would be nice if that <anonymous>:9:9 bit could instead point to a real source file.

The second is within a debugger (e.g. browser dev-tools) when stopped within the evaluated code.

I think I've figured out how to attach sourcemaps with rollup, but so far they aren't improving either situation. The devtools in Chrome, when I insert a debugger call into the evaluated code, shows me a source file that appears to be the bundled output of rollup. Since we aren't compressing/minifying/uglifying anything, this isn't too hard to read, but it's still not the original file name or line number. So I expect IDEs like VSCode may not be happy, especially if you want to set a breakpoint on code that has not yet been evaluated.

The stacktrace printed when just running node smap.js in a shell didn't seem to get any better with sourcemaps.

I'll keep poking at this one.

Multiple Realms being created?

I'm in the process of reorganizing the kernel state, and moved the kernel state out of kernel.js and into another file under another sub-directory. This included state related functions, like dump and getState and loadState. To create the kernel obj in kernel.js, I just changed the properties of the kernel obj, such as dump: kernelState.dump, where kernelState.dump is an import from the other file.

I was getting an error: (node:82523) UnhandledPromiseRejectionWarning: TypeError: prototype function () { [native code] } of unknown.dump is not already in the fringeSet

I asked Mark about it and he suggested trying

dump() {
  return kernelState.dump();
},

And that worked. Mark says this is a symptom of having multiple realms. Also, the prototype of dump was not Function.prototype and had an arguments property that Function.prototype didn't have.

build cloud-hosted multiple-validator setup tool

@michaelfig and I sketched out a plan for setting up multiple validators and configuring them to talk to each other. The idea is to launch a docker image on each of several providers, and each image will run ag-chain-cosmos init, then listen on a little webserver on a predefined port for a coordinator process to ping it. It will have two endpoints. The first will return the new agcc public keys. The coordinator will fetch these from all nodes and use them to build the combined genesis.json. It will also collect the hostname/ipaddr/port of all nodes and build a shared addressbook so they can all find each other. It will also collect the initial vat code (and bootstrap.js).

The coordinator will then send the whole bundle to the second endpoint, which will modify the agcc setup files with the bundle's contents. Then the webserver will shut down, and it will switch the image into runtime mode, where it runs agcc start. If we bounce the image after setup, it should go directly into agcc start and skip the setup phase.

There's a race here, of course, which we'll fix eventually by baking some keypair or something into the image and signing the setup message.

We need to build the coordinator tool as well: it should take a list of ipaddr/port (of the newly-launched agcc images waiting to be configured) and a local directory full of vat code (part of which is a bootstrap file that will contain the pubkey of the controller solo node), and it should contact all the chain nodes and configure them and set them running.

Maybe this coordinator tool should also take instructions on how to contact the PaaS APIs and launch the hosts (giving it API keys and whatever). But maybe not.. decouple the creation of the chain hosts from their configuration.

build basic transcript-based persistence mechanism

I've started with some docs in docs/persistence.md . We don't have engine-level support for checkpointing, but we can plan for it, and in the meantime we "persist" the Vat state by recording a transcript of everything we ask the Vat to do. Since Vats are supposed to be deterministic, replaying the transcript ought to construct an identical state.

simplify/speed-up kvstore API

Kernel state is managed in a key-value store that is limited to string keys and string values. When running in a solo machine, this store is turned into JSON and written to disk after every turn, and the Map itself lives in the outer/primal/controller Realm. There is a kernel-realm proxy that forwards the set/get/has requests up to the outer realm, being careful to map objects correctly between the two realms. When running in a chain, the set/get/has requests are first forwarded up to the primal realm, then they're encoded into messages that can cross the Node.js/Go-lang boundary. To maintain a common API, the solo-machine case does this same kind of encoding (so there is a JSON blob with a method property with values like get and set).

This process is probably slower than necessary. We'd like to minimize the number of times any given value gets serialized and deserialized. I think for the solo-machine case, we could expose a collection of methods to the kernel realm, rather than using strings to merge all those cases into a single string type. Ideally, each key and value is created as a primitive string at the point of origin (e.g. src/kernel/state/kernelKeeper.js) and remains that way until it reaches the outer-realm Map object.

For calling out to the chain-based kvstore, we could optimize things by adding a number of argument slots to the bridge: instead of passing just a single string, it could pass a list of argument strings, so sendToNode('kvstore', 'set', 'key', 'value') wouldn't require any additional marshalling. Our cross-language needs are modest and predefined, so I think we don't need a completely general mechanism, and I expect the speedup of removing that JSON parsing will be significant (since we do sets and gets so frequently).

rename `syscall.fulfillToTarget` to `fulfillToPresence`

Presence is the object that a receiving vat creates to represent its access to some remote pass-by-presence object. I didn't have a good name for the other end of this (PassByPresenceObject?), so I just called it a "target". Unfortunately this collides with name of the first argument of syscall.send(), which could indicate either an imported pass-by-presence object (and which would be invoked by E(presence).methodname(args)), or an imported promise.

In today's meeting, we didn't come up with a better name for this thing, but we did decide that the kernel APIs could be renamed. We're planning to change syscall.fulfillToTarget with syscall.fulfillToPresence, and dispatch.notifyFulfillToTarget with dispatch.notifyFulfillToPresence. The latter makes a lot of sense, as we're asking the receiving vat to create a new Presence object, and then fulfill the named promise to it. The former is a bit more of a stretch, but it still kinda works: the sending vat is asking the kernel to fulfill this promise to a Presence, which sort of means "please tell all the subscribing vats to fulfill this to a Presence", even if the value being passed into the syscall is not actually a Presence (it's an export).

(hmm, maybe fulfillToExport would make more sense, and then notifyFulfillToImport, but it's nice to have the two sides of the method use the same suffix)

chain node crashed as txn was delivered: OutOfGas?

The steps to reproduce are the same as in #270 . With 46fb5e8, the txn is signed properly and makes it far enough into the chain to cause the SwingSet instance to be launched (with our new lazy-launch scheme). But something goes wrong and Go panics.

The log says panic: (types.ErrorOutOfGas). @michaelfig have you seen this error yet? I thought the defaults were to turn off all gas checking, and we didn't see this error when things were working last friday. We're generating the keypair differently (on friday we were following the tutorial and used add-genesis-account PUBKEY 1000agtoken,1000jackcoin, and in the STR from #270 we're just doing 1000agtoken). Also I'm not enabling --trust-node in the helper (CLI tool), but I don't think that should affect the chain node's behavior.

The chain node's log is:

$ ag-chain-cosmos start
Starting Node
Have AG_COSMOS { NodeReplier: [Function: NodeReplier],
  runAG_COSMOS: [Function: runAG_COSMOS],
  send: [Function: send],
  path:
   '/home/warner/stuff/agoric/cosmic-swingset/build/Release/agcosmosdaemon.node' }
Starting Go AG_COSMOS from Node AG_COSMOS
Starting Cosmos [/home/warner/stuff/agoric/cosmic-swingset/lib/ag-chain-cosmos start]
Done starting Cosmos
End of starting AG_COSMOS from Node AG_COSMOS
Constructing app!
I[2019-05-30|23:51:13.737] Starting ABCI with Tendermint                module=main
Starting daemon!
Sending to Node {"type":"AG_COSMOS_INIT"}
Send to node port 1 {"type":"AG_COSMOS_INIT"}
Ending Send to Node {"type":"AG_COSMOS_INIT"}
Waiting for 1
Replying to Go with true 0
Reply to Go true
Woken, got {true <nil>}
Received AG_COSMOS_INIT response true <nil>
E[2019-05-30|23:51:13.867] Couldn't connect to any seeds                module=p2p
I[2019-05-30|23:51:18.952] Executed block                               module=state height=1 validTxs=0 invalidTxs=0
I[2019-05-30|23:51:18.956] Committed state                              module=state height=1 txs=0 appHash=4BA4C99E30283C64AFB8F9D7E861518DFCA59D3A48935EBF1F6F6EB1CAD18189
E[2019-05-30|23:51:21.353] Failed to read request                       module=rpc-server protocol=websocket remote=127.0.0.1:43678 err="websocket: close 1005 (no status)"
E[2019-05-30|23:51:21.353] Error closing connection                     module=rpc-server protocol=websocket remote=127.0.0.1:43678 err="close tcp 127.0.0.1:26657->127.0.0.1:43678: use of closed network connection"
I[2019-05-30|23:51:24.010] Executed block                               module=state height=2 validTxs=0 invalidTxs=0
I[2019-05-30|23:51:24.018] Committed state                              module=state height=2 txs=0 appHash=D6517F5884299B7DFD0DDAE581E32D75FFDD174D485B5A136A19E6BCDC910558
I[2019-05-30|23:51:29.074] Executed block                               module=state height=3 validTxs=0 invalidTxs=0
I[2019-05-30|23:51:29.082] Committed state                              module=state height=3 txs=0 appHash=3F8BBA966746ADBCE15EA7AF71A0F1A9A9104E7CE7005111027F0AC319C788CF
I[2019-05-30|23:51:34.134] Executed block                               module=state height=4 validTxs=0 invalidTxs=0
I[2019-05-30|23:51:34.142] Committed state                              module=state height=4 txs=0 appHash=5E9F16945771A950FDA7FDD5FCCA8AC3F06BE8C6005A69FBF1EDD4EC10571043


About to call SwingSet
Send to node port 1 {"type":"DELIVER_INBOUND","peer":"cosmos1gu9yhnwrgjef4ujzw9dr4tp6junh6gv8fc7zht","messages":[[1,"{\"target\":{\"type\":\"your-egress\",\"id\":1},\"methodName\":\"getBalance\",\"args\":[],\"slots\":[],\"resultSlot\":{\"type\":\"your-resolver\",\"id\":6}}"]],"ack":0,"storagePort":1,"blockHeight":5}
Ending Send to Node {"type":"DELIVER_INBOUND","peer":"cosmos1gu9yhnwrgjef4ujzw9dr4tp6junh6gv8fc7zht","messages":[[1,"{\"target\":{\"type\":\"your-egress\",\"id\":1},\"methodName\":\"getBalance\",\"args\":[],\"slots\":[],\"resultSlot\":{\"type\":\"your-resolver\",\"id\":6}}"]],"ack":0,"storagePort":1,"blockHeight":5}
Waiting for 2
Sending SDK_READY
handler got { type: 'SDK_READY' }
sdkReady: checking for saved kernel state
Send to Go
Send to Go {"method":"has","key":"kernel"}
buildSwingset
kernel.addDevice(mailbox)
= adding vat 'mint' from /home/warner/stuff/agoric/cosmic-swingset/demo1/vat-mint.js
= adding vat 'comms' from /home/warner/stuff/agoric/cosmic-swingset/demo1/vat-comms.js
= adding vat 'vattp' from /home/warner/stuff/agoric/cosmic-swingset/node_modules/@agoric/swingset-vat/src/vat-tp/vattp.js
loading bootstrap.js
=> queueing bootstrap()
adding vref _bootstrap
adding vref comms
adding vref mint
adding vref vattp
adding dref mailbox
bootstrap() called
about to return {"@qclass":"undefined"} []
cs[comms].dispatch.deliver 0.init -> 30
makeMint
cs[comms].dispatch.deliver 0.addEgress -> 31
all vats initialized
Send to Go
Send to Go {"method":"set","key":"kernel","value":"{\"devices\":{\"mailbox\":{\"deviceState\":null,\"managerState\":{\"imports\":{\"inbound\":[],\"outbound\":[{\"key\":\"10\",\"value\":{\"id\":0,\"type\":\"export\",\"vatID\":\"vattp\"}}]},\"nextImportID\":11}}},\"nextPromiseIndex\":46,\"promises\":[{\"fulfillData\":\"{\\\"@qclass\\\":\\\"undefined\\\"}\",\"fulfillSlots\":[],\"id\":\"40\",\"state\":\"fulfilledToData\",\"subscribers\":[]},{\"fulfillData\":\"{\\\"@qclass\\\":\\\"undefined\\\"}\",\"fulfillSlots\":[],\"id\":\"41\",\"state\":\"fulfilledToData\",\"subscribers\":[]},{\"fulfillData\":\"{\\\"@qclass\\\":\\\"undefined\\\"}\",\"fulfillSlots\":[],\"id\":\"42\",\"state\":\"fulfilledToData\",\"subscribers\":[]},{\"fulfillSlot\":{\"id\":1,\"type\":\"export\",\"vatID\":\"mint\"},\"id\":\"43\",\"state\":\"fulfilledToPresence\",\"subscribers\":[]},{\"fulfillSlot\":{\"id\":2,\"type\":\"export\",\"vatID\":\"mint\"},\"id\":\"44\",\"state\":\"fulfilledToPresence\",\"subscribers\":[]},{\"fulfillData\":\"{\\\"@qclass\\\":\\\"undefined\\\"}\",\"fulfillSlots\":[],\"id\":\"45\",\"state\":\"fulfilledToData\",\"subscribers\":[]}],\"runQueue\":[],\"vats\":{\"_bootstrap\":{\"kernelSlotToVatSlot\":{\"devices\":[{\"key\":\"mailbox-0\",\"value\":{\"id\":40,\"type\":\"deviceImport\"}}],\"exports\":[{\"key\":\"comms-0\",\"value\":{\"id\":10,\"type\":\"import\"}},{\"key\":\"mint-0\",\"value\":{\"id\":11,\"type\":\"import\"}},{\"key\":\"mint-1\",\"value\":{\"id\":13,\"type\":\"import\"}},{\"key\":\"mint-2\",\"value\":{\"id\":14,\"type\":\"import\"}},{\"key\":\"vattp-0\",\"value\":{\"id\":12,\"type\":\"import\"}}],\"promises\":[{\"key\":\"40\",\"value\":{\"id\":20,\"type\":\"promise\"}},{\"key\":\"41\",\"value\":{\"id\":21,\"type\":\"promise\"}},{\"key\":\"43\",\"value\":{\"id\":22,\"type\":\"promise\"}},{\"key\":\"44\",\"value\":{\"id\":23,\"type\":\"promise\"}},{\"key\":\"45\",\"value\":{\"id\":24,\"type\":\"promise\"}}],\"resolvers\":[]},\"nextDeviceImportID\":41,\"nextImportID\":15,\"nextPromiseID\":25,\"nextResolverID\":30,\"state\":{\"transcript\":[{\"d\":[\"deliver\",0,\"bootstrap\",\"{\\\"args\\\":[[],{\\\"_bootstrap\\\":{\\\"@qclass\\\":\\\"slot\\\",\\\"index\\\":0},\\\"comms\\\":{\\\"@qclass\\\":\\\"slot\\\",\\\"index\\\":1},\\\"mint\\\":{\\\"@qclass\\\":\\\"slot\\\",\\\"index\\\":2},\\\"vattp\\\":{\\\"@qclass\\\":\\\"slot\\\",\\\"index\\\":3}},{\\\"_dummy\\\":\\\"dummy\\\",\\\"mailbox\\\":{\\\"@qclass\\\":\\\"slot\\\",\\\"index\\\":4}}]}\",[{\"id\":0,\"type\":\"export\"},{\"id\":10,\"type\":\"import\"},{\"id\":11,\"type\":\"import\"},{\"id\":12,\"type\":\"import\"},{\"id\":40,\"type\":\"deviceImport\"}],null],\"syscalls\":[{\"d\":[\"callNow\",{\"id\":40,\"type\":\"deviceImport\"},\"registerInboundHandler\",\"{\\\"args\\\":[{\\\"@qclass\\\":\\\"slot\\\",\\\"index\\\":0}]}\",[{\"id\":12,\"type\":\"import\"}]],\"response\":{\"data\":\"{\\\"@qclass\\\":\\\"undefined\\\"}\",\"slots\":[]}},{\"d\":[\"send\",{\"id\":12,\"type\":\"import\"},\"registerMailboxDevice\",\"{\\\"args\\\":[{\\\"@qclass\\\":\\\"slot\\\",\\\"index\\\":0}]}\",[{\"id\":40,\"type\":\"deviceImport\"}]],\"response\":20},{\"d\":[\"subscribe\",20]}]},{\"d\":[\"notifyFulfillToData\",20,\"{\\\"@qclass\\\":\\\"undefined\\\"}\",[]],\"syscalls\":[{\"d\":[\"send\",{\"id\":10,\"type\":\"import\"},\"init\",\"{\\\"args\\\":[{\\\"@qclass\\\":\\\"slot\\\",\\\"index\\\":0}]}\",[{\"id\":12,\"type\":\"import\"}]],\"response\":21},{\"d\":[\"subscribe\",21]}]},{\"d\":[\"notifyFulfillToData\",21,\"{\\\"@qclass\\\":\\\"undefined\\\"}\",[]],\"syscalls\":[{\"d\":[\"send\",{\"id\":11,\"type\":\"import\"},\"makeMint\",\"{\\\"args\\\":[]}\",[]],\"response\":22},{\"d\":[\"subscribe\",22]}]},{\"d\":[\"notifyFulfillToPresence\",22,{\"id\":13,\"type\":\"import\"}],\"syscalls\":[{\"d\":[\"send\",{\"id\":13,\"type\":\"import\"},\"mint\",\"{\\\"args\\\":[100,\\\"purse1\\\"]}\",[]],\"response\":23},{\"d\":[\"subscribe\",23]}]},{\"d\":[\"notifyFulfillToPresence\",23,{\"id\":14,\"type\":\"import\"}],\"syscalls\":[{\"d\":[\"send\",{\"id\":10,\"type\":\"import\"},\"addEgress\",\"{\\\"args\\\":[\\\"solo\\\",1,{\\\"@qclass\\\":\\\"slot\\\",\\\"index\\\":0}]}\",[{\"id\":14,\"type\":\"import\"}]],\"response\":24},{\"d\":[\"subscribe\",24]}]},{\"d\":[\"notifyFulfillToData\",24,\"{\\\"@qclass\\\":\\\"undefined\\\"}\",[]],\"syscalls\":[]}]},\"vatSlotToKernelSlot\":{\"deviceImports\":[{\"key\":\"deviceImport-40\",\"value\":{\"deviceName\":\"mailbox\",\"id\":0,\"type\":\"device\"}}],\"imports\":[{\"key\":\"import-10\",\"value\":{\"id\":0,\"type\":\"export\",\"vatID\":\"comms\"}},{\"key\":\"import-11\",\"value\":{\"id\":0,\"type\":\"export\",\"vatID\":\"mint\"}},{\"key\":\"import-12\",\"value\":{\"id\":0,\"type\":\"export\",\"vatID\":\"vattp\"}},{\"key\":\"import-13\",\"value\":{\"id\":1,\"type\":\"export\",\"vatID\":\"mint\"}},{\"key\":\"import-14\",\"value\":{\"id\":2,\"type\":\"export\",\"vatID\":\"mint\"}}],\"promises\":[{\"key\":\"promise-20\",\"value\":{\"id\":40,\"type\":\"promise\"}},{\"key\":\"promise-21\",\"value\":{\"id\":41,\"type\":\"promise\"}},{\"key\":\"promise-22\",\"value\":{\"id\":43,\"type\":\"promise\"}},{\"key\":\"promise-23\",\"value\":{\"id\":44,\"type\":\"promise\"}},{\"key\":\"promise-24\",\"value\":{\"id\":45,\"type\":\"promise\"}}],\"resolvers\":[]}},\"comms\":{\"kernelSlotToVatSlot\":{\"devices\":[],\"exports\":[{\"key\":\"mint-2\",\"value\":{\"id\":11,\"type\":\"import\"}},{\"key\":\"vattp-0\",\"value\":{\"id\":10,\"type\":\"import\"}}],\"promises\":[{\"key\":\"42\",\"value\":{\"id\":20,\"type\":\"promise\"}}],\"resolvers\":[{\"key\":\"41\",\"value\":{\"id\":30,\"type\":\"resolver\"}},{\"key\":\"45\",\"value\":{\"id\":31,\"type\":\"resolver\"}}]},\"nextDeviceImportID\":40,\"nextImportID\":12,\"nextPromiseID\":21,\"nextResolverID\":32,\"state\":{\"transcript\":[{\"d\":[\"deliver\",0,\"init\",\"{\\\"args\\\":[{\\\"@qclass\\\":\\\"slot\\\",\\\"index\\\":0}]}\",[{\"id\":10,\"type\":\"import\"}],30],\"syscalls\":[{\"d\":[\"send\",{\"id\":10,\"type\":\"import\"},\"registerCommsHandler\",\"{\\\"args\\\":[{\\\"@qclass\\\":\\\"slot\\\",\\\"index\\\":0}]}\",[{\"id\":1,\"type\":\"export\"}]],\"response\":20},{\"d\":[\"fulfillToData\",30,\"{\\\"@qclass\\\":\\\"undefined\\\"}\",[]]}]},{\"d\":[\"deliver\",0,\"addEgress\",\"{\\\"args\\\":[\\\"solo\\\",1,{\\\"@qclass\\\":\\\"slot\\\",\\\"index\\\":0}]}\",[{\"id\":11,\"type\":\"import\"}],31],\"syscalls\":[{\"d\":[\"fulfillToData\",31,\"{\\\"@qclass\\\":\\\"undefined\\\"}\",[]]}]}]},\"vatSlotToKernelSlot\":{\"deviceImports\":[],\"imports\":[{\"key\":\"import-10\",\"value\":{\"id\":0,\"type\":\"export\",\"vatID\":\"vattp\"}},{\"key\":\"import-11\",\"value\":{\"id\":2,\"type\":\"export\",\"vatID\":\"mint\"}}],\"promises\":[{\"key\":\"promise-20\",\"value\":{\"id\":42,\"type\":\"promise\"}}],\"resolvers\":[{\"key\":\"resolver-30\",\"value\":{\"id\":41,\"type\":\"resolver\"}},{\"key\":\"resolver-31\",\"value\":{\"id\":45,\"type\":\"resolver\"}}]}},\"mint\":{\"kernelSlotToVatSlot\":{\"devices\":[],\"exports\":[],\"promises\":[],\"resolvers\":[{\"key\":\"43\",\"value\":{\"id\":30,\"type\":\"resolver\"}},{\"key\":\"44\",\"value\":{\"id\":31,\"type\":\"resolver\"}}]},\"nextDeviceImportID\":40,\"nextImportID\":10,\"nextPromiseID\":20,\"nextResolverID\":32,\"state\":{\"transcript\":[{\"d\":[\"deliver\",0,\"makeMint\",\"{\\\"args\\\":[]}\",[],30],\"syscalls\":[{\"d\":[\"fulfillToPresence\",30,{\"id\":1,\"type\":\"export\"}]}]},{\"d\":[\"deliver\",1,\"mint\",\"{\\\"args\\\":[100,\\\"purse1\\\"]}\",[],31],\"syscalls\":[{\"d\":[\"fulfillToPresence\",31,{\"id\":2,\"type\":\"export\"}]}]}]},\"vatSlotToKernelSlot\":{\"deviceImports\":[],\"imports\":[],\"promises\":[],\"resolvers\":[{\"key\":\"resolver-30\",\"value\":{\"id\":43,\"type\":\"resolver\"}},{\"key\":\"resolver-31\",\"value\":{\"id\":44,\"type\":\"resolver\"}}]}},\"vattp\":{\"kernelSlotToVatSlot\":{\"devices\":[{\"key\":\"mailbox-0\",\"value\":{\"id\":40,\"type\":\"deviceImport\"}}],\"exports\":[{\"key\":\"comms-1\",\"value\":{\"id\":10,\"type\":\"import\"}}],\"promises\":[],\"resolvers\":[{\"key\":\"40\",\"value\":{\"id\":30,\"type\":\"resolver\"}},{\"key\":\"42\",\"value\":{\"id\":31,\"type\":\"resolver\"}}]},\"nextDeviceImportID\":41,\"nextImportID\":11,\"nextPromiseID\":20,\"nextResolverID\":32,\"state\":{\"transcript\":[{\"d\":[\"deliver\",0,\"registerMailboxDevice\",\"{\\\"args\\\":[{\\\"@qclass\\\":\\\"slot\\\",\\\"index\\\":0}]}\",[{\"id\":40,\"type\":\"deviceImport\"}],30],\"syscalls\":[{\"d\":[\"fulfillToData\",30,\"{\\\"@qclass\\\":\\\"undefined\\\"}\",[]]}]},{\"d\":[\"deliver\",0,\"registerCommsHandler\",\"{\\\"args\\\":[{\\\"@qclass\\\":\\\"slot\\\",\\\"index\\\":0}]}\",[{\"id\":10,\"type\":\"import\"}],31],\"syscalls\":[{\"d\":[\"fulfillToData\",31,\"{\\\"@qclass\\\":\\\"undefined\\\"}\",[]]}]}]},\"vatSlotToKernelSlot\":{\"deviceImports\":[{\"key\":\"deviceImport-40\",\"value\":{\"deviceName\":\"mailbox\",\"id\":0,\"type\":\"device\"}}],\"imports\":[{\"key\":\"import-10\",\"value\":{\"id\":1,\"type\":\"export\",\"vatID\":\"comms\"}}],\"promises\":[],\"resolvers\":[{\"key\":\"resolver-30\",\"value\":{\"id\":40,\"type\":\"resolver\"}},{\"key\":\"resolver-31\",\"value\":{\"id\":42,\"type\":\"resolver\"}}]}}}}"}
panic: (types.ErrorOutOfGas) (0x7ff35778cb80,0xc0023b94a0)

goroutine 17 [running, locked to thread]:
github.com/cosmos/cosmos-sdk/store/types.(*basicGasMeter).ConsumeGas(0xc0023f57b0, 0x3755a, 0x7ff356fd27e2, 0xc)
	/home/warner/go/pkg/mod/github.com/cosmos/[email protected]/store/types/gas.go:93 +0xad
github.com/cosmos/cosmos-sdk/store/gaskv.(*Store).Set(0xc002404720, 0xc0023f5d90, 0xb, 0x10, 0xc00242c000, 0x1d83, 0x1d83)
	/home/warner/go/pkg/mod/github.com/cosmos/[email protected]/store/gaskv/store.go:51 +0xa2
github.com/Agoric/cosmic-swingset/x/swingset.Keeper.SetStorage(0x7ff3578abbc0, 0xc000bb8780, 0x7ff35789d260, 0xc0009e9b20, 0xc000157030, 0x7ff3578a4ba0, 0xc002402cc0, 0xc0023b7080, 0x14, 0xc0023f5d70, ...)
	/home/warner/stuff/agoric/cosmic-swingset/x/swingset/keeper.go:89 +0x8e1
github.com/Agoric/cosmic-swingset/x/swingset.(*storageHandler).Receive(0xc00232e870, 0xc002418000, 0x23ea, 0xc0022b6048, 0x0, 0x0, 0x23f6)
	/home/warner/stuff/agoric/cosmic-swingset/x/swingset/storage.go:41 +0xfdd
github.com/Agoric/cosmic-swingset/x/swingset.ReceiveFromNode(0x1, 0xc002418000, 0x23ea, 0x2, 0x2, 0x23f6, 0x0)
	/home/warner/stuff/agoric/cosmic-swingset/x/swingset/handler.go:71 +0x142
main.SendToGo(0xc000000001, 0x2d0a050, 0xc002415ec0)
	/home/warner/stuff/agoric/cosmic-swingset/lib/agcosmosdaemon.go:101 +0x10b
main._cgoexpwrap_f0fe34618d15_SendToGo(0x7ffc00000001, 0x2d0a050, 0x0)
	_cgo_gotypes.go:119 +0x64
Aborted (core dumped)

standardize on a body+slots structure for serialized reference-bearing object graphs

We have lots of places where we create data (a string which holds the JSON serialization of an object graph, with @qclass markers and stuff) and slots (which is an array of import/export/promise/resolver references), which always travel together. These serve as the arguments of method sends, and the value to which a promise might be resolved. The data remains unmodified as it travels from vat to kernel to other vat, but the slots are mapped at each border. The same pattern happens when making cross-machine calls over a TCP socket or other channel.

In the current code, these two values usually get passed as separate named arguments, and their names change depending upon the context. We should standardize upon some structure (maybe { data, slots } except that doesn't remind us that data is a string) and pass just one argument containing both pieces.

We might use the opportunity to combine methodName with the arguments too, so { method, args: { data, slots } }.

ui/demo plan

Capturing some notes about the UI/demo scheme that @michaelfig and I worked out this afternoon:

The swingset-factory repo will provide a tool (in node.js) that:

launches the swingset solo machine in a "worker" subprocess connected via pipes
listens on a TCP port for WebSocket connections in the parent "factory" process
forwards inbound JSON blobs to the worker, forwards outbound blobs to the websocket
the worker hands inbound blobs to a ui-mailbox device (a second instance of the usual mailbox device), which delivers them to a new "UI Vat" (could use a better name). This UI Vat accepts JSON blobs from the frontend UI, rather than VatTP/CapTP messages.
the worker notices new outbound blobs in the ui-mailbox device and forwards them through stdout to the factory process, which can then deliver them back out the websocket
(actually, since the number of connected websocket clients can range from zero to lots, the blobs may need to indicate which client they're intended for. We can simplify this by delaring all connected clients to be the same, with blobs sent to all of them in parallel)
the "factory" process has an HTTP server with two jobs: the websocket for JSON blobs, and a static directory of HTML with which the browser-based client is loaded

For the client-side machines in our demo, the static HTML should implement a basic REPL-like application in the browser: typing a string into the box causes a JSON blob to be sent into the ui-vat for evaluation, and a new numbered history item to be added to a scrollable list (with initial contents of "pending.."). In the vat, each blob will include the history number, and when the eval finishes (and the promise it returns resolves), the ui-vat will send a blob back out to the frontend to change that "pending.." to some actual value.

For the chain-side machines, we need a "controller" swingset-factory that listens on two separate ports. One port is for public interaction, and has a "please provision me" button. The other port is for our control of the testnet chain, and has a more closely-held interface. Both deliver blobs into the attached solo machine, but to different targets. The controller machine must have pre-configured access to the on-chain vats, with which it can drive provisioning messages.

If we're going to use magic-wormhole to deliver provisioning information, we need the factory process to write some files to disk and then spawn the magic-wormhole executable. Alternatively we could bake the URL of the public interaction port into the client-side application, and have the two sides communicate directly with HTTP or websocket messages.

prohibit send()-to-self

I made a mistake while working on some comms integration, and convinced a vat to send messages to itself. Specifically, by getting an "ingress" and "egress" backwards, the receiving comms vat did a syscall.send({type:'export',id},...). Normally vats only use send to target {type:'import'}, but in this case the kernel cheerfully translateded the vat-local export into a sending-vat-specific export, and put it on the run queue just like any other message. Eventually the message was delivered to dispatch.deliver() back into the same vat from which it originated.

This was kinda confusing, and I don't think there's a good reason for vats to use the kernel to call back into themselves. So for now I'm going to disable these calls, and have the kernel throw an exception if a vat tries to send() to one of their own exports. It's conceivable that this will become a sensible thing to do in the future, when we get the escalator schedulers in place, but even then I'm not sure it will make sense.

The specific check is added to src/kernel/vatManager.js, in doSend().

build ed25519 signing device

For the sending side of our solo-to-solo and solo-to-chain comms paths, we'll need the ability to generate signatures on the message bodies (since secrets are involved, signing can only be done from a solo machine, not from a chain). Ed25519 is my favorite algorithm, but the implementation I'd like to use (from libsodium) is written in C and accessed through some Node.js FFI binding, making it unusable from within a confined SES context. This is a great place for a device, which can safely invoke the external FFI library and then synchronously returning the signature. This device isn't adding any authority (the signature algorithm is just manipulating data which the vat already has, and the vat is Turing-complete), but for performance and safety against timing sidechannels we want to use a low-level language for the implementation, and that's not available from within our confined code environment.

For the receiving side of many paths, we need the ability to verify the signatures (which can be done safely/correctly/meaningfully from within a chain-based machine). This should also be implemented as a device, perhaps the same one.

In the longer run, we're going to want to move the signing key out into another process, in the hopes that this will provide us with some protection against Spectre/Meltdown -style attacks. In this case, the device could own the UNIX-domain socket or named pipe we used to talk to the signing process.

Reconfigure .eslintignore and run ESLint

windows pathname fix in "sourceIndex must be relative or absolute" check

☐

write bang-syntax convertor

We'd like to be able to use "bang syntax" in userspace Vat code: x ! foo(y). This should be converted into E(x).foo(y). This probably wants to be a plugin to our use of rollup when bundling the vat code into a single string.

If we can accomplish that, we should jump ahead to converting it into E('send', x, 'foo', y), and change the E object to match. By providing the method name as a string, we can avoid the need for a Proxy (to glean the method name when it gets looked up), which would be faster and simpler.

The original Promises proposal (http://web.archive.org/web/20161026162206/http://wiki.ecmascript.org/doku.php?id=strawman:concurrency) describes the syntax, although we might need to update some parts of it.

Harden-related error when using --no-ses and the shell command

I get a harden-related error when I try to run something like this on master:

$ bin/vat --no-ses shell demo/left-right

The exact error is:

(node:68556) UnhandledPromiseRejectionWarning: TypeError: prototype [object Object] of unknown.p.domain is not already in the fringeSet

build provisioning server

The "provisioning server" will be a (probably python-based) webserver that offers demo users the ability to get access to a bundle of demo resources. It will offer a form that says "click here to join the demo", which presents a magic-wormhole invitation code. On the client side, they run some demo-setup tool that we need to write, which uses that code to build a connection with the provisioning server, and exchange two things: their newly-minted ag-solo node's public key is traded for a configuration blob that includes the chain's GCI, the ipaddr/port of all the validator nodes (used as tendermint RPC servers), and the ingress index they should use in their bootstrap.js to access the demo objects.

The provisioning server will then use its localhost HTTP access to a controller ag-solo node to deliver a provisioning request to the chain vat objects. These objects will create the demo objects, then register them with the comms vat in a new egress, tied to the client's public key. Then the provisioning server will send the chain info to the client, and the client will configure its own bootstrap.js file before running ag-solo start.

change HTTP interface

I'd like to build a different HTTP interface for the demo, which would also be easier to use for speaking to the "controller node" (a closely-held solo machine which we use to provision new users onto the chain machine). This might subsume https://github.com/Agoric/SwingSet/issues/46 , be subsumed by it, or get built on top of it, I'm not sure yet.

Our current HTTP scheme uses WebSockets exclusively, and obligates the browser frontend to understand our VatTP conventions (numbered messages with acks).

The controller node would be easier to manage if we could do more RPC-like request+response HTTP messages.

I'd like the HTTP server to accept requests and turn each into an inbound device message that gets sendOnlyed to vat-http, along with a request id (a sequence number). vat-http delivers that to some object, and when the resulting promise is resolved, it does an invoke of the device node to send back the response (citing the request id). The HTTP server holds on to the request object in a table until that response comes back, at which point it can fire the HTTP response back.

This violates our hangover-inconsistency a little bit: if the machine crashes after we've sent the HTTP response but before the checkpoint is completed, the next reboot won't know about the HTTP request that prompted that response. But I think this is ok, at least for now, the client issuing these HTTP requests doesn't behave like a Vat. If this bothers us, we could change the HTTP device to include an outbox like with mailbox devices, and have the host loop look in the outbox for HTTP responses to send. The main downside is that we'll want to clear those responses once they're sent, so we need an additional inbound device API to delete things from this state. And we need to think about how the device state should be persisted, if at all.

In addition to the RPC-ish request+response, we should retain a WebSocket broadcast channel, since that's the easiest way to asynchronously update the REPL browser frontend when a promise becomes resolved. The controller/provisioning node won't use websockets, only RPC-like calls. The REPL frontend might use an RPC at the beginning to fetch the history, but then/before-then should start paying attention to websocket payloads to add-new/update-old history items.

comms: improve serialize-to-key function

in src/kernel/commSlots/state/makeCLists.js, there's a Map that manages a table with nominally three columns that all describe the same target: "incomingWireMessage" (which we, the comms vat, receive from some other machine), "kernelToMe" (the slot reference we exchange with the kernel), and "outgoingWireMessage" (the thing we send to the other machine). We only actually ever use two of the six possible mappings: incoming->kernel, and kernel->outgoing. The kernel column holds an object with a { type, id } pair of properties, while the incoming/outgoing also have a otherMachineName property.

Since Javascript doesn't have immutable tuples for use in maps (like my beloved Python), we have to turn these composite objects into a string-like key, with some sort of encoding scheme.

We need an encoding scheme that is both stable and collision-free under adversarial attack. Any encoding scheme that can be decoded unambiguously is sufficiently collision-free. JSON.stringify is collision-free, but its stability depends upon the order in which the object keys were added. "djson" (a library that provides deterministic JSON encoding) would be stable, but importing it into a SES environment sounds like an adventure.

I ran into cases where lookups were failing because the JSON encoding wasn't stable, and I couldn't figure out why (something was creating objects in a different way than the clist-management code), so I decided to rewrite the comms/state code to use a different scheme.

For now, we use a cheap concatenation that is stable but not collision-free. The kernel objects are turned into keys like kernel-${type}-${id}, and the incoming objects use incoming-${machineName}-${type}-${id}. If machineName can have hyphens, the attacker could provoke collisions. However, 'otherMachineName' will generally be a public key, which won't have hyphens, so the attacker is not likely to be able to force a collision with other machines, which is the only kind of collision that could get them unintended access.

This ticket is to remember to improve this, probably by using entirely independent Map instances (one for each incoming machine), so we're only looking up a simple integer in each one.

bin/vat doesn't work

There is an error when running the encouragementBot from the README:

michael$ bin/vat run demo/encouragementBot
/Users/michael/src/SwingSet/bin/vat:1
Error: Cannot find module '../src/devices'
Require stack:
- /Users/michael/src/SwingSet/bin/vat
    at Object.<anonymous> (/Users/michael/src/SwingSet/bin/vat:10:26)
    at Module._compile (internal/modules/cjs/loader.js:816:30)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:827:10)
    at Module.load (internal/modules/cjs/loader.js:685:32)
    at Function.Module._load (internal/modules/cjs/loader.js:620:12)
michael$

struggling to run contractHost demo: Cannot find module './bundles/kernel'

I just checked out 3ae77b4 and tried to follow the instructions in the README, but I lose:

connolly@jambox:~/projects/SwingSet$ ./bin/vat run demo/contractHost -- mint
/home/connolly/projects/SwingSet/src/controller.js:1
Error: Cannot find module './bundles/kernel'
Require stack:
- /home/connolly/projects/SwingSet/src/controller.js
- /home/connolly/projects/SwingSet/src/index.js
- /home/connolly/projects/SwingSet/bin/vat
    at Object.<anonymous> (/home/connolly/projects/SwingSet/src/controller.js:1)
connolly@jambox:~/projects/SwingSet$ node --version

cc @JoshOrndorff

test

implement kernel-side escalator scheduler

The kernel currently holds a queue of messages to deliver, and executes them in strict order of their submission. In the longer run, we want messages to be able to pay for faster execution, using the "meter" and "keeper" mechanisms from KeyKOS.

The first step will omit the whole financial aspect and just implement the scheduling algorithm. Each pending delivery, once it is unblocked on any dependency it might have (none for now, but Flows will add some sooner or later), gets put on an escalator with some starting offer price and delta-price/delta-time slope. Nothing executes while its offer price is below zero (this enables timed delivery, more or less). As time passes, all messages have their price increased by their slope. The highest offer gets to execute.

In our isolated environment, "time passing" does not have a lot of meaning. But we should be able to simulate it somewhat. It will become more relevant when we're limited by how much computation we can execute at once, e.g. if the Vats are living in a blockchain and the limit is how much computation can go into each block.

A later phase will be to implement Meters and have some sort of tokens being spent. For now, userspace should just be able to specify the initial price and slope to be anything they like, without concern for how it will pay for the prioritization it wants.

We need to think through the userspace interface for this. Most of the time, each outbound message should use the same Meter (and scheduling parameters?) as the triggering inbound message was using. But it should be possible to fork off an separate scheduler as an option when making outbound calls. Maybe a dynamic-scope tool (what would be a "context manager" in Python) that causes all message sends within that scope to use the alternate settings.

Dean's Flow work should cover this: any messages sent within the context of a Flow will use the same Flow (and will be dependent upon all previous messages in that Flow), but there's a .fork() operation that gives you a new Flow, maybe with different parameters. We'll need to make the kernel aware of these Flow objects.

get CI set up

build HTTP server device

For interaction with UIs and CLI-based tools, we need a way to send messages into a machine. Not everything is a swingset machine, so we should enable some sort of lower-tech protocol.

I'm thinking the device node should use Node's built-in HTTP server, and the root object should provide a registerHandler function. Each time an HTTP request arrives, the handler will receive a message that includes the full contents of the request as arguments, plus a "response index" or handle of some sort. Calling a second function on the root object with the response index causes the HTTP request to be answered. We can then wrap this in a Vat which does some minimal dispatch/routing, and presents a promise-based interface (so the real target of the message can trigger a response by just fulfilling their response promise to the HTTP response data).

It might also be interesting to add an HTTP client device, but that's lower priority for now.

new device model

@dtribble and I walked through a new device model this morning, and I think it makes enough sense to implement eventually:

devices are represented by a new slot type, so {type: 'device', id}
these are Vat imports just like promises and exports of other vats
they can be shared with other Vats by including them as an argument in syscall.send, etc
they are not valid as a target of syscall.send
a new syscall.invoke(deviceImport, args, slots) -> (args, slots) is used instead

syscall.invoke is like syscall.send but it is synchronous: the device is invoked immediately, and it can return an arbitrary immediate value (whereas send only returns a promiseID). This lets it be used for synchronous table lookup or modification.

For the inbound-comms-vat use case, we were thinking that devices are allowed to put deliveries onto a special queue that gets handled before the regular run-queue.

The bootstrap vat would get all configured devices as arguments of the iniital bootstrap() call, just like it currently gets references to the root object of all pre-configured vats. It could then share these with other vats as it sees fit. This means both the bootstrap vat and the recipient vat are sharing access, which makes Dean nervous, so he'd like to see some sort of getExclusive call. We might implement this by adding a method to each device that returns a new device handle (and revokes access from any old ones). He was also somewhat concerned about authenticity: if we're going to handle devices like we do Purses (where deposit is sort of a "get exclusive"), then it'd be nice if vats had access to the Issuer, so they could check that the device handle they received is really from the Issuer they know. But I thought that if we add a special pathway for them to get the Issuer, we could just give them the device directly, and all Vats are already relying upon the bootstrap vat anyways.

The API of a Vat is still pure data, so I don't think this will make migration any more difficult. And it removes the current raw-JS-object endowment pattern, which should make it easier. It means devices are only slightly more powerful than regular references: the one thing they add is synchronous invocation.

promise.domain interferes with harden() in non-SES code and node REPL

We learned that Promise instances acquire an extra property in code executed from the Node REPL:

$ node
> p = new Promise(r => 1)
> Object.getOwnPropertyNames(p)
[ 'domain' ]
> Object.getPrototypeOf(p.domain)
Domain {
  members: undefined,
  _errorHandler: [Function],
  enter: [Function],
  exit: [Function],
  add: [Function],
  remove: [Function],
  run: [Function],
  intercept: [Function],
  bind: [Function] }
>

This seems to exist to support Node's deprecated "Domain" concept (https://nodejs.org/api/domain.html#domain_domains_and_promises), which provides a sort of on-exit notification when execution has left a particular region of the code (think server event handlers, and cleaning up resources allocated for a single event).

The .domain property is an object whose prototype is not a normal JS primordial.

When we run bin/vat shell --no-ses, any code which tries to harden() a Promise will fail, because it tries to harden p.domain, and then it sees that the prototype is not in the "fringe" (the set of all the primordials that SES would have frozen). In this way, our non-SES harden() behavior is successfully detecting something strange: a form of mutability that would have violated the goals of hardening.

But in practice, it's kind of a drag. Running the vat with --no-ses is awfully handy for tracking down the line numbers of crashes, so it's annoying that the code can't work without SES.

One fix might be to find this domain prototype and add it to the @agoric/harden fringe. Another might be for the vat to monkeypatch Promise (somehow?) to stop adding these domain properties. Since this is specific to the Node REPL, there may be some hook we can use to control the behavior without needing to muck with Promise too much.

remove basedir/vat-NAME/index.js handler

You can currently provide the source code for pre-defined vats in one of two ways: a single file named vat-NAME.js (in the base directory, next to bootstrap.js), or a directory named vat-NAME/ that contains (at least) an index.js file. Either one of these can be used to define the vat, and since they're fed to rollup, they can have import statements to pull in other code.

MarkM pointed out that "there's only one way to do it" is a virtue, and the ability to import means you can arrange the rest of your code any way you like, so we decided to drop the vat-NAME/index.js approach and only leave the vat-NAME.js scheme. (Doing it the other way around means all your source files are named index.js, just in different directories, and that's pretty annoying to work with).

This will be a 3-line change to loadBasedir() in src/controller.js, plus maybe deleting a test.

signature verification fails on second message

Following the instructions in README-solo-and-chain.md, I'm able to get an E(home.chain).getBalance() message from the web browser to the HTTP server to the solo machine to the chain machine and back again, updating the browser REPL history display to say "100" units as expected. However the solo machine then tries to ACK the outbound message, and that second inbound message gets an error from the helper, saying the chain node rejected it with a signature verification error:

new outbound message 1 for 975da084af0bd6e37087cc48bf0bd2ad853277e2a8eb4d10e1c01c297b813d2d: {"target":{"type":"your-egress","id":1},"methodName":"getBalance","args":[],"slots":[],"resultSlot":{"type":"your-resolver","id":6}}
 1 new messages
 invoking deliverator
delivering to chain 975da084af0bd6e37087cc48bf0bd2ad853277e2a8eb4d10e1c01c297b813d2d [ [ 1,
    '{"target":{"type":"your-egress","id":1},"methodName":"getBalance","args":[],"slots":[],"resultSlot":{"type":"your-resolver","id":6}}' ] ] 0
running helper [ 'tx',
  'swingset',
  'deliver',
  'cosmos1qe69qjhzmu37f3zgwf7wahd9vg2yvf5pgyh54h',
  '[[[1,"{\\"target\\":{\\"type\\":\\"your-egress\\",\\"id\\":1},\\"methodName\\":\\"getBalance\\",\\"args\\":[],\\"slots\\":[],\\"resultSlot\\":{\\"type\\":\\"your-resolver\\",\\"id\\":6}}"]],0]',
  '--from',
  'ag-solo',
  '--yes',
  '--chain-id',
  'agoric',
  '--home',
  '/home/warner/stuff/agoric/cosmic-swingset/t1/ag-cosmos-helper-statedir' ]
 helper said: Response:
  Height: 10
  TxHash: 9D0AE33B523271F59835E60AE613CDE01E188E640B3ED9D2C44C66949815E956
  Logs: [{"msg_index":0,"success":true,"log":""}]
  GasWanted: 200000
  GasUsed: 10019
  Tags:
    - action = deliver

new block on 975da084af0bd6e37087cc48bf0bd2ad853277e2a8eb4d10e1c01c297b813d2d, fetching mailbox
 helper said: {"value":"{\"ack\":1,\"outbox\":[[1,\"{\\\"event\\\":\\\"notifyFulfillToData\\\",\\\"promise\\\":{\\\"type\\\":\\\"your-promise\\\",\\\"id\\\":6},\\\"args\\\":\\\"100\\\",\\\"slots\\\":[]}\"]]}"}

deliverInbound [ [ 1,
    '{"event":"notifyFulfillToData","promise":{"type":"your-promise","id":6},"args":"100","slots":[]}' ] ] 1
about to return {"@qclass":"undefined"} []
about to return {"@qclass":"undefined"} []
cs[comms].dispatch.deliver 1.inbound -> 40
sendIn 975da084af0bd6e37087cc48bf0bd2ad853277e2a8eb4d10e1c01c297b813d2d => {"event":"notifyFulfillToData","promise":{"type":"your-promise","id":6},"args":"100","slots":[]}
cs[comms].dispatch.deliver 2.update -> 41
about to return {"@qclass":"undefined"} []
deliver { http:
   { outbox: [ [Array], [Array], [Array], [Array], [Array] ],
     inboundAck: 1 },
  '975da084af0bd6e37087cc48bf0bd2ad853277e2a8eb4d10e1c01c297b813d2d': { outbox: [], inboundAck: 1 } }
new outbound message 5 for http: {"target":{"type":"your-egress","id":10},"methodName":"update","args":[0,"command[0] = E(home.chain).getBalance()\nhistory[0] = 100\n"],"slots":[],"resultSlot":{"type":"your-resolver","id":9}}
 1 new messages
 invoking deliverator
 0 new messages
 invoking deliverator
delivering to chain 975da084af0bd6e37087cc48bf0bd2ad853277e2a8eb4d10e1c01c297b813d2d [] 1
running helper [ 'tx',
  'swingset',
  'deliver',
  'cosmos1qe69qjhzmu37f3zgwf7wahd9vg2yvf5pgyh54h',
  '[[],1]',
  '--from',
  'ag-solo',
  '--yes',
  '--chain-id',
  'agoric',
  '--home',
  '/home/warner/stuff/agoric/cosmic-swingset/t1/ag-cosmos-helper-statedir' ]
ERROR: {"codespace":"sdk","code":4,"message":"signature verification failed"}
helper failed
rc: 1
stdout: Response:
  TxHash: C3235AC813FDA4E41CA83BE6118994B7E80E2174073A11DF135B01AB2FA7990D

stderr: ERROR: {"codespace":"sdk","code":4,"message":"signature verification failed"}

I didn't see anything noteworthy on the chain node's log: it didn't complain about a bad txn being submitted. I'm assuming the "signature verification failed" message came from the chain node.

I think the next step is to instrument the chain node's cli-handler and see how it is building the message body that needs to be signed. Maybe we're leaving some field out of the signed bytes which is somehow ok for the first message (maybe when sequence is 0?), but not for the second. Or maybe the sequence number isn't being updated, and the chain node is throwing the wrong kind of error message.

'vat run demo/encouragementBotComms' is broken

When I replaced the old per-vat device code with the new machine-wide device nodes, I broke the code in bin/vat that knows how to provide channel devices to any vat named comms. As a result, running vat run demo/encouragementBotComms is now broken. Related to #171 but a tiny bit larger.

The fix will be to change bin/vat to:

import buildChannel()fromsrc/devices/channel.js`
lose config.vatDevices
instead do config.devices = [['channel', ch.srcPath, ch.endowments]], which must come from a const ch = buildChannel()

I think the demo/encouragementBotComms/bootstrap.js is all ready to go, since we test it from the unit tests. It was just the launcher in bin/vat that wasn't updated.

harden method results, or attach an extra then() in user-level code?

Our serialization code imposes a requirement that everything you pass into or out of a method must be frozen, specifically in the way that our harden() utility does it. The unserializer hardens everything that comes out, which can avoid surprises if e.g. a method modifies an array that it receives.

This is a minor nuisance in patterns which use something like Promise.all to join a bunch of results together on the return path from a method. For example, the last section of our escrow agent (demo/contractHost/escrow.js) is:

  const aT = makeTransfer(a.moneySrcP, b.moneyDstP, b.moneyNeeded);
  const bT = makeTransfer(b.stockSrcP, a.stockDstP, a.stockNeeded);
  return Promise.race([
    Promise.all([aT.phase1(), bT.phase1()]),
    failOnly(a.cancellationP),
    failOnly(b.cancellationP),
  ]).then(
    _x => Promise.all([aT.phase2(), bT.phase2()]),
    _ex => Promise.all([aT.abort(), bT.abort()]),
  );

That final .then yields a Promise.all, which yields an array of results (in this case, whatever the phase2() methods produce). However that Array object won't be frozen, since Promise.all doesn't know that it's supposed to freeze things. The error message is not ideal: we know which method was being invoked (and whose return value was not frozen), but it's not always obvious which line of code is responsible for the final value (you might be returning a promise created by some subroutine, and you might not know whether they harden things or not).

One way to fix this would be to make userspace authors explicitly freeze these things:

  ]).then(
    _x => Promise.all([aT.phase2(), bT.phase2()]),
    _ex => Promise.all([aT.abort(), bT.abort()]),
  ).then(harden, harden);

The other way would be to have the method-dispatching code just automatically freeze the return value from the method invocation. User-level code wouldn't be obligated to do anything special.

It probably comes down to what value we're getting from this requirement. It certainly encourages more liberal use of harden in our code, and makes it clear that we shouldn't be using e.g. mutable Arrays or Objects as API surfaces. But does this reminder help us w.r.t. return values?

To get the contractHost examples to work, I implemented the second option (the method-dispatching code automatically freezes the return value). But maybe we should remove that and add the extra .then(harden) in the contractHost examples instead.

create addVat() device

Some Vats should have the ability to create new Vats. We should probably treat this authority as a "device", to provide it selectively to certain initial vats. This might also make it easier to provide certain authority over the new Vat: termination, migration, upgrade, and debugging.

can't use SES.confineExpr with --no-ses

It's probably obvious, but when we run the vat with --no-ses, we no longer have the SES.confineExpr safe two-argument evaluator for guest code. This means we can't run the contractHost example with --no-ses, at least not the variant that actually uses the contract host (which needs to evaluate the contract source code it receives).

This is related to the question of how to best expose this evaluator. At present, SES provides a SES global endowment, from which .confineExpr (and I think plain .confine) are available. Since this is an endowment, any further evaluation (like the one performed by the contract host) must pass SES through to the evaluated code, since it isn't automatically there, and it isn't otherwise obtainable.

It might be nicer to provide SES as an importable module (possibly with some special-casing when run inside a SES realm), like we do with @agoric/harden. It's not clear how to achieve that, but eventually we need a better approach than a global.

new state-management approach

After we land Agoric/SwingSet#57, all the kernel state will be stored in a single key-value store, which is addressed through get/set messages so it can live in the outer "primal" realm (the get/set messages deal entirely with strings, which are safe to pass cross-realm).

The old state-management approach was to occasionally ask the controller for the entire kernel state, which it returned as a big JSON object, which could then be serialized to disk. At startup time, if the config object contained state, the kernel would be told to loadState(config.state) before it did anything else: this would populate the various tables and also replay the vat transcripts (to bring the javascript object-graph state back up to date). If there was no saved state, the bootstrap function would be called instead.

The new approach I'm thinking of is:

we add a flag to the state which says "bootstrap function has been run" or not
src/controller.js exports a function to build a kvstore object that wraps a file on disk, with a method that writes the full contents to disk. We also provide an empty state (in particular, the vat transcripts are empty) somehow.
buildVatController is constructed with a mandatory kvstore object (either the disk-wrapper, or something that puts state into the cosmos-sdk durable/provable kvstore)
instead of buildVatController using state/no-state to switch between loadState and callBootstrap, it always just calls kernel.startup. Then kernel.startup processes the vat transcripts (which might be empty), then looks at the flag to see whether bootstrap has been run yet or not, and synthesizes/enqueues the bootstrap message if so.
(eventually it might make more sense to include the bootstrap message in the "empty" state, and remove the has-bootstrap-been-run flag)
as the kernel runs, it gets its state directly from the kvstore, and modifies it with get at the same time

In a solo-machine environment, the startup code should either create an empty state object, or build one from the file on disk. Then, after it cycles the kernel each time, it should tell the kvstore to save itself to disk.

In a chain environment, all state is read directly from the chain's kvstore, and changes are processed immediately. We don't need any special post-kernel-cycle save call.

Lazy State Loading for Chain Machines

Our current plan for the cosmos-sdk integration is to defer creating the swingset environment until the first time the x/swingset/handler.go handler is invoked, which will occur some time after the chain node is launched, when a txn containing a swingset message is processed. (if most of the cosmos-sdk messages are to modules other than swingset, this could be a rather long time). At that moment, the handler will deliver the deliverInbound message over to the node.js side, which will realize that it doesn't have a kernel/controller to deliver into, and it will construct them. During construction, it will run kernel.startup, which will rebuild the javascript environment by re-delivering all the vat messages that were recorded in the kvstore state. It can read this kvstore state because we're in the middle of a transaction: that state was unavailable during process startup, so we can't build the swingset module (or rather we can't inject its state) any earlier.

This replay step could take an unpredictable amount of time, since our orthogonal-persistence approach requires us to replay those transcripts, which grow with the age of the vat (rather than with the size/number of objects in those vats). This might interfere with the chain node's ability to validate/vote in a timely fashion: any node that has been restarted since the last swingset message will take a lot longer than the ones that still have that javascript state intact. In the worst case, this could result in slashing as penalty for not voting quickly enough.

It might be nice to reduce this unpredictability by pre-loading as much of the JS state as possible. Our thought is to start with adding a sequence number to the kernel state (maybe counting turns, maybe counting additions to the runqueue).

Then, we manage a separate copy of the kernel state in a file on disk, outside of the normal cosmos-sdk kvstore. We need this separate copy because the kvstore is only available to handlers during the processing of a transaction (the Keeper knows whether it is processing a CheckTx or a DeliverTx, and provides different state objects in the two cases). To load the kernel from an earlier state during node startup, that state must come from the disk. But that state might not match what the node is really using, since we don't get notified when blocks are finished.

To deal with that, we store the messages that provoke turns along with the sequence number of the turn that results, and we can replay these messages to roll forward from the disk-based state to whatever the actual kvstore contains.

Nominally, just before delivering each message to the node.js side, we pull the full kvstore state (including the sequence number) and write it to disk. We can get this state because we're inside a transaction. We don't really want to do this every single message, as it's a lot of data, so we decimate the data in two ways. First, we use the context object to find out what the current block height is, and we only consider writing a new snapshot when that value has changed. Second we only write snapshots once out of every N times (perhaps 100 messages).

After delivering the message, we pull the seqnum from the kvstore (which should now be one larger than before), and append both the seqnum and the contents of the message we delivered to the on-disk file, as an array of (seqnum, message) tuples.

At startup, we read the latest snapshot state from disk, and build a swingset instance from it. This will take a while, since we're replaying every single vat message, but this all happens before the cosmos-sdk node is ready for validation, so it's the best time to do it. We need to manage a short-lived kvstore object with this saved state for a while, separate from the cosmos-sdk's real kvstore.

Then, in the swingset handler, upon entry, we pull the seqnum from the real kvstore, and compare it to the one in the short-lived populated-from-disk kvstore. In general, the real one will have a newer seqnum, because our snapshot is somewhat old. At that point, we read (seqnum,message) pairs from the disk table and apply the messages until it results in the short-lived kvstore having the same seqnum as the real one. While these messages are being applied, only the short-lived kvstore should be modified.

When the seqnum catches up, both kvstores should have the same contents, and the JS state should be the same as it was when the cosmos-sdk node last processed a transaction. At that point, we should swap out the kvstores, leaving the real (cosmos-sdk) one in place, and discarding the short-lived one.

If, for some reason, the on-disk snapshot is too new, the handler can throw out the failed-speculation kernel, and start up a new one, replaying the entire vat transcripts, and just take the latency hit