ava-labs / hypersdk Goto Github PK

View Code? Open in Web Editor NEW

194.0 20.0 100.0 389.75 MB

Opinionated Framework for Building Hyper-Scalable Blockchains on Avalanche

Home Page: https://hypersdk.xyz/

License: Other

Go 81.11% Shell 1.48% Rust 17.12% C 0.30%

avalanche blockchain golang

hypersdk's Issues

Only unmarshal a block during `ParseBlock` if we haven't already done so

Load testing reveals that for every 1 accepted block, we call parse on that block ~17 times

Add cost per state key read

Each transaction should have some sort of fee based on the number of state keys it requires.

We should not make this the entirety of the fee because it would not allow us to charge for interesting compute within an action.

Add support for auth signing using Yubikey (easier than writing a ledger app)

Remove Unnecessary Memory Copies with Slice to Array Conversions

go1.17 added the ability to convert from slice to array/array pointer: https://tip.golang.org/ref/spec#Conversions_from_slice_to_array_or_array_pointer

However, the UX for this was super ugly until go1.20 (just released):

We should now migrate all our codec unmarshaling to do this instead of copying from the tx byte slice (which will dramatically reduce memory allocations in block processing). Thanks to @rafael-abuawad for calling my attention to this in #21.

I wish we had more memory allocations benchmarked so we could show the performance improvement here but don't think it makes sense to wait to add those to add this. We can probably just add one quick one for unmarshaling a dummy tx.

We MUST be careful to only cast exactly the bytes we need so we don't keep the entire backing array of the slice alive if we don't need to (eventhough we usually will because we hold onto the tx bytes): https://utcc.utoronto.ca/~cks/space/blog/programming/GoInteriorPointerGC

Update to `go1.20.2`

go1.20.1 is not stable (golang/go#58798)

[cpu] Test `zstd` compression

ava-labs/avalanchego#1278

During hypersdk load testing, I found that avalanchego spends the majority of time performing compression-related tasks (zstd should help a ton here):

Showing nodes accounting for 16600ms, 71.55% of 23200ms total
Dropped 486 nodes (cum <= 116ms)
Showing top 10 nodes out of 233
      flat  flat%   sum%        cum   cum%
    3040ms 13.10% 13.10%     7990ms 34.44%  compress/flate.(*compressor).deflate
    2500ms 10.78% 23.88%     3170ms 13.66%  compress/flate.(*decompressor).huffSym
    2160ms  9.31% 33.19%     2160ms  9.31%  runtime.memmove
    1820ms  7.84% 41.03%     1820ms  7.84%  crypto/sha256.block
    1730ms  7.46% 48.49%     1730ms  7.46%  runtime/internal/syscall.Syscall6
    1550ms  6.68% 55.17%     1670ms  7.20%  github.com/golang/snappy.encodeBlock
    1460ms  6.29% 61.47%     2290ms  9.87%  compress/flate.(*compressor).findMatch
     820ms  3.53% 65.00%      820ms  3.53%  compress/flate.matchLen (inline)
     760ms  3.28% 68.28%      760ms  3.28%  compress/flate.(*dictDecoder).writeByte
     760ms  3.28% 71.55%      760ms  3.28%  runtime.memclrNoHeapPointers

Ensure optional packer asserts 0 values for all non-registered indices

[x/merkledb] Introduce bytes.Pool for all read/newly created nodes

We should use a bytes.Pool to reduce the number of memory allocations we do during trie operations.

Introduce bytes.Pool for blocks and for transactions

When evicting a cached block, failing to parse a block, or a removing a tx from the mempool, we should add the raw bytes to a pool for later reuse.

Allow `hypersdk` to arbitrary penalize validator uptimes

If a validator does not sign an Avalanche Warp Message or does not send us certain data blobs (in the case of a storage-based Subnet), we should be able to penalize them by reducing their Uptime (which affects whether or not they will be rewarded).

Increase # of txs and # of accounts in load test

It would also be cool if there was an option to increase the number of accounts with each new transaction instead of sending to an existing account (which keeps the trie the same size during the test).

Create a simple tutorial on How To Use the HyperSDK 📖

Create a simple tutorial on how to use the SDK. I believe this would be the most useful thing. Maybe in written or video form.

Topics that the tutorial should cover:

Why? Why use the HyperSDK
How? How to use it
What? What is the reader/viewer going to build in the tutorial
Conclusion How to expand on it.

[tokenvm] Add `spam` command to token-cli

To test the throughput on local/live networks, we'll need some sort of spam script.

To optimize throughput, we should ask for a # of accounts and then generate 1 tx per recipient per second. The max throughput per second will then be # accounts ^ 2.

We should additionally disregard any failures/avoid tracking tx success to maximize throughput.

Update to [email protected]

https://github.com/ava-labs/avalanchego/releases/tag/v1.9.9

[tokenvm] Add simple website demo

Bonus points for adding a MPC-based auth provider like (web3auth) instead of storing private keys in the browser

[memory] Make `tstate.Insert` more efficient and use a `sync.Pool` in `tokenvm.storage`

Allocated memory:

Showing nodes accounting for 19.24GB, 48.93% of 39.32GB total
Dropped 435 nodes (cum <= 0.20GB)
Showing top 10 nodes out of 188
      flat  flat%   sum%        cum   cum%
    2.88GB  7.33%  7.33%     2.88GB  7.33%  github.com/ava-labs/hypersdk/tstate.(*TState).Insert
    2.64GB  6.71% 14.04%     2.64GB  6.71%  github.com/ava-labs/hypersdk/examples/tokenvm/storage.PrefixBalanceKey (inline)
    2.52GB  6.40% 20.44%     2.52GB  6.40%  github.com/ava-labs/avalanchego/utils/wrappers.(*Packer).expand
    2.47GB  6.28% 26.72%     2.47GB  6.28%  github.com/ava-labs/avalanchego/x/merkledb.(*trieView).invalidateChildrenExcept
       2GB  5.09% 31.81%        2GB  5.09%  golang.org/x/exp/maps.Clone[...] (inline)
    1.42GB  3.60% 35.41%     1.42GB  3.60%  github.com/ava-labs/avalanchego/x/merkledb.newPath (inline)
    1.41GB  3.58% 39.00%     3.41GB  8.67%  github.com/ava-labs/avalanchego/x/merkledb.(*node).clone
    1.33GB  3.39% 42.39%     3.51GB  8.93%  github.com/ava-labs/avalanchego/x/merkledb.(*trieView).calculateNodeIDsHelper
    1.29GB  3.28% 45.67%     1.29GB  3.28%  google.golang.org/grpc.(*parser).recvMsg
    1.28GB  3.27% 48.93%     3.07GB  7.82%  github.com/ava-labs/hypersdk/chain.(*Processor).Prefetch.func1

Update `vm/streaming` to properly maintain WS connections (read/write timeouts)

Update to `[email protected]`

Blocked on: ava-labs/avalanche-network-runner#467
Related: #82

Update Minimum Golang Version to 1.20

Will also need to remove all //nolint:gocritic comments about default T values for generics.

Add integration test coverage

https://go.dev/blog/integration-test-coverage

Handle case where state syncing is skipped

2023-03-21T06:07:52.3130418Z �[0;0m�[1;33m[node9-bls] [03-21|06:07:52.312] INFO <2EWQarceXeUfevv4ohDBBtgRbJjRYS54Lrrfr7w7jczbcZFT23 Chain> syncer/state_syncer.go:298 skipping state sync {"reason": "no acceptable summaries found"}
2023-03-21T06:07:52.3131808Z �[0;0m�[1;33m[node9-bls] [03-21|06:07:52.312] INFO <2EWQarceXeUfevv4ohDBBtgRbJjRYS54Lrrfr7w7jczbcZFT23 Chain> bootstrap/bootstrapper.go:123 starting bootstrapper
2023-03-21T06:07:52.3132818Z �[0;0m�[1;33m[node9-bls] [03-21|06:07:52.312] �[0;0mINFO�[0;0m <2EWQarceXeUfevv4ohDBBtgRbJjRYS54Lrrfr7w7jczbcZFT23 Chain> vm/vm.go:379 bootstrapping started
2023-03-21T06:07:52.3143658Z �[0;0m�[1;33m[node9-bls] [03-21|06:07:52.314] INFO <2EWQarceXeUfevv4ohDBBtgRbJjRYS54Lrrfr7w7jczbcZFT23 Chain> common/bootstrapper.go:244 bootstrapping started syncing {"numVerticesInFrontier": 1}
2023-03-21T06:07:52.3145168Z �[0;0m�[1;33m[node9-bls] [03-21|06:07:52.314] INFO <2EWQarceXeUfevv4ohDBBtgRbJjRYS54Lrrfr7w7jczbcZFT23 Chain> bootstrap/bootstrapper.go:553 executing blocks {"numPendingJobs": 0}
2023-03-21T06:07:52.3146378Z �[0;0m�[1;33m[node9-bls] [03-21|06:07:52.314] INFO <2EWQarceXeUfevv4ohDBBtgRbJjRYS54Lrrfr7w7jczbcZFT23 Chain> queue/jobs.go:224 executed operations {"numExecuted": 0}
2023-03-21T06:07:52.3148277Z �[0;0m�[1;33m[node9-bls] [03-21|06:07:52.314] INFO <2EWQarceXeUfevv4ohDBBtgRbJjRYS54Lrrfr7w7jczbcZFT23 Chain> snowman/transitive.go:409 consensus starting {"lastAcceptedBlock": "7cE3pDvTaEx3Gn75wU2ve6RqRbLzdm6jh3ooqpvVhRXnSpCWg"}
2023-03-21T06:07:52.3149778Z �[0;0m�[1;33m[node9-bls] [03-21|06:07:52.314] �[0;0mINFO�[0;0m <2EWQarceXeUfevv4ohDBBtgRbJjRYS54Lrrfr7w7jczbcZFT23 Chain> vm/vm.go:382 normal operation started

TODO

We must hard fail/retry in a loop/wipe disk if the disk was never fully synced
If it was synced and we just bootstrap, then we should mark the VM is ready

Add Avalanche Warp Messaging Support

Reference: https://github.com/ava-labs/xsvm

Add basic stats tracking

Track total txs, total accounts, hourly txs, and hourly accounts for easy dashboard creation before full indexing can be done.

[x/merkledb] re-use data on disk when restarting sync after history window

When we have previously synced but are past the "history window" kept by other nodes, we clear all previously synced data on-disk.

Instead, we should fetch edge proofs of the current state to see if we can avoid re-fetching anything we already have from the rest of the network.

Add shared "simple auth" module

For auth methods that don't require any DB access to verify, we should add shared modules (ex: ED25519 in TokenVM)

tokenvm: ./scripts/run.sh: line 142: ginkgo: command not found

Not sure how to get past this. This is the same problem I'm having with indexvm

ava-labs/indexvm#17

Any help?

go version go1.19.2 darwin/arm64

[Devnet Planning] Hardware Spec

Consider running different devnets at a variety of different hardware specs so devs can have a clear understanding of what different tradeoffs of CPU/RAM/DISKIO allow.

Acceptor Queue can Panic on Shutdown

The acceptor queue does not handle graceful shutdown properly (if we are processing a block when shutdown is triggered): https://github.com/ava-labs/indexvm/actions/runs/4226574719/jobs/7340155987.

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x90 pc=0xbb5fbe]

goroutine 73 [running]:
github.com/ava-labs/hypersdk/chain.(*StatelessBlock).Processed(...)
	/home/runner/go/pkg/mod/github.com/ava-labs/[email protected]/chain/block.go:510
github.com/ava-labs/hypersdk/vm.(*VM).processAcceptedBlocks(0xc0001103c0)
	/home/runner/go/pkg/mod/github.com/ava-labs/[email protected]/vm/resolutions.go:132 +0x9e
created by github.com/ava-labs/hypersdk/vm.(*VM).Initialize
	/home/runner/go/pkg/mod/github.com/ava-labs/[email protected]/vm/vm.go:272 +0x218a

The simplest fix for this would just be to wait on the acceptorQueue in VM.Shutdown.

Wrap `PebbleDB` in `CorruptableDB`

We should wrap PebbleDB in this interface to ensure we never write to a database after receiving an unexpected error (which may lead to unexpected consequences): https://github.com/ava-labs/avalanchego/tree/master/database/corruptabledb

Output Load Test Change in PR

It would be super cool if PRs emitted how much the load test changed (assuming the disk sampling was similar on main).

Integrate PR Coverage Provider

We should be able to emit coverage changes on each PR + add a badge to the README

[chain/processor] fetch values from view concurrently

Once x/merkledb locking is overhauled, it will be possible to concurrently fetch values from disk (already supported by pebble).

[warp] Migrate to using the included `*warp.Message.ID()` method

Error loading workspace: You have opened a nested module.

This is probably not a big deal, but thought I'd mention it.

I am getting the following warning in VS Code:

Error loading workspace: You have opened a nested module. To work on multiple modules at once, 
please use a go.work file. See https://github.com/golang/tools/blob/master/gopls/doc/workspace.md 
for more information on using workspaces.

Ensure we ALWAYS prioritize consensus processes

Under load (lots of RPC tx submissions and gossip), our naïve locking patterns don't yield time to consensus-related processes as soon as possible.

We need to overhaul this design to ensure that no matter what is going on that consensus-related processes get time to execute (and acquire requisite locks) as soon as possible.

For now, the simplest would be to have a "job processor" where we regulate access to MerkleDB based on priority instead of relying on golang to manage lock contention. This can then be reverted once MerkleDB allows concurrent read access.

[crypto] Cleanup Bech32 Naming

https://go.dev/play/p/NQNOAkSxkw5

Inside of avalanchego we pad the input (https://github.com/btcsuite/btcd/blob/902f797b0c4b3af3f7196d2f5d2343931d1b2bdf/btcutil/bech32/bech32.go#L400-L406) to ensure the bytes each encode 5 bits (required by library used) and then we receive an extra by back in decoding because of this. This is not explained correctly in the comments and the variable we use to check the length against is misnamed.

If we do not do this padding, we can fail to parse bech32 strings with the following error:

unable to convert address from 8-bit to 5-bit formatting

[spam] Add probability an account is active in a given spam cycle

This will allow us to increase the number of active accounts without increasing target TPS (better stress test of merkledb).

[x/merkledb] Allow concurrent read access to DB

Workaround: #56

Add storagevm example

everyone is expected to store all data
nodes randomly query each other for certain blobs and bench nodes that can't produce blobs within a timeout consistently
lockup tokens to pay for state you are using
only store blob hashes in state
don't become a validator until you have retrieved all data...otherwise will get benched right away

[memory] Override default block cache config settings

During load testing, I found that avalanchego uses a ton of memory storing parsed blocks (basically 2048 * 2 * <expected block size>):

Showing nodes accounting for 1.94GB, 97.98% of 1.98GB total
Dropped 288 nodes (cum <= 0.01GB)
Showing top 10 nodes out of 65
      flat  flat%   sum%        cum   cum%
    0.95GB 47.83% 47.83%     0.95GB 47.83%  google.golang.org/protobuf/internal/impl.consumeBytesNoZero
    0.82GB 41.52% 89.35%     0.82GB 41.55%  github.com/ava-labs/avalanchego/vms/components/chain.(*State).ParseBlock
    0.16GB  8.07% 97.42%     0.16GB  8.07%  github.com/ava-labs/avalanchego/utils/wrappers.(*Packer).expand (inline)
    0.01GB  0.56% 97.98%     0.01GB  0.56%  github.com/syndtr/goleveldb/leveldb/util.(*BufferPool).Get
         0     0% 97.98%     0.16GB  8.07%  github.com/ava-labs/avalanchego/codec.(*manager).Marshal
         0     0% 97.98%     0.16GB  8.07%  github.com/ava-labs/avalanchego/codec/reflectcodec.(*genericCodec).MarshalInto
         0     0% 97.98%     0.16GB  8.07%  github.com/ava-labs/avalanchego/codec/reflectcodec.(*genericCodec).marshal
         0     0% 97.98%     0.79GB 40.04%  github.com/ava-labs/avalanchego/message.(*inMsgBuilder).Parse
         0     0% 97.98%     0.79GB 40.04%  github.com/ava-labs/avalanchego/message.(*msgBuilder).parseInbound
         0     0% 97.98%     0.79GB 40.04%  github.com/ava-labs/avalanchego/message.(*msgBuilder).unmarshal

This is very likely related to this: https://github.com/ava-labs/avalanchego/blob/7d73b59cb4838d304387ea680b9cc4053b72620c/vms/rpcchainvm/vm_client.go#L65-L70

const (
	decidedCacheSize    = 2048
	missingCacheSize    = 2048
	unverifiedCacheSize = 2048
	bytesToIDCacheSize  = 2048
)

Create a module for all testing helpers in the `TokenVM`

Other HyperVMs (like the IndexVM) would benefit from having a shared set of testing helpers to write integration/load/e2e tests.

[tokenvm] add cluster deployer script for devnets

Devnet deployment should be fully reproducible and OSS. We should aim to use avalanche-ops if possible.

Use new networking handler to expose network support in the `Controller`

Related: #59

Move generic heap to its own package and remove the one in `mempool` and `tokenvm/controller`

[warp] Only build with `CGO_CFLAGS="-O -D__BLST_PORTABLE__"` on Mac and Windows

When building (maybe only from source) on Linux, we should allow optimized BLS assembly code to be used. We disabled this previously because this optimized code panics on some architectures.

BLS verification is client-side and does not depend on how AvalancheGo is compiled: https://github.com/ava-labs/avalanchego/blob/b3a74687d5443e4dfd6fae61802752eee4c72f9b/vms/platformvm/warp/signature.go#L115-L131

[docs] Add a basic website to the `/docs` folder

Would be VERY cool if it had some animations like:

Add a WASM interpreter module that can be embedded in any Action (or replace custom actions)

This could be used to either build some sort of smart-contract based HyperSDK or to build a rollup-optimized subnet that can process any WASM fraud proof.

[tokenvm] Overhaul Fees

Naïve fee pricing/usage targets were added during initial development. We should revisit these and add the updated values to the README.

Changes

Charge for keys specified in pre-fetch
Charge a fee if new state created

Add large block support

To send blocks larger than 2MB, allow a block to include hashes for other chunks that can be downloaded over the P2P layer before verification (and after parse).

If all chunks can't be received, then the block will simply fail verification.

We should probably do this with some form of erasure coding.

While we could simply just "send more blocks", this allows us to perform fewer merkle root calculations and consensus events.

We will need to be very thoughtful about introducing protection mechanisms so a malicious producer can't force us to fetch a bunch of useless data.

Add a Contributors Guide section in the `README.md`

To invite new contributors and to maintain some guidelines adding Contributors Guide section to the README could be really helpful.

ava-labs / hypersdk Goto Github PK

hypersdk's Issues

TODO

Changes

Recommend Projects

Recommend Topics

Recommend Org