Giter Site home page Giter Site logo

ava-labs / hypersdk Goto Github PK

View Code? Open in Web Editor NEW
194.0 20.0 100.0 389.75 MB

Opinionated Framework for Building Hyper-Scalable Blockchains on Avalanche

Home Page: https://hypersdk.xyz/

License: Other

Go 81.11% Shell 1.48% Rust 17.12% C 0.30%
avalanche blockchain golang

hypersdk's Issues

Add cost per state key read

Each transaction should have some sort of fee based on the number of state keys it requires.

We should not make this the entirety of the fee because it would not allow us to charge for interesting compute within an action.

Remove Unnecessary Memory Copies with Slice to Array Conversions

go1.17 added the ability to convert from slice to array/array pointer: https://tip.golang.org/ref/spec#Conversions_from_slice_to_array_or_array_pointer

However, the UX for this was super ugly until go1.20 (just released):
image

We should now migrate all our codec unmarshaling to do this instead of copying from the tx byte slice (which will dramatically reduce memory allocations in block processing). Thanks to @rafael-abuawad for calling my attention to this in #21.

I wish we had more memory allocations benchmarked so we could show the performance improvement here but don't think it makes sense to wait to add those to add this. We can probably just add one quick one for unmarshaling a dummy tx.

We MUST be careful to only cast exactly the bytes we need so we don't keep the entire backing array of the slice alive if we don't need to (eventhough we usually will because we hold onto the tx bytes): https://utcc.utoronto.ca/~cks/space/blog/programming/GoInteriorPointerGC

[cpu] Test `zstd` compression

ava-labs/avalanchego#1278

During hypersdk load testing, I found that avalanchego spends the majority of time performing compression-related tasks (zstd should help a ton here):

Showing nodes accounting for 16600ms, 71.55% of 23200ms total
Dropped 486 nodes (cum <= 116ms)
Showing top 10 nodes out of 233
      flat  flat%   sum%        cum   cum%
    3040ms 13.10% 13.10%     7990ms 34.44%  compress/flate.(*compressor).deflate
    2500ms 10.78% 23.88%     3170ms 13.66%  compress/flate.(*decompressor).huffSym
    2160ms  9.31% 33.19%     2160ms  9.31%  runtime.memmove
    1820ms  7.84% 41.03%     1820ms  7.84%  crypto/sha256.block
    1730ms  7.46% 48.49%     1730ms  7.46%  runtime/internal/syscall.Syscall6
    1550ms  6.68% 55.17%     1670ms  7.20%  github.com/golang/snappy.encodeBlock
    1460ms  6.29% 61.47%     2290ms  9.87%  compress/flate.(*compressor).findMatch
     820ms  3.53% 65.00%      820ms  3.53%  compress/flate.matchLen (inline)
     760ms  3.28% 68.28%      760ms  3.28%  compress/flate.(*dictDecoder).writeByte
     760ms  3.28% 71.55%      760ms  3.28%  runtime.memclrNoHeapPointers

Allow `hypersdk` to arbitrary penalize validator uptimes

If a validator does not sign an Avalanche Warp Message or does not send us certain data blobs (in the case of a storage-based Subnet), we should be able to penalize them by reducing their Uptime (which affects whether or not they will be rewarded).

Increase # of txs and # of accounts in load test

It would also be cool if there was an option to increase the number of accounts with each new transaction instead of sending to an existing account (which keeps the trie the same size during the test).

Create a simple tutorial on How To Use the HyperSDK 📖

Create a simple tutorial on how to use the SDK. I believe this would be the most useful thing. Maybe in written or video form.

Topics that the tutorial should cover:

  • Why? Why use the HyperSDK
  • How? How to use it
  • What? What is the reader/viewer going to build in the tutorial
  • Conclusion How to expand on it.

[tokenvm] Add `spam` command to token-cli

To test the throughput on local/live networks, we'll need some sort of spam script.

To optimize throughput, we should ask for a # of accounts and then generate 1 tx per recipient per second. The max throughput per second will then be # accounts ^ 2.

We should additionally disregard any failures/avoid tracking tx success to maximize throughput.

[memory] Make `tstate.Insert` more efficient and use a `sync.Pool` in `tokenvm.storage`

Allocated memory:

Showing nodes accounting for 19.24GB, 48.93% of 39.32GB total
Dropped 435 nodes (cum <= 0.20GB)
Showing top 10 nodes out of 188
      flat  flat%   sum%        cum   cum%
    2.88GB  7.33%  7.33%     2.88GB  7.33%  github.com/ava-labs/hypersdk/tstate.(*TState).Insert
    2.64GB  6.71% 14.04%     2.64GB  6.71%  github.com/ava-labs/hypersdk/examples/tokenvm/storage.PrefixBalanceKey (inline)
    2.52GB  6.40% 20.44%     2.52GB  6.40%  github.com/ava-labs/avalanchego/utils/wrappers.(*Packer).expand
    2.47GB  6.28% 26.72%     2.47GB  6.28%  github.com/ava-labs/avalanchego/x/merkledb.(*trieView).invalidateChildrenExcept
       2GB  5.09% 31.81%        2GB  5.09%  golang.org/x/exp/maps.Clone[...] (inline)
    1.42GB  3.60% 35.41%     1.42GB  3.60%  github.com/ava-labs/avalanchego/x/merkledb.newPath (inline)
    1.41GB  3.58% 39.00%     3.41GB  8.67%  github.com/ava-labs/avalanchego/x/merkledb.(*node).clone
    1.33GB  3.39% 42.39%     3.51GB  8.93%  github.com/ava-labs/avalanchego/x/merkledb.(*trieView).calculateNodeIDsHelper
    1.29GB  3.28% 45.67%     1.29GB  3.28%  google.golang.org/grpc.(*parser).recvMsg
    1.28GB  3.27% 48.93%     3.07GB  7.82%  github.com/ava-labs/hypersdk/chain.(*Processor).Prefetch.func1

Handle case where state syncing is skipped

2023-03-21T06:07:52.3130418Z �[0;0m�[1;33m[node9-bls] [03-21|06:07:52.312] INFO <2EWQarceXeUfevv4ohDBBtgRbJjRYS54Lrrfr7w7jczbcZFT23 Chain> syncer/state_syncer.go:298 skipping state sync {"reason": "no acceptable summaries found"}
2023-03-21T06:07:52.3131808Z �[0;0m�[1;33m[node9-bls] [03-21|06:07:52.312] INFO <2EWQarceXeUfevv4ohDBBtgRbJjRYS54Lrrfr7w7jczbcZFT23 Chain> bootstrap/bootstrapper.go:123 starting bootstrapper
2023-03-21T06:07:52.3132818Z �[0;0m�[1;33m[node9-bls] [03-21|06:07:52.312] �[0;0mINFO�[0;0m <2EWQarceXeUfevv4ohDBBtgRbJjRYS54Lrrfr7w7jczbcZFT23 Chain> vm/vm.go:379 bootstrapping started
2023-03-21T06:07:52.3143658Z �[0;0m�[1;33m[node9-bls] [03-21|06:07:52.314] INFO <2EWQarceXeUfevv4ohDBBtgRbJjRYS54Lrrfr7w7jczbcZFT23 Chain> common/bootstrapper.go:244 bootstrapping started syncing {"numVerticesInFrontier": 1}
2023-03-21T06:07:52.3145168Z �[0;0m�[1;33m[node9-bls] [03-21|06:07:52.314] INFO <2EWQarceXeUfevv4ohDBBtgRbJjRYS54Lrrfr7w7jczbcZFT23 Chain> bootstrap/bootstrapper.go:553 executing blocks {"numPendingJobs": 0}
2023-03-21T06:07:52.3146378Z �[0;0m�[1;33m[node9-bls] [03-21|06:07:52.314] INFO <2EWQarceXeUfevv4ohDBBtgRbJjRYS54Lrrfr7w7jczbcZFT23 Chain> queue/jobs.go:224 executed operations {"numExecuted": 0}
2023-03-21T06:07:52.3148277Z �[0;0m�[1;33m[node9-bls] [03-21|06:07:52.314] INFO <2EWQarceXeUfevv4ohDBBtgRbJjRYS54Lrrfr7w7jczbcZFT23 Chain> snowman/transitive.go:409 consensus starting {"lastAcceptedBlock": "7cE3pDvTaEx3Gn75wU2ve6RqRbLzdm6jh3ooqpvVhRXnSpCWg"}
2023-03-21T06:07:52.3149778Z �[0;0m�[1;33m[node9-bls] [03-21|06:07:52.314] �[0;0mINFO�[0;0m <2EWQarceXeUfevv4ohDBBtgRbJjRYS54Lrrfr7w7jczbcZFT23 Chain> vm/vm.go:382 normal operation started

TODO

  • We must hard fail/retry in a loop/wipe disk if the disk was never fully synced
  • If it was synced and we just bootstrap, then we should mark the VM is ready

Add basic stats tracking

Track total txs, total accounts, hourly txs, and hourly accounts for easy dashboard creation before full indexing can be done.

[Devnet Planning] Hardware Spec

Consider running different devnets at a variety of different hardware specs so devs can have a clear understanding of what different tradeoffs of CPU/RAM/DISKIO allow.

Acceptor Queue can Panic on Shutdown

The acceptor queue does not handle graceful shutdown properly (if we are processing a block when shutdown is triggered): https://github.com/ava-labs/indexvm/actions/runs/4226574719/jobs/7340155987.

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x90 pc=0xbb5fbe]

goroutine 73 [running]:
github.com/ava-labs/hypersdk/chain.(*StatelessBlock).Processed(...)
	/home/runner/go/pkg/mod/github.com/ava-labs/[email protected]/chain/block.go:510
github.com/ava-labs/hypersdk/vm.(*VM).processAcceptedBlocks(0xc0001103c0)
	/home/runner/go/pkg/mod/github.com/ava-labs/[email protected]/vm/resolutions.go:132 +0x9e
created by github.com/ava-labs/hypersdk/vm.(*VM).Initialize
	/home/runner/go/pkg/mod/github.com/ava-labs/[email protected]/vm/vm.go:272 +0x218a

The simplest fix for this would just be to wait on the acceptorQueue in VM.Shutdown.

Output Load Test Change in PR

It would be super cool if PRs emitted how much the load test changed (assuming the disk sampling was similar on main).

Error loading workspace: You have opened a nested module.

This is probably not a big deal, but thought I'd mention it.

I am getting the following warning in VS Code:

Error loading workspace: You have opened a nested module. To work on multiple modules at once, 
please use a go.work file. See https://github.com/golang/tools/blob/master/gopls/doc/workspace.md 
for more information on using workspaces.

Ensure we ALWAYS prioritize consensus processes

Under load (lots of RPC tx submissions and gossip), our naïve locking patterns don't yield time to consensus-related processes as soon as possible.

We need to overhaul this design to ensure that no matter what is going on that consensus-related processes get time to execute (and acquire requisite locks) as soon as possible.

For now, the simplest would be to have a "job processor" where we regulate access to MerkleDB based on priority instead of relying on golang to manage lock contention. This can then be reverted once MerkleDB allows concurrent read access.

[crypto] Cleanup Bech32 Naming

https://go.dev/play/p/NQNOAkSxkw5

Inside of avalanchego we pad the input (https://github.com/btcsuite/btcd/blob/902f797b0c4b3af3f7196d2f5d2343931d1b2bdf/btcutil/bech32/bech32.go#L400-L406) to ensure the bytes each encode 5 bits (required by library used) and then we receive an extra by back in decoding because of this. This is not explained correctly in the comments and the variable we use to check the length against is misnamed.

If we do not do this padding, we can fail to parse bech32 strings with the following error:

unable to convert address from 8-bit to 5-bit formatting

Add storagevm example

  • everyone is expected to store all data
  • nodes randomly query each other for certain blobs and bench nodes that can't produce blobs within a timeout consistently
  • lockup tokens to pay for state you are using
  • only store blob hashes in state
  • don't become a validator until you have retrieved all data...otherwise will get benched right away

[memory] Override default block cache config settings

During load testing, I found that avalanchego uses a ton of memory storing parsed blocks (basically 2048 * 2 * <expected block size>):

Showing nodes accounting for 1.94GB, 97.98% of 1.98GB total
Dropped 288 nodes (cum <= 0.01GB)
Showing top 10 nodes out of 65
      flat  flat%   sum%        cum   cum%
    0.95GB 47.83% 47.83%     0.95GB 47.83%  google.golang.org/protobuf/internal/impl.consumeBytesNoZero
    0.82GB 41.52% 89.35%     0.82GB 41.55%  github.com/ava-labs/avalanchego/vms/components/chain.(*State).ParseBlock
    0.16GB  8.07% 97.42%     0.16GB  8.07%  github.com/ava-labs/avalanchego/utils/wrappers.(*Packer).expand (inline)
    0.01GB  0.56% 97.98%     0.01GB  0.56%  github.com/syndtr/goleveldb/leveldb/util.(*BufferPool).Get
         0     0% 97.98%     0.16GB  8.07%  github.com/ava-labs/avalanchego/codec.(*manager).Marshal
         0     0% 97.98%     0.16GB  8.07%  github.com/ava-labs/avalanchego/codec/reflectcodec.(*genericCodec).MarshalInto
         0     0% 97.98%     0.16GB  8.07%  github.com/ava-labs/avalanchego/codec/reflectcodec.(*genericCodec).marshal
         0     0% 97.98%     0.79GB 40.04%  github.com/ava-labs/avalanchego/message.(*inMsgBuilder).Parse
         0     0% 97.98%     0.79GB 40.04%  github.com/ava-labs/avalanchego/message.(*msgBuilder).parseInbound
         0     0% 97.98%     0.79GB 40.04%  github.com/ava-labs/avalanchego/message.(*msgBuilder).unmarshal

This is very likely related to this: https://github.com/ava-labs/avalanchego/blob/7d73b59cb4838d304387ea680b9cc4053b72620c/vms/rpcchainvm/vm_client.go#L65-L70

const (
	decidedCacheSize    = 2048
	missingCacheSize    = 2048
	unverifiedCacheSize = 2048
	bytesToIDCacheSize  = 2048
)

[tokenvm] Overhaul Fees

Naïve fee pricing/usage targets were added during initial development. We should revisit these and add the updated values to the README.

Changes

  • Charge for keys specified in pre-fetch
  • Charge a fee if new state created

Add large block support

To send blocks larger than 2MB, allow a block to include hashes for other chunks that can be downloaded over the P2P layer before verification (and after parse).

If all chunks can't be received, then the block will simply fail verification.

We should probably do this with some form of erasure coding.

While we could simply just "send more blocks", this allows us to perform fewer merkle root calculations and consensus events.

We will need to be very thoughtful about introducing protection mechanisms so a malicious producer can't force us to fetch a bunch of useless data.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.