onflow / flow-go Goto Github PK

A fast, secure, and developer-friendly blockchain built to support the next generation of games, apps, and the digital assets that power them.

License: GNU Affero General Public License v3.0

Makefile 0.44% Go 98.96% Dockerfile 0.03% Shell 0.04% Assembly 0.10% Perl 0.03% Cadence 0.35% Solidity 0.05%

flow onflow golang go blockchain

flow-go's Issues

Implement withNoCacheOnRetrieve option for badger cache

Problem

I was checking getEventsForBlockIDs code, I saw it is also caching items that will not be requested again. (like old blockIDs)

flow-go/storage/badger/cache.go

Line 104 in 733ff7c

evicted := c.cache.Add(key, resource)

The Proposed Solution

Adding something like withNoCacheOnRetrieve option can increase the hit rate.

[Access] Update signature field for CollectionGuarantee and Block

For CollectionGuarantee, signatures is defined as array, https://bit.ly/2TjAQ5m, while only a single one is actually populated: https://bit.ly/3wbjrZZ
Same thing for Block:
https://bit.ly/3xeJel2
https://bit.ly/3ha2uuB

The signatures currently being used are aggregated signatures, which are intended to replace a list of multiple individual signatures.

We should update the protobuf entities to reflect the Go types.

In addition, we should also add all the other fields specified by a block, like ParentVoterIDs and ProposerID and ProposerSig (and the Payload stuff).

Since this may involve removing the old fields, it may be a breaking change for clients.

[Access] Fix TODO's for unstaked node

TODO's from DHT PR:

#1098 (comment)
https://github.com/onflow/flow-go/pull/1098/files#r687050836 ( updating from zeroID to use idTranslator)

[Access] modify relay engine to only relay validated proposals

First of all, we probably only need to relay messages on a specific set of channels (namely, the receive-blocks channel).

Second of all, we should only relay validated block proposals to the unstaked network, and dropping duplicates instead of blinding relaying every message that is received.

Optimize caching approvals for non-processable results

I think the current code would work, though it behaves a little bit strange in my opinion:

we cache approvals for blocks, whose result we haven't seen yet
when the result comes along (but is detached, e.g. because we are missing an ancestor result), we throw away all the approvals because the AssignmentCollector is not processable.

Overall, I feel we should still cache the approval (in the collector), even if the collector is not yet processable, because it might become processable in the future.

This goes back to the three different states a collector can be in:

not yet processable
processable
orphaned

Previously, we tried to represent these states with a boolean variable. I see different ways forward:

Option (i): we keep the boolean variable. Then we would need two different caches for each AssignmentCollector:
- a cache for verified approvals and a cache for unverified approvals
- while the collector has processable = false all approvals go in the cache for unverified ones
- when the collector changes to processable = true, we verify the approvals (potentially concurrently) and then move them over into the verified cache as well as account for them in the ApprovalCollectors
Option (ii): Alternatively, we could abstract AssignmentCollector by an interface and have 3 different implementations, one for each state:
- In state 1., the implementation only caches the approvals without verification
- In state 2. the implementation verifies all approvals
- In state 3. all operations on the AssignmentCollector are NoOps
Option (iii): we change the boolean state to a enum (well Go's abomination of enums)

I am happy to leave this as tech-debt task for a later PR.

Originally posted by @AlexHentschel in #736 (comment)

[Access Node] create godoc example for splitter / relay engine

Context: #947 (comment)

Let's create a runnable example that clearly demonstrates how to use the splitter engine.

[State sync engine] Implement prioritization in bitswap

This issue is tracked in the Bitswap repo here: ipfs/boxo#82

We would like to preserve the bandwidth of Execution Nodes, which are the only creators of new data. One way to do this is to implement some mechanism in bitswap for how we want to prioritize responding to requests for blocks.

Execution nodes should prioritize responding to requests for more recent blocks, to make sure that enough AN's download these blocks

Is Threre a testnet blockchain explorer in flow

🐞 Bug Report

A clear and concise description of what the bug is.

What is the severity of this bug?

Critical - Urgent: We can't do anything if this isn't actioned immediately (product doesn't function without this, it's blocking us or users, or it resolves a high severity security issue). Whole team should drop what they're doing and work on this.

Critical: We can't do anything if this isn't actioned immediately (product doesn't function without this, it's blocking us or users, or it resolves a high severity security issue). One person should look at this right now.

Important: * We have to do this before we ship, but it can wait until the next sprint (product or feature won't function without it, but it's not blocking us or users right now). Team should do this in the next sprint.

Should have: * It would be better if we do this before we ship, but it's OK if we don't (product functions without this, but it's a better user experience). Consider adding to a future sprint.

Could have: It really doesn't matter if we do this (product functions without this, impact to user is minimal).

Reproduction steps

Steps to reproduce the behaviour:

Make API call '...'
See error

I followed these steps and was able to reproduce the issue

Expected behaviour

A clear and concise description of what you expected to happen.

Screenshots

If applicable, add screenshots to help explain your problem.

Specifications

System: (e.g. macOS 10.13.4, Linux)
Go: (e.g. 1.13)

Additional context

Add any other context about the problem here.

Alchemy - Support for Node-ID via a file read, instead of a command line parameter

Problem Definition

What problem is this solving?

Currently, the node-id for a Flow node-client must be passed in as a command-line argument to the docker image.
--nodeid ${FLOW_GO_NODE_ID}

Unfortunately, there is no way to automate this parameter via ECS Task Definitions for auto-scaling to be set individually per node (e.g. using an environment variable or a file to set the node-id programmatically). This means that nodes cannot be auto-generated with unique node ids.

Proposed Solution

Enable support for the nodeid to be passed in as a file location on the instance on which the docker image is running.

What are the proposed solutions to this problem?

Definition of Done

What tests must pass for this issue to be considered solved?

Actions Needed Before Submitting

Update ticket status using the following (remove this section once ticket created)

What workstream does this ticket deal with? Find the appropriate 'S-' label and add that label.
Is it a specific 'type' of ticket (ex: bug, documentation)? If yes, add that label as well.
Is this ticket related to an overarching theme (ex: architecture, performance)? If yes, add that label as well.
Add any/all descriptive characteristic labels needed (ex: Needs Estimation, Needs Test Cases).
Now we should determine what release this ticket is associated with. If none, leave it blank. If it is associated with a specific release, please add it to the appropriate release.
If this ticket is associated with a release, we want to assign it a level of importance within that release. These labels follow the standard MoSCoW method rules. We want to look at releases and then the importance of tickets within those specific releases. So the MoSCoW label is ONLY valid when it is taken in conjunction with its release.
Assign this ticket a priority level (High, Medium, Low) via the appropriate label. These labels control the importance of the ticket within the sprint. For example, all P-High tickets should be worked on first, then P-Medium, then P-Low. This gives us an easy way to identify the order of priority for tickets within a specific sprint.

[Network] Verify originID against libp2p ID

Today, the network layer does not verify the OriginID of a message, and it passes it directly to the engines.

There are many reasons that this is bad (opens the door to impersonation attacks), and now with unstaked nodes it is even more important to address this.

The verification for staked nodes can be done by checking the OriginID against the From field of the raw message we get from libp2p. Given a Flow ID, we should be able to deduce the libp2p peer ID from the protocol state.
For unstaked nodes, we will do the validation by comparing the given OriginID against the result of whatever deterministic libp2p ID -> Flow ID mapping we end up deciding on.
For staked AN's, we will validate messages from the unstaked network inside the libp2p topic validator, but using the same method as above.

As a result, we may need to do some refactoring so that the network layer of the code is aware of whether it is a staked or unstaked network, so it can choose which verification logic to use from above. Alternatively, we can implement some message validators that do this validation.

We can write a test to ensure that this is all working:

Start up two nodes, and from one of them send a message with an incorrect origin ID
Check that this message is discarded by the receiving node.

Update error type for pruning functions

This revealed a bit of a terminology conflict:

The error is using Height, while we prune here by view.

A potential solution would be to relax the name of the error a bit; e.g. we could call it AlreadyPrunedError. It is much less specific ... but maybe that makes the error more broadly applicable 🤷

Nothing we have to address in this PR, just shopping for ideas. cc @zhangchiqing

Originally posted by @AlexHentschel in #1198 (comment)

[Network] implement PeerFilter

Implement PeerFilter for staked nodes once libp2p/go-libp2p-pubsub#451 is added to a release, and bump the go-libp2p-pubsub version in flow dependencies.

Create a crypto/Makefile target for BLST-related tests that does ADX detection

As a general comment, can the if statement be run locally too and not only on CI ?

If yes, it would be cleaner to have a new Makefile target crypto/cross-blst that encapsulates all this.
Eventually, when we make sure this cross-blst tests always pass, the new target would be called by the unit-tests and we can remove the blst CI job

Originally posted by @tarakby in #1227 (comment)

Add Domain name to the X509 certificate published by the Access node.

Problem Definition

What problem is this solving?
The current X509 certificate published by the Access node does not include a domain name. This makes the standard Java crypto library fail to load it.

Proposed Solution

What are the proposed solutions to this problem?
Include a domain name in the certificate.

Definition of Done

What tests must pass for this issue to be considered solved?

Actions Needed Before Submitting

Update ticket status using the following (remove this section once ticket created)

What workstream does this ticket deal with? Find the appropriate 'S-' label and add that label.
Is it a specific 'type' of ticket (ex: bug, documentation)? If yes, add that label as well.
Is this ticket related to an overarching theme (ex: architecture, performance)? If yes, add that label as well.
Add any/all descriptive characteristic labels needed (ex: Needs Estimation, Needs Test Cases).
Now we should determine what release this ticket is associated with. If none, leave it blank. If it is associated with a specific release, please add it to the appropriate release.
If this ticket is associated with a release, we want to assign it a level of importance within that release. These labels follow the standard MoSCoW method rules. We want to look at releases and then the importance of tickets within those specific releases. So the MoSCoW label is ONLY valid when it is taken in conjunction with its release.
Assign this ticket a priority level (High, Medium, Low) via the appropriate label. These labels control the importance of the ticket within the sprint. For example, all P-High tickets should be worked on first, then P-Medium, then P-Low. This gives us an easy way to identify the order of priority for tickets within a specific sprint.

[State sync engine] Design sunsetting protocol

We will need a sunset boundary (e.g. once a day / week) past which nodes are synced to a state checkpoint rather than a sequence of state diffs.

We will need to figure out the details of how this will be implemented:

Over time, state diffs can be accumulated into batches of n starting from the state checkpoint, which is how nodes can catchup from the checkpoint to the latest state.
How does a newly spun up node discover what the CID of the latest checkpoint is? Some ideas:
- We could store a block id -> CID mapping in the DHT. But is this reliable / BFT?
- Each execution result could contain the hash for the latest checkpoint as of that block. Thus, anyone following consensus can find out the hash of the latest checkpoint, and therefore start requesting that checkpoint.
- There could be a separate channel that is used for requesting this information (and if so, we will need to create a new engine on execution nodes and access nodes to respond to these requests):
  - A newly spun up node could request the latest checkpoint hash on this channel.
  - We also would need a separate channel where the node could find out about the aggregated state diffs from the checkpoint to latest state.

[AccessNode] Unstaked AN should start with modified root protocol snapshot that does not contain Identities

Problem Definition

What problem is this solving?

Currently, the root protocol snapshot file contain all the identities. These identities should not be exposed to the unstaked AN and it should be able to boot up from a version of the root protocol snapshot that does not contain these identities.

Proposed Solution

What are the proposed solutions to this problem?

Definition of Done

What tests must pass for this issue to be considered solved?

Actions Needed Before Submitting

Update ticket status using the following (remove this section once ticket created)

What workstream does this ticket deal with? Find the appropriate 'S-' label and add that label.
Is it a specific 'type' of ticket (ex: bug, documentation)? If yes, add that label as well.
Is this ticket related to an overarching theme (ex: architecture, performance)? If yes, add that label as well.
Add any/all descriptive characteristic labels needed (ex: Needs Estimation, Needs Test Cases).
Now we should determine what release this ticket is associated with. If none, leave it blank. If it is associated with a specific release, please add it to the appropriate release.
If this ticket is associated with a release, we want to assign it a level of importance within that release. These labels follow the standard MoSCoW method rules. We want to look at releases and then the importance of tickets within those specific releases. So the MoSCoW label is ONLY valid when it is taken in conjunction with its release.
Assign this ticket a priority level (High, Medium, Low) via the appropriate label. These labels control the importance of the ticket within the sprint. For example, all P-High tickets should be worked on first, then P-Medium, then P-Low. This gives us an easy way to identify the order of priority for tickets within a specific sprint.

[Admin server] merge command context with worker context

Comment: #1222 (comment)

[Access] Choose a sensible value for `findPeerQueryTimeout`

Leftover things from #1133

Choose a sensible value for findPeerQueryTimeout

Building on m1 mac

🐞 Bug Report

make install-tools fails on m1 mac.

What is the severity of this bug?

Could have:

Reproduction steps

Check out code on m1 mac and follow the instruction until make install-tools

Expected behaviour

It should build

Screenshots

Specifications

System: Mac os X Big Sur on M1 Mac Mini
Go: 1.16.4

Additional context

diff --git a/crypto/relic_build.sh b/crypto/relic_build.sh
index bb70d3123..a0f98d388 100755
--- a/crypto/relic_build.sh
+++ b/crypto/relic_build.sh
@@ -17,7 +17,7 @@ pushd "$DIR/relic/build"
 #
 GENERAL=(-DTIMER=CYCLE -DCHECK=OFF -DVERBS=OFF)
 LIBS=(-DSHLIB=OFF -DSTLIB=ON)
-COMP=(-DCOMP="-O3 -funroll-loops -fomit-frame-pointer -march=native -mtune=native")
+COMP=(-DCOMP="-O3 -funroll-loops -fomit-frame-pointer -mtune=native")
 RAND=(-DRAND=HASHD -DSEED=)```

seems to fix the problem.

[admin interface] implement commands

Ideas:

enable logging / changing log level
enable profiler
enable tracing
Toggling unstaked network on / off on access nodes

Along with the implementation, we should also add some metrics to indicate whether the toggle is on or off

[Access Node] Getting Block ID by Transaction ID

Problem Definition

There's currently no way to get the block ID a given transaction ID has been included on with the access node endpoints provided. GetTransaction or GetTransactionResult do not provide block ID. I do not know whether the functionality is there but not exposed or if there is a workaround. Some developer teams are using reference block ID as a workaround.

Collection ID is also a nice additional field to have on the response.

[Access] Design the One-to-one messaging scheme for the unstaked and staked access node

Problem Definition

Currently, the networking only allows uni-direction one-to-one call. A node can request an entity from another node using the one-to-one networking call. The reply will be a separate one-to-one call from the receiver to the sender.
The message itself only carries the node id as the origin ID and not the complete address.
This means that a scheme is needed to figure out the nodeID to address mapping. In the current staked network, this is done via looking up the identity table but in the unstaked network there is no identity table to look up.

Proposed Solution

What are the proposed solutions to this problem?

Definition of Done

What tests must pass for this issue to be considered solved?

Actions Needed Before Submitting

Update ticket status using the following (remove this section once ticket created)

What workstream does this ticket deal with? Find the appropriate 'S-' label and add that label.
Is it a specific 'type' of ticket (ex: bug, documentation)? If yes, add that label as well.
Is this ticket related to an overarching theme (ex: architecture, performance)? If yes, add that label as well.
Add any/all descriptive characteristic labels needed (ex: Needs Estimation, Needs Test Cases).
Now we should determine what release this ticket is associated with. If none, leave it blank. If it is associated with a specific release, please add it to the appropriate release.
If this ticket is associated with a release, we want to assign it a level of importance within that release. These labels follow the standard MoSCoW method rules. We want to look at releases and then the importance of tickets within those specific releases. So the MoSCoW label is ONLY valid when it is taken in conjunction with its release.
Assign this ticket a priority level (High, Medium, Low) via the appropriate label. These labels control the importance of the ticket within the sprint. For example, all P-High tickets should be worked on first, then P-Medium, then P-Low. This gives us an easy way to identify the order of priority for tickets within a specific sprint.

[State sync engine] Add new "ExecutionDataCID" field to ExecutionResult

Add a ExecutionDataCID field to execution result, so that consensus followers can know how to request the state diff for a given block over bitswap.

Each time an execution node computes a new ExecutionData for a block, it can forward it to the ExecutionDataStorer to be stored into the blockstore, and include the returned CID in the execution result.

Note: This may introduce additional latency / CPU usage on execution nodes, we should benchmark to see how much the performance impact will be.

Unable to build flow node from source

I've been trying to build through docker an instance of an access flow node. Here's the part of my current Dockerfile which I request for the zipped binary and try to build the node (using golang:1.15 image), FLOW_VERSION used were 0.20.5 (first error) and 0.22.2 (second error):

ADD https://github.com/onflow/flow-go/archive/refs/tags/v${FLOW_VERSION}.tar.gz /tmp/flow-go.tar.gz

RUN tar xzf /tmp/flow-go.tar.gz -C /tmp \
  && rm -rf /tmp/flow-go.tar.gz \
  && cd /tmp/flow-go-${FLOW_VERSION} \
  && GO11MODULE=on CGO_ENABLED=1 GOOS=linux GOARCH=amd64 \
  go build -o ./app ./cmd/${FLOW_NODE_TYPE}

The problems that I've been having seems to stem from not having proper access to some of the internal repos, I'll leave some logs that might give an idea of my problem:

#8 28.51 go: github.com/onflow/[email protected]: reading github.com/dapperlabs/cadence-internal/go.mod at revision v0.18.0-patch.6: unknown revision v0.18.0-patch.6

and

8 37.06 go: github.com/onflow/[email protected]: invalid version: git fetch -f origin refs/heads/*:refs/heads/* refs/tags/*:refs/tags/* in /go/pkg/mod/cache/vcs/db392155395003ec64ede82caba0be55603c47c0862c4e5dd9a86e3f28d64f5d: exit status 128:
#8 37.06        fatal: could not read Username for 'https://github.com': terminal prompts disabled

The steps that I'm using to build the node from source are the same as this Dockerfile RUN but without the --mount flags and ldflags set for the go build command.

Is it possible to have access to these repos to properly build the node from source? Or maybe I'm missing something that is important to build it.

[ledger] Remove stopc in the compactor (using Lifecycle)

You may be interested in #1077, which will allow you to replace c.stopc as well :)

Originally posted by @smnzhu in #1073 (comment)

[Unstaked AN] Filter sync engine identifiers by protocol ID supported

Currently, the SyncEngineIdentifierProvider for the unstaked node returns the set of all peers in the node's peerstore.

However, we should really be filtering for only those nodes which support the Flow protocol ID, since it is possible for the unstaked node to be connected to any number of other random peers (unlikely, but possible since technically anyone with the node's address and running libp2p code can initiate a connection with the node).

Protobook has some methods that may help, but I'm not sure if the implementation is populated with some initial date when we first connect to a Peer, or is it only slowly populated over time as we create streams with that peer and negotiate protocols.

We may need to implement a new option in Libp2p so that peers can exchange a set of protocols that they support upon initial connection.

For example, say I am using some peer discovery (e.g DHT discovery), it would be useful to know which of these peers support a particular protocol (or at least claim to), so that if I want to pick some of them to talk to, I don’t need to blindly try creating streams with every single one of them

[Test] Fix test flakiness

We have an ongoing problem with unit and integration test flakiness, specifically, inconsistent test failures when nothing has changed. This causes delays in PR merges and makes the team less trustful of test failures because it's not clear if a test failure is because of a bug or due to test flakiness.

This epic examines dealing with test flakiness in multiple ways

developing tools to detect test flakiness
fixing specific flaky tests
creating standards for how to write / run tests to minimize test flakiness

[Storage] Optimize Badger DB throughput

Problem Definition

In most places, the storage layer of Flow currently uses single transactions whose size depends on the semantics of where they happen. We often wait for a single Badger transaction to commit before issuing the next one. This is not the optimal way to maximize throughput for Badger, according to the Badger FAQ:

https://dgraph.io/docs/badger/faq/#my-writes-are-really-slow-why

In summary, transactions should be as big as possible, and should happen concurrently, in order to maximize Badger's write performance. However, the available Write Batch API is less than optimal for integration into Flow.

Proposed Solution

In the Flow DPS, we have optimized our own API, using the approach from the Badger Write Batch API, but integrating it with our own storage API seamlessly.

We create transactions manually and fill them until they reach their maximum size. Once the transaction is too big, start it's commitment asynchronously and apply the failed operation to a new transaction, see:

https://github.com/optakt/flow-dps/blob/master/service/index/writer.go#L246

We limit the maximum number of in-flight concurrent transactions using a semaphore, and use a channel to handle errors asynchronously as well. We believe this approach could be applied to Flow as well, and would hopefully increase the DB performance.

The caching layer takes care of making the new data available before any Badger transaction is fully committed, so data availability delays should not be an issue. One special consideration with Flow would be that the database might have partial updates for a block if the node crashes. This could be solved by tracking the height last and dropping all data for incomplete block data upon restart.

Fix flaky test - TestEpochs/TestViewsProgress

TestEpochs/TestViewsProgress test keeps failing intermittently in CI test runs.

--- FAIL: TestEpochs (500.00s)
--- PASS: TestEpochs/TestEpochJoin (97.74s)
--- FAIL: TestEpochs/TestViewsProgress (402.25s)
FAIL
FAIL github.com/onflow/flow-go/integration/tests/epochs 500.110s

Test is added to quarantine here

FVM: Upkeep automation

Context

Upkeep task should be automated if they can be.

Upkeep tasks that could benefit from automation:

keeping flow-go (mostly FVM) up to date with the latest cadence
keeping flow-emulator up to date with the latest flow-go
keeping flow-playground-api up to date with the latest flow-go

Definition of Done

Improve automation of upkeep tasks

[State sync engine] define IPLD Schema for state diff

Define an IPLD schema for BlockData which we can use for the state diffs.

Docs: https://ipld.io/docs/data-model/kinds/

[unit tests] Fix soon-to be flaky unit tests calling AssertXX in an assert.Eventually

This makes sense!
OK, is this true of other assert calls?
I'm worried about these in our code base:
https://gist.github.com/3eecce8d13fc60e45180039855171324

I think so - the AssertXXX calls eventually call testing.Errorf which marks the test as failed.
Hence this pattern of calling Eventually { AssertXXX} will not work. I will fix those in a different PR.

Originally posted by @vishalchangrani in #1173 (comment)

[Lifecycle] Update `ConsensusFollower` and `CommandRunner` to implement `Startable`

Update ConsensusFollower and CommandRunner to implement Startable with proper error handling once Startable has been merged to master.

[Access] Update signature field for CollectionGuarantee and Block

The signatures currently being used are aggregated signatures, which are intended to replace a list of multiple individual signatures.

We should update the protobuf entities to reflect the Go types.

Since this may involve removing the old fields, it may be a breaking change for clients.

Define a new NodeInfo type NodeInfoTypePrivateUnstaked

Problem Definition

What problem is this solving?

Currently, the NodeInfo can be one of three types:
A new type needs to be defined - NodeInfoTypePrivateUnstaked for the unstaked node. Care needs to be taken that this new type which allows for a nil staking key does not introduce any side effects.

Proposed Solution

What are the proposed solutions to this problem?

Definition of Done

What tests must pass for this issue to be considered solved?

Actions Needed Before Submitting

Update ticket status using the following (remove this section once ticket created)

What workstream does this ticket deal with? Find the appropriate 'S-' label and add that label.
Is it a specific 'type' of ticket (ex: bug, documentation)? If yes, add that label as well.
Is this ticket related to an overarching theme (ex: architecture, performance)? If yes, add that label as well.
Add any/all descriptive characteristic labels needed (ex: Needs Estimation, Needs Test Cases).
Now we should determine what release this ticket is associated with. If none, leave it blank. If it is associated with a specific release, please add it to the appropriate release.
If this ticket is associated with a release, we want to assign it a level of importance within that release. These labels follow the standard MoSCoW method rules. We want to look at releases and then the importance of tickets within those specific releases. So the MoSCoW label is ONLY valid when it is taken in conjunction with its release.
Assign this ticket a priority level (High, Medium, Low) via the appropriate label. These labels control the importance of the ticket within the sprint. For example, all P-High tickets should be worked on first, then P-Medium, then P-Low. This gives us an easy way to identify the order of priority for tickets within a specific sprint.

Determine if setting GOMAXPROCS higher improves Flow TPS

Problem Definition

There are two potential issues related to GOMAXPROCS.

1. 🚀 BadgerDB can be faster with higher GOMAXPROCS

BadgerDB and possibly other 3rd-party packages may benefit from increased GOMAXPROCS. BadgerDB random reads on NVMe storage on a 4-CPU system (BadgerDB TPS ≠ Flow TPS):

57501.74 TPS using MAXPROCS=4
104680.17 TPS using MAXPROCS=64 <-- nice speedup but how does it compare to 8, 16, 24, or 32?
105601.16 TPS using MAXPROCS=128 <-- likely not worth extra overhead on other parts of system

We don't want to maximize BadgerDB TPS at the cost of Flow TPS, so the best GOMAXPROCS for us is probably going to be lower than what is suggested by BadgerDB in their docs (currently 128 👀 in their Quickstart Guide).

2. 🪲 Entire Go program potentially stalling on mmap

Some articles claim entire Go programs can stall if all goroutines get stuck accessing cold data regions of mmap files:

What happens if GOMAXPROCS goroutines concurrently access cold data regions in mmaped file? Complete stall of the whole program until the OS resolves major page faults caused by these goroutines!

I haven't verified the article's claim about program stalls and the Go runtime says:

The GOMAXPROCS variable limits the number of operating system threads that can execute user-level Go code simultaneously. There is no limit to the number of threads that can be blocked in system calls on behalf of Go code; those do not count against the GOMAXPROCS limit. This package's GOMAXPROCS function queries and changes the limit.

There are 26 go.sum files in onflow (across various projects) with "mmap-go". I think BadgerDB uses mmap without using edsrzf/mmap-go, so maybe it's Prometheus, etc.

Proposed Solution

Determine if increasing GOMAXPROCS improves Flow TPS, etc. By using high-level metrics, we can avoid pitfalls of relying solely on micro-benchmarks and save time. We can compare Flow TPS after bumping up a single number before we explore more time-consuming changes related to this issue.

Rather than using a hard-coded GOMAXPROCS such as 128 proposed by BadgerDB, we can use benchmarks comparisons with real data from mainnet snapshot to measure Flow TPS, etc.

Maybe on a 16-CPU server with non-shared NVMe storage, we can try GOMAXPROCS=32 and then adjust up/down depending on Flow TPS and other relevant metrics. Unfortunately, BadgerDB used a 4-CPU system for their benchmarks, so we won't benefit from the gains they got bumping up to 16 GOMAXPROCS.
Determine if mmap-related stalls of entire Go programs are possible and if that might occur in Flow. If the article's claims are confirmed to be valid with Go 1.16, then open a separate ticket to mitigate risks.

In Go 1.10, the limit on GOMAXPROCS was removed. In Go 1.9, GOMAXPROCS max limit was 1024. If we look back far enough, we'll find ancient posts arguing about the risks of GOMAXPROCS being higher than 1 due to overhead. Let's keep an open mind and measure Flow TPS before debating overhead or proposing design changes to avoid that overhead, etc.

Definition of Done

Identified the GOMAXPROCS setting for common CPU counts (4-CPU, ..., 64-CPU) that provides the highest Flow TPS.
Verified if mmap-related stalls as described in some Go articles are possible. If they are then open a separate ticket to mitigate those risks.
No regressions (avoid introducing performance hits in other parts of the system that can affect Flow TPS.)

More info

mmap may slow down your Go app
mmap induced stalls are noticable when ... -- author mentions etcd (etcd uses bbolt which uses mmap).

BadgerDB and MAXPROCS benchmarks on a 4-CPU machine

From: https://github.com/dgraph-io/badger-bench/blob/master/randread/maxprocs.txt

With fio
$ fio --name=randread --ioengine=libaio --iodepth=32 --rw=randread --bs=4k --direct=0 --size=2G --numjobs=16 --runtime=240 --group_reporting

Average:          DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
Average:         xvda      0.09      4.57      0.71     59.20      0.00      0.00      0.00      0.00
Average:      nvme0n1 118063.07 944503.71      0.86      8.00     12.75      0.11      0.01    100.36


With Go (default GOMAXPROCS, should be 4 because 4 core machine)
Average:          DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
Average:         xvda      1.27     12.00      4.76     13.19      0.00      0.21      0.21      0.03
Average:      nvme0n1  57501.74 548921.95      0.43      9.55      6.43      0.11      0.02     99.76

With Go, GOMAXPROCS=64
Average:          DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
Average:         xvda      0.11      0.17      3.30     32.00      0.00      0.80      0.80      0.01
Average:      nvme0n1 104680.17 981817.39      0.00      9.38     12.82      0.12      0.01    100.04

With Go, GOMAXPROCS=128
Average:          DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
Average:         xvda      0.40      0.32      5.92     15.60      0.00      0.20      0.20      0.01
Average:      nvme0n1 105601.16 989440.35      0.00      9.37     12.79      0.12      0.01    100.04


With GOMAXPROCS=32
        Command being timed: "./randread --dir /mnt/data/fio --mode 1 --seconds 60"
        User time (seconds): 23.34
        System time (seconds): 100.91
        Percent of CPU this job got: 207%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 1:00.00
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 3820
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 9
        Minor (reclaiming a frame) page faults: 416
        Voluntary context switches: 2958129
        Involuntary context switches: 2525
        Swaps: 0
        File system inputs: 59343840
        File system outputs: 0
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0

With GOMAXPROCS=128
        Command being timed: "./randread --dir /mnt/data/fio --mode 1 --seconds 60"
        User time (seconds): 21.59
        System time (seconds): 104.34
        Percent of CPU this job got: 209%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 1:00.00
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 3968
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 11
        Minor (reclaiming a frame) page faults: 590
        Voluntary context switches: 2956871
        Involuntary context switches: 2591
        Swaps: 0
        File system inputs: 59264616
        File system outputs: 0
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0

Caveats

I stumbled across this while helping with the design of the next storage architecture and haven't confirmed the data presented by third parties (such as the claim about mmap in Go programs).

I'm not the best person (at this time) to tackle this. If you're interested in tackling this issue, please feel free to assign yourself.

My apologies if the optimal settings for GOMAXPROCS on various nodes for common CPU counts were already investigated.

Updates

June 6, 2021 -- Update content to make it more clear there are potentially two separate issues. Make it more clear that our goal is to optimize for Flow TPS (not just BadgerDB). Mention Prometheus using mmap. Update title to replace "performance" with "Flow TPS".

[Network] remove Engine and replace with MessageProcessor

Context: #947 (review)

We should remove the Engine interface from the networking layer into the module package (there exists already an engine interface there, which is wrapping network.Engine).

[Access] Allow unstaked AN's to send sync requests to each other

For the MVP, we will not allow unstaked AN's to send sync requests to each other. However, once we have DHT and everything else set up, we should allow them to send sync requests to each other for better scalability.

Unstaked AN's should be able to verify results it gets from other unstaked AN's, and with DHT an unstaked node should be able to target messages to specific recipients, so all of this should be possible.

[Access] update bootstrap file setup for consensus follower

For consensus follower (ie unstaked AN), it would be a lot simpler if the user could just provide the network key at the command line, instead of having the program read it from bootstrap files. Otherwise, the user needs to generate bootstrap files in the bootstrap dir just to provide the networking key.

Instead of taking nodeID as a parameter for constructing the consensus follower, we should instead replace this argument with a network key.

Then, instead of having the user read their own network key from the bootstrap files, they can generate it using the new unstaked key generation function and pass it into the constructor.

Ideally, we will also make this argument optional, and if a specific network key is not passed in the consensus follower code should generate it automatically.

[State sync engine] ExecutionData requester engine

Implement a state requester engine that follows Incorporated / Sealed block notifications from the FinalizationDistributor, and makes requests for the ExecutionData CIDs of the Execution Results in those blocks

It may need to maintain an index similar to the sync core and keep retrying requests for ExecutionData that have not yet been received.
To get better latency, we can trigger the requests immediately upon receiving each new ExecutionReceipt. Then, once a block has actually been sealed, we can keep only the ExecutionData which was sealed.

[Access] Relay messages to unstaked network directly

This issue is to address the following concern raised about allowing unstaked nodes to connect to multiple upstream staked nodes:

Staked nodes will be relaying duplicate information to the network, because on each new block proposal every staked nodes will relay it to the unstaked network. This means that an unstaked node which is connected to n upstream AN's will receive (and gossip-forward) every block proposal n times.

Currently, we are implementing our own relay engine to relay messages received on the staked network to the unstaked network:

https://github.com/onflow/flow-go/blob/master/engine/access/relay/engine.go#L107-L108

However, this calls Publish on the libp2p pubsub.Topic:

https://github.com/onflow/flow-go/blob/master/network/p2p/libp2pNode.go#L487

Which creates a new libp2p Message with the node's own ID / signature:

https://github.com/libp2p/go-libp2p-pubsub/blob/master/topic.go#L224-L234

In order to take advantage of libp2p's built-in message deduplication, we need to relay the libp2p Message unchanged.

Code pointers:

Forwarding a message received from another peer: https://github.com/libp2p/go-libp2p-pubsub/blob/master/pubsub.go#L1025-L1035
The deduplication mentioned above: https://github.com/libp2p/go-libp2p-pubsub/blob/master/pubsub.go#L1076-L1089
Where the message is actually sent back out to peers: https://github.com/libp2p/go-libp2p-pubsub/blob/master/pubsub.go#L1127
GossipSub's implementation of Publish: https://github.com/libp2p/go-libp2p-pubsub/blob/master/gossipsub.go#L943

Instead of using the relay engine, what we probably need to do is to call GossipSubRouter.Publish directly to publish the message onto the unstaked network.

Here is where we process the libp2p Message: https://github.com/onflow/flow-go/blob/master/network/p2p/readSubscription.go#L58-L59

As you can see, we only pass on rawMsg.Data to the engine level, meaning we lose all the other information. Instead, we need some mechanism that intercepts the rawMsg at this level, and directly passes that to GossipSub.Publish on the unstaked network.

The benefit of this approach, is that it should reduce amount of messages on the gossip layer.

Previously, the same block proposal sent from two different staked AN's were treated as two different messages at the gossip layer, and so they would both be processed and forwarded. This means that both would have to go through the hotstuff validation logic, and only at that point does the second one that was received get discarded.

With the new approach, the same block proposal is detected as a duplicate at the gossip layer, preventing it from being further disseminated in the network and also preventing further processing up the stack.

Implement rate limiting for sync requests / responses

We should implement some rate limiting for sync requests so that we don't get spammed by sync requests / responses. Take inspiration from libp2p pubsub throttling code.

We should probably implement some sort of rate limiting for inbound unicast connections in general

State Sync Engine

The Epic tracks the implementation of the State Sync Engine

[Access] Evaluate changing the unstaked AN network to CBOR

this change will also be needed here https://github.com/onflow/flow-go/blob/master/cmd/access/node_builder/access_node_builder.go#L644

Originally posted by @vishalchangrani in #1132 (comment)

Live DPS Indexer Integration

Problem Definition

It is currently impossible for the Flow Data Provisioning Service to build its index from live networks.

Proposed Solution

As described in flow-dps#282, the proposed solution would be to do the following changes to Flow-go's execution node binary:

Add a Publisher component that is in charge of publishing finalized blocks, processed trie updates and events over a socket.
Add a Synchronizer component that answers to bootstrapping requests from the Live DPS Indexer.
Secure the sockets used by both components by at least using IP Whitelisting to only allow the known IP of the Live DPS Indexer to connect. It could be protected further by using some form of SSL protocol.

To elaborate on the how, here is my proposal.

In cmd/execution/main.go we would create a notification channel, used by the Publisher component to receive notifications that a trie update or a block should be published over the socket.
That notification channel would also be passed along to the LedgerViewCommitter so that whenever it writes in the ledger, it also notifies the Publisher that the trie update should be published. The trie update would be built from the state.View that is being committed along with the root hash of the base state.
Similarly, the notification channel would be given to the execution node's ingestion engine, so that when a block is executed, a notification is sent.
It is not clear to me yet what would be the best place to send the event data from, but what is needed on the indexing side, if we are to implement the Access API, is at least to have access to sets of flow.Event, flow.LightCollection, flow.TransactionBody and flow.TransactionResult for any given height.
The Synchronizer then would only need access to the FlowNodeBuilder's Storage.Blocks interface and the trie directory in order to be able to answer to synchronization requests.

Definition of Done

The Live DPS Indexer is able to bootstrap from a running Flow network.
The Live DPS Indexer is able to continuously receive trie updates, block updates and events.
Network failures are handled gracefully.
Messages sent over the socket are compressed.
The impact of the Publisher and Synchronizer components on performance is negligible.
Publish and Synchronization sockets are secure (SSL/IP Whitelist.)
Documentation and tests are up-to-date.

Additional Notes

I have some questions regarding this integration.

Which branch should we base this work upon?
Should this work be split into multiple digestible pull requests on a feature branch, or is it preferable to do it in one go?
Should this work be done on a fork, or would you rather create a feature branch dedicated to it on this repository?
Do you have suggestions as to how and where event/collection/transaction/transaction result data could be fetched from?

[State sync engine] Implement block stores

We will need to implement the Blockstore interface for various entities in order to use them with bitswap:

Collections
Events
TrieUpdates
TransactionResult?
Need a separate Blockstore for MerkleDAG nodes which are used for hierarchical requests

TODO: Think about how we can reuse the existing badger storage inside the blockstore implementation, to avoid having to store a duplicate copy of the data in every state diff (one version in the blockstore, and another version of each of the entities in the state diff in the badger storage).

Can we add an API to the Storage interfaces to query for data by hash instead of by ID? Currently data in badger is keyed by flow ID, but perhaps there's a way to add a Secondary key?
We may be able to implement this secondary index ourselves: dgraph-io/badger#738 (comment)
- Add a second db that maps from the hash to the flow ID.

Can we do the same for state checkpoints as well?

Localnet (FLITE) compatibility with docker images

Problem Definition

I am running my local FLITE (localnet). Until recently I was able to run current docker mainnet images with localnet (with some quick fix on bootstrap). But recently something is broken I am getting errors about some state deserialization.

If it is not too much hassle, it would be great if FLITE is supported with docker images.

[Admin server] auto generate client / server cert pair for mutual TLS

Comment: #1222 (comment)

Continuous delivery mode might not be stopped as expected

🐞 Bug Report

The break appears to be intended to stop continuous delivery mode but the break is a no-op due to its location.

flow-go/network/stub/network.go

Lines 275 to 287 in 2f7d0df

    
           go func() { 
        
           	wg.Done() 
        
           	for { 
        
           		select { 
        
           		case <-timer.C: 
        
           			n.DeliverAll(recursive) 
        
           		case <-n.qCD: 
        
           			// stops continuous delivery mode 
        
           			break 
        
           		} 
        
           	} 
        
           }()

Replacing break with return is one possible way to resolve this.

[State sync engine] Push channel for state diffs

Consider a separate push channel for state diff hashes for latency reasons, and an eager state requester engine that requests state diff blocks based on this push channel.

We could get this for free if we add the state diff hash to ExecutionResult, since execution results are pushed to AN's already:

flow-go/engine/access/ingestion/engine.go

Line 111 in c5fc8ac

_, err := net.Register(engine.ReceiveReceipts, eng)
If we implement this, we need to track the hashes of the state diffs that were eagerly requested, and the normal state requester engine should later verify these once the block containing those execution results is sealed. If the state diff hashes in the sealed block don't match what we originally requested, we should invalidate the previous downloaded blocks and start a new request.

	go func() {
	wg.Done()
	for {
	select {
	case <-timer.C:
	n.DeliverAll(recursive)
	case <-n.qCD:
	// stops continuous delivery mode
	break
	}
	}
	}()

onflow / flow-go Goto Github PK

flow-go's Issues

Problem

The Proposed Solution

🐞 Bug Report

What is the severity of this bug?

Reproduction steps

Expected behaviour

Screenshots

Specifications

Additional context

Problem Definition

Proposed Solution

Definition of Done

Actions Needed Before Submitting

Problem Definition

Proposed Solution

Definition of Done

Actions Needed Before Submitting

Problem Definition

Proposed Solution

Definition of Done

Actions Needed Before Submitting

🐞 Bug Report

What is the severity of this bug?

Reproduction steps

Expected behaviour

Screenshots

Specifications

Additional context

Problem Definition

Problem Definition

Proposed Solution

Definition of Done

Actions Needed Before Submitting

Problem Definition

Proposed Solution

Context

Definition of Done

Problem Definition

Proposed Solution

Definition of Done

Actions Needed Before Submitting

Problem Definition

1. 🚀 BadgerDB can be faster with higher GOMAXPROCS

2. 🪲 Entire Go program potentially stalling on mmap

Proposed Solution

Definition of Done

More info

Caveats

Updates

Problem Definition

Proposed Solution

Definition of Done

Additional Notes

Problem Definition

🐞 Bug Report

Recommend Projects

Recommend Topics

Recommend Org