Giter Site home page Giter Site logo

dusk-network / dusk-blockchain Goto Github PK

View Code? Open in Web Editor NEW
102.0 102.0 48.0 147.6 MB

Reference implementation of the DUSK Network node, written in Golang

License: MIT License

Makefile 0.63% Shell 1.22% Go 98.00% JavaScript 0.15%
blockchain cryptocurrency golang

dusk-blockchain's People

Contributors

autholykos avatar dependabot[bot] avatar devtomnl avatar fed-franz avatar fulviuus avatar ggoranov avatar goshawk-3 avatar hdauven avatar herr-seppia avatar ivensfernando avatar jules avatar kevaundray avatar t00ts avatar thomasmodeneis avatar threehook avatar ureeves avatar zer0 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dusk-blockchain's Issues

Use a concurrent strategy to speed up BLS Public Keys and Signature aggregation

Benchmarks conducted on dusk-network/dusk-crypto#14 show a substantial performance improvement when parallelizing the aggregation operations. Given the criticality of such aggregation when creating Agreement messages, it would be good to apply the concurrent strategy to minimize event production time and avoid slacking behind when in terms of steps (or even round) just because Agreement creation is slow.

Following function should serve as an inspiration for exploiting concurrency when aggregating Signature:

func aggregateSignatures(sigs []*Signature) *Signature {
    const W = 4
    C := len(sigs)

    M := C / W
    R := C % W

    if C < M {
        return aggregateSignaturesSequential(sigs)
    }

    var wg sync.WaitGroup
    var last *Signature

    // goroutines will fill this slice concurrently
    sgs := make([]*Signature, W) 

    wg.Add(W)
    // w workers for n couples
    for w := 0; w < W; w++ {
        // each goroutine will execute a sequential Aggregation of M signatures
        lo, hi := w*M, (w+1)*M 

        go func(i int, set []*Signature) {
            sgs[i] = aggregateSignaturesSequential(set)
            wg.Done()
        }(w, sigs[lo:hi])
    }

    // aggregating the remaining chunks not fitting in the worker count
    if R > 0 { 
        last = aggregateSignaturesSequential(sigs[C-R:])
    }

    wg.Wait() // waiting for all the goroutines to be done

    if last != nil {
        sgs = append(sgs, last)
    }

    return aggregateSignaturesSequential(sgs)
}

func aggregateSignaturesSequential(sigs []*Signature) *Signature {
    s, sigmas := sigs[0], sigs[1:]
    for _, sig := range sigmas {
        s = s.Aggregate(sig)
    }
    return s
}

Can't build node from source

So no running on Windows for now / ever?

And where's the compiled program as mentioned in the install guide?

Thanks...

Error fetching block from node db: database: block not found

After testing the node near height 13903 as described in issue #3, it is not possible to sync anymore because of a "block not found" error.

Jules on Telegram:

will be pushing a fix for this shortly - we did have to roll back a couple blocks on the blockchain in order to ensure smooth syncs for the updated nodes. there was a pretty nasty bug found with certificate discrepancies between nodes, identified last night and patched preliminarily on the other nodes. once the fix is pushed on the github, i'd advise to rebuild and re-sync from scratch

Chain committee is one round behind consensus committee during launch

Since a committee.Store is initialized on the chain tip height, and a consensus round is always one ahead of the tip, it can cause a discrepancy between the chain and the consensus committees.

For example, if the chain tip is at block 2000, then the round is 2001. Upon startup, consensus components run RemoveExpiredProvisioners with round 2001, so their committee is then one step ahead of the chain, who only calls this function when blocks are accepted.

As a result, this can cause the node to get stuck, since a forged block will have an invalid certificate according to the chain, even though it is correct from a consensus point of view.

Subtract 1 DUSK when a node is extracted with deterministic sortition

The current version of deterministic sortition does not decrease the available stake of a node one or more DUSK of which has been already extracted for the committee. The deterministic sortition procedure has to be upgraded to subtract 1 DUSK from the stake available during an instance of sortition procedure when a DUSK under the control of the node is extracted for the committee.

Refactor `encoding` package to increase efficiency

Currently there is room for improvement in passing by reference vs passing by value, especially in hotter parts of the code using this package (such as encoding and decoding events, transactions and big blocks).

Automate Soundness Testing

A test harness should allow us to automate E2E tests on top of fully configured and functioning testnet to avoid regressions, especially on race conditions. An overview of E2E test setup could include:

  1. Configuring network state
  2. Bootstrapping the network (dusk-node,blindBid,voucher, monitor services)
  3. Triggering actions of different kinds
    • send RPC requests
    • simulate wire events

Perform assertions to ensure correct state and timing are achieved by testing assertions that check:

  • persistent node state (blockchain and wallet stores)
  • non-persistent node state (mempool, eventBus, etc ..)
  • environment state (cpu, mem, hdd, network stats, etc)

Ideally, these testing could be run locally on demand but also on a regular base by CI implemented in #17.

Following pseudo-code should demonstrate the concept:

func TestSendTransaction() {

   duskNodeConfig := readConfig(basicRun-dusk-node.toml)
   
   // ImagePull  ...
   // ContainerCreate ...

    logPipe,dbPath,rpcAddr := cli.ContainerStart(duskNodeConfig)
    db :=  db.Open(dbPath)

   var expectedTx transactions.Tx
   rpcConn.SendRPC(rpcAddr, "SendRawTx", expectedTx)
 
  // Assert point: Ensure the transaction was accepted and stored
  for {
        attempts++
        time.Sleep( 10*time.Second)
        err := db.FetchBlockTxByHash(expectedTx.Hash)

         if err == database.NotFound {
             continue
         }

        if attempts > 6 {
            t.Fatal("expecting the transaction is accepted within 1 minute")
        }
    }
}

Remove global channels in RPCBus source file, and find another way to implement them

The RPCBus had it's channels implemented as global variables due to simplicity (according to @ggoranov). However, this approach can be quite unsafe, as there is no guarantee that the correct component is reading a request - even if the intent is to only have the channel strictly between two components, this is not mentioned anywhere (as far as I can tell, if I missed it please let me know), and in such a case it would still be better to return it from a function call or something similar, so to not make that particular channel available to the rest of the code since it shouldn't be shared outside of those components in the first place. Anybody not properly familiar with this code could easily make a mistake that way.

Consensus is not initialized when running the network from genesis block

When starting a local network from scratch (for testing purposes) it seems no nodes are running the consensus.

According to the current implementation, msg.InitializationTopic is published in case of topics.AcceptedBlock event only which is totally fine for updating the consensus module with the chain tip.

However, in case of starting the network from genesis block, wouldn't the network get stuck as no node is able to generate block at height 1?

@julesdesmit The below patch workarounds this but need your expertise here, if I'm missing something?

+++ b/pkg/core/consensus/initiator.go
@@ -35,6 +35,13 @@ func GetStartingRound(eventBroker wire.EventBroker, db database.DB, keys user.Ke
        // Start listening for accepted blocks, regardless of if we found stakes or not
        acceptedBlockChan, listener := InitAcceptedBlockUpdate(eventBroker)
 
+       if currentHeight == 0 {
+               roundBytes := make([]byte, 8)
+               binary.LittleEndian.PutUint64(roundBytes, 1)
+               eventBroker.Publish(msg.InitializationTopic, bytes.NewBuffer(roundBytes))
+               return nil
+       }
+

Consider exposing wallet commands as an API

Currently, the entire wallet interaction goes through CLI commands. A client application is not capable of creating a transaction, getting balance and so on.
An example: To enable test-harness, rpc package is calling cli.SendBidTx directly as a temporary patch.

Proposal:

Wallet commands can be moved to a separate package so that both CLI and RPC packages can use them via RPCBus.

Other option could be to deprecated CLI entirely and build RPC API only for interaction with the wallet.

Let me know your thoughts on that.
@julesdesmit @autholykos

Repopulating committees and bidlists takes a long time if blockchain is longer than 250,000 blocks

On testnet nodes, an incredibly long delay was observed anywhere between 2-5 minutes to re-initialize a node after shutting it down, when the blockchain was longer than 250k blocks. This is due to the fact that bidlists and committees and kept seperately between components, meaning that there will be a good handful of committee stores and bidlists re-populating on startup, and also because none of this information is persisted on disk, and so the node needs to walk up the blockchain many times when it restarts.

configure Travis CI

The current repo is missing Continuous Integration functionality.

We are going to integrate Travis CI into our toolchain.

Clarify the effects of `startprovisioner`

It is not so intuitive that you can execute startprovisioner without staking. According to the documentation it is necessary to stake.

"stake": Stake a given amount of DUSK, to allow participation as a provisioner in consensus. Make sure to use the sync command first.

What is actually happening when you run startprovisioner without staking?

Is it a deliberate choice to let users sync manually instead of syncing automatically? What is the risk of staking without syncing?

1 tDusk missing from wallet balance

After testing the blockgenerator, 1 tDusk is missing from my wallet balance.

$ ./testnet 
initializing node...
initialization complete. opening console...
> createwallet [mypassword]
Wallet created successfully!
Public Address: [myaddress]
> sync
Syncing wallet... (13791/13791)
Found 0 spends and 1 receives
> balance
Balance: 10.00000000
> stake 8 100 [mypassword]
hash: [somehash]
> sync                
Syncing wallet... (13804/13804)
Found 0 spends and 1 receives
> startprovisioner
provisioner module started
to more accurately follow the progression of consensus, use the showlogs command
> sync
Syncing wallet... (13807/13807)
Found 0 spends and 0 receives
> sync
Syncing wallet... (13808/13808)
Found 0 spends and 0 receives
> bid 2 100 [mypassword]
hash: [txid]
> startblockgenerator [txid]



balance
              

stop

After waiting a few hours I exited the console with CTRL + C because the console would not return control (see #2). Coincidentally the test monitor was stuck at height 13903.

I restarted my node to check whether I could reproduce the error. Unfortunately I did not manage to sync, but I also noticed 1 tDUSK is missing.

$ ./testnet 
initializing node...
initialization complete. opening console...
> balance
please load a wallet before trying to check balance
> loadwallet [mypassword]
error attempting to load wallet: cipher: message authentication failed
> sync
please load a wallet before trying to sync
> loadwallet [mypassword]
Wallet loaded successfully!
Public Address: [myaddress]
> balance
Balance: 9.00000000
> balance
Balance: 9.00000000
> sync    
Syncing wallet... (13903/13903)
Found 0 spends and 1 receives
> balance
Balance: 9.00000000

Repopulating committees and bidlists takes a long time if blockchain is longer than 250,000 blocks

On testnet nodes, an incredibly long delay was observed anywhere between 2-5 minutes to re-initialize a node after shutting it down, when the blockchain was longer than 250k blocks. This is due to the fact that bidlists and committees and kept seperately between components, meaning that there will be a good handful of committee stores and bidlists re-populating on startup, and also because none of this information is persisted on disk, and so the node needs to walk up the blockchain many times when it restarts.

Refactor eventfilter and accumulator packages defensively

Current way the accumulator and eventfilter prevent race conditions is by making heavy used of state management in the accumulator. It is possible to refactor the packages defensively to keep the same resilience against race conditions but without excessive locking.

Automate startup of node, with regards to consensus

With both the provisioner component and the generation component not being dependent on a specific hash, and with both now able to update their values when a tx expires, we can make the node even more easy to use by automating the syncing, staking, bidding and starting of consensus components without any need for user input.

Some handy added features would be specifying a --wallet flag to start the node without consensus running, and adding some commands that can adjust default locktimes and values for automated transactions.

This enhancement should remove the need for node operators to consistently check if their transactions have expired and from sending new ones in due time by themselves, which is much less of a drag. This replaces #13, since a user can no longer accidentally start consensus components on the wrong round this way.

Refactor event bus handlers

With the addition of the new ring buffer, the Handler interface now contains behaviour which is only needed for one specific implementation. This can probably be refactored to be a bit more accomodating.

Agreement step counter does not increment on failed reduction cycle

This is a regression introduced by #63, since the collector will stop short any attempt to run sendAgreement if there is no vote set included in the message. The problem is that it should, to stay in sync with the reduction events it might receive, and so to create a proper committee for these events.

A fix will be pushed shortly.

Tx inputs are not added

Since the tx structures were unified, the StandardTx method was built to emulate the one from the old model, where it returns the Standard struct as a value. This will cause problems when creating transactions, as inputs are not saved on the correct Standard struct, and leaving the actual transaction bare.

Remove `crypto` package

The crypto package is still left in the repo, carrying only an unfinished README document. It should be deleted, as we are now importing it from an external repo.

Timer does not exceed 20 seconds

Currently, in the IncreaseTimeout function for the consensus.Timer, t.timeOut is evaluated against t.baseTimeOut*2^10. Golang reads the ^ symbol as a bitwise XOR, and not a power. This should be changed, as the timer needs to keep increasing until it has doubled 10 times.

Proof generator should have method to update values, instead of re-instantiating it when it expires

When a bid expires, we currently create a new proof generator as it is stored on the broker as an interface value, with no method to update the D value. However, this can cause quite some slowdown on the node, as it would have to reconstruct the bidlist, which can take (as observed on testnet) up to a few minutes. A much more efficient way to do this would be just to add a method to the Generator interface to update the D value directly, and make use of a boolean flag as mentioned by @ggoranov.

Pass buffers in the Event Bus by value

Currently, to avoid sharing the same pointer, we copy the buffer being sent over the event bus for each handler in any given map of handlers. We can instead pass these buffers by value. This would remove the need for the copyBuffer function, and most likely make the dissemination of event buffers much more efficient.

Fix race condition between round updates and agreement event creation

As observed on the testnet, a possibility exists for the node to unintentionally offset it's agreement component step counter due to the agreement.broker.Listen goroutine being entirely seperate from the invocation to agreement.broker.sendAgreement (which happens as a callback from the reduction component). As aggregating the reduction events into an agreement event takes some amount of time (I believe it's a few hundred ms), there's an opening for the round update to happen in the midst of this. After the aggregation is complete, and the state is updated, the deferred call to broker.state.IncrementStep is executed, and the node then starts off that round with an agreement step counter that is ahead by one.

As a result, any events that are created next, will be invalid. This happens because the committee bitset that gets generated is dependent on the broker.state.Step, meaning that for reduction events created on step 1 and 2, the agreement event contains a bitset corresponding to these events in the committees from step 3 and 4. This causes an inconsistency in the network and, if enough nodes are affected on the same round, it can halt consensus entirely since too many nodes will not be able to generate proper agreement events, regardless of how many times they try.

This bug was reproduced in a test, which will be included in the PR to fix this. I believe a simple fix would just be to invoke the sendAgreement function through a channel, instead of a callback, which would synchronize it with the occurrence of round updates, thus keeping the step counter in it's correct place at any time.

Help nodes falling-behind to catch up

Due to some unexplained issue, some nodes during deployment can fall behind and never sync up with the rest of their peers. My hunch is that it is due to some nodes not accepting every block after finishing consensus, which ends up in a gap in their blockchain, and then when someone asks to sync with them, they only supply about half the blocks and leave the requesting node hanging. Need to do more testing to confirm this.

To quote @toghrulmaharramov

we have to work on formalizing a bootstrapping model for the nodes that have fallen behind for less than n blocks. Unfortunately, due to the complexity of Reduction phase, which muddles up the bootstrapping process, the catch-up process has to be made more complex than on rival protocols. I have a few ideas on how the problem could be elegantly curtailed

Simplify `Reducer` algorithm by eliminating unnecessary asynchronous processing

This is somewhat related to #21 albeit more focused on a general simplification of the Reducer structure.

The Reducer component is supposed to perform operations that are quite sequential in nature. A (grossly oversimplified) layout of its duty is the following loop:

  • wait for a quorum of (valid) reduction events to be reached or a timeout to be triggered
  • validate the winning candidate block
  • propagate the 1st step reduction
  • increment the step
  • wait for a quorum of (valid) reduction events to be reached (or a timeout to be triggered)
  • propagate the result

Such a layout might be better modeled with 2 separate sync.WaitGroup instead of overusing channels.

Fix intermittent failure of BLS keypair generation

When calling user.NewRandKeys or user.NewKeysFromBytes, if the rand.Reader, or any other seed, runs out of bytes before being able to generate a BLS keypair, it returns an EOF error. This can be easily mitigated by adding some logic to retry with a new seed in this case. Additionally, this should take care of an intermittently failing test.

CLI wallet does not unlock after bid expiration

Example with bid of 2 and locktime of 100 blocks.

> loadwallet [password]
Wallet loaded sucessfully!
> sync
Syncing wallet.. (138088/13808)
> bid 2 100 [password]
hash: [txid]
>startblockgenerator [txid]


Immediately after the last command, the CLI does not respond to input.

After the locktime of 100 blocks has passed, the CLI is expected to return control, but remains non-interactive.

Confusing description of locktime in README.md

In README.md locktime is confusingly described as an amount of blocks that must exceed the block height. That means locktime must be at least as long as the age of the chain, which seems excessive and impractical.

[locktime] stands for the amount of blocks for which the bid is locked (block_height < locktime < 250000)

Additionally the maximum locktime is 250,000 blocks, which at the current average blocktime of 7 seconds is 20 days. After 20 days the block height will pass 250000, at which point there is no value for locktime which would satisfy block_height < locktime < 250000.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.