Giter Site home page Giter Site logo

paritytech / substrate Goto Github PK

View Code? Open in Web Editor NEW
8.4K 477.0 2.6K 265.63 MB

Substrate: The platform for blockchain innovators

License: Apache License 2.0

Rust 98.40% Shell 0.19% Python 0.01% Dockerfile 0.02% WebAssembly 1.28% Nix 0.01% Handlebars 0.08% EJS 0.01% JavaScript 0.01%
parity polkadot blockchain substrate client node

substrate's Introduction

Dear contributors and users,

We would like to inform you that we have recently made significant changes to our repository structure. In order to streamline our development process and foster better contributions, we have merged three separate repositories Cumulus, Substrate and Polkadot into a single new repository: the Polkadot SDK. Go ahead and make sure to support us by giving a star ⭐️ to the new repo.

By consolidating our codebase, we aim to enhance collaboration and provide a more efficient platform for future development.

If you currently have an open pull request in any of the merged repositories, we kindly request that you resubmit your PR in the new repository. This will ensure that your contributions are considered within the updated context and enable us to review and merge them more effectively.

We appreciate your understanding and ongoing support throughout this transition. Should you have any questions or require further assistance, please don't hesitate to reach out to us.

Best Regards,

Parity Technologies

substrate's People

Contributors

andresilva avatar arkpar avatar athei avatar bkchr avatar cecton avatar cheme avatar davxy avatar dependabot[bot] avatar expenses avatar gavofyork avatar ggwpez avatar gilescope avatar gnunicorn avatar kianenigma avatar koushiro avatar koute avatar marcio-diaz avatar michalkucharczyk avatar mxinden avatar nikvolf avatar pepyakin avatar rphmeier avatar shawntabrizi avatar sorpaas avatar svyatonik avatar thiolliere avatar tomaka avatar tomusdrw avatar tripleight avatar xlc avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

substrate's Issues

Governance: Delayed enactments

Proposals should generally suffer some delay after the vote is finalised before they get enacted in order to allow for any tokens to change hands. This delay should either be at the council's discretion, according to the level of contention it generates on the council or, according to the level of contention it generates on the network. For the most contentious motions that get passed, a sufficiently long period should be left before enactment in order that stakers are able to disengage and sell funds.

Refactor consensus code into polkadot/substrate components

The BFT code just needs a couple of things:

  • Messages in
  • Messages out
  • Sign message with local key
  • Round proposer
  • Round timeouts
  • Proposal generation function
  • Proposal evaluation function

The proposal generation and evaluation functions should encapsulate the behavior of the specific substrate chain. This can be packaged up into its own trait.

We can package this up in a substrate-bft crate.

polkadot-statement-table crate:
The details of the proposal generation/evaluation function are what encapsulates the current statement table logic. This needs

  • Misbehavior type
  • Import of incoming statements
  • Incoming statements to trigger fetch of candidate data for evaluation of availability
  • Signing statements
  • Creation of batches of outgoing messages (to be done on a timer or by some other heuristic)

polkadot-consensus crate:

  • service built on top of the substrate-network
  • maintains connections to authorities
  • combined router for substrate-bft and polkadot-statement-table messages
  • manages local consensus identity and signing of messages
  • collates local parachain candidate
  • fetches and evaluates other candidates as necessary
  • creates full substrate block from statement table and transaction queue
  • accumulates misbehavior to be evaluated on-chain

Protocol: Light-client friendly storage tracking

Use-cases of a node from an external application's point of view fall into two categories: inspection and notification. Light clients, which sync using only the chain of headers and do not generally validate extrinsic data within the block, must have special considerations to ensure they are able to provide both use-cases efficiently.

Inspection (of storage or the chain) is pretty easily provided following the design of Ethereum and Bitcoin before it: full nodes (or even light-nodes that already have the requisite information) may provide proofs on the value of a particular key in storage using only the storage's Merkle-trie root as a priori assumption (which is provided for through a header-sync).

However, notification is more difficult to provide as a low-trust service since proof that a given block does not have an extrinsic which causes a state-change of interest is not efficiently derivable from the storage's Merkle-trie roots. In Ethereum this was addressed through collating a large (2KB) Bloom filter in each block and embedding it in the header. This bloated the header and was ultimately fruitless as the usage of Ethereum ballooned and the Bloom became saturated.

Instead, I propose three mechanisms for addressing state-change notification on Substrate, two of which (the latter two) it makes sense to combine:

  • Track last-modification-time entry in storage;
  • Provide a Merkle trie root of all modified storage entries, ordered and indexed;
  • Provide a hierarchy of Merkle trie roots of all modified storage entries in a series of blocks, ordered and indexed.

Track last-modification-time entry in storage

At present, the storage database is a set of key-value pairs. These pairs are arranged into a Merkle trie and baked into a single "root" hash. This proposal would simply prefix the value with the block number at which the value was last modified.

Synced light-clients could easily query proof-servers on when a storage item of interest last changed. Proof-servers could prove the most recent block (compared to either the head or some block before the head that the light-client knows about) at which the change happened. Light-clients could request a change-log between some begin and end block of one or more storage keys and the proof-server would return a chain of these proofs as irrefutable evidence of all blocks in which one or a number of storage entries changed.

There is one issue with this approach: deleted storage entries would still have a footprint in the database, necessary for recording the block at which it was "last modified" (i.e. deleted) - without this light-clients would lose their ability to query for its historical change log. The (slightly inelegant) workaround to this would be to have special "garbage-collection" blocks in which these zombie entries would be purged from the database (and thus the trie). Light-clients would ensure that they always made at least one change-log request within each of these periods.

This would increase every storage entry in the database by around 32-bytes (for the block number). It wouldn't have much of an effect on the disk i/o and the header size would remain the same. However, for storage with a lot of changes, building and executing these garbage-collection blocks may become a serious efficiently issue.

Ordered, indexed, Merklised per-block change-trie

This proposal creates a new structure that encodes all changes, not dissimilar in spirit to the Bloom filter. This structure takes the form of a trie root build as the mapping of indices to storage keys. The indices are sequential and ordered by storage key. Like the Bloom filter this gives a cryptographic digest of what has changed in the block. Unlike the Bloom filter, a proof that any given key has not changed is not only possible, but also compact.

To provide the proof that a given key didn't change, the proof-server provides the two (Index, Key) entries on either side of the key to be proven was not changed. (A null sentinel index of (ChangedKeyCount, null) would denote the upper end in order to provide a proof should the queried key be greater than the upper limit of modified keys.)

In principle, this trie could also contain a second mapping of (Key, [ ExtrinsicIndex_1, ExtrinsicIndex_2, ... ]) to denote which extrinsic data in the block actually caused the key to be changed.

Extending over ranges

While this allows for efficient proofs that one or more keys were not changed (or, if they were changed, can give the specific extrinsics which caused the change) in any given block, use-cases typically want to ascertain this for a range of blocks.

This structure, however, lends itself to a hierarchical approach: event N blocks, the trie would contain an additional entry ('digest', DigestChangeTrieRoot). DigestChangeTrieRoot would be the root of a similar trie structure, except that it would contain the accumulated modified accounts of the previous N blocks. Rather than containing the series of ExtrinsicIndexs that caused the change of any given key, it would contain the block numbers that caused the change, allowing for the efficient identification of the exact extrinsic through a logarithmic number of queries/proofs.

This structure can be nested and recursed arbitrarily; N might reasonably be either 16, 32 or 256 and be recursed 4, 3, or 2 times accordingly to get a maximum block range that the top-level trie covered to be 32768 or 65536.

Light clients would query proof-servers in batches, hopping over the blocks by the top-level range at a time. Proof-servers would either return with proofs that nothing of interest changed or would return with the sub-ranges where something did change (along with the key that changed there). Light-client would re-query within that subrange, drilling down until they determined the exact set of extrinsics. In principle, this entire query could be prepared on the server-side and a compiled proof of everything built and sent back to the light-client with minimal bandwidth/latency used.

Client: Structured logging

Eventual aim: a highly detailed version of ethstats.net with clients (optionally!) directly contacting a central web server to provide it with real-time information that is collated and served to web pages.

This is a two-part project; one part is fitting the appropriate logging into the client in order to connect and stream JSON information on the client's operational statistics to a server. The second part is writing such a web app; the server part of the app would receive and collate this information in real time from many polkadot clients and then distribute the resulting information to web-browsers for display.

This issue only describes the first part.

Implementation

slog can provide the structured logging API. This should be combined with a lazy_static and a simple macro in order to get a global logging macro, much like trace! from the log crate except that it accepts key/value pairs rather than a formatted string.

This macro should be used throughout the client for all key events (block arrived from network/queued/validated/imported, transaction(s) submitted/arrived/mined, peer connected/disconnected, ...).

The output of the structured log should be directed to a JSON encoder and then sent via a websockets connection to a server (address/port configurable via CLI params ala polkadot --stats-server=ws://stats.polkadot.io). On opening the websockets connection, an initial dump of the nodes state should be made (current chain head number/hash, peers, transactions in the pool).

Refactor Transaction through runtime & primitives

The primitives crate is a dependency of the runtime crate. Yet the Transaction type defined in primitives, semantically depends on the contents of runtime since it expresses all callable endpoints within runtime in a strongly typed manner.

This combination has a number of problematic side effects:

  • When new end-points are added, they are added not just to their native crate, but also to one if its dependencies, making no sense.
  • It's not enough to publicly export any types by a runtime module and exposed through a callable endpoint. You need to actually move that type up into the primitives crate. If the type has impl logic, then that must be moved too (even if it's highly specialised and not very "primitive") or some other acrobatics used to circumvent.

Furthermore, requiring a strongly-typed dispatch at all implies another facepalm: Some endpoints themselves proxy a further dispatchable "proposal", causing a self-reference that means a bare type in the enum cannot be used and further allocations are needed. i.e.

enum Proposal {
    ...
    StartPublicReferendum(Proposal, Format),
    ...
}

must become StartPublicReferendum(Box<Proposal>, Format),.

Aside from these specific side-effects which are cause pain right now, the general ramifications of this leaving this unfixed are a tendency towards spaghetti references and monolithic code.

There are three ways of going that I can see:

1a. Move Transaction (and all that depend on it, like Block) to runtime. This would keep the current types as they are, but leave primitives to be just the super-low-level types and runtime to be the crate to be imported if high-level typing was needed.
1b. Move Transaction (and all that depend on it, like Block) to some other module (e.g. highlevel). This would keep the current types as they are, but leave primitives to be just the super-low-level types. highlevel would depend on runtime and be the crate to be imported.
2. Avoid making Transaction typed around any runtime-dependent information. Transaction would be more like in Ethereum where the dispatch element is just a byte blob to be interpreted at (or just before) the time of dispatch, not when the transaction is being initially deserialised. This fixes all problems including the Proposal-within-a-Proposal issue.

My preference is for option 2, moving away from this attempt to bake the dispatch logic into the type system, which seems to be forcing such problems on us. Aside from the great view from the ivory tower, I see no great need to represent the dispatch data under strong types prior to the time of dispatch.

CC @rphmeier

For Polkadot: WASM-based smart-contract parachain

Can link in relevant code from https://github.com/paritytech/parity

BlockData:

  • 256 recent headers
  • parachain transactions
  • state trie proof

Validation function:

  • check header validity (mostly just timestamp)
  • apply ingress prior to transactions
  • apply transactions

Collator:

  • create valid header
  • apply ingress
  • push transactions from queue until out of gas or transactions.
  • best choice of gas amount?

Genesis block

Need a genesis block. Should include initial set of validators/session keys, together with the initial code, compiled from the runtime.

Runtime: Avoid panics in apply_extrinsic

NOTE: This is specific to the Polkadot/Demo implementations, NOT Substrate.

The native runtime should be able to just the validity of including an extrinsic upon only the information in the extrinsic and basic balance/index information of the sender account. This makes the implementation of the tx queue more straightforward and efficient and also helps make clear arguments against DoS vectors.

Basically, the only conditions upon which apply_extrinsic may panic are:

  • the free balance of the sender account is less than cost_xt_basic + cost_xt_byte * xt.encode().len(); or
  • the index of the sender account is not equal to xt.index.

If these "panic" conditions are not met then apply_extrinsic must never panic. To panic thereafter would cause a DoS vector for the miner at best, and will cause the miner to create invalid blocks at worst.

Directly it is determined that apply_extrinsic will not panic, the balance should be reduced by the fee and the sender index incremented. Any further "higher-level" criteria that are not met (and would thus cause a panic in the current code) should be reworked to ensure they return instead without changing any storage items (except, of course, the balance reduction and index increment).

Derivable Codec

  1. Split traits into Encodable, Decodable or something similar so we can serialize borrowed/unsized data and deserialize into borrowed data.
  2. custom-derive

Key management for validators

Validator nodes will need to store their master keys persistently. Session keys can be derived from the master key and session index. This will be done in the native blockchain-specific (i.e. not substrate) code.

RPC: Storage entry change query and notification pub/sub

Websockets/pub-sub RPCs should be expanded to allow RPC clients to track specific storage items to get notifications should they change. It should also be possible to efficiently query historical changes - getting which extrinsics changed a number of storage keys over a range of blocks.

  • -> state_queryStorage(keys: [ StorageKey, ... ], from: BlockHash, to: Option<BlockHash>) -> Result<QueryIndex, Error>: Query changes of a storage entry, possibly historical and potentially tracking real-time, asynchronously reporting.
    • from: The block after which changes will be provided.
    • to: If given, the block up to which changes will be provided. If not given, then notifications will track the head of the chain as it changes.
  • <- state_notifyStorage(query: QueryIndex, until: BlockHash, changes: Changes) Notify of a change that happened in block until. This is guaranteed to be more recent than any previously reported changes. NOTE: This doesn't cover reversions - we'd probably want a separate notification type for those.

Error may be:

  • Unknown (i.e. we don't recognise from or to)

Changes takes the form of a structure:

[ {
  block: BlockHash,
  changes: [ {
    keys: [ StorageKey, ... ]
  }, ... ]
}, ... ]

Macro for constructing a high-level type-safe wrapper around substrate storage

Goal: never reference or load storage items using the key string directly. It is arcane, bug-prone, and unreadable.

usage:

Using a trait Storage:

trait Storage {
    // panic if the type is wrong for the key.
    fn load<T: Decodable>(&self) -> Option<T>;
    fn store<T: Encodable>(&mut self, value: T);
}
storage_declarations! {
    Authorities: List(":auth" -> AuthorityId), // creates something like current `KeyedVec` using prefix ":auth"
    Code: ":code" -> Vec<u8>, // creates a single value. stored under that key.
    ...
}

Authorities::len_key() -> &'static [u8];
Authorities::load_len(&Storage) -> u32
Authorities::key_for(n) -> Vec<u8>;
Authorities::load_from(&Storage, n) -> Option<AuthorityId>;
// ... KeyedVec-like API

Code::load_from(&Storage) -> Option<Vec<u8>>; // assumes a `Decodable` trait where the `<[u8] as Decodable>::Decoded = Vec<u8>`
Code::store_in(&mut Storage, &[u8]); 
Code::key() -> &'static [u8]; // for low-level usage.

crate substrate_storage would define all storage values used in substrate.
crate polkadot_storage would define all storage values used in polkadot.

The "load"/"store" API is a little annoying, so under runtime-support we would provide a Storage implementation that calls out to the externalities and a trait to provide helpers that are more ergonomic: i.e. a load() and store(T) function which are usable only within the runtime.

Usage in runtime:

// assuming these declarations:
storage_declarations! {
    Authorities: List(":auth" -> AuthorityId), // creates something like current `KeyedVec` using prefix ":auth"
    Code: ":code" -> Vec<u8>, // creates a single value. stored under that key.
    ...
}

// ...

Authorities::len_key() -> &'static [u8];
Authorities::len() -> u32
Authorities::key_for(n) -> Vec<u8>;
Authorities::load(n) -> AuthorityId;
// ... KeyedVec-like API

Code::load() -> Option<Vec<u8>>; // assumes a `Decodable` trait where the `<[u8] as Decodable>::Decoded = Vec<u8>`
Code::store(&[u8]); 
Code::key() -> &'static [u8]; // for low-level usage.

Refactor into substrate and polkadot-relay

Rough gameplan:

  • Unpick native-runtime from substrate-executor (use native impl_stubs! macro - current a no-op, to generate the code || match method { "execute_block" => safe_call(|| runtime::execute_block(&data.0)),... and this function, along with the wasm is equates to in as a static-dispatch). (#62)
  • Refactor/pick-apart polkadot-client into a generic substrate-client and a Polkadot-specific polkadot-client. (#62)
  • Rework polkadot-rpc to depend only on the generic (perhaps trait?) substrate-client and rename to substrate-rpc (and rename polkadot-rpc-servers -> substrate-rpc-servers). (#62)
  • Rework polkadot-network to create a generic substrate module substrate-network (the relay-chain sync code plus a peer-network overlay, basically) that can be used by substrate-client and extended into a polkadot-specific module polkadot-network capable of handling polkadot network messages (parachain candidate selection &c.) and functionality (parachain peer-pre-connections).
  • Consider renaming substrate::transactions to substrate::extrinsics to reflect the fact that the data is completely generic and may not have features typical of transactions.

Full block production

Requires #7 and #8: We can then add dummy parachains, collators, and then have validators vote on proposals and seal them online.

Integrate consensus with the state

Integrate consensus with the state:

  • determination of roles taken by validators (in terms of grouping and primary selection) #55
  • collation of local parachain candidate
  • creation of a relay chain block from candidates with enough requisite votes
  • verification that the candidates in a proposed block have enough requisite votes

Slashing: on-chain evaluation of misbehavior reports from the BFT and statement table subsystems

...and automatically generate transactions with any witnessed misbehavior.

Should be simple enough:

  • Each misbehavior report must be accompanied by some security bond.
  • Each report contains a validator who misbehaved, the block hash they were building on when they misbehaved, and the proof of misbehavior
  • A report is valid iff the validator was a validator at the given block hash and the misbehavior given is true.

BFT misbehavior reports could be managed at the substrate level.
Parachain Statement Table misbehavior can only be managed in the polkadot runtime -- and will often require proofs of group membership at a block in recent history. We will need to make sure that all duty rosters within some range of history are computable.

Runtime: Recombine into Polkadot

  • Ditch the runtime in favour of something equivalent to demo/runtime.
  • Remove any Log/Header/Block types and use the specialised versions of the runtime primitives generics.
  • Reintegrate staking/slashing logic.

Runtime block validation should check state root

At present, the storage trie is not calculated during the execution of the runtime, so it's rather difficult to verify the storage root in the header. We'll have to introduce an external - calculate_storage_root or whatever, which forces the storage root to be evaluated.

Runtime: "DAO"/community funding manager

Currently the staking mechanism doesn't pay out a reward. Once it does, then there will be a counterpart reward paid into the network funding bucket. This network funding bucket may be tapped by ecosystem members and payments be made to those that are approved.

Main things to consider:

  • Are payouts made in a one-off ad-hoc fashion or batched into monthly budgets?
  • Is the capitalisation of the bucket at a fixed rate (relative to the validator payout) or adaptive?
  • How does a payment get ratified?

Record proposals for live rhododendron sessions in the DB

To prevent accidental double-propose when going offline for a short period.

DB holds a mapping equivalent to a HashMap<parent_hash, Vec<(round number, proposal)>>.

When proposing at a round k on top of a given parent hash, check if we already proposed at this round and don't create a new one. Otherwise, place the new proposal in the mapping and commit to disk.

When importing a block on top of parent_hash, clear all recorded proposals based on it as they are no longer relevant.

RPC: Extrinsic submission & inclusion notification pub/sub

Websockets/pub-sub RPCs should be expanded to allow RPC clients to submit extrinsics and get full lifetime notifications of them.

  • -> author_submitExtrinsic(xt: Vec<u8>) -> Result<ExtrinsicHash, Error>
  • <- author_extrinsicUpdate(xt: ExtrinsicHash, status: Status)

Error may be:

  • InvalidFormat (i.e. it's plain old invalid and will never become valid)
  • Dead (i.e. it was once valid but has now become invalid)
  • Immature (i.e. it's currently invalid and while it may become valid at some point, that's too far ahead to care about)
  • PoolFull (i.e. there's no room at the inn)
  • AlreadyKnown If we already know about it but yet it's valid for inclusion, then it's not an error - we carry on as before and trace the pre-existing extrinsic instead.

Status may be:

  • Finalised(BlockHash) (it's finalised and all is well)
  • Usurped(ExtrinsicHash) (some state change (perhaps another extrinsic was included) rendered this extrinsic invalid)
  • Broadcast(Vec<PeerId>) (it has been broadcast to the given peers) version 2.0 only

Parachains: Minimal Parachains Framework

  • Validators collate, evaluate, and ensure availability of parachain candidates.
  • Misbehavior which can be slashed
  • No messaging yet; parachains are completely isolated.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.