Giter Site home page Giter Site logo

bdk_core_staging's People

Contributors

danielabrozzoni avatar evanlinjin avatar laggintimes avatar llfourn avatar notmandatory avatar rajarshimaitra avatar vladimirfomene avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

bdk_core_staging's Issues

Demo CoinSelector usage in BDK proper

CoinSelector is nicely decoupled from the rest of the code and could be pasted into bdk and benefit it right now. This also allows us to make a key API change to BDK that will have to be done eventually anyway. Better to do it now rather than all at the same time later on. Also gives us feedback on the API.

Implement updates that returns position in block and merkle proofs

We have built the internals to allow us to include position in block and even a full merkle proof for the transactoin in ChainIndex. Let's do it and show it in an example.

cc @evanlinjin a decent candidate for this would be creating a ChainGraph update from blocks fetched from rpc. It could be done for electrum too (see #77) because iirc getting the position is the same as getting the merkle root. This should be a different call though because you don't always want this.

Another thing worth considering is that including the position in block makes the ChainIndex a total order so we may need to extend that trait to tell us whether the index is a total order. This will allow us to make sure no two chain indexes are the same if it is a total order (i.e. no transactions are at the same height in the same block).

Add ChainGraph::invalidate_checkpoints

We added this to SparseChain but not chain graph but I think we need it directly on chain graph as well because we can't get a mutable ref to SparseChain from the chain graph.

"Real world" testing for transaction building and coin selection

Firstly, we need tests to check whether weight calculations are correct for coin selector. I would do this with different descriptors, and attempt to "sign" with "max satisfaction weight".

  • Generate a Selection result.
  • Construct tx with that result (that is signed).
  • Compare weight of final tx with calculations of the CoinSelector::weight* methods.

It will be awesome, if we can pass our raw tx to Bitcoin Core RPC (or something) to verify further.

We also need tests that ensure "minimum drain value" are correct, so that tx will not be rejected for having a dust output.

What other tests do we need?

Super-issue regarding transaction history consistency

As mentioned by @LLFourn in a set of messages, there are scenarios which we should handle:

  • When we evict a tx due to conflict, we should also evict the tx's children.
  • When we insert a tx without chain position that exposes 2 TxHeight::Unconfirmed txs have conflict, we should have a way to ensure consistency.

Also related:

  • We should implement CPFP feerate logic.

Contextual quotes

This is why I am concerned about just allowing arbitrary graph insertions.
We have to come up with an algorithm to always make txs in chain consistent with graph on every insertion.
-- @LLFourn

`SparseChain` API for invalidating checkpoints directly

Something like...

/// Invalidates checkpoints `from_height` (inclusive) and the displaced
/// transactions are moved to `TxHeight::Unconfirmed`.
pub fn invalidate_checkpoints_changeset(&self, from_height: u32) -> ChangeSet {
    todo!()
}

With this API, we can simplify the complexity of our RPC Chain Example (#89). We will not need the must_include variable within Client::emit_blocks.

Put bdk_file_store into bdk_chain

It only needs std and bincode v2.0.0. We can make bincode an optional dependency as I think implementing bincode traits directly would be a nice addition anyway.

The only thing I can see that we need to do is get rid of anyhow from bdk_file_persist (just return an std::io::Error) and to actually implement the bincode traits on everything. Some things might have to be done without derive since we might want to use consensus encode/decode for various bitcoin types.

Rename "ChainIndex" to "ChainPositoin"

We are calling this an "index" all over the place which is confusing because of all the indexes we have in the crate. Let's call it a ChainPosition (the position is the specific type of index it is) and use the type parameter P.

Fast indexing of UTXOs

Right now our APIs allow you to find unspent outputs by iterating over all outputs and checking if they're spent. This is fine for most applications but for users with hundreds of thousands of historical txos doing all this work just to get the balance is a bit much. How can we do better?

Things that I imagine would be useful:

  • What utxos are unspent
  • What utxos are only spent by unconfirmed transactions
  • What utxos that were unspent at a certain height

Remove "iter_" prefix for things

In std many things return an iterator without putting iter_ in front of it. Let's only put iter when we are creating some kind of custom iterator (e.g. generating a iterator for all spks of a descriptor) and want to bring attention to that.

API to get changeset from update to SparseChain

Right now we have a CheckpointCandidate which we can apply to a SparseChain. Given refactoring we can rename CheckpointCandidate to Update. When we apply an update we want the option to see the ChangeSet which should include all the txids added and all the txids removed (and their heights and whatever metadata we add).

This is useful so you can fetch the full transactions for the actual addition to the sparse chain and add them to the transaction graph. This changeset might also be useful for applying to other structures and data stores.

Improve efficiency of `BnbLimit::Duration`

As @afilini mentioned in #23 (comment)

post-merge note: i don't have real numbers but i'm guessing the syscall to get the current time can easily take 10x more time than just running a bnb iteration (due to the context switch), at least on modern cpus.

i think it would make sense to check the time every n iterations, with n probably >= 100. worst case you overshoot a bit, but overall at least you'd spend most of the time actually doing work rather than waiting for a syscall to finish.

there's a catch, though: if you have a very very slow cpu and a lot of utxos you may end up overshooting by a lot. i'm not sure what's the solution here though, i don't think it's worth adding the complexity to measure how long a few iterations take to adjust the n parameter. currently the bnb does i think thousands of iterations and nobody ever complained, so n = 100 i think is more than reasonable imho.

`CoinSelector` could be decoupled even further

The problem

Currently waste metric calculations (and associated parameters) are tightly coupled with CoinSelector and CoinSelectorOpt. For example, CoinSelectorOpt::long_term_feerate and CoinSelectorOpt::spend_drain_weight are waste-specific. Could we somehow have this in a separate struct?

Additionally, @csralvall suggested that we could try make excess strategies more flexible/extendable.

Potential solution

Ideally, after we decouple waste calculations, we would still want to be able to determine waste from both CoinSelector and Selection. This is because waste is considered during selection (for BnB), and is used to compare different selection results.

Personally, I think CoinSelector and Selection should be the same struct named CoinSelection and we use the "typestate pattern". This way weight and excess calculations/methods are shared between these two states, and we can perform waste calculations on both the unfinished and finished states.

CoinSelection<Unfinished> will have mutating methods.
CoinSelection<Finsihed> will only have non-mutating methods.

implement coin selection algorithms

  • Branch and bound at a minimum
  • Perhaps others inspired by bitcoind's wallet

We want to provide useful CS algorithms and make sure our CoinSelector API is composable enough to support them.

Futher notes

  • We should re-engineer finish so it doesn't decide whether we should use change or not. I may output different fee, weight and waste for drain vs non-drain. We want to do this so we can have more sophisticated change policy. (done)
  • We should provide a function that applies a particular drain policy with some usual setups. For example, there a heuristic that uses the magnitude of the outputs to decide which output is change. We could provide a way of applying a policy that selects more inputs to counter this. Of course we must provide a way of making the decision about whether there should be change or not.
  • Let's make a testing/bench-marking program to evaluate coin selection. We should look at computational performance per iteration and selection performance. (done)
  • We want an API that does drain minimization because we don't want to apply the drain policy in coin selection and it is useful to have drain minimization for cases where you don't mind adding an excess to the receiver's value.
  • We want an API that does minimization of a user defined value through a callback for example. For example a user could provide:
    • A function which outputs waste in which case it would do something similar to the existing B&B
    • A function which outputs excess (i.e. minimizes the selected amount)
    • A function which simply outputs the number of inputs selected (minimizes number of selected)
    • A function which scores different types of input SPKs badly (so you try your best to select from the same kind of SPK)
    • A function which outputs some privacy metric (minimises privacy loss)
    • A function which outputs a large number if the selection would include a change output (try to find changeless solution).
    • ... and so on and so forth
  • Since B&B and other search solutions incrementally improve their selection we could return an iterator of selections that would start off returning sub-optimal selections but then give you better ones the more you call .next. This allows the user to control how much time is spent doing CS.

Even further todos

  • How do we filter out candidates with a negative "effective value"? (done)
  • Change error to struct, with enum to signal which constraint was the cause. needed should reflect how much more "effective value" is needed before a successful finish can be achieved.
  • finish should return different choices: no drain with excess to fees, no drain with excess to receiver, with drain. finish returns vec ordered by increasing waste. (done)

Update::merge should be error free

When merging two updates, if they disagree somewhere (or rather you cannot prove they agree on the blockhash at that height) the update should just truncate itself to the height hey agree on and remove transactions in the disputed height.

Remove conflicting transactions from mempool/unconfirmed

Right now we allow transactions in SparseChain to double spend if they are in the mempool. Or rather we cannot enforce this within the sparse chain itself since we assume that the txs that are confirmed are from valid blocks and therefore cannot possibly spend each other. The TxGraph however should be able to resolve this for us. In ChainGraph we should be able to enforce this policy on every update (the mempool evictions should go into the ChangeSet) since it also has access to a TxGraph.

I'm not sure what the API should be but you need to be able to:

  1. Get unconfirmed txids from sparse_chain.
  2. For each one, check what conflicts with them. If the conflict is confirmed then it goes into eviction set. Otherwise you tiebreak by fee (perhaps make this configurable).
  3. Create a sparse_chain::ChangeSet of all removals

With chain_graph all of the above should happen automatically behind the scenes when you update so implementing it on ChainGraph might be the place to start to figure out what additional APIs you need on SparseChain and TxGraph (there might not be any).

Design and implement data persistence

  • It should be easy to serialize the state of each of our components
  • We should have nice APIs for common DB backends for storing and restoring TxGraph and SparseChain.
  • Perhaps you might want to also store your derivation index.
  • Try and use commit structure to make it fast to update the DB.

Tests for mempool consistency

We merged #74 but didn't really add any tests to properly enforce the new behaviors.

The main thing I can see is missing is testing that a transaction that was in a stale block which conflicts with an update transaction does not make it into the mempool.

I think #74 also introduced a bug where an update can evict a transaction in the chain without invalidating its block by having a new transaction that conflicts with it. We shouldn't allow this and this should mean the update is invalid. Test for this too.

Esplora and electrum should allow producing an update based on outputs rather than script pubkey

Right now the only way to find out if an output is spent is to query the APIs for a script pubkey. What if I only have an outpoint that I am interested in or I just want to check if a transaction is confirmed.

In addition to the iterator of script pubkeys we currently pass in we should allow iterators for:

  1. OutPoint which would return the tx with the outpoint on it (if it can find it) and any tx spending from it
  2. Txid which would return the transaction with this txid in the update if it exists.

Update should be opaque

Originally I had the fields as pub because I thought it might be nice to be able to serialize/deserialize it. I think it's more important that we exlcude invalid states from this thing such as transactions being higher than the highest checkpoint etc.

Consider doing this at the same time as #54

Changesets should commit to previous changesets to avoid applying out of order

Now we have persistent storage we want to make sure that we apply changesets in order. One way to ensure this is that each changeset commits to the previous one. This can be checked by the persistent storage backend before applying it.

Things to consider:

  1. When we add new transactions to the sparse chain we may want to atomically remove conflicts in the mempool as dictated by the TxGraph see #68. How do we make sure that these two things end up in the same changeset and avoiding having to return multiple changesets.

Enforce tx consistency in ChainGraph

Now that we have done #80 we always have a full tx whenever we have a txid in the SparseChain. In #74 we started enforcing that txs in ChainGraph don't double spend each other but only in determine_changeset. We are inconsistent about this though. We don't do it in insert_tx for example. This is an issue to discuss whether we want to do this across the whole structure. If we are not going to always enforcing it then we shouldn't just do it in determine_changeset. We should have a separate method like determine_changeset_evict_conflicts that does determine_changeset.

My preference is to keep it consistent across the whole structure.

Restructure chain data indexing

Here are our current components:

  1. SparseChain: It indexes transactions and where they are in blocks.
  2. SpkTracker: It indexes txouts by script pubkeys in whatever transactions you add to it.

Design goals

  1. We want to be able to incrementally update SpkTracker with new data in the blockchain where complexity proportional to the changes (e.g. number of new transactions) since SpkTracker was last updated. This currently works but only in the case that new transactions are all later in the blockchain than when it was last updated.

  2. We want to able to update the list of TXIDs in our TXID set in SparseChain without necessarily getting the full transaction data. Suggestion: split sprase chain into a (i) monotone structure which just stores and indexes transactions (but doesn't know which in the chain or not) and (ii) a non-monotone structure which keeps track of where transactions are in the chain but doesn't have full transaction data. We want to be able to quickly find which txids exist in the non-monotone structure but not in the monotone one.

  3. SpkTracker currently has to scan over all the txouts to find which are unspent. We should be able to create an index of this. Preferably, this would be outside SpkTracker. Something like an UnspentIndex. UnspentIndex would be non-montone and would require the non-monotone blockchain data to generate.

Potential Solution

  1. Do not rely on the order of transactions in updates like the checkpointing system does. Rather, just cache the updates to each structure and then when other structures need to sync to it they can recall which was the last update they saw i.e. like git commits and merging.

Potential Order

  1. Purge codebase of internal checkpointing logic. CheckPointCandidate doesn't need to be changed here just the internal optimizing algorithms in SpkTracker and SparseChain.
  2. Split SparseChain into monotone and non-monotone structures. Also change CheckPointCandidate to only have txids rather than full transactions.
  3. Add commit like indexing from SpkTracker to monotone SparseChain and from monotone SparseChain to non-monotone SparseChain. (rename these monotone SparseChain could TxGraph).
  4. Develop UnspentIndex.

Test case tracking issue

The development approach we're taking here is to test for correctness since we're still trying to figure out how the APIs should work.
In this issue we can track test cases we think we need but don't want to write since things are changing too quickly.

  • when applying an update that invalidates a block any conflicts in the update with transactions in that block should be ignored.
  • Getting new script pubkeys from a descriptor tracker that has wildcards should result in different scripts for each call

ChainGraph API for electrum

  1. We want to have the chaingraph enforce that for every txid in sparse_chain there is a full tx in the graph. To do this we'll have to make those fields private and we'll have to have an error case when the changeset is applied.
  2. Electrum needs to fetch incrementally. First it tells you txids you are interested in (associated with your script pubkeys) and then you batch fetch the transactions. We need an API to do this. It makes sense for electrum to return something like a SparseChain from the first requests for related txids to the scripts. We then need to turn the txids we're missing into full transactions in a ChainGraph that can be applied. Hint: maybe we can have a missing function on ChainGraph that somehow returns what's missing and then a method on the electrum API to take that set of missing things and turn it into a ChainGraph with a batch request.

Add Coin Selection Algorithm

The CoinSelectOpt should have an enum with various algorithm options to use for coin selection.

Initial target is to cover what bdk already covers.. Largest first, Oldest first and Branch and bound..

This can be a potentially "good first issue" too..

Use error deriving macro

At some point we will need proper Error structure for the bdk_core lib.

This is a tracking issue on discussing over the possible Error struct.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.