stellar / stellar-core Goto Github PK
View Code? Open in Web Editor NEWReference implementation for the peer-to-peer agent that manages the Stellar network.
Home Page: https://www.stellar.org
License: Other
Reference implementation for the peer-to-peer agent that manages the Stellar network.
Home Page: https://www.stellar.org
License: Other
We keep track of peers, how they fail etc
we should have tests that simulate normal and abusive peers
CXXFLAGS='-fsanitize=address' should work on clang now. Turn it on on travis.
We should probably take the version in Payment (that emits proper meta data), generalize it if necessary.
It seems likely that sha256 might be a better choice than sha512/256. The latter is about 1.5x faster in 64bit software, but involves different constants from the main sha512 algorithm, is not as widely supported in other programming language standard libraries, and is not going to be supported by the new skylake-generation hardware instructions for sha256.
https://en.wikipedia.org/wiki/Intel_SHA_extensions
https://en.wikipedia.org/wiki/SHA-2#Comparison_of_SHA_functions
Currently configure enables or disables the postgresql backend based on sniffing for libpq. This is too subtle and confuses users both ways: when they want it sometimes they don't get it, when they don't want it sometimes they get it and then tests fail when it tries to connect to a local postgres. Make it explicit.
Here's two different runs, one failed, one successful, filtered on the async call to retrieveQuorumSet
from FBA in Herder.cpp and the response from overlay recvFBAQuorumSet
in Herder fro qSet 4b5e56.
In the successful run, node 918ecd does receive qSet 4b5e56 eventually as expected.
spolu@spolu-ThinkPad-T430s:~/src/stellar/hayashi$ cat worked | grep "Herder.*Quorum.*4b5e56"
29/01/15 13:09:08 [Herder] DEBUG Herder::recvFBAQuorumSet@135ab0 qSet: 4b5e56
29/01/15 13:09:08 [Herder] DEBUG Herder::retrieveQuorumSet@ad8e17 qSet: 4b5e56
29/01/15 13:09:08 [Herder] DEBUG Herder::retrieveQuorumSet@41aa30 qSet: 4b5e56
29/01/15 13:09:08 [Herder] DEBUG Herder::retrieveQuorumSet@41aa30 qSet: 4b5e56
29/01/15 13:09:08 [Herder] DEBUG Herder::retrieveQuorumSet@ad8e17 qSet: 4b5e56
29/01/15 13:09:08 [Herder] DEBUG Herder::retrieveQuorumSet@918ecd qSet: 4b5e56
29/01/15 13:09:08 [Herder] DEBUG Herder::retrieveQuorumSet@ad8e17 qSet: 4b5e56
29/01/15 13:09:08 [Herder] DEBUG Herder::retrieveQuorumSet@41aa30 qSet: 4b5e56
29/01/15 13:09:08 [Herder] DEBUG Herder::retrieveQuorumSet@918ecd qSet: 4b5e56
29/01/15 13:09:08 [Herder] DEBUG Herder::retrieveQuorumSet@ad8e17 qSet: 4b5e56
29/01/15 13:09:08 [Herder] DEBUG Herder::retrieveQuorumSet@918ecd qSet: 4b5e56
29/01/15 13:09:08 [Herder] DEBUG Herder::retrieveQuorumSet@918ecd qSet: 4b5e56
29/01/15 13:09:08 [Herder] DEBUG Herder::recvFBAQuorumSet@ad8e17 qSet: 4b5e56
29/01/15 13:09:08 [Herder] DEBUG Herder::retrieveQuorumSet@41aa30 qSet: 4b5e56
29/01/15 13:09:08 [Herder] DEBUG Herder::retrieveQuorumSet@41aa30 qSet: 4b5e56
29/01/15 13:09:08 [Herder] DEBUG Herder::retrieveQuorumSet@918ecd qSet: 4b5e56
29/01/15 13:09:08 [Herder] DEBUG Herder::recvFBAQuorumSet@41aa30 qSet: 4b5e56
29/01/15 13:09:08 [Herder] DEBUG Herder::retrieveQuorumSet@918ecd qSet: 4b5e56
29/01/15 13:09:08 [Herder] DEBUG Herder::retrieveQuorumSet@918ecd qSet: 4b5e56
29/01/15 13:09:08 [Herder] DEBUG Herder::retrieveQuorumSet@918ecd qSet: 4b5e56
29/01/15 13:09:08 [Herder] DEBUG Herder::recvFBAQuorumSet@918ecd qSet: 4b5e56
In the failed run, node 918ecd does not receive qSet 4b5e56 as expected. The simulation stops when no more timer is active, and 918ecd never receives it.
spolu@spolu-ThinkPad-T430s:~/src/stellar/hayashi$ cat failed | grep "Herder.*Quorum.*4b5e56"
29/01/15 13:09:12 [Herder] DEBUG Herder::recvFBAQuorumSet@135ab0 qSet: 4b5e56
29/01/15 13:09:12 [Herder] DEBUG Herder::retrieveQuorumSet@41aa30 qSet: 4b5e56
29/01/15 13:09:12 [Herder] DEBUG Herder::retrieveQuorumSet@ad8e17 qSet: 4b5e56
29/01/15 13:09:12 [Herder] DEBUG Herder::retrieveQuorumSet@41aa30 qSet: 4b5e56
29/01/15 13:09:12 [Herder] DEBUG Herder::retrieveQuorumSet@918ecd qSet: 4b5e56
29/01/15 13:09:12 [Herder] DEBUG Herder::retrieveQuorumSet@41aa30 qSet: 4b5e56
29/01/15 13:09:12 [Herder] DEBUG Herder::retrieveQuorumSet@918ecd qSet: 4b5e56
29/01/15 13:09:12 [Herder] DEBUG Herder::retrieveQuorumSet@ad8e17 qSet: 4b5e56
29/01/15 13:09:12 [Herder] DEBUG Herder::retrieveQuorumSet@ad8e17 qSet: 4b5e56
29/01/15 13:09:12 [Herder] DEBUG Herder::recvFBAQuorumSet@41aa30 qSet: 4b5e56
29/01/15 13:09:12 [Herder] DEBUG Herder::retrieveQuorumSet@918ecd qSet: 4b5e56
29/01/15 13:09:12 [Herder] DEBUG Herder::retrieveQuorumSet@ad8e17 qSet: 4b5e56
29/01/15 13:09:12 [Herder] DEBUG Herder::retrieveQuorumSet@ad8e17 qSet: 4b5e56
29/01/15 13:09:12 [Herder] DEBUG Herder::retrieveQuorumSet@918ecd qSet: 4b5e56
29/01/15 13:09:12 [Herder] DEBUG Herder::retrieveQuorumSet@ad8e17 qSet: 4b5e56
29/01/15 13:09:12 [Herder] DEBUG Herder::recvFBAQuorumSet@ad8e17 qSet: 4b5e56
29/01/15 13:09:12 [Herder] DEBUG Herder::retrieveQuorumSet@918ecd qSet: 4b5e56
29/01/15 13:09:12 [Herder] DEBUG Herder::retrieveQuorumSet@918ecd qSet: 4b5e56
29/01/15 13:09:12 [Herder] DEBUG Herder::retrieveQuorumSet@918ecd qSet: 4b5e56
29/01/15 13:09:12 [Herder] DEBUG Herder::retrieveQuorumSet@918ecd qSet: 4b5e56
29/01/15 13:09:12 [Herder] DEBUG Herder::retrieveQuorumSet@918ecd qSet: 4b5e56
29/01/15 13:09:12 [Herder] DEBUG Herder::retrieveQuorumSet@918ecd qSet: 4b5e56
29/01/15 13:09:12 [Herder] DEBUG Herder::retrieveQuorumSet@918ecd qSet: 4b5e56
29/01/15 13:09:12 [Herder] DEBUG Herder::retrieveQuorumSet@918ecd qSet: 4b5e56
29/01/15 13:09:12 [Herder] DEBUG Herder::retrieveQuorumSet@918ecd qSet: 4b5e56
Any idea @jedmccaleb ?
OverlayManagerImpl knows when to run its next connection attempt, but it does not currently use this information to limit its sleep/wake cycle. Instead it schedules a tick every 2 seconds (OverlayManagerImpl::tick()
). There's no need for this, it might as well sleep until the next connect attempt should be made.
When something goes wrong in a config file error, just dumping out a std::runtime_error is not really good enough. Should at minimum write out which config item was being processed, ideally more context than that too if any is available.
For the sake of local testing, we should support an HTTP request that streams hot history (XDR) from the DB table that holds it. This will permit testing one server "catching up" against another directly, without additional public API servers.
Starting from the bucket that contains the current db's ledger number, write all objects in that bucket and all younger buckets to the database, then set the db ledger to the youngest bucket's earliest ledger value.
For the most part, identifiers representing an transaction sequence number are sequence
with one notable exception Transaction
, which uses seqNum
and offerSeqNum
. IMO, we should probably standardize on sequence
, since most locations use that.
It's nonstandard and, in particular, postgresql doesn't support it; replace with constraints on the table and in-application range checks. It will be an implementation limitation.
Given a target ledger number, download all the snapshots necessary to reconstruct a specific bucketlist.
For all files X.cpp, the first include should be X.h
This makes it simpler to avoid weird ordering issues of includes between cpp files.
Once hayashi is instrumented we'll need to collect metrics into a backend system.
Options for this in order of mat's preference:
Not sure if that is correct, but my assumption is that closeTime is the unix timestamp of when the ledger closed.
When running --test [simulation]
, src/util/TmpDir.cpp
ends up getting deleted, generally after running 3-4 iterations of the test.
The root tempdir should be created if it doesn't exist on startup, and cleaned on startup.
Edited to reflect conversation below:
-- used to say --
This is cleanup but will improve compile times and cut out a lot of extra concept-names. For each FooGateway/FooMaster pair:
FooMaster::Impl
and member std::unique_ptr<Impl> mImpl;
~FooMaster()
and define it in the FooMaster.cpp
fileFooMaster::*
members to class FooMaster::Impl { ... }
in the FooMaster.cpp
fileFooMaster
methods to Impl
methods as necessary, or just prefix variable accesses with mImpl->
and make FooMaster
a friend of FooMaster::Impl
FooMaster
as FooManager
. Slightly more dry/boring business term.We should properly use "user(foo)" and "into(bar)" constructs instead of constructing sql strings.
Even in the context of complex queries this can be achieved. See how LoadOffers works for an example.
Known offender (there might be more):
TrustLine
Upon transaction submission, I'm seeing the following warnings emitted to stdout:
...
19/02/15 09:08:53 [FBA] INFO Slot::attemptCommit@7eb7f3 i: 5 b: (0,87af74)
19/02/15 09:08:53 [FBA] INFO Slot::processEnvelope@7eb7f3 i: 5 {ENV@7eb7f3|COMMIT|(0,87af74)|4f852d}
19/02/15 09:08:53 [FBA] INFO Slot::attemptCommitted@7eb7f3 i: 5 b: (0,87af74)
19/02/15 09:08:53 [FBA] INFO Slot::processEnvelope@7eb7f3 i: 5 {ENV@7eb7f3|COMMITTED|(0,87af74)|4f852d}
19/02/15 09:08:53 [FBA] INFO Slot::attemptExternalize@7eb7f3 i: 5 b: (0,87af74)
19/02/15 09:08:53 [Herder] INFO Herder::valueExternalized@7eb7f3 txSet: e2401e
WARNING: there is already a transaction in progress
WARNING: there is no transaction in progress
...
The transactions do apply correctly, as expected.
revisision: 6e1ca86718dfa25ab8b41eba03b906689694e315
config:
PEER_PORT= 39133
RUN_STANDALONE=false
LOG_FILE_PATH="hayashi.log"
HTTP_PORT=39132
PUBLIC_HTTP_PORT=false
# what generates the peerID (used for peer connections) used by this node
PEER_SEED="s3BCUXncNvghHzKafx4gwYGaEG5rEeMUDdJPDsdjve3ojoFd5tK"
# what generates the nodeID (used in FBA)
VALIDATION_SEED="s3BCUXncNvghHzKafx4gwYGaEG5rEeMUDdJPDsdjve3ojoFd5tK"
QUORUM_THRESHOLD=1
QUORUM_SET=["gxoicA8D962NezYaa4AmrhXKGHYbrELu8rhyKE2vt8osLHL3T5"]
DATABASE="postgresql://dbname=hayashi_development"
example of existing issues:
According to SQL-92, identifiers that are not quoted are case-insensitive. Postgresql supports this by lower-casing all unquoted identifiers transparently. Sqlite does not have this problem.
Unfortunately some SQL toolkits (for example, both ActiveRecord and Sequel) will always quote identifiers for the SQL they produce. To integrate with the hayashi DB, a developer will have to manually convert to the lower-cased form, which could be confusing (well, it certainly was for me).
IMO, we should either:
Most of the uses of stringstream in the database code are superfluous and actually risk SQL injection. Use prepare against string constants with placeholders instead.
CLF currently only stores live ledger objects (added or modified); it does not support tombstones. It needs to.
(tried on Windows, but I suspect it's the same on other platforms)
(Moved from https://github.com/stellar/puppet/issues/131)
Need to set up a deployment scenario for Hayashi.
Would like to do this via an ASG so we add add/remove instances with minimal effort.
Will need some way to designate ownership of public DNS entries and Postgres databases in order to do this.
I also had a chat with graydon re: deployment on Jan 16th that should help fuel things a bit.
What wasn't yet clear to me was how to configure trust relationships between the nodes, especially if the set is dynamic.
Not the biggest deal but if it is easy to fix it would be better
Currently we try to keep most of the tests quick enough that they can run casually while working / doing CI. We should also have a stress-test mode that tries to see where performance problems show up as we scale up transaction rate and database size. It should also write out performance metrics.
this allows platform to build extractors that for example follow transactions on specific users and get the balance breakdown
should include cases where intermediate transfers would result in over the limit trust line, offers being filled, invalid offers, etc
Extract history records from the DB table storing them and write them to a history block when more than the history block size worth of ledgers have passed.
Buckets should be memory-bounded and switch to disk-backed form when larger than some threshold value (say, 10MB or so?). This will require rewriting the merge algorithm to optionally use disk-based iteration as well, and rewriting the history bucket-writing code to just flip the bucket to disk-backed mode.
Given a running hayashi, submitting the following command:
curl http://127.0.0.1:39132/tx\?2e3c35010749c1de3d9a5bdd6a31c12458768da5ce87cca6aad63ebbaaef7432000003e80000000100000000000003e80000000000000000000000009d7d563f1648962f08cab1b00f086bf5726cbf5413138aaa9d956b285e4b9c350000000000000000000007d0000000000000000000000064000000000000000000000001af13cb78b2acc5885b47cbb8d5b6d65dcd6c2d7e11a4ef622598eab42aeec1f9f7c5ee01effb80eac4f50560eef376d47b14d61a13302c6052542e4d825b1502
Triggers a crash:
➜ hayashi git:(master) ✗ ./bin/stellard
09/02/15 18:32:24 [default] INFO Starting stellard-hayashi 25b8c8e
09/02/15 18:32:24 [default] INFO Config from stellard.cfg
09/02/15 18:32:24 [default] INFO Application constructing (worker threads: 8)
09/02/15 18:32:24 [default] INFO Application constructed
09/02/15 18:32:24 [default] DEBUG TmpDirMaster cleaning: tmp
09/02/15 18:32:24 [default] DEBUG TmpDir deleting: tmp
09/02/15 18:32:24 [default] DEBUG TmpDir created tmp
09/02/15 18:32:24 [Overlay] DEBUG PeerDoor binding to endpoint 0.0.0.0:39133
09/02/15 18:32:24 [Overlay] DEBUG PeerDoor acceptNextPeer()
09/02/15 18:32:24 [FBA] DEBUG Node::cacheQuorumSet@41a4bc qSet: eca466
09/02/15 18:32:24 [FBA] INFO LocalNode::LocalNode@41a4bc qSet: eca466
09/02/15 18:32:24 [Herder] DEBUG Herder::recvFBAQuorumSet@41a4bc qSet: eca466
09/02/15 18:32:24 [default] INFO Listening on 127.0.0.1:39132 for HTTP requests
09/02/15 18:32:24 [default] INFO Connecting to: sqlite3://stellar.db
09/02/15 18:32:24 [default] INFO Loading last known ledger
09/02/15 18:32:24 [default] DEBUG PeerMaster tick
....
09/02/15 18:33:14 [default] DEBUG PeerMaster tick
09/02/15 18:33:16 [default] DEBUG PeerMaster tick
libc++abi.dylib: terminating with uncaught exception of type std::runtime_error: error in stellar::hexToBin(std::string)
[1] 73458 abort ./bin/stellard
All our target platforms support it and it's slightly less error-prone
We keep letting them slip through. They're serious bugs.
Ledger entries are currently identified in many places by an intrinsic tuple-based identity, for example an offer is identified as its (owner, sequence)
pair, or a trustline by its (owner, issuer, currency)
triple. In other places, the code identifies ledger entries by hashing these triples into Yet Another Hash Value called the entry's index
. This is potentially confusing and unnecessary -- especially since in the case of accounts it coincides with the account key -- and in any case there there's a class that handles comparing by intrinsic identity directly (LedgerKey). Remove uses of hash-based indexes.
When downloading / catching up to a specific ledger, the final phase involves transaction replay of the most-recent transactions (after the state of the earliest bucket is attained). Add code to do this (reading from a history block).
Right now the peer database is updated/accessed via scattered SQL statements in the source code.
We should consolidate in one place (let's say PeerMaster that happens to define the schema).
When you run in real time mode, if there's nothing to do aside from pending future timers, no "real work" gets done and so crank(false)
-- nonblocking -- returns immediately and declines to propagate real time through to virtual. This is explicit in the code but (a) I'm not sure why I put the && nWorkDone != 0
criterion there in the first place and (b) it's clearly not quite right even if it's heart is in the right place, since it causes the app to stall.
Allocates inline, is exception-safe.
Given a running stellard instance, running curl http://127.0.0.1:39132/tx
will crash it.
➜ hayashi git:(master) ✗ ./bin/stellard
09/02/15 18:29:49 [default] INFO Starting stellard-hayashi 25b8c8e
09/02/15 18:29:49 [default] INFO Config from stellard.cfg
09/02/15 18:29:49 [default] INFO Application constructing (worker threads: 8)
09/02/15 18:29:49 [default] INFO Application constructed
09/02/15 18:29:49 [default] DEBUG TmpDir created tmp
09/02/15 18:29:49 [Overlay] DEBUG PeerDoor binding to endpoint 0.0.0.0:39133
09/02/15 18:29:49 [Overlay] DEBUG PeerDoor acceptNextPeer()
09/02/15 18:29:49 [FBA] DEBUG Node::cacheQuorumSet@41a4bc qSet: eca466
09/02/15 18:29:49 [FBA] INFO LocalNode::LocalNode@41a4bc qSet: eca466
09/02/15 18:29:49 [Herder] DEBUG Herder::recvFBAQuorumSet@41a4bc qSet: eca466
09/02/15 18:29:49 [default] INFO Listening on 127.0.0.1:39132 for HTTP requests
09/02/15 18:29:49 [default] INFO Connecting to: sqlite3://stellar.db
09/02/15 18:29:49 [default] INFO Loading last known ledger
09/02/15 18:29:49 [default] DEBUG PeerMaster tick
09/02/15 18:29:51 [default] DEBUG PeerMaster tick
09/02/15 18:29:53 [default] DEBUG PeerMaster tick
09/02/15 18:29:55 [default] DEBUG PeerMaster tick
09/02/15 18:29:57 [default] DEBUG PeerMaster tick
09/02/15 18:29:59 [default] DEBUG PeerMaster tick
09/02/15 18:30:01 [default] DEBUG PeerMaster tick
09/02/15 18:30:03 [default] DEBUG PeerMaster tick
09/02/15 18:30:05 [default] DEBUG PeerMaster tick
09/02/15 18:30:07 [default] DEBUG PeerMaster tick
09/02/15 18:30:09 [default] DEBUG PeerMaster tick
09/02/15 18:30:11 [default] DEBUG PeerMaster tick
libc++abi.dylib: terminating with uncaught exception of type std::out_of_range: basic_string
[1] 73413 abort ./bin/stellard
Given a hayashi node, running a new ledger, then introducing a valid payment from the root account to another:
curl http://localhost:39132/tx\?blob\=2e3c35010749c1de3d9a5bdd6a31c12458768da5ce87cca6aad63ebbaaef7432000003e80000000100000000000003e80000000000000000000000009d7d563f1648962f08cab1b00f086bf5726cbf5413138aaa9d956b285e4b9c3500000000000000000bebc20000000000000000000bebc2000000000000000000000000015a406e28841e7f8d47cfb768755d70fb6ae3ff656ce9d4769b50b6a2dde52bd0a0388498a92e20b885c630846174206abc63da4ef6b30c71a585c2c267f3bc0a
This results in a new record in Accounts table as expected, balances all appear to be correct, but TxHistory itself is still empty.
Postgresql backend for SOCI has very limited BLOB support. Implement the missing bits.
some tests were failing and were disabled in
MonsieurNicolas@e95c6aa
SQL calls should be centralized per class: right now writes are properly factored per entry type (ie "OfferFrame") but the reads are in database.cpp.
This makes schema management more complicated than it should be.
It was broken / backed out for some reason, but it's a good check.
more often than not, we want
for(auto& X : Y) or for(auto const& X : Y )
instead of making a temp copy
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.