Giter Site home page Giter Site logo

ethereum / py-evm Goto Github PK

View Code? Open in Web Editor NEW
2.2K 97.0 627.0 16.5 MB

A Python implementation of the Ethereum Virtual Machine

Home Page: https://py-evm.readthedocs.io/en/latest/

License: MIT License

Python 99.22% Makefile 0.19% Shell 0.04% Solidity 0.55%
evm ethereum python ethereum-virtual-machine

py-evm's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

py-evm's Issues

Architecture for block mining

What is wrong?

The DAO fork rules require running some custom code prior to application of any transactions on the block in which the dao fork takes place. Currently, this is implemented in a not-so-ideal way within the configure_header method. Given the intention of the configure_header method, this is not the right place for this type of logic.

How can it be fixed

Introduce the following workflow for block mining.

  • When previous block is mined, new block is seeded (fully automated within mine_block().
  • New block is initialized (currenty done in configure_header()). This would transition the block into the initialized state after which it cannot be reconfigured. Probably worth renaming to initialize_block().
  • Transactions are applied to block.
  • Block is finalized. (can be called manually to finalize a block but not mine the next one)
  • Block is mined (if not already finalized, this would be automatically done)

This would introduce the following state machine seeded > initialized > under_construction > finalized. This would also allow hooks to be put in place at the initialized and finalized steps which should provide cleaner ways to perform this type of logic. My spider sense also suggests these will come in handy later.

Implement Spurious Dragon hard fork

Reference: https://blog.ethereum.org/2016/11/18/hard-fork-no-4-spurious-dragon/

What is wrong?

The spurious dragon hard fork contains the following rule changes:

  • EIP155: Transaction replay protection
  • EIP160: EXP cost increase
  • EIP161: State trie clearing
  • EIP170: Contract code-size limit

How can it be fixed

Pluggable ECC backends - Part 2: coincurve ECC backend.

What is wrong?

Many users will be willing to install the necessary system dependencies to gain the speed provided by the coincurve library.

How can it be fixed.

Once #24 is done, implement a coincurve based ECC backend which uses the coincurve library for faster ECC operations.

Pluggable ECC backends - Part 1

https://www.pivotaltracker.com/n/projects/2066033

What is wrong?

The library current uses a local python implementation of the necessary ECC functions for signature creation and verification. The library coincurve provides faster implementations of these functions but at the cost of requiring a system dependency.

  • Some users will value being able to do a vanilla install without futzing with system dependencies
  • Some users will value speed and be willing to install the necessary system dependencies to get it.

How can it be fixed?

Implement a pluggable backend system for all of the ECC functionality. This should start with a default backend which only delegates the current ECC functionality.

Implementation Details

  • Since we don't have a formal configuration spec for Py-EVM yet, lets use environment variables. for this one, lets use PYEVM_ECC_BACKEND
  • Copy the import_string function from populus: https://github.com/pipermerriam/populus/blob/master/populus/utils/module_loading.py
  • Implement a get_ecc_backend_class function which reads the PYEVM_ECC_BACKEND value from os.environ and imports the class.
  • Implement a get_ecc_backend function which returns the instantiated ECC backend class.
  • Implement a BaseECCBackend class which implements the base needed for ECC operations.
  • Implement a PurePythonECCBackend class which uses the functions found at evm.utils.ecdsa.

Add type hints for all packages

Type hints allow us to perform static type checking, among other things. We could benefit a lot from using them.

This stackoverflow answer does a great job at describing their main benefits.

pyannotate would come in handy if it had support for py3-style annotations, but meanwhile we can add them manually

monkeytype may actually be more useful than pyannotate as it generates py3-style annotations

Definition of done

  • type hints for all of evm, p2p, and trinity modules.
  • individual mypy CI runs for each of these modules using mypy --follow-imports=silent --ignore-missing-imports --check-untyped-defs --disallow-incomplete-defs -p <module>

This should not be done as a single pull request, but rather many small ones to iteratively add type hints, slowly expanding the mypy coverage as hints are added.

Write unit tests for the `Message` object

What is wrong?

There are no unit tests for the Message object

How can it be fixed

Write unit tests that verify that it:

  • all __init__ parameters are properly validated.
  • v.is_origin correctly returns True/False for whether this message is the origin message meaning that it's the ourtermost VM message being executed.
  • v.origin property returns correct value.
  • v.code_address returns correct value when to address is set to CREATE_CONTRACT_ADDRESS.
  • v.storage_address returns correct value when message is performaing contract creation.
  • v.is_create returns True when message is for contract creation.

Modify *trace* logging format to conform to spec

What is wrong?

There is a spec of some form which the data generated by tracing a transaction execution generates. Py-EVM needs to conform to that.

How can it be fixed

Talk with @cdetrio and figure out where this format is documented if anywhere and then make the appropriate changes.

Documentation

We should expand the documentation to cover the following topics.

The Chain class.

  • High level description of it's purpose/role/responsibilities
  • Brief explanation of how it wraps VM classes.
  • List of public methods (they don't need to be individually documented yet)

The BaseChainDB class

  • High level overview of it's purpose/role/responsibilities
    • It's like an ORM for the Chain class.
  • List of public methods (they don't need to be individually documented yet)

The VM class.

  • High level overview of it's purpose/role/responsibilities
  • Basic explanation of the hierarchy of apply_transaction > execute_transaction > apply_message > apply_computation.
  • List of methods and properties that can be set with simple descriptions of what each is used for

The EVM Computation API

  • Enumerate all of the APIs that are accessible from the Computation object and the nested GasMeter, Stack, Memory , Message, and CodeStream apis.
  • List of public methods and properties for these objects

Developers Guide

  • Chains go in evm.chains.<name>
  • Fork rules go in evm.vm.forks.<name>

Implement binary search for `RoutingTable.get_bucket_for_node`

At the time of creation, this code only exists in the branch gsalgado:p2p under @gsalgado 's fork.

What is wrong?

The current implementation of evm.p2p.kademlia.RoutingTable.get_bucket_for_node is implemented such that it operates in linear runtime.

How can it be fixed

This can be fixed by doing two things:

  1. Implementing ordering for the Bucket class.
  2. Using the bisect module to lookup the appropriate bucket using a binary search.

In order to make the Bucket class orderable, you should use the functools.total_ordering class decorator. Buckets should be sorted by Bucket.end

Once the Bucket class can be ordered, you should use the bisect module to perform a binary search on the bucket list to find the appropriate bucket for the given node.

The criteria for a bucket/node match are: Bucket.start <= node.id <= Bucket.end

How to test it

This functionality should be written as a standalone function which takes a list of buckets and the node and returns the bucket. If no bucket is found for the given node then a ValueError should be raised.

Test cases you should ensure are covered are:

  • empty list of buckets
  • single bucket
  • two buckets
  • many buckets
  • @gsalgado other things?

Some BaseChainDB methods may leave the database in an inconsistent state.

For instance, persist_header_to_db() stores the header in the DB, calculates the total difficulty for that block and then stores that in the DB as well, but it can crash when calculating the difficulty and in that case we'd end up with the header stored but without its total difficulty.

There are a few ways to address that:

  1. Use a database that supports ACID transactions
  2. Batch DB updates and apply them atomically
  3. Deal with it manually, with a bare except that rolls back updates

Number 1) is problematic because LevelDB does not support transactions, 3) is error prone and tedious, so I think 2) is our best option, but it means we need to add support for that to MemoryDB as well.

Write unit test for the `Computation` object

What is wrong?

There are no unit test for the evm.vm.computation.Computation object

How can it be fixed

Write unit tests to verify that it:

  • Basic smoke tests for prepare_child_message. Check depth is increased, origin is preserved, etc.
  • expand_memory extension correctly validates parameters and charges appropriate gas costs.
  • c.output property only returns a value if the computation didn't error.
  • register_account_for_deletion fails if an account is registered twice.
  • get_accounts_for_deletion() returns empty if the computation errors.
  • add_log_entry performs validation and adds the log entry.
  • get_log_entries returns empty when computation has errored.
  • get_gas_refund returns 0 when computation has errored.
  • get_gas_remaining returns 0 when computation has errored.
  • Use as a context manager appropriately catches and consumes VM errors raised inside of the context, resulting in an errored computation object.
  • get_gas_used returns max gas when computation has errored.

Unit tests for the `Memory` object

What is wrong?

There are no unit tests for the functionality of the evm.vm.memory.Memory object

How can it be fixed

Write unit tests to verify that it:

  • write(..) function parameters are validated:
    • start_position and size are valid 256 bit unsigned integers.
    • value is of type bytes
    • The length of value is the same as size (this may actually not be required by the EVM).
    • Invalid to write a value that would extend beyond the end of the total memory size.
  • extend appropriately extends the memory, only padding with zeros when the size increases.
  • read pulls the correct bytes from memory.

ERROR collecting tests/json-fixtures/test_vm.py

  • py-evm Version: 0.2.0-alpha.1
  • OS: osx
  • Python Version (python --version): 3.5.4
  • Environment (output of pip freeze):
apipkg==1.4
execnet==1.4.1
flake8==3.3.0
mccabe==0.6.1
pluggy==0.5.1
py==1.4.34
pycodestyle==2.3.1
pyflakes==1.5.0
pytest==3.1.2
pytest-xdist==1.18.1
tox==2.7.0
virtualenv==15.1.0

What is wrong?

>>> py.test
============================================================ test session starts ============================================================
platform darwin -- Python 3.5.2, pytest-3.2.0, py-1.4.34, pluggy-0.4.0 -- /Users/davidjuin0519/anaconda3/bin/python
cachedir: .cache
rootdir: /Users/davidjuin0519/projects/py-evm, inifile: pytest.ini
collected 21439 items / 1 errors / 1 skipped

================================================================== ERRORS ===================================================================
______________________________________________ ERROR collecting tests/json-fixtures/test_vm.py ______________________________________________
import file mismatch:
imported module 'test_vm' has this __file__ attribute:
  /Users/davidjuin0519/projects/py-evm/tests/core/vm/test_vm.py
which is not the same as the test file we want to collect:
  /Users/davidjuin0519/projects/py-evm/tests/json-fixtures/test_vm.py
HINT: remove __pycache__ / .pyc files and/or use a unique basename for your test file modules
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 1 errors during collection !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
==================================================== 1 skipped, 1 error in 59.51 seconds ====================================================

The test could run after removing file py-evm/tests/json-fixtures/test_vm.py but many test cases from tests/json-fixtures/test_blockchain.py fail. Not sure if this is a expected result.

Please include information like:

  • full output of the error you received
  • what command you ran
  • the code that caused the failure (see this link for help with formatting code)

How can it be fixed

Fill this in if you know how to fix it.

Add `Computation.is_success` and `Computation.is_error` API

What is wrong?

There are many places in the codebase that use the check if computation.error to check for whether a given round of computation resulted in an error. This would be better implemented as a dedicated API so that future changes to the EVM semantics on what constitutes an error are not directly tied to the presence or absence or truthyness of what lives at computation.error

How can it be fixed

Add a these new APIs to the Computation object:

  • Computation.is_success
  • Computation.is_error

is_success should be a computed property that returns True if computation.error is None. Conversely, is_error should be a computed property that returns the opposite of whatever is_success returns.

Generic P2P tests as JSON documents

What is wrong?

It would be nice to have a test suite for our p2p code that was generic in some of the ways that the JSON-fixture tests are for the vm.

How can it be done

It seems like we should be able to come up with a JSON document which defined the setup of the network, and then a sequence of operations to be performed by the nodes on the network, and then an end state that we expect the network to resolve to.

This is tangentially related to a need for a generic simulation framework which I think the beginnings of already exist. However, I think we need to start zeroing in on how we make some of these things generic so that we can work towards a generic simulation runner that can run any of our defined network conditions. This pattern seems common in some of the other codebases and if done well, it should pay off in lower testing overhead going forward.

I'm currently not familiar enough with the innards to know exactly what this should look like but hoping to dig in soon.

PyPy investigation notes

I was curious how fast py-evm would run in pypy, and it sent me down a rabbit hole. This issue is just a way to note what I learned along the way.

  • pypy won't work out of the box because of the dependency on pysha3 (via eth-bloom), which fails to compile under pypy
  • I ran eth-bloom tests as a rough speed proxy with several different configurations:
    • pysha3: 0.5s
    • pure python implementation in cpython: 25s
    • pure python implementation in pypy: 9.5s
    • pure python implementation in cython: 20.5s
    • after a couple of type annotations in cython: 16.5s
    • cython in pypy: crash at runtime with TypeError: object.__new__(builtin_function_or_method) is not safe, use builtin_function_or_method.__new__()
  • Optimizing sha3 in cython doesn't matter if it doesn't work under pypy (because we might as well keep the c extension version). Although I do expect that it would be possible to make it work under pypy with some tweaks.
  • Punting on pypy and working on a cython trie implementation (which was Piper's first instinct) might be the most fruitful next step

Add new gas and return data API to Computation object.

What is wrong?

With the introduction of the REVERT opcode in the byzantium fork rules we can no longer assume that the presence of computation.error means that gas has been fully depleted or that there is no return data.

This has resulted in the following lines of code being repeated in various places.

Some of this logic repetition would be better served by a dedicated API on the Computation object.

How can it be fixed

Add the following two new APIs to the Computation object.

def.should_burn_gas():
    return self.error and self.error.burns_gas

def should_erase_return_data():
    return self.error and self.error.zeros_return_data

if you would also like to cleanup the slighly confusing name zeros_return_data and change it to erases_return_data go for it.

Implement Journaling DB wrapper

What is wrong?

The current snapshot mechanism for rolling transactions back won't scale. We can't expect to take a snapshot at the beginning of every transaction in real situations.

How can it be fixed

We need the ability to do fast rollbacks when transactions fail. I believe that Journaling is best way to accomplish this. Looking through the go-ethereum codebase, they store a list of actions that have occurred on the state database with sufficient information to unroll all of those changes. It seems reasonable to copy this approach.

Alternatively we could do what pyethereum does and keep all changes in a local cache and manually commit them when the transaction succeeds.

I'm inclined to try the journaling approach first.

Slow byzantium tests

What is wrong?

Here are the slowest 50 byzantium tests.

The slowest 20 of those combine for 30 minutes worth of test running. Currently the whole suite takes about 45 minutes to run which is right at the threshhold for travis-ci's maximum runtime for a single test run.

300.04s call     tests/json-fixtures/test_state.py::test_state_fixtures[/Users/piper/sites/py-evm/fixtures/GeneralStateTests/stStaticCall/static_Call50000_sha256.json:static_Call50000_sha256:Byzantium:0]
208.33s call     tests/json-fixtures/test_state.py::test_state_fixtures[/Users/piper/sites/py-evm/fixtures/GeneralStateTests/stStaticCall/static_Call50000_rip160.json:static_Call50000_rip160:Byzantium:0]
166.29s call     tests/json-fixtures/test_state.py::test_state_fixtures[/Users/piper/sites/py-evm/fixtures/GeneralStateTests/stStaticCall/static_Call50000_sha256.json:static_Call50000_sha256:Byzantium:1]
155.48s call     tests/json-fixtures/test_state.py::test_state_fixtures[/Users/piper/sites/py-evm/fixtures/GeneralStateTests/stStaticCall/static_Call50000.json:static_Call50000:Byzantium:1]
118.80s call     tests/json-fixtures/test_state.py::test_state_fixtures[/Users/piper/sites/py-evm/fixtures/GeneralStateTests/stStaticCall/static_Call50000_ecrec.json:static_Call50000_ecrec:Byzantium:1]
115.58s call     tests/json-fixtures/test_state.py::test_state_fixtures[/Users/piper/sites/py-evm/fixtures/GeneralStateTests/stStaticCall/static_Call50000_rip160.json:static_Call50000_rip160:Byzantium:1]
102.82s call     tests/json-fixtures/test_state.py::test_state_fixtures[/Users/piper/sites/py-evm/fixtures/GeneralStateTests/stStaticCall/static_LoopCallsThenRevert.json:static_LoopCallsThenRevert:Byzantium:0]
89.42s call     tests/json-fixtures/test_state.py::test_state_fixtures[/Users/piper/sites/py-evm/fixtures/GeneralStateTests/stStaticCall/static_Call50000_identity2.json:static_Call50000_identity2:Byzantium:1]
86.87s call     tests/json-fixtures/test_state.py::test_state_fixtures[/Users/piper/sites/py-evm/fixtures/GeneralStateTests/stStaticCall/static_Call50000_identity.json:static_Call50000_identity:Byzantium:1]
86.83s call     tests/json-fixtures/test_state.py::test_state_fixtures[/Users/piper/sites/py-evm/fixtures/GeneralStateTests/stStaticCall/static_Return50000_2.json:static_Return50000_2:Byzantium:0]
86.67s call     tests/json-fixtures/test_state.py::test_state_fixtures[/Users/piper/sites/py-evm/fixtures/GeneralStateTests/stCallCreateCallCodeTest/Call1024PreCalls.json:Call1024PreCalls:Byzantium:0]
85.83s call     tests/json-fixtures/test_state.py::test_state_fixtures[/Users/piper/sites/py-evm/fixtures/GeneralStateTests/stChangedEIP150/Call1024PreCalls.json:Call1024PreCalls:Byzantium:0]
84.56s call     tests/json-fixtures/test_state.py::test_state_fixtures[/Users/piper/sites/py-evm/fixtures/GeneralStateTests/stDelegatecallTestHomestead/Call1024PreCalls.json:Call1024PreCalls:Byzantium:0]
80.78s call     tests/json-fixtures/test_state.py::test_state_fixtures[/Users/piper/sites/py-evm/fixtures/GeneralStateTests/stStaticCall/static_Call50000.json:static_Call50000:Byzantium:0]
64.97s call     tests/json-fixtures/test_state.py::test_state_fixtures[/Users/piper/sites/py-evm/fixtures/GeneralStateTests/stStaticCall/static_Call50000_ecrec.json:static_Call50000_ecrec:Byzantium:0]
59.34s call     tests/json-fixtures/test_state.py::test_state_fixtures[/Users/piper/sites/py-evm/fixtures/GeneralStateTests/stStaticCall/static_Call1024PreCalls2.json:static_Call1024PreCalls2:Byzantium:0]
53.58s call     tests/json-fixtures/test_state.py::test_state_fixtures[/Users/piper/sites/py-evm/fixtures/GeneralStateTests/stStaticCall/static_Call50000_identity.json:static_Call50000_identity:Byzantium:0]
52.65s call     tests/json-fixtures/test_state.py::test_state_fixtures[/Users/piper/sites/py-evm/fixtures/GeneralStateTests/stStaticCall/static_Call50000_identity2.json:static_Call50000_identity2:Byzantium:0]
48.52s call     tests/json-fixtures/test_state.py::test_state_fixtures[/Users/piper/sites/py-evm/fixtures/GeneralStateTests/stStaticCall/static_LoopCallsThenRevert.json:static_LoopCallsThenRevert:Byzantium:1]
42.72s call     tests/json-fixtures/test_state.py::test_state_fixtures[/Users/piper/sites/py-evm/fixtures/GeneralStateTests/stCallCreateCallCodeTest/Call1024BalanceTooLow.json:Call1024BalanceTooLow:Byzantium:0]
42.23s call     tests/json-fixtures/test_state.py::test_state_fixtures[/Users/piper/sites/py-evm/fixtures/GeneralStateTests/stChangedEIP150/Call1024BalanceTooLow.json:Call1024BalanceTooLow:Byzantium:0]
40.17s call     tests/json-fixtures/test_state.py::test_state_fixtures[/Users/piper/sites/py-evm/fixtures/GeneralStateTests/stCallCreateCallCodeTest/Callcode1024BalanceTooLow.json:Callcode1024BalanceTooLow:Byzantium:0]
40.06s call     tests/json-fixtures/test_state.py::test_state_fixtures[/Users/piper/sites/py-evm/fixtures/GeneralStateTests/stChangedEIP150/Callcode1024BalanceTooLow.json:Callcode1024BalanceTooLow:Byzantium:0]
34.27s call     tests/json-fixtures/test_state.py::test_state_fixtures[/Users/piper/sites/py-evm/fixtures/GeneralStateTests/stSystemOperationsTest/CallRecursiveBomb0_OOG_atMaxCallDepth.json:CallRecursiveBomb0_OOG_atMaxCallDepth:Byzantium:0]
32.15s call     tests/json-fixtures/test_state.py::test_state_fixtures[/Users/piper/sites/py-evm/fixtures/GeneralStateTests/stRevertTest/LoopCallsDepthThenRevert2.json:LoopCallsDepthThenRevert2:Byzantium:0]
31.26s call     tests/json-fixtures/test_state.py::test_state_fixtures[/Users/piper/sites/py-evm/fixtures/GeneralStateTests/stRevertTest/LoopCallsDepthThenRevert3.json:LoopCallsDepthThenRevert3:Byzantium:0]
28.12s call     tests/json-fixtures/test_state.py::test_state_fixtures[/Users/piper/sites/py-evm/fixtures/GeneralStateTests/stDelegatecallTestHomestead/CallRecursiveBombPreCall.json:CallRecursiveBombPreCall:Byzantium:0]
27.40s call     tests/json-fixtures/test_state.py::test_state_fixtures[/Users/piper/sites/py-evm/fixtures/GeneralStateTests/stRevertTest/LoopCallsThenRevert.json:LoopCallsThenRevert:Byzantium:0]
26.93s call     tests/json-fixtures/test_state.py::test_state_fixtures[/Users/piper/sites/py-evm/fixtures/GeneralStateTests/stCallCreateCallCodeTest/CallRecursiveBombPreCall.json:CallRecursiveBombPreCall:Byzantium:0]
26.11s call     tests/json-fixtures/test_state.py::test_state_fixtures[/Users/piper/sites/py-evm/fixtures/GeneralStateTests/stStaticCall/static_Call50000bytesContract50_1.json:static_Call50000bytesContract50_1:Byzantium:1]
25.48s call     tests/json-fixtures/test_state.py::test_state_fixtures[/Users/piper/sites/py-evm/fixtures/GeneralStateTests/stStaticCall/static_Call1024PreCalls.json:static_Call1024PreCalls:Byzantium:1]
24.69s call     tests/json-fixtures/test_state.py::test_state_fixtures[/Users/piper/sites/py-evm/fixtures/GeneralStateTests/stDelegatecallTestHomestead/Call1024BalanceTooLow.json:Call1024BalanceTooLow:Byzantium:0]
24.64s call     tests/json-fixtures/test_state.py::test_state_fixtures[/Users/piper/sites/py-evm/fixtures/GeneralStateTests/stDelegatecallTestHomestead/Delegatecall1024.json:Delegatecall1024:Byzantium:0]
24.42s call     tests/json-fixtures/test_state.py::test_state_fixtures[/Users/piper/sites/py-evm/fixtures/GeneralStateTests/stRevertTest/LoopCallsThenRevert.json:LoopCallsThenRevert:Byzantium:1]
23.19s call     tests/json-fixtures/test_state.py::test_state_fixtures[/Users/piper/sites/py-evm/fixtures/GeneralStateTests/stStaticCall/static_Call50000bytesContract50_2.json:static_Call50000bytesContract50_2:Byzantium:1]
21.41s call     tests/json-fixtures/test_state.py::test_state_fixtures[/Users/piper/sites/py-evm/fixtures/GeneralStateTests/stStaticCall/static_Call1024PreCalls2.json:static_Call1024PreCalls2:Byzantium:1]
14.22s call     tests/json-fixtures/test_state.py::test_state_fixtures[/Users/piper/sites/py-evm/fixtures/GeneralStateTests/stRandom/randomStatetest636.json:randomStatetest636:Byzantium:0]
14.15s call     tests/json-fixtures/test_state.py::test_state_fixtures[/Users/piper/sites/py-evm/fixtures/GeneralStateTests/stStaticCall/static_Call1024PreCalls3.json:static_Call1024PreCalls3:Byzantium:1]
12.91s call     tests/json-fixtures/test_state.py::test_state_fixtures[/Users/piper/sites/py-evm/fixtures/GeneralStateTests/stRandom/randomStatetest467.json:randomStatetest467:Byzantium:0]
11.91s call     tests/json-fixtures/test_state.py::test_state_fixtures[/Users/piper/sites/py-evm/fixtures/GeneralStateTests/stRandom/randomStatetest458.json:randomStatetest458:Byzantium:0]
11.90s call     tests/json-fixtures/test_state.py::test_state_fixtures[/Users/piper/sites/py-evm/fixtures/GeneralStateTests/stRandom/randomStatetest150.json:randomStatetest150:Byzantium:0]
11.75s call     tests/json-fixtures/test_state.py::test_state_fixtures[/Users/piper/sites/py-evm/fixtures/GeneralStateTests/stRandom/randomStatetest639.json:randomStatetest639:Byzantium:0]
11.68s call     tests/json-fixtures/test_state.py::test_state_fixtures[/Users/piper/sites/py-evm/fixtures/GeneralStateTests/stStaticCall/static_LoopCallsDepthThenRevert2.json:static_LoopCallsDepthThenRevert2:Byzantium:0]
11.25s call     tests/json-fixtures/test_state.py::test_state_fixtures[/Users/piper/sites/py-evm/fixtures/GeneralStateTests/stRandom/randomStatetest154.json:randomStatetest154:Byzantium:0]
11.23s call     tests/json-fixtures/test_state.py::test_state_fixtures[/Users/piper/sites/py-evm/fixtures/GeneralStateTests/stRecursiveCreate/recursiveCreateReturnValue.json:recursiveCreateReturnValue:Byzantium:0]
11.10s call     tests/json-fixtures/test_state.py::test_state_fixtures[/Users/piper/sites/py-evm/fixtures/GeneralStateTests/stStaticCall/static_LoopCallsDepthThenRevert3.json:static_LoopCallsDepthThenRevert3:Byzantium:0]
10.98s call     tests/json-fixtures/test_state.py::test_state_fixtures[/Users/piper/sites/py-evm/fixtures/GeneralStateTests/stSystemOperationsTest/ABAcalls1.json:ABAcalls1:Byzantium:0]
10.42s call     tests/json-fixtures/test_state.py::test_state_fixtures[/Users/piper/sites/py-evm/fixtures/GeneralStateTests/stSpecialTest/failed_tx_xcf416c53.json:failed_tx_xcf416c53:Byzantium:0]
10.34s call     tests/json-fixtures/test_state.py::test_state_fixtures[/Users/piper/sites/py-evm/fixtures/GeneralStateTests/stRandom/randomStatetest159.json:randomStatetest159:Byzantium:0]
10.25s call     tests/json-fixtures/test_state.py::test_state_fixtures[/Users/piper/sites/py-evm/fixtures/GeneralStateTests/stRandom/randomStatetest554.json:randomStatetest554:Byzantium:0]

How can it be fixed

I'm thinking that we split the byzantium tests into two runs, one with the 20 slowest tests and then another with all the rest.

Need a Chain that syncs only block headers

The existing Chain implementation assumes a synced database with all blocks/headers/state is available, but for the light client we'll need one that only syncs only block headers and fetches any extra data (blocks/receipts/contract-code/proofs) on demand.

Create a new BatchDB class for atomic database operations.

What is wrong?

There are a few cases where we need to make multiple database updates and have them either all succeed or all fail.

How can it be fixed

Implement a new class BatchDB in a new module evm/db/batch.py

This class should serve as a wrapper in much the same way as the current JournalDB found in the module evm/db/journal.py.

This class should be usable as a context manager similar to the following:

with BatchDB(db) as batch_db:
    batch_db[b'a'] = b'arst'  # doesn't write the key to the database but rather to a local cache.
    assert batch_db[b'a'] == b'arst'  # intelligently returns the value from the cache
    batch_db[b'b'] = b'tsra'  
# upon exiting the context, it writes all of the keys from the cache into the underlying database.

Or via an alternate API that requires manually committing the batch.

batch_db = BatchDB(db)
batch_db[b'a'] = b'arst'  # doesn't write the key to the database but rather to a local cache.
assert batch_db[b'a'] == b'arst'  # intelligently returns the value from the cache
batch_db[b'b'] = b'tsra'  
batch_db.commit()

If an exception is raised before committing (or leaving the context when used as a context manager) none of the changes should be committed to the database.

Setup documentation: Part 2

One #7 is done, the documentation can be stubbed out using the following basic structure.

  • Index
    • High level description of the library. Links to github repository, gitter chat channel, table of contents for the remainder of the documentation.
  • Getting Started
    • Installation: using pip, from source
    • QuickStart:
  • The Mainnet EVM
    • EVM Basics:
    • Testing:
  • Creating or extending an EVM
    • opcodes: how to write an opcode, the computation object and API
    • vms: how to compose opcodes into a vm, all of the method hooks that can be overridden
    • evms: how to compose vms into an evm.
  • API
    • EVM
    • VM
    • Computation class / Gas Meter class / CodeStream class / Memory class / Message class / Stack class
    • Opcode class
  • Development
    • How to setup the codebase and run the tests.
    • Style Guide

A minimally stubbed out version of these topics in roughly this hierarchy would get the documentation in a great starting position for fleshing it out.

Database Backends

What is wrong?

Currently, PyEVM is using the MemoryDB from py-trie for it's database.

  1. We need to use something "local" to the project since this is not a dependency we want to have.
  2. We can't go forever using an in-memory database.

How can it be fixed

This can be done in multiple steps, but the general plan should be:

  1. Define the API for a database backend. In general I think the MemoryDB class from the py-trie library is a good starting point. This should go in evm.db.backends.base
  2. Create a local MemoryDB in PyEVM at evm.db.backends.memory and replace all uses of the py-trie version with the one local to the project.
  3. Create a new backend that is backed by something like RocksDB or LevelDB
  4. If it seems useful/helpful create an API for getting these backends in much the same way as the ECC backend system works.

Fill this in if you know how to fix it.

Passing public/private keys around in raw format is problematic

What is wrong?

We currently pass pass Public/Private keys around in raw form (i.e. bytes), and in many places we have to validate them (i.e. check their type/length)

Also, the public keys we pass around do not include the '\x04' prefix (making them 64-bytes long) since they are used to generate Ethereum addresses or P2P Node IDs, and those must not include that. However, most libraries (e.g. coincurve) expect public keys to include the '\x04' prefix (with 65 bytes)

How can it be fixed

We could have custom types for Public and Private keys to centralise validation and abstract away the differences between what we must use when generating addresses/node-ids and what the libraries expect

Fix `EVM.header` mutation and architecture.

This issue is seeded from this comment: https://github.com/pipermerriam/py-evm/pull/21/files#r124833739

What is wrong

Currently, EVM.header is used to anchor the chain to the current HEAD state.

Mutation of this value occurs in the following places:

  • EVM.apply_transaction()
  • EVM.import_block()
  • EVM.mine_block()
  • EVM.configure_header()

This mutation introduces the potential for us to have two separate actions modify the EVM.header property and to end up either overwriting something we should not, or just to have some sort of inconsistency. This is not ideal.

How can it be fixed.

Not clear at this time but this is something that should likely be addressed because it has the potential to be very problematic down the road.

Unit tests for the `GasMeter` object

What is wrong?

No unit tests for the evm.vm.gas_meter.GasMeter object.

How can it be fixed

Write some unit tests to validate:

  • It correctly validates the start_gas parameter during instantiation.
  • It doesn't allow negative values in consume_gas(), return_gas() and refund_gas()
  • it throws an OutOfGas exceptions if consume_gas() is called with an amount larger than the remaining gas.
  • It tracks consumption, returning, and refunding gas correctly (this can be a simple procedural test)

Unit tests for the `CodeStream` object

What is wrong?

There are no tests for the evm.vm.code_stream.CodeStream object.

How can it be fixed

Write unit tests that test:

  • Ensure the object only accepts bytes types during instantiation.
  • v.peek() correctly returns the next opcode without changing the location in the code stream.
  • v.next() and next(v) return the correct next opcode.
  • When the end of the bytecode is reached, it returns the STOP opcode.
  • Use of the context manager functionality with v.seek(position) correctly reverts to the original stream position when the context exits.
  • v.is_valid_opcode() correctly determines that bytes after a PUSHXX operation are not valid.

Unit tests for Chain methods

We have no unit tests for most Chain methods; it'd be good to have some at least for the methods that deal with block retrieval/import/validation

`CodeStream.is_valid_opcode()` function likely exploitable

What is wrong?

In order to validate that a JUMPDEST is valid, we have to ensure that it isn't within the data bytes for a PUSHXX instruction. The current implementation does this with the following algorithm.

  1. Check if the JUMPDEST is in the data bytes for any PUSHXX instruction within the previous 32 bytes.
  2. If not, it is valid.
  3. If so, apply the same algorithm to the PUSHXX instruction to be sure it is not in the data bytes for another PUSHXX instruction.

The worst case for this is bytecode that looks like this:

PUSH1, PUSH1, PUSH1, PUSH1, ...... , PUSH1, JUMPDEST

In this case, in order to validate the last instruction, it will have to step backwards through every single byte. The current implementation will do this recursively, which could, with a sufficiently large piece of bytecode exceed the maximum stack size.

How can it be fixed

Simplest solution is to drop the current logic and just do what pyethereum does and process the entire bytecode in one pass. We should probably start there, and see about whether we can add some optimizations in to either take a hybrid approach, abort the recursive stepping backwards through the chain at a certain depth, or something else we haven't thought of yet.

Remove `Opcode` class

What is wrong?

The evm.opcode.Opcode class is more overhead than should probably be standard for opcodes.

How can it be fixed

I've considered the following options.

  • Write a more functional approach to the same thing.
  • Embed gas consumption into the opcodes themselves.

I'm pretty sure that the functional approach will be loosely equivalent in terms of computational overhead. This suggests that going ahead and moving all of the gas consumption logic into the opcode functions themselves is the right approach.

For the few cases where the cost has changed from hard forks, I think we can parametrize the gas consumption as part of the function signature and then use functools.partial to set the appropriate default.

VM.block and VM.state_db can easily get out of sync

What is wrong?

Having VM.state_db and VM.block is error prone as we need to ensure they're in sync, which is done by assigning VM.state_db.root_hash to VM.block.header.state_root whenever we update the state.

How can it be fixed

We could have the state DB as a property on the Block class, with all state-related functionality wrapped in Block methods, thus completely abstracting the State DB (and the need to keep it in sync with the block) from external code

Block and Header validation

The full validation rules from the yellow paper need to be added for blocks and headers. Convention thus far for validation is as follows.

  1. Validation or RLP objects is separated into local validation and evm aware validation.
    • local validation entails things that can be validated from within the context of the object such as making sure that all fields are of the correct type and within the established ranges.
    • evm aware validation is for everything else, such as checking that a transaction's nonce is greater than the current nonce from the evm state.
  2. Local validation is done as a method on the object itself RLPObject.validate()
  3. EVM aware validation is done as a method on the VM class. Typically the default implementation should throw a NotImplementedError and the real validation logic should be setup only for the mainnet VM classes.

The block and header validation rules should be pulled from the yellow paper: https://ethereum.github.io/yellowpaper/

In short they are:

  • All fields on the BlockHeader are of the appropriate type and within range.
  • All fields on the Block are of the appropriate type and within range.
  • BlockHeader.state_root exists within the db (evm aware)
  • BlockHeader.uncles_hash is equal to the rlp encoded list of uncles from the block (local)
  • BlockHeader.transactions_hash is equal to the root hash of the Block.transaction_db. (local)
  • BlockHeader.receipts_hash is equal to the root hash of the Block.receipts_db. (local)
  • BlockHeader.bloom is equal to the bloom filter containing all of the bloomables from all of the logs from all of the receipts in the block. (local)
  • BlockHeader.difficulty matches the expected difficulty computed from the timestamp of the previous header. (evm aware)
  • BlockHeader.gas_limit is within the allowed range. (evm aware)
  • BlockHeader.gas_used is less-than-or-equal to the BlockHeader.gas_limit (local)
  • BlockHeader.nonce and BlockHeader.mix_hash satisfy the proof of work rules (evm aware due to DAG)
  • BlockHeader.timestamp is greater-than the parent header timestamp. (local)
  • BlockHeader.block_number is equal-to the parent block number + 1
  • BlockHeader.extra_data is less-than-or-equal to 32 bytes in length.

Change EVM to use starting block number rather than a block range.

What is wrong?

Currently the evm.vm.evm.EVM class uses block ranges in the format (start_block, end_block) to define which blocks each VM applies to. For example, for the mainnet rules, the EVM composition would be as follows.

EVM.configure(
    vm_block_ranges=(
        ((0, 1149999), FrontierVM),
        ((1150000, 1919999), HomesteadVM),
        ((1920000, 2456999), DaoVM),
        ((2457000, None), AntiDosVM),
    )
)

This format is problematic because:

  1. It requires complex validation to ensure that the block ranges are valid. See https://github.com/pipermerriam/py-evm/blob/42507769f1d7b0b4c7f773a320074fb710e3580a/evm/validation.py#L116
  2. The data is redundant. Each VM must end on the block prior to when the next VM starts.

How can it be fixed.

This should be relatively easy to fix changing this mechanism to instead only require the start block. The example above would look something like the following under this new scheme.

EVM.configure(
    vm_configuration=(
        (0, FrontierVM),
        (1150000, HomesteadVM),
        (1920000, DaoVM),
        (2457000, AntiDosVM),
    )
)

Write unit tests for the `Stack` object

What is wrong?

The evm.vm.stack.Stack object does not have any unit tests.

How can it be fixed

Write unit tests to verify that it:

  • push validates that the value being pushed on is a valid stack item.
  • push and dup do not allow the stack to exceed 1024 items.
  • pop returns the latest stack item (or throws if the stack is empty)
  • the value returned from pop is cast to the appropriate type as specified by type_hint
  • swap and dup perform the correct shuffling and duplication of the correct stack items.
  • pop, swap, and dup raise InsufficientStack when the stack does not contain enough items.

Move fork and chain specific constants into their individual modules

What is wrong?

The evm.constants module has a number of values that are fork specific such as FRONTIER_DIFFICULTY_ADJUSTMENT. Having these in the evm.constants module feels wrong since the core Py-EVM library is supposed to be largely agnostic about specific fork rules.

How can it be fixed

These variables would be more appropriately located if they were located under the fork module that they apply to. Move each of the fork specific and chain specific constants into a new module at evm.vm.forks.<fork-name>.constants or evm.chains.mainnet.constants.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.