ethereum / consensus-spec-tests Goto Github PK

Common tests for the Ethereum proof-of-stake consensus layer

License: MIT License

consensus-spec-tests's Introduction

Ethereum Proof-of-Stake Consensus Spec Tests

This repository contains test vectors for the Ethereum Proof-of-Stake Consensus Spec. Other types of testing (network, fuzzing, benchmarking, etc.) are currently a work in progress, and will be hosted in separate repositories. The intention of this repository is to provide a solid base for Ethereum proof-of-stake clients (aka "beacon nodes") to consume as part of their unit-testing efforts around spec behavior.

The tests are YAML files following the general testing format.

The generators that are responsible for generating all of the spec tests can be found in ethereum/consensus-specs/test_generators.

New tests can be added by creating a generator in the specs repository, or adding functionality to an existing generator. Generators are small and easy to write, and can use the pythonized-spec to build expected test outputs: see documentation

Note that this repository is growing over time as the spec evolves, and more test-generation code is being added. The YAML test-vectors are tracked using Git LFS, to accommodate for large test vectors (Take execution-layer tests repository size as an example).

License

See LICENSE file.

consensus-spec-tests's People

Contributors

Stargazers

Watchers

consensus-spec-tests's Issues

Test idea: surround attester slashing in the wrong order

While fuzzing my slasher I noticed it was producing surround slashings in the wrong order (i.e. with the surrounding attestation as attestation_2). Given the sensitivity of the spec to this ordering, I think it would make sense to test an attester slashing that is completely valid, but swaps attestation_1 and attestation_2.

bls/msg_hash_g2_compressed little issue

case 5:

- input: {message: '0x0000000000000000000000000000000000000000000000000000000000000000',
    domain: '0xffffffffffffffff'}
  output: ['0x83f7465e351f7bd0216fc5fcad4dcd1a4f45410bba304fbb8cab225c526fffafdd90f169add5c752d737f039242343a3',
    '0xa62b52965b0a1e44a1b2e5bed5eaa448028317603d2bacf4acd7b7e263e01e2a841e7537a827cb1502a4109dbac0ff']

output should be of two 48-bytes points, but second is trimmed to 47 bytes, should be 0x00a62b.. instead. I guess, we shouldn't trim leading zeroes that as we don't do this with message, we have fixed size ssz fields etc.

Some SSZ generic bitlist tests don't follow testing format

The bitlist tests in the ssz_generic subdir are meant to have names of the form bitlist_{length}_{extra}, but there are three that don't conform to this and complicate parsing:

bitlist_no_delimiter_empty
bitlist_no_delimiter_zero_byte
bitlist_no_delimiter_zeroes

Request: Add tar.gz for each tagged release

It would be helpful for our tooling if we could download a gzipped tarball for each release.

v1.1.8 minimal test release minimal and mainnet tarballs contains DS_Store files

$ wget -qO- https://github.com/ethereum/consensus-spec-tests/releases/download/v1.1.8/minimal.tar.gz | tar tz | grep DS_Store
tests/minimal/._.DS_Store
tests/minimal/.DS_Store
tests/minimal/bellatrix/._.DS_Store
tests/minimal/bellatrix/.DS_Store
tests/minimal/bellatrix/sanity/._.DS_Store
tests/minimal/bellatrix/sanity/.DS_Store
tests/minimal/bellatrix/sanity/blocks/._.DS_Store
tests/minimal/bellatrix/sanity/blocks/.DS_Store
tests/minimal/bellatrix/sanity/blocks/pyspec_tests/._.DS_Store
tests/minimal/bellatrix/sanity/blocks/pyspec_tests/.DS_Store
tests/minimal/bellatrix/operations/._.DS_Store
tests/minimal/bellatrix/operations/.DS_Store
tests/minimal/bellatrix/operations/attester_slashing/._.DS_Store
tests/minimal/bellatrix/operations/attester_slashing/.DS_Store

$ wget -qO- https://github.com/ethereum/consensus-spec-tests/releases/download/v1.1.8/mainnet.tar.gz | tar tz | grep DS_Store
tests/mainnet/._.DS_Store
tests/mainnet/.DS_Store
tests/mainnet/bellatrix/._.DS_Store
tests/mainnet/bellatrix/.DS_Store
tests/mainnet/bellatrix/transition/core/pyspec_tests/._.DS_Store
tests/mainnet/bellatrix/transition/core/pyspec_tests/.DS_Store
tests/mainnet/altair/._.DS_Store
tests/mainnet/altair/.DS_Store
tests/mainnet/altair/fork_choice/._.DS_Store
tests/mainnet/altair/fork_choice/.DS_Store
tests/mainnet/altair/fork_choice/ex_ante/._.DS_Store
tests/mainnet/altair/fork_choice/ex_ante/.DS_Store
tests/mainnet/altair/fork_choice/ex_ante/pyspec_tests/._.DS_Store
tests/mainnet/altair/fork_choice/ex_ante/pyspec_tests/.DS_Store
tests/mainnet/altair/fork/._.DS_Store
tests/mainnet/altair/fork/.DS_Store

Filenames are too long

For example, this filepath is 201 characters.

tests/general/phase0/bls/sign_msg/small/sign_msg_0x263dbd792f5b1be47ed85f8938c0f29586af0d3ac7b977f21c278fe1462040e3_0x0000000000000000000000000000000000000000000000000000000000000000_0x0100000000000000

Linux has a maximum filename length of 255 characters for most filesystems. When we ultilize caching in our continuous integration or even locally, the base path may be more than 54 characters.

Please reduce the filename length for these tests.

Bitcoin ETH

https://github.com/ethereum/consensus-spec-tests/blame/080c96fbbf3be58e75947debfeb9ba3b2b7c9748/LICENSE#L1

Add more optimistic sync tests

Issue

We need tests covering more sophisticated scenarios of Optimistic Sync.

Details

Follow up to ethereum/consensus-specs#2982
Test format ethereum/consensus-specs#2965

An idea by @potuz is to have multiple block tree branches that a client is syncing optimistically with, then all these branches but one should be invalidated making client to jump from one invalid branch to the other (LMD weights should be set accordingly, with an edge case where we have the same weight for valid and for invalid branch). Client should end up on a valid branch as canonical chain.

v1.2.0-rc.1 tests are missing mainnet sync committee update tests

The tests previously located at tests/mainnet/altair/epoch_processing/sync_committee_updates are missing from the v1.2.0-rc.1 tarballs

Add tests for consensus on valid G2 points

The specs describe in https://github.com/ethereum/eth2.0-specs/blob/dev/specs/bls_signature.md#g2-points exactly the conditions when a G2 point is valid.

I think a potential consensus attack on the eth2.0 network could be creating deposits which have invalid signatures based on those rules. So it is very important to not break consensus that all clients recognize the same signatures (compressed as G2 points) as valid/invalid.

Therefore I propose adding tests which test the conditions specified in https://github.com/ethereum/eth2.0-specs/blob/dev/specs/bls_signature.md#g2-points. As far as I can see such tests are not existent at the moment (whenever we test invalid signatures for deposit we use the signature with all 0s).

Block sanity transfer test is impossible because the transfer limit is 0

In the v0.6.3 block sanity tests, there's a transfer test that can't be passed using the default config because MAX_TRANSFERS is 0. We should either get rid of the test, or change the minimal config to use greater transfer limit.

[Release needed] BLS tests: fast_aggregate_verify inconsistent format v1.0.0-rc0

There is one test in fast_aggregate_verify that has an inconsistent format with the rest (fast_aggregate_verify_infinity_pubkey)

It uses messages (plural) instead of message

Git LFS usage quota

Hi,
The ethereum organization is reaching the max quota for Git LFS usage

You’ve used 100% of your data plan for Git LFS on the organization ethereum. Please purchase additional data packs to cover your bandwidth and storage usage:

The quota is not the size, but the bandwidth:

When I looked into it, it appears that only this particular repo uses LFS, due to this setting: https://github.com/ethereum/eth2.0-spec-tests/blob/master/.gitattributes

I'm not quite sure what this integration does, nor how LFS works, but is this really something that is needed?

Please advice if we need this, because if we do we'll have to purchase additional data packs.
cc @jpitts

Attestation signatures used in tests are invalid due to ignored custody bit

While running the block sanity tests for Lighthouse we noticed something funky. It seems that in verify_indexed_attestation, the two message hashes for custody bit 0 and 1 can sometimes end up being equal! This seems to be due to the custody bits being supplied as integer literals (0b0 or 0b1), as changing them to False and True causes the message hashes to be distinct again (and to match what Lighthouse thinks they should be).

There's a demo of this on the ignored-custody-field branch on my fork of the spec here: ethereum/consensus-specs@v0.6.3...michaelsproul:ignored-custody-bit

I've been running the test_attestation test from a venv in test_libs/pyspec like:

pytest --config=minimal eth2spec -k test_attestation -s

Which produces the message hash 891e0385800ff598f80d2339ff1cd2ac46c3717dca00d099a7008c104946e765 for both messages when using the 0b1 syntax, and 891e0385800ff598f80d2339ff1cd2ac46c3717dca00d099a7008c104946e765 or 5a36957157fb41a2c7ecd1e8358f6896b15b6ef6b7ac7344f851f8c64f37f9f5 when using False/True.

The signature used in the tests must also be constructed badly, because the test passes

Write binary data in yaml canonical format instead of hex encoded strings

We're finding that unmarshalling yaml is very cumbersome with hex encoded strings as binary data.

Example: ssz: '0x00000000' cannot be inherently unmarshaled to a byte slice in go. go playground link.

Looking at the spec for yaml, this data should be represented as a base64 binary string with the prefix !!binary.

Edit: I no longer support adding !!binary as it is entirely unnecessary with modern libraries understanding that Base64 encoded strings are binary data. However, the tag resolution in the yaml spec for this scenario is unclear. It's probably safer to add !!binary explicitly.

v1.0.0-rc.0 fast_aggregate_verify_infinity_pubkey uses `messages` field instead of `message`

It wasn't hard to work around, but I thought it was strange that the fast_aggregate_verify_infinity_pubkey test uses a messages field instead of message. It would be nice for it to be consistent.

Version: v1.0.0-rc.0

Operations/attester_slashing success cases issue

Solved in ethereum/consensus-specs#1126 but looks like generation was done using v0.6.2, so we have exits like 6, 5, 5, 5, 5, 6, 6, 6. Waiting for regeneration.

Fix `deneb` `include_attestation_from_previous_fork_with_new_range` test

In the latest release (v1.4.0-beta.0), there's a new test called include_attestation_from_previous_fork_with_new_range meant to ensure the new attestation inclusion rules under EIP-7045 apply to attestations from the final CAPELLA epoch. This is the only generated test which makes use of the @spec_configured_state_test override.

This override emits a config.yaml file in the generated test directory. There are several problems with this:

The sanity test definition does not say to expect a config.yaml
The generated config.yaml is invalid:

# this is the mainnet example
ALTAIR_FORK_EPOCH: 74240
...
BELLATRIX_FORK_EPOCH: 144896
...
CAPELLA_FORK_EPOCH: 194048
...
DENEB_FORK_EPOCH: 2

The generated config.yaml is missing the following fields which (without modification) prevent lighthouse from ingesting it:

DEPOSIT_CHAIN_ID
DEPOSIT_NETWORK_ID
DEPOSIT_CONTRACT_ADDRESS

If clients silently ignore the config.yaml (which is likely), they will use the default testing config:

ALTAIR_FORK_EPOCH: 0
BELLATRIX_FORK_EPOCH: 0
CAPELLA_FORK_EPOCH: 0
DENEB_FORK_EPOCH: 0

This test generates the attestation in epoch 1, which is meant to be a CAPELLA attestation. But with the default config, it ends up being within DENEB. This would mean we are only testing that the new rules apply to attestations within DENEB instead of ensuring they also apply to attestations from the final CAPELLA epoch.

Optimistic Sync tests proposal

Optimistic sync is an extension of consensus specs, but it seems valuable to feature consensus-spec-tests with a few test cases checking CL client's behaviour in complicated scenarios. Hive is the main testing tool that currently does this job. Debugging cross-layer (CL + EL) Hive test is difficult, writing Hive tests employing sophisticated re-org scenarios is even harder while both should be much easier if handled by consensus-spec-tests suite.

As optimistic sync is an opt-in backwards compatible extension, test format should follow the same principle. A subset of tests separated from the others and utilizing fork_choice test format seems like a reasonable approach. Clients that are not supporting optimistic sync (tho there is no such clients to date) may not support this subset at all.

It is proposed to extend fork choice tests format with PayloadStatusV1 responses which EL would return when newPayload or forkchoiceUpdated method is called for a corresponding block. CL clients that are implementing opti sync tests will have to mock EL clients accordingly.

Format extension for `on_block` handler

{
  block: block_{block_root},
  valid: true/false,
  payload_block_hash: block.body.execution_payload.block_hash, # payload hash is convenient for the EL mock
  payload_status: {status: "VALID" | "INVALID" | "SYNCING" | "ACCEPTED", latestValidHash: blockHash | null, validationError: null}
}

Should the PayloadStatus be ever updated in a backwards incompatible way these tests will have to be updated accordingly. Though, this is unlikely to happen outside of a hard fork on CL side, thus, tests may be aligned as well supporting the old status structure for previous hard forks.

Example scenario

B0 <- B1 <- B2 <- B3 
  \
    <- B1' <- B2' <- B3' <- B4'

1. Client imports [B0 ... B3], payload_status: {status: "VALID", ...} for each block in this branch
2. Client reorgs to and optimistically imports [B1' ... B3'], payload_status: {status: "SYNCING", ...} for each of this blocks
3. Client imports B4' with payload_status: {status: "INVALID", latestValidHash: B0.body.execution_payload.block_hash, ...}
4. head == B3

It is also proposed that valid flag for each of B1' ... B4' blocks is set to false indicating that these blocks are invalid allowing this test to result in the same outcome on CI and when run by a client without optimistic sync support. CL clients that do support optimistic sync should omit this flag and rely on payload_status instead.

cc @hwwhww @djrtwo @potuz @ajsutton @tersec @paulhauner

Make this repository self-sufficient

There is one thing, which is missing from this repository but it's required to run these tests:
configs, https://github.com/ethereum/eth2.0-specs/tree/dev/configs
I propose to add them to these repository, so client implementations could run tests using only one submodule repository. Plus there will be no issues with different versions of repositories.

Non-updated BLS test

The test bls/sign/small/sign_case_11b8c7cad5238946 is still using the hash-to-curve draft v5 output

PyECC outputs 0x89dcc02150631de23c5ba6fac74394163f1f05643c77e0bde7fea29ce64cf5fe68c440ff401908d81cc3ddcd46db41cf119e6ab5f897cafdb9b78000437354ea9796b61badc28e6d757e42c0dd7e55bd5b4fd4d9a694ddbddb5f524511090277 for the following test: https://media.githubusercontent.com/media/ethereum/eth2.0-spec-tests/master/tests/general/phase0/bls/sign/small/sign_case_11b8c7cad5238946/data.yaml

input: {privkey: '0x47b8192d77bf871b62e87859d653922725724a5c031afeabc60bcef5ff665138',
  message: '0x0000000000000000000000000000000000000000000000000000000000000000'}
output: '0xb23c46be3a001c63ca711f87a005c200cc550b9429d5f4eb38d74322144f1b63926da3388979e5321012fb1a0526bcd100b5ef5fe72628ce4cd5e904aeaa3279527843fae5ca9ca675f4f51ed8f83bbf7155da9ecc9663100a885d5dc6df96d9'

from py_ecc import bls


privkey = int('0x47b8192d77bf871b62e87859d653922725724a5c031afeabc60bcef5ff665138', 16)
message = bytes.fromhex('0000000000000000000000000000000000000000000000000000000000000000')

sig = bls.G2ProofOfPossession.Sign(privkey, message)
print(f'output: 0x{sig.hex()}')

What is in the test is the value expected in the previous hash-to-curve draft

Proposal: BLS serialization tests

BLS serialization tests

Background

The latest IETF BLS standard draft (04) uses ZCash's point representations as the serialization format.
We used to have an old https://github.com/ethereum/eth2.0-specs/blob/v0.9.0/specs/bls_signature.md#point-representations doc as the explainer of the Zcash docs. But it was deleted since we expected the IETF BLS standard would define it more formally.
Here is a good summary of the different cases of 3 MSBs of byte 0 in compressed format (see the "ZCash semantics" column): pairingwg/bls_standard#16.
We currently don't have a test suite for BLS compression/decompression tests.

Proposed new test suite

The input and output of our BLS APIs are all in minimal-pubkey-size form (compressed 48-byte pubkey and 96-byte signature). So the functions to test would be:

def compress_G1(decompressed_pubkey: bytes96) -> bytes48

input:
- decompressed_pubkey: bytes96
output:
- compressed_pubkey: (i) bytes48 or (ii) empty for the invalid cases.

def decompress_G1(compressed_pubkey: bytes48) -> bytes96

input:
- compressed_pubkey: bytes48
output:
- decompressed_pubkey: (i) bytes96 (ii) empty for the invalid cases.

def compress_G2(decompressed_signature: bytes192) -> bytes96

input:
- decompressed_signature: bytes192
output:
- compressed_signature: (i) bytes96 (ii) empty for the invalid cases.

def decompress_G2(compressed_signature: bytes96) -> bytes192

input:
- compressed_signature: bytes96
output:
- decompressed_signature: (i) bytes192 (ii) empty for the invalid cases.

Discussions

1. Are the APIs available?

1.1. I naively assumed that the BLST binding does provide these APIs for all supported languages. Could the client teams help check if it's true? /cc @kirk-baird @michaelsproul @benjaminion @mratsim @nisdas and @dot-asm
1.2. Do your compression and decompression APIs include subgroup membership checking?

2. Did the fuzzing already cover the 3-MSBs edge cases of BLS tests?

/cc @zedt3ster

3. Do you think it would help to reduce the consensus error risks?

/cc @JustinDrake @CarlBeek

Hexadecimal integers should not be in quotes

Here are the formats for integers in the yaml spec.

canonical: 12345
decimal: +12345
octal: 0o14
hexadecimal: 0xC

When a hexadecimal integer is written as '0xC' then it is interpreted as a string by yaml spec compliant libraries. This only seems to happen with domain fields in the bls yaml tests so far.

See: Example 2.19. Integers and 10.3.2. Tag Resolution in yaml spec.

Fork-choice integration test format and example (WIP)

The issue is opened to communicate the initial draft version of the proposed format for fork choice integration-style compatibility test suite.
It's an early stage work in progress and will change. However, the main idea will stay the same.
It is to be moved to a GitHub repo at some point, when it's mature enough.

There will be common files like spec constants and genesis states, but this is to be defined.
There is a file for each test case (yaml format currently, but can be SSZ as well).
Each test consists of test steps, each step being either an event (slot, block or attestation) or a condition to verify. There is also an initial state (should be spec constants plus genesis state, however the last one is lengthy, currently only amount of validators is specified in the file.
Test file format:

---
# initial state currently generated from initial amount of deposits/validators
# and spec constants (external file)
validators: <InitialValidatorsCount>

# each test is a sequence of four step type: either one of three event kinds or a condition check
steps:
# new slot event
- !<slot>
  slot: <Slot>
# new block event ("standard" block serialization to yaml is used, as in the other tests)
- !<block>
  block: <Block>
# new attestation event ("standard" attestation serialization to yaml is used, as in the other tests)
- !<attestation>
  attestation: <Attestation>
# only head test condition currently, but this is the most important
# it means, that the root of the head block at the moment should be equal to the value specified
# other test conditions to be add in future
- !<check>
  checks:
    head: <Root>

A file with several examples attached
Stale integration_tests_on_attestation.zip

New version fork_choice_integration_tests.zip

Add More Test Coverage for Longer, Dynamic SSZ Lists

TL;DR - SSZ tests are not exhaustive for dynamic lists with many elements in them.

Context

Our Go-SSZ library is currently up-to-date with v0.8 and passing all latest spec tests, both minimal and mainnet. However, we encountered something quite interesting in slot/block sanity tests when it came to SSZ. Our hash tree root for the beacon state in all the slot/block sanity tests was failing and returning a different root than what the spec wants. We were confused as we were indeed passing all types of spec tests. We noticed, however, that the SSZ spec tests do not cover cases where dynamic lists have a lot of elements (the slot sanity tests had 512 validators in the registry of the state).

We found a discrepancy between Python and Go in mixInLength when converting the length of an object into a 256-bit number. In Python, following code returns:

>>> balances = [32 for _ in range(512)}
>>> print(f"0x{len(balances).to_bytes(32, 'little').hex()}")

0x0002000000000000000000000000000000000000000000000000000000000000

The same code in Go, however, returns:

balances := make([]uint64, 512)
buf := make([]byte, 32)
binary.PutUvarint(buf, uint64(len(balances)))
fmt.Printf("%x\n", buf)

// Prints...
0x8004000000000000000000000000000000000000000000000000000000000000

It turns out we had to use a different method of putting the length into a buffer than binary.PutUvarint, however, we expected SSZ spec tests to have covered these cases. It turns out that if there were a single SSZ spec test case that had lists with lengths as big as the ones in the sanity tests, we could have caught this bug earlier.

End to end third party testing support

I wanted to inquire if there are considerations to support third party testing with the consensus-spec-tests. It would be interesting to have an endpoint or a standardized log message format that allows for external testing tools to validate the state of a consensus implementations at a certain point in time. It would be great to have a standardized way to validate a consensus node against the spec after performing chaos testing.

BLS tests: only keep the Ethereum specific tests

The current consensus-spec-tests tarballs (v1.1.2) still include the following tests:

aggregate
aggregate_verify
fast_aggregate_verify
sign
verify

Those were superseded by https://github.com/ethereum/bls12-381-tests.

Should we remove those and only keep ethereum specific tests in this repo?

eth_aggregate_pubkeys
eth_fast_aggregate_verify

cc @hwwhww @asanso