trailofbits / x509-limbo Goto Github PK

View Code? Open in Web Editor NEW

36.0 6.0 4.0 84.69 MB

A suite of testvectors for X.509 certificate path validation and tools for building them

Home Page: https://x509-limbo.com

License: Apache License 2.0

Makefile 0.28% Python 17.57% Go 0.43% C++ 80.78% Shell 0.23% Rust 0.66% CSS 0.01% Dockerfile 0.04% JavaScript 0.01%

cryptography x509

x509-limbo's Introduction

x509-limbo

⚠️ This is a work in progress! ⚠️

A suite of testvectors (and associated tooling) for X.509 certificate path validation.

This project is maintained by Trail of Bits.

How to use this repository

This repository contains canned testcases for developing or testing implementations of X.509 path validation.

To use it, you'll need to understand (and use) two pieces:

limbo-schema.json: The testcase schema. This is provided as a JSON Schema definition.
limbo.json: The combined testcase suite. The structure of this file conforms to the schema above.

The schema will tell you how to consume the combined testcase suite.

Developing

This repository contains a self-managing tool called limbo.

make dev && source env/bin/activate

limbo --help

This tool can be used to regenerate the schema, as well as develop and manage testcases and testcase assets:

limbo schema --help
limbo compile --help

There are also two convenience make targets for quickly regenerating the schema and test suite:

make limbo-schema.json
make limbo.json

Licensing

This repository and the Limbo testsuite are licensed under the Apache License, version 2.0.

This repository additionally contains testcases that are generated from the BetterTLS project, which is also licensed under the Apache License, version 2.0.

x509-limbo's People

Contributors

Stargazers

Watchers

Forkers

gindokiw firasah baloo

x509-limbo's Issues

Schema: support a `tag` field that implies optionality

This will help us support testcases for optional parts of X.509 profiles, like certificate policy extensions.

Behavior

The presence of a tag implies that the testcase's behavior is "optional", meaning:

If the implementation under test supports the testcase's functionality, then the test MUST have the expected result (i.e., must succeed or fail as specified).
If the implementation under test doesn't support the testcase's functionality MUST fail.

(2) is a little murky -- I'm not 100% sure this should be MUST fail, but it's probably fine to start with that (since we expect to mostly use this on things like critical extensions that aren't universally supported).

Sketch

Something like this:

@testcase(tag="whatever")

...or maybe this:

@testcase
def whatever(builder: Builder): 
    builder.tag("whatever")

The latter is probably easier to implement, so we should start with that.

Idea: `conflicts_with` or similar in the schema

We have few mutually exclusive rfc5280 and webpki tests. It might make sense to encode these conflicts in the schema itself. Or maybe not.

Improve testing around name constraints

There are a lot of details to get right here so I'll spend some time beefing up this testing after we get some initial support merged into the skeleton branch.

CI: Enforce that `limbo.json` is up-to-date

We shouldn't allow a PR merge unless limbo.json is fully up-to-date. The easiest way to do this is probably to run make limbo.json in the CI and make sure that the test IDs fully match the checked-in copy.

Additional testcases

#1 is old and not super well organized, so I'm copying things that haven't yet been done into this new issue.

RFC 5280

implementations should reject EC keys not in namedCurve format (see #173)
implementations should reject v1 certificates that contain v3 extensions
implementations should reject DNS Name Constraints of the form .foo (leading period is valid in URI constraints and others, but not in DNS constraints) (#207)
implementations should reject OtherName OIDs that they don't know (#228)

CABF

7.1.4.3: If present, Subject.commonName MUST contain exactly one entry that is one of the values contained in the subjectAltName extension, and MUST be encoded as follows
- For IPv4 addresses, must be an IPv4Address per RFC 3986 S. 3.2.2
- For IPv6 addresses, must be be encoded in the text representation specified in RFC 5952 S. 4.
- For FQDNs or wildcard domain names, must be a char-for-char copy of the dNSName entry from subjectAltName; P-labels must not be converted to their Unicode representation.
7.1.2.7.6 and 7.1.2.7.10: extKeyUsage is required in subscriber certificates, and MUST contain id-kp-serverAuth (MAY contain id-kp-clientAuth), and MUST NOT contain any other id-kp-*, anyExtendedKeyUsage, or the Precertificate Signing Certificate OID (1.3.6.1.4.1.11129.2.4.4)
7.1.2.10.6: CA EKUs are similar to subscriber cert EKUs

Regressions

CVE-2024-0567 from gnutls (https://gitlab.com/gnutls/gnutls/-/issues/1521)

General

Implementations should (generally) not be permissive around times close to expiries (e.g. a cert that expired 5 seconds before validation should generally not be accepted)

Client verification

Implementations should treat the *@example.com email NC as a literal email address with an inbox of *, not as a wildcard pattern for example.com.

Other test suites

NIST PKITS (RFC 3280)
Chrome's test suite: https://github.com/chromium/chromium/tree/main/net/data/ssl/certificates
PITTv3: https://github.com/carl-wallace/rust-pki/tree/main/pittv3

Determine a key reuse/output regeneration policy

At the moment, the test suite is entirely re-built whenever make limbo.json is run. This includes all keypairs and certificates, meaning that key IDs, serial numbers, etc. all change each time.

This has some pros and cons:

Pro: Regenerating each time ensures that we don't accidentally introduce staleness or non-reproducibility to the test suite, e.g. a testcase that loses its corresponding Python definition.
Pro: It's simple this way.
Con: Consumers of the test suite can't assume stable serial numbers or key IDs as fixed identifiers in downstream testing, at least not without updating them whenever they update the suite. This may be a "pro" in disguise.
Con: It makes the git history chunky.

CC @alex for opinions.

Create a trophy case

Highlight our discovered and fixed bugs in different implementations.

Testcase enumeration

Not exhaustive, obviously.

This issue tracks a few "baseline" testcases we should include as part of an MVP.

They're categorized roughly below.

Path construction cases

Certificate state cases

FAILURE: Ensure that various APIs are invariant preserving
- 5280: CAs must contain an AuthorityKeyIdentifier unless self-signed
- 5280: CAs must provide AuthorityKeyIdentifier.keyIdentifier unless self-signed
- 5280: CAs must not assert AuthorityKeyIdentifier.critical
- 5280: CAs must contain a SubjectKeyIdentifier
- 5280: CAs must not assert SubjectKeyIdentifier.critical
- 5280: CAs must contain a non-empty DN for their Issuer (#7)
- etc.
- CA/B: EEs must have AuthorityInformationAccess and it must be well-formed

Reference material:

RFC 5280: https://datatracker.ietf.org/doc/html/rfc5280
RFC 6125: SAN and other identity matching in a PKIX/TLS context: https://www.rfc-editor.org/rfc/rfc6125.html
CA/B Forum BRs: https://cabforum.org/wp-content/uploads/CA-Browser-Forum-BR-v2.0.0.pdf

Go harness is returning computed result rather than "actual" result

I noticed this while visually scanning the results render: the current crypto/x509 Go harness is putting the computed result into the results.json, rather than the "actual" result.

As a result (sigh), the render is slightly off:

This should actually be:

rfc5280::ee-critical-aia-invalid	❌	FAILURE	SUCCESS	validation succeeded when failure was expected

The fix here is probably to have evaluateTestcase return the err status directly, rather than comparing it to the expected status.

CC @tnytown: I'm tagging you on this one, but I can tackle it if you don't have the time.

CI: Protect against test regressions

As the suite gets bigger, it'll become easier and easier to accidentally regress on the intended behavior of individual testcases. As part of stopping that, the CI should be able to determine when a PR changes the overall results (which, for refactoring or non-behavioral changes, will strongly indicate an error).

Regression detection follow-ons

Once #201 is merged, some things:

Leaving a new comment each time is pretty noisy, we could modify a single comment instead.
We should gate a PR on no-regressions || (regressions && regressions-triaged) labels. I tried this with a basic workflow in #201, but it's limited by the fact that workflows can't trigger other workflows.

Communicate "why it matters" on each testcase

Pointed out by @reaperhulk: the current metadata for each testcase communicates whether it passes or fails, but doesn't adequately communicate why or how severe a disagreeing result would be.

We should figure out some way to communicate that, e.g. "failing this testcase means that an implementation is likely to accept certificates for unintended domains."

Render harness results on the site

The harness results don't fit in the GITHUB_STEP_SUMMARY anymore, and cause folding that makes them hard to search for. We should instead collate them into the site.

Test `rfc5280::nc::permitted-dn-match` puts SAN for example.com in the root cert

Thanks for this useful test suite.

The test rfc5280::nc::permitted-dn-match does something that was quite surprising to me; it specifies that we should attempt a match against "example.com", and a SAN for "example.com" does exist - but in the root cert, not the leaf.

OpenSSL seems to agree that the SAN in a CA certificate does not apply to a leaf

$ openssl verify -verify_hostname example.com -CAfile root.crt leaf.crt
CN=foo
error 62 at 0 depth lookup: hostname mismatch
error leaf.crt: verification failed

If there is some wording somewhere that the SAN in a CA certificate does apply also to all certificates that that CA issues, I would appreciate a reference. I don't see anything that suggests this in RFC 5280 for example.

CI: Enforce testcase schemas

Whatever we settle on should have a well-specified schema that can be efficiently checked in the CI.

Break up the namespaces a bit

rfc5280.py is getting very large; we should break it into a Python directory-style module:

rfc5280/
    __init__.py
    name_constraints.py
    whatever.py

...which in turn means that we can probably shorten the function names a bit.

Stabilize identifiers

Once we're past the MVP, testcase identifiers really shouldn't change (so that downstream consumers can rely on them).

Consequently, any changes that we do make to identifiers past the MVP should be subject to SemVer.

Testcase model: allow setting maximum chain-building depth

Most X.509 validation APIs expose an API for setting a maximum-chain building depth, e.g. X509_VERIFY_PARAM_set_depth for OpenSSL. We should expose a similar setting in our testcase model and builder, to exercise these APIs.

NB: This depth normally refers to the maximum number of non-self-issued intermediate CA certificates that appear in the chain, i.e. is generally 2 less than the actual maximal depth (which includes the end-entity and the trust anchor).

Figure out a good way to keep leaf cert keys around

Some integration strategies for x509-limbo testcases need the leaf's private key, e.g. when attempting to test an implementation that only supports chain building on an active socket.

Thus, x509-limbo should provide the private key in these cases.

See: python/cpython#112389

RFE - Expected path in limbo.json

Browsing the JSON, expected_peer_name is given to help the harness validate the peer's identity for each test case. But AFAICT, no indication of expected path exists outside of the human-readable description field.

However, it'd be great to have allowed (or denied, when simpler) chain path(s) annotated for the test suite. E.g., in the negative test case pathlen::intermediate-violates-pathlen-0, perhaps this'd be:

"denied_paths": [
  [
    "CN=x509-limbo-root",
    "OU=108858597233759013321958503749681591474467573902,CN=x509-limbo-intermediate-pathlen-0",
    "OU=101602283156362821002694322459160944963902715535,CN=x509-limbo-intermediate-pathlen-0",
    "CN=example.com"
  ]
]

Or maybe using references and indices (trusted_certs:0 -> untrusted_intermediates:0 -> untrusted_intermediates:1 -> peer_certificate) if that is simpler than Subjects.

This would help for unit testing chain building frameworks. For the subset of checks applicable to just the path building code, these could be extracted from the JSON and positively asserted against. And it helps confirms that the client accepted it for one of the right reasons. For example, in the Go harness:

x509-limbo/harness/gocryptox509/main.go

Lines 208 to 209 in 471656d

    
           chain, err := peer.Verify(opts) 
        
           _ = chain

...the contents of chain are left unvalidated.

Personally, my motivation also comes from work on the CA side: in cases where CA software contains multiple roots & ICAs, e.g., due to rotation, they may opt to support automatic chain building if multiple ICAs are loaded into a single instance. This is the case in Vault's/OpenBao's PKI Secrets Engine's chain building code. It is not necessarily meant to be strictly browser-grade RFC 5280/X.509/CA-BF compliant, most notably because it is computed prior to having a concrete leaf and because many attributes are left to the browser or leaf issuance code to enforce. And, as far as I know, nobody is running it as a web-grade PKI, especially without CT. :-D

However, it'd still be great to be able to take a subset of these tests, chop off the leaf, and validate the internal code constructs possible valid CA paths without any obviously invalid ones.

Professionally, feel free to ping me if you have questions on the Bouncy Castle modules. :-) (Are you leaning towards Java or C# or both?).

Thanks!

Support `key_usage` in the harnesses

In the cryptography.io tests, I had to begin populating the key usage extension since it was marked as critical. Since we're now populating this information, we should fix the harnesses to support this.

Render the JSON schema on the site

This is purely a "nice to have": we could use json-schema-for-humans to convert limbo-schema.json into Markdown and embed it as a page.

Define a testcase result model/schema

Now that we have a few harnesses, it would be nice to standardize their outputs so that we can build up a nice visualization/matrix of implementation discrepancies. To do that, each harness needs to output testcase results in a consistent format.

A rough sketch:

class TestcaseResult:
    # must match a Testcase's id
    id: str
    
    # keep these separate rather than flattening them so that we can
    # visualize each combination of states differently.
    # the actual result can also be "SKIPPED", to indicate either a testing
    # or known functionality gap.
    expected_result: Literal["SUCCESS"] | Literal["FAILURE"]
    actual_result: Literal["SUCCESS"] | Literal["FAILURE"] | Literal["SKIPPED"]
    
    # any additional error message etc., possibly multiple lines
    failure_context: str

MVP: Architectural sketch

This is based on @alex's original sketch: https://gist.github.com/alex/fdb067c3f424a62975df79f5e044005d

Some initial thoughts:

What is the "unit of work" in the suite? Is it a top-level limbo.json (or whatever) file with testcases: list[Testcase], or are there expected to be multiple of those files with disjoint (?) testcases?
A big JSON file will eventually become tedious to edit. Another testvector design that I'm familiar with is to have a bunch of separate descriptions that get "compiled" into the big limbo.json (or whatever) file (with CI checking this)

Idea: add a `limbo` subcommand that generates a reasonably formatted GitHub issue

Random thought: this will be nice to have when reporting bugs upstream:

limbo explain testcast-id

...should produce a Markdown formatted explanation of what testcase-id does, including all of the relevant states/chains required for the upstream to reproduce it.

Or maybe this would be overkill and/or should be done somewhere else. Just saving this as an issue so I don't forget it.

Additional harness ideas

Consolidating this into a single issue.

Add a harness for macOS's `security verify-cert`

The Security framework can be accessed through the security CLI:

https://ss64.com/osx/security-cert-verify.html

Test `cryptography-x509-validation` using our testsuite

This work will go in pyca/cryptography, but tracking here so that we don't clutter their issue tracker up.

As part of preparing pyca/cryptography#8873 for merge, we should prepare a branch that runs the x509-limbo test suite against it.

Consider modelling failure reasons into Limbo spec

One thing I've noticed during development is that it's very easy for chains to be rejected via the wrong check. Unless you debug the test yourself, it's difficult to know whether you're exercising the intended behaviour or whether it's bouncing off some other check. And once we've verified that it's being rejected for the right reason, there's no guarantee that the logic will remain that way over time.

Obviously we don't want to bind Limbo to pyca/cryptography in any meaningful way but I was wondering if there's value in having information about why a particular test ought to fail. Each harness can then optionally map their own error messages to the Limbo failure reason enumeration to check whether it hit the right check or not.

Add an example harness against a known-good library (go crypto/x509 probably)

This will help validate that testcases we write are correct. Or we'll find bugs in go. Either way.

Rustls webpki test harness?

Hi folks, great project 🔒 🔨

I saw that #80 added a test harness for briansmith/webpki. Would you be willing to do similar for rustls/webpki?

The rustls project uses this fork instead of the original repository since Rustls v0.21.0. It has some capabilities not present in the original repository (like IP address subject support, and CRL based revocation checking).

We've invested in running other testing regimens like BetterTLS in-repo but are always keen for more coverage.

Track and report harness-found bugs to upstream implementations

A few subcomponents:

Create "anomalous results" pages that collate each harness's unexpected results (#156)
Actually begin filing issues for each
- Go: golang/go#65085
- certvalidator: wbond/certvalidator#47

Do a proper release + embed `limbo.json` as a release asset

It's getting very big; keeping it checked in as a file is getting a little silly.