Giter Site home page Giter Site logo

protobuf-specs's Introduction

sigstore framework

Fuzzing Status CII Best Practices

sigstore/sigstore contains common Sigstore code: that is, code shared by infrastructure (e.g., Fulcio and Rekor) and Go language clients (e.g., Cosign and Gitsign).

This library currently provides:

  • A signing interface (support for ecdsa, ed25519, rsa, DSSE (in-toto))
  • OpenID Connect fulcio client code

The following KMS systems are available:

  • AWS Key Management Service
  • Azure Key Vault
  • HashiCorp Vault
  • Google Cloud Platform Key Management Service

For example code, look at the relevant test code for each main code file.

Fuzzing

The fuzzing tests are within https://github.com/sigstore/sigstore/tree/main/test/fuzz

Security

Should you discover any security issues, please refer to sigstores security process

For container signing, you want cosign

protobuf-specs's People

Contributors

adityasaky avatar asraa avatar bdehamer avatar bobcallaway avatar codysoyland avatar dependabot[bot] avatar elfotografo007 avatar haydentherapper avatar jalseth avatar jleightcap avatar kommendorkapten avatar loosebazooka avatar pwelch avatar tetsuo-cpp avatar wlynch avatar woodruffw avatar znewman01 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

protobuf-specs's Issues

jsonschema: setting `--platform` breaks targets on aarch64-darwin

In current repository root, the jsonschema related targets fail:

$ make jsonschema
(cut)
Unable to find image 'jsonschema-specs-build:latest' locally
docker: Error response from daemon: pull access denied for jsonschema-specs-build, repository does not exist or may require 'docker login': denied: requested access to the resource is denied.
See 'docker run --help'.
make: *** [jsonschema] Error 125

Removing the platform flag Makefile fixes this locally -- haven't tested on other hosts.

(CC @kommendorkapten, @woodruffw)

Cut first release of the probuf messages.

Description

I propose we cut the first release (i.e a tagged version) of the protobuf specs, as we have already seen some confusion of the state in this repository. My suggestion is that we leave the repository open for review until the week of January 9th 2023, then we discuss the state and see if we feel ready for the first release. If we need more time for review (holidays coming up) we can leave the repository open for one extra week.

cc: @haydentherapper @asraa @znewman01 @codysoyland @bdehamer @woodruffw

Define a Sigstore root of trust

Define a Sigstore root of trust protobuf

This issue is to track a protobuf definition of a Sigstore-specific "trust root". This would be used by clients as an input to verify an artifact/signature, as the only other input besides the artifact target information (which may be an offline OR online bundle of information). The information would contain:

  • Rekor: A map of Rekor public keys, keyed by the log ID and containing extra custom metadata (e.g. online date range, identifiable URI, and more to be extended)
  • Fulcio: A map of Fulcio certificates. I'm not sure what to key these by. We'd want the online date range, URI, subject/issuer. The online date range could be the most useful thing to use for identifying which Fulcio root CA was used to sign off on a particular signature (all Fulcio-issued certs are short-lived, so time is necessary in verification). TODO: How to represent intermediates? Intermediates should be alongside the roots, and there may be a collection. A "Fulcio" target could be represented as a map[string][]*x509.Certificate (i.e. a list of certs)
  • CTFE: A map of CT public keys, keyed by the log ID and containing the same extra custom metadata as Rekor.
  • And more to come (TSA)

The data structure for maps/list would imply that clients may need to hold on to multiple keys and certificates, because they may be validating targets signed on previous shards in past years. During verification, clients must use information in this struct to identify the relevant data for the particular target that it's verifying at runtime.

In other words, this information is all pre-resolved per-instance of the client, which makes it use-able for a variety of different environments and types of trust roots (online TUF, hard-coded test variables, etc). The verification function signature would look something like verify(bundle, trust_root)

I propose calling it (please suggest!):

  • TrustedRoot
  • SigstoreTrustRoot
  • TrustedRootStore

I think prefacing it with Sigstore might not be necessary; it's already defined in the Sigstore package.

I'll give a quick impl start, but please comment here for any discussions/ideas that I should take into consideration before review.

@vaikas @kommendorkapten @joshuagl @patflynn @jku @haydentherapper @znewman01

jsonschema: Docker container dependency conflict

As of 1 hour ago on https://github.com/sigstore/protobuf-specs/actions/runs/5743015810/job/15566361392,

Step 1/4 : FROM alpine@sha256:82d1e9d7ed48a7523bdebc18cf6290bdb97b82302a8a9c27d4fe885949ea94d1
 ---> c1aabb73d233
Step 2/4 : RUN apk add --update protoc=3.21.12-r2 protobuf-dev=3.21.12-r2 go=1.20.5-r0 git=2.40.1-r0
 ---> Running in 7afa87ea9cd6
fetch https://dl-cdn.alpinelinux.org/alpine/v3.18/main/x86_64/APKINDEX.tar.gz
fetch https://dl-cdn.alpinelinux.org/alpine/v3.18/community/x86_64/APKINDEX.tar.gz
ERROR: unable to select packages:
  go-1.20.7-r0:
    breaks: world[go=1.20.5-r0]
The command '/bin/sh -c apk add --update protoc=3.21.12-r2 protobuf-dev=3.21.12-r2 go=1.20.5-r0 git=2.40.1-r0' returned a non-zero code: 1
make: *** [Makefile:91: docker-image-jsonschema] Error 1

(CC @woodruffw , @kommendorkapten)

How to release (and version) this repo

We're going to publish libraries across a lot of ecosystems. Don't want a manual process here. Also want the versions that clients see to make sense/generally align. Somewhat tricky because there are two distinct things we're publishing:

  • the "spec" should have a version
  • the generated code

Proposal (post-1.0):

  • We release package versions for every package at once, on releases tagged v1.0.3 or similar (we have to follow Go conventions because Go needs you to know it's the most important)
  • Spec version changes follow semver:
    • Inconsequential tweak (e.g., docs) => bump patch version
    • New feature => bump minor version
    • Breaking change => bump major version
  • If a spec change OR a change to the generation procedure (e.g., bumping protoc version) would cause a change to the generated code, follow semver as if we had written that code:
    • Inconsequential tweak (e.g., docs) => bump patch version (unless you already have bumped patch/minor/major for the spec)
    • New feature => bump minor version (unless you already have bumped minor/major for the spec)
    • Breaking change => bump major version (unless you already have major for the spec)

Annoyingly, a bump to the protoc version might cause a patch change for one language but be a noop for others...in such cases, we can either (a) do a noop release, or (b) skip that release (we will never publish that version for this language, the releases might go 1.0.3 to 1.0.5)

CC @loosebazooka

How could the current protobuf spec model multiple bundle types ?

Question

By looking at the current protobuf specification, I could identify how the spec reuses the current bundle format relying on Rekor entry logs, and additional fields (verification materials, ...) that aren't included today in the cosign bundle.
However I am wondering how the current spec could model scenarios where Rekor is not in use. For instance, sigstore/cosign#2331 proposes the usage of a Timestamp authority as an alternative guarantee for cosign signing/verification. To support this use case, the current protobuf spec will need additional fields in the TimestampVerificationData, and ignore other fields as well as the tlog entry data.

Wouldn't it make the spec handier to have a Sigstore Bundle as an interface, and two or more types of Bundles such as RekorBundle or TimestampedBundle in the current spec definition ? These two types could model the aforementioned use case, and extend it to future verification mechanisms.

Any thoughts ?
I hope I am not missing any previous discussion done during the creation of this spec.

@kommendorkapten @haydentherapper or others

Proposal: Include raw Rekor Bundle in TransparencyLogEntry

Description

The bundle spec includes transparency log entries. In earlier discussions and the unmerged verification documentation, I think we agreed that IF a log entry is provided in the bundle, we MUST perform a successful SET verification (inclusion promise).

The SET verification requires 6 pieces of data:

  • canonicalized Rekor entry
  • integrated time
  • log index
  • log id
  • Rekor's public key
  • signature (known as the SET)

All of the above fields are included in the bundle except for the canonicalized Rekor entry. It was decided in the original proposal to reconstruct it from other data in the bundle, and by adding KindVersion to hint to the verifier which entry version to reconstruct. I believe this decision was made due to the fact that the intoto type would duplicate the DSSE in the bundle, producing a larger file size due to the redundant data.

Unfortunately, we still lack enough information to deterministically reproduce all Rekor entries. This was called out in the original proposal, but it was believed that we could reconstruct hashedRekord and intoto entries, which I understand to be the only supported entry types for the bundle. In fact, the intoto type still has an ambiguous field, the hash value which is calculated over the DSSE envelope. Rekor does not enforce any canonicalization of the DSSE used to produce the hash in the intoto v0.0.2 type, therefore, current intoto v0.0.2 entries cannot be verified without the raw bytes of the DSSE, which are not included in the bundle.

Additionally, it is a large burden to place on non-Go clients to reproduce the behavior of all the supported Rekor entry types, and I believe this will lead to verifiers that don't follow spec.

In my opinion, we should just capture the raw Rekor entry in the TransparencyLogEntry.

Alternatives:

  • We can just leave it as-is and require online verification for Rekor entries, but that kinda breaks the goal of offline verification of these bundles.
  • We can use TSA countersignatures to verify the signature was produced while the certificate was valid, which offers much of the same level of authenticity

[Proposal] Require inclusion proofs

As discussed in sigstore/rekor#1566, I am proposing that inclusion proofs in bundles become mandatory. There are no server-side changes, as these are already returned.

Here's a table of the current support for inclusion proofs per client, and how they support verifying proofs (included SETs to be thorough).

Client Proof included in bundle? Verifying Proofs Offline Verifying Proofs Online Verifying SETs
sigstore-java Yes Yes? Yes? Yes
sigstore-js WIP No No Yes
sigstore-python Yes Yes Yes Yes
cosign No, no bundle support Yes (not from bundle) Yes Yes (not from bundle)
sigstore-rs No, no bundle support No No Yes?

@bdehamer @woodruffw @loosebazooka @znewman01 Please let me know if anything is inaccurate or help fill in the question marks.

Define Bundle Verification Result

We already have message types defined that describe the inputs to the verification process (Bundle, TrustedRoot, ArtifactVerificationOptions) so it seems reasonable to also define a standardized verification response. This will give clients a standard way to convey more specific verification errors (and make it easier to compare the verification results across the different client implementations).

I'm not married to this particular design, but offering it as a starting point for refinement:

enum VerificationStep {
        // Verify the signature w/ the leaf certificate in the chain
        SIGNATURE = 1;
        // Verify the certificate chain back to the root of the identity service
        CERTIFICATE_CHAIN = 2;
        // Verify the SET of the transparency log entry
        TRANSPARENCY_LOG = 3;
        // Verify the signed timestamp is valid and trusted
        TIMESTAMP_AUTHORITY = 4;
        // Verify the signature timestamp (either from tlog SET or timestamp authority) is within certificate's validity period
        SIGNATURE_TIMESTAMP = 5;
}

message VerificationResult {
        bool success = 1;
        optional string message = 2;
        optional VerificationStep failed_step = 3;
}

If we go with this idea of having some sort of enum to identity the specific verification step which failed we could go even more granular (e.g. CERTIFICATE_CHAIN_SCT, TRANSPARENCY_LOG_INCLUSION_PROOF, etc) . . . not clear to me what the most useful level of detail is gonna be.

Another thing which occurs to me is that it might be helpful for the VerificationResult to surface the Fulcio certificate extensions (in the case where the verification is successful). The extensions are probably the next things any client is gonna read after verifying the bundle and this could save them the trouble of fishing them out of the certificate.

CC @kommendorkapten @asraa

Create and publish new java release v0.0.1

Just to get a release started (I understand the spec is not finalized), but we need something to work with for now

note: java release process requires vN.N.N (it's checked in the action). We should change that if we want to do a *.1a1 like python, but that's up to the repo owners.

A repo admin must (maybe @znewman01)

  • push a tag release/java/v0.0.1

Someone with sonatype[dev.sigstore] publish permissions must (probably @loosebazooka)

  • release to maven central

Detailed instructions at https://github.com/sigstore/protobuf-specs/blob/main/java/README.md

Stabilize the bundle format

Per sigstore/sig-clients#8: releasing a 1.0 version of the specs here would lend weight to cross-client standardization.

As part of that, we probably need a task burndown, along a few axes:

  1. What isn't in the specs yet that needs to be?
  2. What (breaking) changes to we want to make for a 1.0?
  3. What other language bindings do we want to be ready with a 1.0?

Generate JSON Schema from Protobuf definitions and use it to enforce fields?

Breakout from #64 (comment): I think one possible solution to our "required" field woes is as follows:

  1. Keep the current Protobuf definitions, and mark fields we expect/need as required;
  2. Use a Protobuf to JSON Schema generator to generate a corresponding JSON Schema, including actual field enforcement (meaning that the generator will need to respect/handle the relevant annotations)
  3. Use the generated JSON Schema to actually enforce the validity of JSON marshaled messages

This shouldn't be too difficult to do, and it looks like there's already some tooling out there that does it (https://github.com/chrusty/protoc-gen-jsonschema).

OTOH, it adds yet another layer of schematization, and only applies to JSON-marshaled messages.

cc @znewman01

[RFC] Rehome `io.intoto` namespace under `dev.sigstore`?

This repository currently contains a copy of the in-toto envelope message definitions, tweaked slightly to influence code generation:

https://github.com/sigstore/protobuf-specs/blob/85dce20afb5e8ad9e170328abb7ff2e61b758958/protos/envelope.proto

These message definitions currently declare their package namespace as io.intoto, which is consistent with the original definition in the DSSE spec repo:

https://github.com/secure-systems-lab/dsse/blob/master/envelope.proto

Based on the conversation in #86, IMO it may make sense to change the package namespace to dev.sigstore.intoto or similar here:

  1. We've slightly modified the message definition (adding metadata to reflect different codegen namespaces)
  2. We've slightly modified the definition's documentation (clarifying it in a few places)
  3. Our copy is (nominally) independent in the sense that we're locked into it, and upstream changes won't be reflected by us without additional compatibility work.

On the other hand:

  1. I'm not sure this actually matters: aside from code generation, does anything really care about the package namespace definition?
  2. Maybe it's unidiomatic to change the package namespace like this? I'm not familiar enough with the Protobuf ecosystem to know.

CC @znewman01 @bobcallaway @haydentherapper for opinions here.

Document bundle verification

Description

There is currently no real documentation on how to verify a bundle. There are some drafts like this: https://github.com/kommendorkapten/cosign/blob/bundle_verification/specs/dotsigstore_bundle/verify.md

Also the bundle format alone leaves some things underspecified. Such as hash and signature algorithm used by the transparency log, that has to be documented. This is now captured in the bundle verification interface, and also called out in this issue: #7, but should be properly documented.

During SigstoreCon there was a lot of discussion on this, the general idea was a layered approach:

  • The process has a global trust root
  • During artifact verification a few steps happen
    1. Based on artifact/policy a subset/filtered of the trust root is selected
    2. The real verification function is executed with the filtered trust root and artifact as input

The primary reason for first filtering the trust root and then running the verification logic is that code with this layout is easier to write and test, as each stage has specific purpose. The verification code should be as simple as possible, with minimal dependencies and possible choices.

Where should the such documentation live? In this repository, the architecture-docs? Separate document or part of the client spec?

cc @znewman01 @asraa @joshuagl @vaikas who was present during the discussions.

Security Policy violation Binary Artifacts

This issue was automatically created by Allstar.

Security Policy Violation
Project is out of compliance with Binary Artifacts policy: binaries present in source code

Rule Description
Binary Artifacts are an increased security risk in your repository. Binary artifacts cannot be reviewed, allowing the introduction of possibly obsolete or maliciously subverted executables. For more information see the Security Scorecards Documentation for Binary Artifacts.

Remediation Steps
To remediate, remove the generated executable artifacts from the repository.

Artifacts Found

  • java/gradle/wrapper/gradle-wrapper.jar

Additional Information
This policy is drawn from Security Scorecards, which is a tool that scores a project's adherence to security best practices. You may wish to run a Scorecards scan directly on this repository for more details.


This issue will auto resolve when the policy is in compliance.

Issue created by Allstar. See https://github.com/ossf/allstar/ for more information. For questions specific to the repository, please contact the owner or maintainer.

FR: Separate canonicalized_body from TransparencyLogEntry?

@asraa @haydentherapper and I had a discussion over at sigstore/rekor#1390 (comment) that I figured is worth opening up an issue here for discussion.

I'm currently implementing offline verification for Gitsign. Currently this is done by serializing all the fields in TransparencyLogEntry individually to their own OIDs, then reconstructing the message at verification time. An idea was suggested to encode the serialized version of TransparencyLogEntry instead, but the message as defined today is problematic because of canonicalized_body.

canonicalized_body was added to TransparencyLogEntry in #10 to let clients verify data without needing to reconstruct the body themselves due to concerns around normalization of DSSE encoding (#9).

However, for Gitsign we would like to omit this body and reconstruct it.
Gitsign has constraints on the signature format to stay in-line with other Git signatures. If we store a serialized version of the TransparencyLogEntry with canonicalized_body, we end up doubling the size of the signature for data that we already store elsewhere and can reconstruct.

I'd like to propose modifying TransparencyLogEntry to allow only storing the minimal data needed. Few options I can see:

  1. We break out canonicalized_body from TransparencyLogEntry and create a new wrapper message that combines them. e.g.

    // name tbd - open to bikeshedding
    message TransparencyLogEntryBundle {
      TransparencyLogEntry entry = 1;
      bytes canonicalized_body = 2;
    }

    This way the TLE + canonicalized_body can still be included in the full bundle, but applications that only need the minimal data in TransparencyLogEntry can just use that.

  2. We add as part of the spec that it's valid for canonicalized_body field to be unset, and if it is it's the responsibility of the client to construct it during verification. This unfortunately has the side effect of effectively undoing the changes in #10, since clients will need to be able handle cases where the canonicalized_body is not present.

  3. We let Gitsign be special and document our off-label usage of TransparencyLogEntry there. I'm a little hesitant about this since a) we'd wouldn't be able to treat TransparencyLogEntry like an opaque envelope and may need to track changes overtime and b) if gitsign is running into this my guess is other signing tools may as well.

Also open to other ideas!

cc @znewman01 @codysoyland

Common test suite strategy

sigstore/cosign#2434 proposes adding "conformance testing" to Cosign (previously implemented in sigstore-python).

I think this is a great idea and we've been thinking about it in the abstract in this repository. I'd like to use this issue to discuss strategy details. What kinds of tests should be be running across all Sigstore clients? How should we architect those?

CC @tetsuo-cpp @di @woodruffw

See also some context from the linked Cosign issue.


Hey @znewman01, thanks for the ideas and feedback.

Appreciate the additional context; I understand much better what you're trying to achieve here. I think overall our goals are aligned and I'm optimistic that we can come up with a plan we're both happy with (and again, I'm still open to merging the CLI tests as-written and even keeping them long-term, I just want to make sure that's part of a coherent conformance testing strategy).

Or were you thinking more that the conformance test suite should just provide a set of test inputs and expected outputs and leaving the actual test runner logic to each client's unit tests?

Precisely. IMO it's way more work to add new shell script-y test suites than to just have each client import standard test data. I also think the performance benefits of avoiding creating are im

It's also not clear to me that every language client should have a CLI.

I think that could be complicated because we'll want to test interactions with Fulcio, Rekor and eventually mock out those services to test scenarios like getting a broken inclusion proof, SCT, etc.

Hm...my imagination is that every CLI should be a pretty thin wrapper around a library interface, so anything you could do in a shell script (spin up a fake Fulcio server and point your tool at it) could happen inside a language-specific test harness just as easily (and integrate better with the repo's existing tests).

I guess this is mostly my personal biasesโ€”I worked full-time on a CLI for a while, and always found CLI-level testing to be a really blunt instrument, brittle, and slow. Further, architecting a CLI for testability without going through the CLI interface itself means that your library is much more usable as a library and that the tests are clearer (assertions don't require regexes and parsing STDOUT). Not sureโ€”am I being at-all convincing here that CLI tests are something to avoid when possible?

I see value in using Protobuf messages but, the way I see it, they can complement the current approach. So at some point when these Protobuf definitions become standardised, we could simplify the CLI protocol and instead ask that clients expose some kind of --protobuf flag that takes an Input rather than the flags that we're asking for now.

Certainly the protobufs work really well for testing verification. I think it's doable to start expressing full sign+verification flows declaratively too, with faked responsesโ€”and in fact, I think a proto-centric approach makes the most sense for faking.

Maybe we could split the difference:

  • Move as much configuration and as many test cases as possible into a standard (declarative) specification.
    • Verification-only cases are probably the easiest.
    • But you could imagine entire sign+verify flows with requests/responses.
    • This doesn't need to be in protobufs exclusively, but I'd strongly prefer a language-agnostic format to actual code.
  • Have drivers/harnesses to run these shared test cases.
    • They'd be responsible for requests/responses, feeding test cases into the library, and checking that we get the right results.
    • One such harness (the first one) could be a CLI harness, which could spin up fake servers and execute commands.

This may be overengineering for now, and it seems reasonable to start with everything in the "CLI harness" but perhaps with a longer-term vision of moving first the verification cases, then everything else, into the "declarative test specification" layer.

Regardless, I think the Cosign repo isn't quite the right place for this issueโ€”would you want to open one up in https://github.com/sigstore/protobuf-specs which seems to be the home of most cross-client concerns?

This issue was specifically to add the existing GitHub Action to cosign's CI so that's why I've opened it here. At the moment, we're wanting to pilot it with clients other than sigstore-python (the client I usually work on).

In the medium-term, I'd expect "common test suite" to be tracked mostly over in the protobuf-specs repo, but it makes sense to have per-client repo issues as well. Maybe we can keep this issue open to track the integration for Golang, and I'll file another issue for the strategy/philosophy discussion.

(Part of the reason I want to move it out of this repo too is that at some point these tests would move over to https://github.com/sigstore/sigstore-go because that's where the core Sigstore logic for Go will live, and Cosign will mostly focus on OCI/container signing.)

Originally posted by @znewman01 in sigstore/cosign#2434 (comment)

More details on encoding

We're planning on using some of these things as interchange formats. There's a few references to "when encoded as JSON" etc. in the protos. Can we get a little more specificity? (e.g. state outright that "the recommended serialization is proto JSON with media_type set"

CC @patflynn

Drop namely/protoc-all in favour of a custom Dockerfile?

When working on #112 I realised that the namely/protoc-all seems to be closed source, and it's very old (Debian Bullseye) which as an example lacks a new Go toolkit (only provides 1.15, released 2021).

It wasn't that hard to hack together a Docker image based on Alpine with rudimentary support for protoc. I think it would be worth exploring the work for adding Java, Ruby and other languages to make sure we know what's get into the image, and are not locked to a closed sourced vendor. As an example, I had to craft a new Docker image with newer Go version to be able to get the JSON schema generation to work.

cc @woodruffw @haydentherapper @znewman01

Stabilization plans?

Thanks a ton for breaking this repo out! I'm extremely excited to see these formats get developed, so that we can reuse them on sigstore-python.

To whit: is there a current roadmap/schedule for official stabilizing the protobuf definitions in this repo (in particular, the Sigstore bundle definition)? Once there's an official version, even a beta one, we can begin ingesting and emitting it on the Python client side ๐Ÿ™‚

Rust bindings

It'd be nice for this repository to also contain Rust bindings, and publish them automatically!

CC @jleightcap

Should we specify a key fingerprint algorithm for public key hint?

Question

The public key "hint" in https://github.com/sigstore/protobuf-specs/blob/main/protos/sigstore_common.proto#L73-L83 leaves the format of the key unspecified. What if we agree on a standard fingerprinting algorithm such as this, which was extracted from OpenSSH?

This could simplify the verifier, as you would just need to configure it with a set of trusted public keys, and the hints/fingerprints would be derived automatically.

Specify hash algorithm and signature algorithm for a transparency log.

Description

The current bundle format does not capture the hash algorithms used by the log. This is called out in this comment sigstore/cosign#2204 (comment).

The information on hash and signature algorithm are now proposed to be part of the transparency log parameters used for verification per this PR: https://github.com/sigstore/protobuf-specs/pull/5/files#diff-b1f89b7fd3eb27b519380b092a2416f893a96fbba3f8c90cfa767e7687383ad4R31

My proposal is to omit this data from the bundle, and only capture it in the tlog parameters.

Adopt the sigstore bundle format in cosign

Description

As part of this effort sigstore/cosign#2331, we're adding support for storing verification data from timestamp authorities (e.g. sigstore/timestamp-authority) into the current cosign bundle. That means users could use Rekor or a timestamp authority to verify its signed artifacts. However, the current cosign implementation assumes the bundle only contains Rekor data.

As a consequence, we decided to reuse the sigstore bundle format approved in this repository. This format would help us to extend the current RekorBundle into a more generic Bundle that could satisfy this new use case. At the same time, we should ensure this new Bundle type does not break the old format, and thus avoids backwards compatibility issues.

We've started proposing some changes in sigstore/cosign#2422 where @haydentherapper suggested to include the maintainers of this spec and decide about this new type.

Removing the root cert from the bundle's cert chain?

I was thinking about this some more:

// A chain of X.509 certificates.
message X509CertificateChain {
// The chain of certificates, with indices 0 to n.
// The first certificate in the array must be the leaf
// certificate used for signing. Any intermediate certificates
// must be stored as offset 1 to n-1, and the root certificate at
// position n.
repeated X509Certificate certificates = 1;
}

...and I think, as specified, it presents a sharp edge towards Sigstore clients: the naive thing to do would be to shove the entire cert chain into the verifying context (i.e. a thing that looks roughly like an OpenSSL X509_STORE). However, if a client does this, then Sigstore signatures effectively become self-signed: any malicious user can insert a cert chain that includes a root CA that isn't one of the trusted ones, and chain validation will resolve against that CA instead.

Clients still need to handle this their own (they should verify the expected contents of the bundle's cert chain), but as a risk reduction strategy: what does everyone think about specifying the root certificate out of the bundle's cert chain? That would turn this from a silent "gotcha" into a spec violation, one that we could write conformance tests against.

Publish generated proto clients in an idiomatic way

When writing a client library for Sigstore, you'll need access to the protos. Manually copying them in is tedious and error-prone.

Instead, we should publish sigstore-protos libraries in a language-idiomatic way (e.g., on a popular package repository).

The languages that will need this:

  • Ruby (RubyGems)
  • Java (Maven Central) #33 #43 #45
  • JavaScript (npm)
  • Go? (Go just uses Git but I'm not sure how it works with generated files).
  • Rust (Cargo)
  • Python (PyPI) #20 #32 #34 #35

We'll probably want advice from representatives of each community, since no maintainer on this repo understands best practices for all of these languages unless somebody has weird hobbies they're not telling me about.

Support detached SCTs

Ah, something that I think may be missing for Fulcio verification is the possibility of a detached SCT header. The public good Fulcio instance will always provide an embedded SCT (promise of inclusion from the CT log). However, private deployments may either specify a detached SCT or exclude CT entirely (which is not ideal, but depends on the set up). I think we need a lightweight sigstore_fulcio.proto (or at least a message in common.proto) that provides a certificate chain and an optional detached SCT - like https://github.com/sigstore/fulcio/blob/main/fulcio.proto#L153

To abstract away Fulcio entirely, we could just wrap X509CertificateChain in another message, such as:

message SignedCertificateTimestamp {
  bytes sct = 1;
}

message CertificateVerifier {
  X509CertificateChain x509_certificate_chain = 1;
  // Optional, may be embedded in the leaf certificate in the chain
  SignedCertificateTimestamp sct = 2;
}

message VerificationMaterial {
        oneof content {
                PublicKeyIdentifier public_key = 1;
                CertificateVerifier certificate_verifier = 2;
        }
}

or as its own proto, but I like the idea that Fulcio isn't a requirement, it's just that you need a certificate chain (from fulcio or any CA) and an SCT (from fulcio or your own CT log)

See sigstore/cosign#1731 or sigstore/cosign#2382 for some comments on it.

Originally posted by @haydentherapper in #1 (comment)

Do a JavaScript release

See #3.

This would entail:

  • setting up an npm project for this repo
  • automating the procedure to generate JS bindings to the protos (Makefile), generally checking in the generated code (gen/ and .github/workflows/)
  • automating/documenting the procedure to do releases of the generated code (RELEASE.md plus any GH Actions) on a tag, preferably releases/js/v0.1.1 or whatever.

Then you could start consuming it in sigstore-js.

CC @bdehamer @feelepxyz @eddiezane

Certificate SAN Verification

Question
In the SubjectAlternativeNameType enum there is a value named SUBJECT_ALTERNATIVE_NAME_TYPE_UNSPECIFIED. How should the certificates SAN extension be verified when this type is specified?

I understand how to retrieve/compare the extension values for EMAIL (assuming this maps to "rfc822Name [1]"), URI and OTHER_NAME, but I'm not sure what to do when the type is unspecified. Scan through the sequence of names and find anything that matches? Fail the verification regardless of value?

Ambiguity around `TransparencyLogEntry.canonicalized_body`

I noticed this while trying to get sigstore-python's WIP bundle generation verifying against sigstore-js: here's what the current canonicalized_body docs say:

        // The canonicalized Rekor entry body, used for SET verification. This
        // is the same as the body returned by Rekor. It's included here for
        // cases where the client cannot deterministically reconstruct the
        // bundle from the other fields. Clients MUST verify that the signature
        // referenced in the canonicalized_body matches the signature provided
        // in the bundle content.

I interpreted this as the body of the response returned by Rekor, without the surrounding apiVersion, kind, etc. envelope. My reasoning there is that that's the canonicalized body that goes directly into SET verification, without any transformation, while including the envelope means that the user has to remove it and then potentially re-canonicalize.

sigstore-js interprets it the other way: they include and expect the entire enveloped form.

IMO either of these interpretations is reasonable, but there probably needs to be exactly one ๐Ÿ™‚

Clarify naming/types around verification bundle members?

Similar to #29 -- I'm doing some reading through the protobufs now that we're getting close to integrating them into sigstore-python, and I wanted to check my understanding of a few things ๐Ÿ™‚

Here's the bundle as it's currently defined:

message Bundle {
        // MUST be application/vnd.dev.sigstore.bundle+json;version=0.1
        // when encoded as JSON.
        string media_type = 1;
        VerificationData verification_data = 2;
        dev.sigstore.common.v1.VerificationMaterial verification_material = 3;
        oneof content {
                dev.sigstore.common.v1.MessageSignature message_signature = 4;
                // A DSSE envelope can contain arbitrary payloads.
                // Verifiers must verify that the payload type is a
                // supported and expected type. This is part of the DSSE
                // protocol which is defined here https://github.com/secure-systems-lab/dsse/blob/master/protocol.md
                io.intoto.Envelope dsse_envelope = 5;
        }
        // Reserved for future additions of artifact types.
        reserved 6 to 50;
}

There are a couple members here that I currently find a little confusing (although it's very possible I don't understand them, and understanding them will resolve my confusion!):

  • VerificationData is a pretty generic type name, and my first assumption was that it's a wrapper type for the input that was signed for. But it's actually a wrapper for a list of transparency log entries, plus materials for timestamp verification.
    • Proposal: either rename this to something like ExtendedVerificationMaterials (emphasizing that it's additional materials that can be used to verify additional properties), or maybe decompose it directly into the body of the Bundle?
  • MessageSignature contains two parts: both a hash over the input, and a signature over the input. I found this a little confusing originally, since the two are conceptually separate (and it's not clear what role the hash plays in the bundle, since it can just be computed from the input being verified).
    • Proposal: either remove MessageSignature.message_digest, or decompose it into the surrounding Bundle? If we do the latter, we need to emphasize that checking the hash against the input's computed hash becomes important -- otherwise we might end up in another "confused metadata" circumstance where the user "verifies" a bundle without actually pinning it to the input they're verifying.
  • HashOutput is a type that contains basically this tuple: (hash-type, digest). However, my understanding of its use (per the CT RFC) is that the only valid hashing algorithm is SHA256 (despite SHA512 being defined). Am I missing a use case where we need this field to be generic over multiple hashing algorithms?

Roundup: more documentation/design questions

I'm doing another one of these roundup issues ๐Ÿ™‚

common.PublicKeyIdentifier

// PublicKeyIdentifier can be used to identify an (out of band) delivered
// key, to verify a signature.
message PublicKeyIdentifier {
        // Optional unauthenticated hint on which key to use.
        // The format of the hint must be agreed upon out of band by the
        // signer and the verifiers, and so is not subject to this
        // specification.
        // Example use-case is to specify the public key to use, from a
        // trusted key-ring.
        // Implementors are RECOMMENDED derive the value from the public
        // key as described in https://www.rfc-editor.org/rfc/rfc3280#section-4.2.1.1
        string hint = 1;
}

The guidance in RFC3280 Ss. 4.2.1.1 isn't great: it encourages identification with the SHA-1 of the ASN.1 BIT STRING for the subjectPublicKeyInfo (presumably DER encoded), with some modifications made to it.

Maybe we could recommend https://www.rfc-editor.org/rfc/rfc6962#section-3.2 instead, since it's what we're already doing for key identification elsewhere due to CT? It's a lot simpler (just SHA2-256(DER(subjectPublicKeyInfo))), and uses a more modern digest.

Verification messages

I might be missing how these are used/expected to be used, but some questions absent that knowledge:

  1. Are these intended to be building blocks for verification policies, i.e. verifying specific OIDC claims/X.509v3 extensions once signature, etc. verification has succeeded?
    1. If that's the intent, it might be worth thinking about regular expression support in ObjectIdentifierValuePair, similar to the current support in SubjectAlternativeName.
    2. Additionally, it might be nice to have building blocks for composing policies, similar to how we've designed the policy API in sigstore-python -- it would be nice to be able to compose messages that describe "all of"/"any of" and other set-language operations (like thresholds)
  2. How does Artifact interact with a bundle's MessageSignature.message_digest? Is the expectation here that the verifier would collect the Artifact (either as direct bytes or from the specified URL) and confirm that it matches the message_digest? I think this is similar to the question I had in #30, but I want to make sure I fully understand the moving pieces here ๐Ÿ™‚

cc @znewman01 @haydentherapper @kommendorkapten

Generate Python bindings?

Similar to the codegen'd Go that's already in the repo: would it be okay to generate and check-in the Python bindings? If so, I/we (ToB) can take a stab at it.

Verification materials: certificate versus certificate chain?

I noticed this while looking through the generated Python code for the protobufs. A VerificationMaterial is defined to contain an entire certificate chain:

// VerificationMaterial captures details on the materials used to verify
// signatures.
message VerificationMaterial {
        oneof content {
                PublicKeyIdentifier public_key = 1;
                X509CertificateChain x509_certificate_chain = 2;
        }
}

I might be misunderstand the use case or context here, but this doesn't currently make sense to me: a bundle bearing its own entire certificate chain is equivalent during verification to a signature from any untrusted signing party, since anybody could substitute the "public good" Fulcio cert chain for their own cert chain.

So, my question: is this intended to be an intermediate (non-terminated) cert chain, or is there some other use case that I'm missing? For sigstore-python, we almost certainly want to pass just one certificate in with our verification materials, and rely on TUF or another out-of-band system (like baked-in certs) to obtain the rest of the chain.

Should more messages/fields be marked as required?

I'm a bit of a protobuf neophyte, so apologies if this is a silly or misguided question ๐Ÿ™‚

To my understanding, every field in a protobuf definition is optional by default (with the optional keyword really just being a way to mark that explicitly).

Given that many of our fields are absolutely required (such as the signature and inclusion proof), does it make sense to mark those as required at the wire level?

I see that Fulcio's protobuf definitions use the (google.api.field_behavior) = REQUIRED syntax, so is something similar to that appropriate here?

Keep version numbers consistent across languages

This repository is eventually going to contain code to publish packages to each language ecosystem's respective package index. We need to figure out some scheme of keeping these version numbers consistent. So for instance, the v0.0.5 version of this package should be generated from the same set of Protobuf specifications regardless of whether we download it for Python, Java, etc.

Context: #25 (comment)

FR: multiple signatures from multiple sources

I would like to at least explore the idea of multiple signatures as well. Different objects in a repository like java might come from different sources and then uploaded together, so adding a bundle to a repository might involving merging multiple signature objects?

Which is slightly different than say one person collecting all the artifacts and singing them after the fact.

Originally posted by @loosebazooka in #48 (comment)

Clarifying/reducing agility in a few more messages

#30 is getting a little long, so I'm filing a separate issue to track some other potential changes to the current messages ๐Ÿ™‚


LogId

Here's how LogId is currently defined:

// LogId captures the identity of a transparency log.
message LogId {
        oneof id {
                // The unique id of the log, represented as the SHA-256 hash
                // of the log's public key, computed over the DER encoding.
                // https://www.rfc-editor.org/rfc/rfc6962#section-3.2
                bytes key_id = 1;
                // Currently not used but proposed by
                // https://datatracker.ietf.org/doc/rfc9162/
                ObjectIdentifier oid = 2;
        }
}

IMO, we probably don't want this agility: the message formats will probably have to change significantly anyways if and when Sigstore moves to CT 2.0, so we should probably limit LogId to just the SHA256(DER(pk)) format that CT 1.0 allows.

HashAlgorithm and SignatureAlgorithm

Here's how these are currently defined:

enum HashAlgorithm {
        HASH_ALGORITHM_UNSPECIFIED = 0;
        SHA2_256 = 1;
        SHA2_512 = 2;
}

// Subset of known signature algorithms.
enum SignatureAlgorithm {
        SIGNATURE_ALGORITHM_UNSPECIFIED = 0;
        ECDSA_P256_SHA_256 = 1; // See NIST FIPS 186-4
        ECDSA_P256_HMAC_SHA_256 = 2; // See RFC6979
        ED25519 = 3; // See RFC8032
        RSA_PKCS1V5 = 4; // See RFC8017
        RSA_PSS = 5; // See RFC8017
}

I might be wrong about this, but I believe we don't need SHA2-512 or RSA PSS: neither of these is mentioned in the CT RFCs. But it's possible these show up in certificates anyways; cc @haydentherapper for thoughts on these.

When can we expect stability?

Description

I noticed that #36 includes breaking changes (including field renumbering ๐Ÿ˜ฌ ), and both the bundle mediaType (0.1) and the directory version (v1) remain the same.

Note that sigstore-js is using these protobuf files, as is our internal Go code at GitHub, and we have already begun building systems that are storing detached bundles. These changes are painful, as we not only need to build client support for these changes, but also migrate data to accommodate them.

I understand this format is in some state of flux, but I would like to see us establish some rigor for dealing with changes to the message schema, such as leaving messages in a deprecated state, preserving field numbers, and incrementing versions. If now is not the time, can we set some expectations now, as the README does not clarify that we aren't.

Perhaps new changes could enter an alpha directory, which will eventually be renamed v2? I'm not sure the best way to go forward, but I'd really like to avoid another breaking change.

Version

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.