Giter Site home page Giter Site logo

spec's People

Contributors

dlorenc avatar jspeed-meyers avatar knqyf263 avatar luhring avatar lumjjb avatar mjnagel avatar puerco avatar rnjudge avatar sudo-bmitch avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

spec's Issues

Clarify timestamps

The structure currently lists only one timestamp in the VEX document and one in the VEX statement. However, the community paper lists a first_issued and last_updated for each. I'm a bit confused which date goes where as theoretically 2 are sufficient (based on inheritance). Please update the description to clarify the usage...

Version ranges in product_id/subcomponent_id

Is there any way to specify the version range in product_id and subcomponent_id?

The minimum requirements for VEX denote as below:

[product_id] and [subcomponent_id] MAY specify sets of products or components, for example:
● Every product or component owned by a supplier
● A product family or product line
● Version ranges
● A specific branch

It makes sense to use version ranges. Otherwise, VEX documents must be updated every time product or subcomponent version changes.

And the OpenVEX spec recommends PURLs.

The use of Package URLs (purls) is recommended.

"Any versions" probably can be described by omitting the version since the version is optional in PURL. e.g. pkg:maven/org.apache.xmlgraphics/batik-anim

How about version ranges? I may be missing something.

Modify severity

We're experimenting with OpenVEX in Aqua Trivy and trying to validate the use case for it (for VEX actually).
One thing that came up is that when a CVE analysis happens it's more common that the conclusion is modified severity and not binary "affected/non-affected" result. I think the goal of VEX is to improve the communication between different parties in the vulnerability assessment chain and in this case will you be open to accommodate this use case?

PURL matching with qualifiers

It is clear that pkg://[email protected]?arch=amd64 means just the amd64 variation of package [email protected], but what does pkg://[email protected] means? Instinctively, no qualifiers means all possible variations of the package, but since PURL specifiers are optional, we don't know if the user omitted the qualifiers intentionally in order to widen the scope, or neglected to include them for another reason (since they are optional).
This is relevant to VEX because when matching the product id PURLs, we need to assume that the PURL in both locations followed the same choices as to which specifiers to use.

Example:
Let there be package pkg://npm/libA that has only amd64 architecture variation. SBOM-tool-A generated SBOM with package pkg://npm/[email protected] while SBOM-tool-B generated SBOM with pkg://npm/[email protected]?arch=amd64. Note that Both SBOM tools produced correct results.
Since libA has only one architecture variation, maintainers of libA issued VEX with product id: pkg://npm/[email protected]. What should the VEX consumer do?

Possible approaches:

  1. Ignore qualifiers and compare just base package - in this case it would mismatch if another qualifiers was used in the VEX (maintainers decide to add ?file_name=libA, or if the same qualifier will be used in a another VEX (maintainers later added support for arm64).
  2. Compare all qualifiers - in this case scanner would not apply VEX to SBOM generated by SBOM-tool-B.
  3. Compare common qualifiers and ignore the rest - This is likely the most reasonable compromise (although not perfect).

Product identifiers with CycloneDX

It is relevant to this issue, but I'd ask about some more specific usage of products with CycloneDX.

CycloneDX also supports VEX, which uses BOM-Link (URN + BOM-Ref) or BOM-Ref for referencing components.
https://cyclonedx.org/capabilities/vex/

For example:

      "affects": [
        {
          "ref": "urn:cdx:3e671687-395b-41f5-a30f-a58921a69b79/1#jackson-databind-2.8.0"
        }
      ]

It is clear which package it refers to as BOM-Ref must be unique in CycloneDX SBOM. It makes sense to me rather than PURL because there could be multiple same packages in SBOM. Let's say a container image has two binaries, A and B, that depend on vulnerable package X (v1.2.3). And CVE-2023-9999 affects package X in binary A and doesn't affect package X in binary B because of vulnerable_code_not_in_execute_path.

  • binary A
    • package X (not affected)
  • binary B
    • package X (affected)

Package X could have the same PURL for A and B in this case. Therefore, it is better to use BOM-Ref for uniqueness. I'm guessing that's why OpenVEX defines subcomponents, but IMHO, it is not ideal that some OpenVEX use PURL, and others use BOM-Ref in products even though they both refer to CycloneDX SBOM. What if OpenVEX forces BOM-Ref as product identifiers in the case of CycloneDX SBOM? It conforms to the CycloneDX VEX spec and is much simpler.

Actually, SPDX is the same since it has package identifiers, but I'd put it aside for now.

I'd like to hear your thoughts. Thanks!

Update Spec to point out `@id` change

The OpenVEX spec has a section Updating Statements with Inherited Data. This contains two examples and the text in between:

When adding a second statement, the document date needs to be updated, but to preserve the integrity of the original statement we need to keep the original document timestamp. The newly added statement can inherit the document's date to avoid duplication:

This text should also point out that you assigned a new @id ("@id": "https://openvex.dev/docs/example/vex-84822c4e5028c"; basically, you copied the statement one from the document with "@id": "https://openvex.dev/docs/example/vex-9fb3463de1b57" - otherwise you would have to add a last_updated field to the document level).

Nevertheless, I think the current implementation violates the rule from the requirements:

[doc_time_first_isued] MUST equal the oldest [statement_time_first_issued] of all included VEX statements.

Notifications of new VEX

Are there any docs on how it is envisioned downstreams can be notified of new VEX? Hoping we can see this eventing integrated into transparency log infra federation to enable automated evaluation on new vulns via recursive application of policy and context local transparency services (see ID security threats WG notes in linked SCITT PR).

VEX schema/spec version should be a field in the metadata

Currently there is no way to specify the current version of spec/schema used. Given that we most likely will have iterations and newer versions, we should encode it so that tooling can use it in the appropriate way.

This proposal is to add a field in Statement and Document to encode a VEX spec version.

Ability to refer back to an SBOM?

Is there currently a way for openvex to refer back to an SBOM? Right now it is common that you might refer to an openvex document from an SBOM, but does openvex support the inverse relationship?

Consider defining an OpenVEX mediaType

OCI has done a fair bit of work on defining a new referrers API that is used to associate metadata like SBOMs, signatures, and VEX to container images. The key piece of data needed to lookup that metadata is a mediaType, so that a query could be made for all associated OpenVEX reports for a specified image. Is that something OpenVEX would be interested in documenting as part of their spec?

IANA has their list of registered media types, and that would be awesome if OpenVEX wanted to go through that process. But it's also acceptable to us to just have something that looks reasonable and is documented by the project, e.g. application/vnd.openvex listed in a readme. OCI has some mediaTypes for their own content defined in opencontainers/image-spec that may be useful examples with features like versioning and a suffix to make future changes easier.

Allow for vulnerability to be a list

The spec doesn't require a single identifying scheme (which is good) but this raises a potential usability concern, as vulnerability is singular.

Take openSUSE-SU-2020:0051-1. Let's say I patch my way out of it and produce a

{
      "vulnerability": "openSUSE-SU-2020:0051-1",
      "products": [...],
      "status": "not_affected",
      "justification": "component_not_present",
      "impact_statement": "The vulnerable code was removed with a custom patch"
}

This vulnerability can be referenced in lots of other ways, including:

As per the current spec, you'd need multiple statements, each duplicating products, status, justification and impact_statement.

{
      "vulnerability": "openSUSE-SU-2020:0051-1",
      "products": [...],
      "status": "not_affected",
      "justification": "component_not_present",
      "impact_statement": "The vulnerable code was removed with a custom patch"
},
{
      "vulnerability": "GHSA-2qrg-x229-3v8q",
      "products": [...],
      "status": "not_affected",
      "justification": "vulnerable_code_not_present",
      "impact_statement": "The vulnerable code was removed with a custom patch"
}
...

Or rely on the consumer to reconcile. That reconciliation is relatively easy in theory, at least where a CVE already exists, but in practice it's still a pain given the breadth of the ecosystem.

Another use case might be where multiple vulnerabilities relate to a single package, and that package isn't present. eg. component_not_present. Again, in that case multiple near-identical statements could be made, but it may be useful to permit it as one statement as well.

Given the above, I'd like to discuss whether vulnerability should instead by vulnerabilities and allow for a list of vulnerability identifiers?

Clarification on Use of @id, PURL Identifier, and Impact Statements

Hello,

I have a few questions and requests for clarification regarding some fields in an OpenVEX document, given the spec provided:

  • What should be the @id field in the document’s metadata? Can it mention the location where the final document will reside?
  • In the package structure, is there a point of having an @id and a purl identifier, or can we opt to go with one of them? And if yes, is it valid to just have the purl identifier in the @id field?
  • Could we have impact_statement in every status? If not, how could we add notes that clarify the status to the user?

TIA!

Make OpenVEX extensible ?

As OpenVEX is intended to be a minimal VEX implementation, is there a plan to make it extensible to support additionnal cases without requiring to modify the core specification ?

Examples of such extensions includes :

  • Modify severity following a mitigation #31.
  • Support other identifiers #23.
  • Support version ranges #26.
  • Support other status labels and status justifications.
  • Indicate a targeted release for the fix of an "affected" product.
  • Add proof/demonstrations of fixes.
  • Include a third party acknowledgement/certification of the statement.
  • Link a vulnerability reported by a specific vulnerability assessment tool.
  • Have a structured mitigation field to describe several mitigation scenarios.
  • Support multiple authors of a statement. That is the product manufacturer confirmed an affected product and a 3rd party researcher propose a mitigation.
  • Support logical predicates for product and vulnerability matching.
  • etc.

Such extensions could be expressed in a Meta OpenVEX format which when processed againsts an SBOM could generate appropriate OpenVEX document, given the extension specification, to be included in the SBOM.

More explicit expectations for package identifiers

(split out from #10)

See #10 (comment)

Currently the spec reads:

The use of Package URLs (purls) is recommended

This ultimately means products could be anything.

We should consider either a) absolutely requiring PURLs, or b) requiring that the type of identifier being used in a statement is declared explicitly.

cc: @garethr — feel free to expand or correct me on this idea!

Published schema for OpenVex

The specification is a good human readable document, but it's not machine readable.

It would be useful to have a standalone schema (for instance in JSON Schema) for the specification. This would:

  • Allow for standalone validation
  • Make strong schema versioning easier to reason about
  • Avoid some ambiguity in any human readable document like the current spec
  • Aid in building tools that implement the spec, including additional libraries beyond the Go implementation

Thoughts?

Be more specific in the Sample Scenario

The current version reads:

As an example, consider the following evolution of a hypothetical impact analysis:

  1. A software author becomes aware of a new CVE related to their product. Immediately, the author starts to check if it affects them.
  2. The investigation determines the product is affected.
  3. To protect their users, the author mitigates the CVE impact via a patch or other method before the vulnerable component issues a patch.

Without VEX data, users scanning the author's software will simply get a third party alert with no details on how the status is evolving. Most critically, when the product is patched (in #3), the alert becomes a false positive.

To inform consumers downstream of the vulnerability evolution, the author can issue a VEX document (in #1) when the CVE is published to let their users know it is under investigation. In #2, when the product is known to be affected, the author can ship a new VEX document, stating the product is affected and possibly some additional advice, like temporary mitigation instructions. Finally when the product is patched, its SBOM can be complemented with a new VEX document informing it is no longer affected by the CVE. Scanners could consume this document and stop alerting about the CVE as it no longer impacts the product.

I have a few comments here:

  • The new VEX document in #2 could actually be an update of the VEX document issued in #1. (This also makes sense as you only have to discover the VEX once and then get the latest info whenever you retrieve the VEX.)
  • If you patch a product, it is actually not longer the same product as its version changed (and there should be a new SBOM as it is a new version). The old version is still vulnerable... (as you can't change what is already shipped). This is important as otherwise someone that didn't patch would retrieve the latest VEX for that product saying that it is not vulnerable when in fact his version (since unpatched) is affected.

clarifying product identification

I believe the @id field of product and component needs further clarification.

on product id:

The optional @id field takes an IRI to make the product referenceable inside the document and addressable externally.

on component id:

Optional IRI identifying the component to make it externally referenceable.

According to this, the only use case mentioned for @id is for reference-ability, no mention of using this field for product identification.
It does mention that id CAN be a purl, but it reads as it's still only for references:

As Package URLs are valid IRIs, the @id can take a purl as a value.

This interpretation is further confirmed since purl is one of many allowed identifiers.

For the use case product identification (actually specifying the affected software), the spec only mentions the identifiers fields (also vaguely):

The spec provides an expressive product struct with fields to address the product using identifiers, hashes.

identifiers field is is described as:

A map of software identifiers where the key is the type and the value the identifier.

From reading the spec alone, one gets the impression that @id is optional and used only for references, and that identifiers should be used to specify the affected product.
However, all the examples are using @id exclusively for product identification, as well as vexctl-create tool.

My understanding is that the spec authors intent was that @id would be

  1. reference-able IRI
  2. if using purl, also the product identifier

If that's correct, I suggest updating the spec to clarify:

  1. @id CAN be used to identify product (in addition to being used as reference IRI)
    1. but only if the identification is using purl?
    2. in this case, it seem to be the preferred way (over identifiers, according to examples, and go code)
  2. in which of the use cases is @id required
  3. in which of the use cases is identifiers required

Modify Spec to Require Artifact Digest in PURL

Preface: Over the past few years large strides have been made in moving OSS software artifact provenance tooling forward. OpenVEX is a continuation of that effort, and we need to make sure that steps like OpenVEX stay true to the core objective of specifically deterministic artifact identification, i.e. cryptographic identity. You only need glance through the NVD to see string based (name) association of vuln to artifact is a non starter. CISA's intent with VEX is really to provide an accurate clearing house for software consumers of ongoing artifact maintenance. So in the short term this proposal is dirt simple:

Fast path: simply require that the PURL contains a proper digest from the source registry as outlined in the PURL spec here: https://github.com/package-url/purl-spec/blob/master/PURL-TYPES.rst

It's acknowledged that the formats of certain package management implementations may have to be adapted to accommodate this requirement, but it should be a requirement, not an option.

Standard for VEX document discovery

As a vendor who would want to publish VEX documents for a certain artifact and enable consumers (end users, security scanners, etc) to know where to find a document for that artifact, there is currently no standard discovery mechanism. This is similar to #9, but distinct in that I'm imagining a pull model rather than a push.

There are several other similar standards we can look to for this:

  • RPM ecosystem has a repos file on disk, the mirror contains a repodata/repomd.xml, which points to a repodata/updateinfo.xml file containing CVE/errata information for all packages for all time. Its easy to imagine a file like /etc/vex/vex-provider.json containing a URL (or base URL) which appends a /.well-known/openvex path or similar.
  • OpenID Connect (OIDC) embeds an "iss" issuer URL in JWT token claims, that relying parties append a /.well-known/openid-configuration path to discover metadata. Artifact types like OCI images that support key/value string metadata could include a well-known key (say, openvex.dev/discovery) that could have a value of a VEX discovery URI.

In addition to having a discovery mechanism, we'll need some format for the associating a discovery document with a particular artifact. To my knowledge, RPM's repodata/updateinfo.xml never is never rotated and grows over time. Rather than having to devise additional metadata formats (such as a big list of VEX URLs) for discovery under a well-known URI and forcing clients to make subsequent requests, for OpenVEX, I could see the need to also embed alongside discovery information some unique identifier for the artifact, maybe just a UUID4.

For argument's sake, imagine the following keys either in a config file or key/value annotations.

{
  "openvex.dev/discovery": "https://alas.aws.amazon.com/alas2023/",
  "openvex.dev/identifier": "931e5e35-2351-4278-9dc4-d45621d4f3c1",
  "openvex.dev/version": "v0.2.0"
}

Such that tooling could infer the vex document for an artifact is available at
https://alas.aws.amazon.com/alas2023/.well-known/openvex/931e5e35-2351-4278-9dc4-d45621d4f3c1 and is built with the v0.2.0 version of the spec.

README.md needs to be updated to reflect v0.2.0 spec changes

The recent v0.2.0 release introduced new enhancements to the spec, but while some examples were updated in the original PR, some documentation has not been updated to reflect this.

Specific spec changes:

So far I've found out-of-date examples in:

I've started on a PR for the above, but opening this issue for discussion, in case there are any other references I've missed. Are there any other locations these should be updated?

Storing VEX files in a dedicated directory within Git repositories

Description

I would like to open a discussion regarding the file path convention for storing OpenVEX files within a Git repository. In the example of Cilium, the filename .openvex.json is used. However, considering factors such as future OpenVEX version upgrades, the need to retain older files, storing individual VEX files for the OCI artifact and the project, and accommodating multiple VEX formats like OpenVEX and CSAF, I think it would be better to store VEX files under a dedicated directory like /.vex rather than using a single file.

Example

For example, a filename format would be like NAME.FORMAT.json for storing the VEX files. With this approach, the file path would look like this:

  • .vex/cilium-oci.openvex.json
  • .vex/cilium-golang.openvex.json
  • .vex/cilium.csaf.json

When storing VEX files in a Git repository, there is a challenge in associating package names with repository names for most ecosystems other than Go. However, users can still utilize the VEX files by manually downloading them, and defining a standard location for these files is beneficial.

I welcome any feedback or thoughts on this proposal.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.