Giter Site home page Giter Site logo

in-toto-rs's Introduction

in-toto Build CII Best Practices Documentation Status

in-toto provides a framework to protect the integrity of the software supply chain. It does so by verifying that each task in the chain is carried out as planned, by authorized personnel only, and that the product is not tampered with in transit.

in-toto requires a project owner to create a layout. A layout lists the sequence of steps of the software supply chain, and the functionaries authorized to perform these steps. When a functionary performs a step in-toto gathers information about the used command and the related files and stores it in a link metadata file. As a consequence link files provide the required evidence to establish a continuous chain that can be validated against the steps defined in the layout.

The layout, signed by the project owners, together with the links, signed by the designated functionaries, are released as part of the final product, and can be validated manually or via automated tooling in, e.g. a package manager.

Getting Started

Installation

in-toto is available on PyPI and can be installed via pip. See in-toto.readthedocs.io to learn about system dependencies and installation alternatives and recommendations.

pip install in-toto

Create layout, run supply chain steps and verify final product

Layout

The in-toto software supply chain layout consists of the following parts:

  • expiration date
  • readme (an optional description of the supply chain)
  • functionary keys (public keys, used to verify link metadata signatures)
  • signatures (one or more layout signatures created with the project owner key(s))
  • software supply chain steps correspond to steps carried out by a functionary as part of the software supply chain. The steps defined in the layout list the functionaries who are authorized to carry out the step (by key id). Steps require a unique name to associate them (upon verification) with link metadata that is created when a functionary carries out the step using the in-toto tools. Additionally, steps must have material and product rules which define the files a step is supposed to operate on. Material and product rules are described in the section below.
  • inspections define commands to be run during the verification process and can also list material and product rules.

Take a look at the demo layout creation example for further information on how to create an in-toto layout.

Artifact Rules

A software supply chain usually operates on a set of files, such as source code, executables, packages, or the like. in-toto calls these files artifacts. A material is an artifact that will be used when a step or inspection is carried out. Likewise, a product is an artifact that results from carrying out a step.

The in-toto layout provides a simple rule language to authorize or enforce the artifacts of a step and to chain them together. This adds the following guarantees for any given step or inspection:

  • Only artifacts authorized by the project owner are created, modified or deleted,
  • each defined creation, modification or deletion is enforced, and also
  • restricted to the scope of its definition, which chains subsequent steps and inspections together.

Note that it is up to you to properly secure your supply chain, by authorizing, enforcing and chaining materials and products using any and usually multiple of the following rules:

  • CREATE <pattern>
  • DELETE <pattern>
  • MODIFY <pattern>
  • ALLOW <pattern>
  • DISALLOW <pattern>
  • REQUIRE <file>
  • MATCH <pattern> [IN <source-path-prefix>] WITH (MATERIALS|PRODUCTS) [IN <destination-path-prefix>] FROM <step>

Rule arguments specified as <pattern> allow for Unix shell-style wildcards as implemented by Python's fnmatch.

in-toto's Artifact Rules, by default, allow artifacts to exist if they are not explicitly disallowed. As such, a DISALLOW * invocation is recommended as the final rule for most step definitions. To learn more about the different rule types, their guarantees and how they are applied, take a look at the Artifact Rules section of the in-toto specification.

Carrying out software supply chain steps

in-toto-run

in-toto-run is used to execute a step in the software supply chain. This can be anything relevant to the project such as tagging a release with git, running a test, or building a binary. The relevant step name and command are passed as arguments, along with materials, which are files required for that step's command to execute, and products which are files expected as a result of the execution of that command. These, and other relevant details pertaining to the step are stored in a link file, which is signed using the functionary's key.

If materials are not passed to the command, the link file generated just doesn't record them. Similarly, if the execution of a command via in-toto-run doesn't result in any products, they're not recorded in the link file. Any files that are modified or used in any way during the execution of the command are not recorded in the link file unless explicitly passed as artifacts. Conversely, any materials or products passed to the command are recorded in the link file even if they're not part of the execution of the command.

See this simple usage example from the demo application for more details. For a detailed list of all the command line arguments, run in-toto-run --help or look at the online documentation.

in-toto-record

in-toto-record works similar to in-toto-run but can be used for multi-part software supply chain steps, i.e. steps that are not carried out by a single command. Use in-toto-record start ... to create a preliminary link file that only records the materials, then run the commands of that step or edit files manually and finally use in-toto-record stop ... to record the products and generate the actual link metadata file. For a detailed list of all command line arguments and their usage, run in-toto-record start --help or in-toto-record stop --help, or look at the online documentation.

Release final product

In order to verify the final product with in-toto, the verifier must have access to the layout, the *.link files, and the project owner's public key(s).

Verification

Use in-toto-verify on the final product to verify that

  • the layout was signed with the project owner's private key(s),
  • has not expired,
  • each step was performed and signed by the authorized functionary,
  • the functionaries used the commands, they were supposed to use,
  • materials and products of each step were in place as defined by the rules, and
  • run the defined inspections

For a detailed list of all command line arguments and their usage, run in-toto-verify --help or look at the online documentation.

Signatures

in-toto-sign is a metadata signature helper tool to add, replace, and verify signatures within in-toto Link or Layout metadata, with options to:

  • replace (default) or add signature(s), with layout metadata able to be signed by multiple keys at once while link metadata can only be signed by one key at a time
  • write signed metadata to a specified path (if no output path is specified, layout metadata is written to the path of the input file while link metadata is written to <name>.<keyid prefix>.link)
  • verify signatures

This tool serves well to re-sign test and demo data. For example, it can be used if metadata formats or signing routines change.

For a detailed list of all command line arguments and their usage, run in-toto-sign --help or look at the online documentation.

in-toto demo

You can try in-toto by running the demo application. The demo basically outlines three users viz., Alice (project owner), Bob (functionary) and Carl (functionary) and how in-toto helps to specify a project layout and verify that the layout has been followed in a correct manner.

Specification

You can read more about how in-toto works by taking a look at the specification.

Security Issues and Bugs

See SECURITY.md.

Governance and Contributing

For information about in-toto's governance and contributing guidelines, see GOVERNANCE.md and CONTRIBUTING.md.

Acknowledgments

This project is managed by Prof. Santiago Torres-Arias at Purdue University. It is worked on by many folks in academia and industry, including members of the Secure Systems Lab at NYU and the NJIT Cybersecurity Research Center.

This research was supported by the Defense Advanced Research Projects Agency (DARPA), the Air Force Research Laboratory (AFRL), and the US National Science Foundation (NSF). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of DARPA, AFRL, and NSF. The United States Government is authorized to reproduce and distribute reprints notwithstanding any copyright notice herein.

in-toto-rs's People

Contributors

adityasaky avatar alanssitis avatar crawford avatar cutecutecat avatar danbev avatar dependabot[bot] avatar joyliu-q avatar kpcyrd avatar lukpueh avatar mo-fatah avatar santiagotorres avatar xynnn007 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

in-toto-rs's Issues

Add left-stripping functionality

in-toto's python and go implementations provide an option to provide patterns that can be used to left-strip artifact paths. One example where this is important is #12, where instead of recording /tmp/rebuilderd7oil7Z/inputs/vultr-cli-2.8.3-1-x86_64.pkg.tar.zst, we want to record vultr-cli-2.8.3-1-x86_64.pkg.tar.zst.

For reference, the python implementation can be found here: https://github.com/in-toto/in-toto/blob/develop/in_toto/runlib.py#L94
And the go implementation can be found here: https://github.com/in-toto/in-toto-golang/blob/master/in_toto/runlib.go#L179

Add support for ecdsa keys

This issue suggests adding support for ecdsa keys.

The motivation for this is to be able to use keys that are generated by cosign. Also, in-toto command line tools are able to accept ecdsa keys which might also be a motivating factor.

I'd be happy to take a stab at this, just wanted to see if there are any objections/issues first.

Inconsistent logic in `calculate_key_id` with python version

For both versions, a key id is calculated by sha256sum a canonicalize JSON object, after formatted like

{
    "keyid_hash_algorithms":["sha256","sha512"],
    "keytype":"rsa",
    "keyval":{"public":"<BASE64-ENCODED-DERBYTES>"},
    "scheme":"rsassa-pss-sha256"
}

The difference is "<BASE64-ENCODED-DERBYTES>":

  • Python version: A pem string without public key header and footer, e.g.
MIIBojANBgkqhkiG9w0BAQEFAAOCAY8AMIIBigKCAYEAxPX3kFs/z645x4UOC3KF
Y3V80YQtKrp6YS3qU+Jlvx/XzK53lb4sCDRU9jqBBx3We45TmFUibroMd8tQXCUS
e8gYCBUBqBmmz0dEHJYbW0tYF7IoapMIxhRYn76YqNdl1JoRTcmzIaOJ7QrHxQrS
GpivvTm6kQ9WLeApG1GLYJ3C3Wl4bnsI1bKSv55Zi45/JawHzTzYUAIXX9qCd3Io
HzDucz9IAj9Ookw0va/q9FjoPGrRB80IReVxLVnbo6pYJfu/O37jvEobHFa8ckHd
YxUIg8wvkIOy1O3M74lBDm6CVI0ZO25xPlDB/4nHAE1PbA3aF3lw8JGuxLDsetxm
fzgAleVt4vXLQiCrZaLf+0cM97JcT7wdHcbIvRLsij9LNP+2tWZgeZ/hIAOEdaDq
cYANPDIAxfTvbe9I0sXrCtrLer1SS7GqUmdFCdkdun8erXdNF0ls9Rp4cbYhjdf3
yMxdI/24LUOOQ71cHW3ITIDImm6I8KmrXFM2NewTARKfAgMBAAE=

And in the json there will be multiple \ns

  • Rust version: A base64-encoded der bytes string, e.g.
MIIBojANBgkqhkiG9w0BAQEFAAOCAY8AMIIBigKCAYEAzgLBsMFSgwBiWTBmVsyW5KbJwLFSodAzdUhU2Bq6SdRz_W6UOBGdojZXibxupjRtAaEQW_eXDe-1CbKg6ENZGt2D9HGFCQZgQS8ONgNDQGiNxgApMA0T21AaUhru0vEofzdN1DfEF4CAGv5AkcgKsalhTyONervFIjFEdXGelFZ7dVMV3Pp5WkZPG0jFQWjnmDZhUrtSxEtqbVghc3kKAUj9Ll_3jyi2wS92Z1j5ueN8X62hWX2xBqQ6nViOMzdujkoiYCRSwuMLRqzW2CbTL8hF1-S5KWKFzxl5sCVfpPe7V5HkgEHjwCILXTbCn2fCMKlaSbJ_MG2lW7qSY2RowVXWkp1wDrsJ6Ii9f2dErv9vJeOVZeO9DsooQ5EuzLCfQLEU5mn7ul7bU7rFsb8JxYOeudkNBatnNCgVMAkmDPiNA7E33bmL5ARRwU0iZicsqLQR32pmwdap8PjofxqQk7Gtvz_iYzaLrZv33cFWWTsEOqK1gKqigSqgW9T26wO9AgMBAAE=

There is not any \ns

Because of the difference, the resulted keyid will be different, which will cause incompatible issue when a python-created layout and public key needs to be created by rust. It will result wrong KeyId.

Allow colon in filenames

The colon symbol is currently in the set of illegal characters for file names:

static PATH_ILLEGAL_STRINGS: &[&str] = &[
":", // for *nix compatibility

This prevents in-toto from generating signatures for Arch Linux packages that have an epoch, eg rust-1:1.56.1-3-x86_64.pkg.tar.zst.

Migrate away from `derp`

derp has not been updated since 2020 and is stopping us from updating a dependency, #61.

We would need to switch to a modern, maintained alternative. A potential alternative is der-parser.

Metablock is not safe object

Thanks to the discussion of @Xynnn007

When he is working on verifylib, he finds that the function load_links_for_layout which will return a dict contains Link or Layout.
The return value may like:

{
      <step name> : {
        <functionary key id> : <Metablock containing a Link or Layout object>,
        ...
      }, ...
    }

as showed at https://github.com/in-toto/in-toto/blob/f6a91bfbff29c98f58e05d9346c8245b6ef9a6a6/in_toto/verifylib.py#L94.

In Rust version, the ideal implement of this function may like:

fn load_links_for_layout(
    layout: &LayoutMetadata,
    link_dir: &str,
) -> Result<HashMap<String, HashMap<KeyId, Metablock<Json, dyn Metadata>>>>

However, as the Metablock and Metadata is not safe object, he can only write the function as:

fn load_links_for_layout(
    layout: &LayoutMetadata,
    link_dir: &str,
) -> Result<HashMap<String, (HashMap<KeyId, Metablock<Json, LinkMetadata>>, HashMap<KeyId, Metablock<Json, LayoutMetadata>>)>>

From my view, there will be some way to refactor the MetaData and Metablock to make them object safe. The source of unsafe comes from Trait PartialEq + Serialize + DeserializeOwned. We can move methods about Serialize and there traits outside of Metablock.

I would like to do this refactor if everyone think it is necessary, maybe on November after my job at hand is done.

Absolute paths are recorded in attestations

One of the rebuilders is currently generating attestations looking like this:

{"signatures":[{"keyid":"585a2a5c5efec5fc22d84f7fa7a4a22cc1c62507cf97b6d8e7df7aaea8e5f659","sig":"57430b2e677e8f91f474054b9618ff78339b7d65d921903b0f743af41f1812dcbdf82cc06334a02270cb759a8db175b62569f192d9cc25409bc80294cd19910a"}],"signed":{"_type":"link","byproducts":{},"env":{},"materials":{"/tmp/rebuilderd7oil7Z/inputs/vultr-cli-2.8.3-1-x86_64.pkg.tar.zst":{"sha256":"768bc8b6ee2f0164036c5bdb6c2dadc644c6ee2a8549b3ba22eacd109c206649","sha512":"81bf157c90a603b0e70c816959ca5860fa7d060ddffa56b464e73f89b0b9aff5af351620beef1e3d6f01faf301e4ced277c3ba46d56fd9e356c5521a90a9da18"}},"name":"rebuild vultr-cli-2.8.3-1-x86_64.pkg.tar.zst","products":{"/tmp/rebuilderd7oil7Z/out/vultr-cli-2.8.3-1-x86_64.pkg.tar.zst":{"sha256":"768bc8b6ee2f0164036c5bdb6c2dadc644c6ee2a8549b3ba22eacd109c206649","sha512":"81bf157c90a603b0e70c816959ca5860fa7d060ddffa56b464e73f89b0b9aff5af351620beef1e3d6f01faf301e4ced277c3ba46d56fd9e356c5521a90a9da18"}}}}

This string:

/tmp/rebuilderd7oil7Z/inputs/vultr-cli-2.8.3-1-x86_64.pkg.tar.zst

should likely be this instead (both in materials and products):

vultr-cli-2.8.3-1-x86_64.pkg.tar.zst

Missing Verifylib

Hi everyone! The code of layout and verifylib is missing in this project. Do you have a development plan to complete this part of the function? I wonder when this project will have the function of verify?

Inconsistent format of publickey with python version

In Rust In-toto, a serialized public key's keyval do not have a member private, e.g.

{
    "keyid": "2f89b9272acfc8f4a0a0f094d789fdb0ba798b0fe41f2f5f417c12f0085ff498",
    "keyid_hash_algorithms": [
        "sha256",
        "sha512"
    ],
    "keytype": "rsa",
    "keyval": {
        "public": "-----BEGIN PUBLIC KEY-----\nMIIBojANBgkqhkiG9w0BAQEFAAOCAY8AMIIBigKCAYEAzgLBsMFSgwBiWTBmVsyW\n5KbJwLFSodAzdUhU2Bq6SdRz/W6UOBGdojZXibxupjRtAaEQW/eXDe+1CbKg6ENZ\nGt2D9HGFCQZgQS8ONgNDQGiNxgApMA0T21AaUhru0vEofzdN1DfEF4CAGv5AkcgK\nsalhTyONervFIjFEdXGelFZ7dVMV3Pp5WkZPG0jFQWjnmDZhUrtSxEtqbVghc3kK\nAUj9Ll/3jyi2wS92Z1j5ueN8X62hWX2xBqQ6nViOMzdujkoiYCRSwuMLRqzW2CbT\nL8hF1+S5KWKFzxl5sCVfpPe7V5HkgEHjwCILXTbCn2fCMKlaSbJ/MG2lW7qSY2Ro\nwVXWkp1wDrsJ6Ii9f2dErv9vJeOVZeO9DsooQ5EuzLCfQLEU5mn7ul7bU7rFsb8J\nxYOeudkNBatnNCgVMAkmDPiNA7E33bmL5ARRwU0iZicsqLQR32pmwdap8PjofxqQ\nk7Gtvz/iYzaLrZv33cFWWTsEOqK1gKqigSqgW9T26wO9AgMBAAE=\n-----END PUBLIC KEY-----"
    },
    "scheme": "rsassa-pss-sha256"
}

But in both python and golang version, it will have one, e.g.

{
    "keyid": "2f89b9272acfc8f4a0a0f094d789fdb0ba798b0fe41f2f5f417c12f0085ff498",
    "keyid_hash_algorithms": [
        "sha256",
        "sha512"
    ],
    "keytype": "rsa",
    "keyval": {
        "private": "",
        "public": "-----BEGIN PUBLIC KEY-----\nMIIBojANBgkqhkiG9w0BAQEFAAOCAY8AMIIBigKCAYEAzgLBsMFSgwBiWTBmVsyW\n5KbJwLFSodAzdUhU2Bq6SdRz/W6UOBGdojZXibxupjRtAaEQW/eXDe+1CbKg6ENZ\nGt2D9HGFCQZgQS8ONgNDQGiNxgApMA0T21AaUhru0vEofzdN1DfEF4CAGv5AkcgK\nsalhTyONervFIjFEdXGelFZ7dVMV3Pp5WkZPG0jFQWjnmDZhUrtSxEtqbVghc3kK\nAUj9Ll/3jyi2wS92Z1j5ueN8X62hWX2xBqQ6nViOMzdujkoiYCRSwuMLRqzW2CbT\nL8hF1+S5KWKFzxl5sCVfpPe7V5HkgEHjwCILXTbCn2fCMKlaSbJ/MG2lW7qSY2Ro\nwVXWkp1wDrsJ6Ii9f2dErv9vJeOVZeO9DsooQ5EuzLCfQLEU5mn7ul7bU7rFsb8J\nxYOeudkNBatnNCgVMAkmDPiNA7E33bmL5ARRwU0iZicsqLQR32pmwdap8PjofxqQ\nk7Gtvz/iYzaLrZv33cFWWTsEOqK1gKqigSqgW9T26wO9AgMBAAE=\n-----END PUBLIC KEY-----"
    },
    "scheme": "rsassa-pss-sha256"
}

This problem will cause the generated keyid different between in-toto-rs and other two implements.

Roadmap of GSOC 2022

In GSOC 2022 project, my work is to deploy DSSE and SLSA for in-toto-rs

Structure of workflow

step 1 -- Generate predicate version by argument

graph TD
A[in_toto_run] --> B[LinkMetadataBuilder]
B[LinkMetadataBuilder] --> C[MeatdataFlatten]
C[MeatdataFlatten] --> D[PredicateWrapper]
E(PredicateVer) -->|select| D[PredicateWrapper]
D[PredicateWrapper] -.->|one of four| F[None]
D[PredicateWrapper] -.->|one of four| G[Link_V02]
D[PredicateWrapper] -.->|one of four| H[SLSA_Provenance_V01]
D[PredicateWrapper] -.->|one of four| I[SLSA_Provenance_V02]
Loading

step 2 -- Generate statement version by argument

graph TD
A[in_toto_run] --> B[LinkMetadataBuilder]
B[LinkMetadataBuilder] --> C[MeatdataFlatten]
C[MeatdataFlatten] --> D[StatementWrapper]
E(StatementVer) -->|select| D[PredicateWrapper]
D[StatementWrapper] -.->|one of two| F[Statement_naive]
D[StatementWrapper] -.->|one of two| G[Statement_V01]
Loading

step3 -- Validate the predicate and statement

valid combination of predicate and statement

predicate statement description
Statement_naive None original version
Statement_V01 Link_V02 link
Statement_V01 SLSA_PROVENCE_V01 link
Statement_V01 SLSA_PROVENCE_V02 link

step 4 -- Serialize the statement

graph TD
A[StatementWrapper] -.->|one of two| B[Statement_V01]
A[StatementWrapper] -.->|one of two| C[Statement_naive]
B[Statement_V01] --> K[serde::Serialize]
B[Statement_V01] --> D[PredicateWrapper]
D[PredicateWrapper] -.->|one of four-same| E[Any Predicate]
E[Any Predicate] --> K[serde::Serialize]
C[Statement_naive] --> K[serde::Serialize]
K[serde::Serialize] --> H[json attestation]
Loading

Finished and todo feature

  • DSSE Envelope data model

  • DSSE introduced into attestion sealing

  • Link compatity data model v0.2 [Serialize+Deserialize]

  • SLSA provenance data model v0.1 [Serialize+Deserialize]

  • SLSA provenance data model v0.2 [Serialize+Deserialize]

  • SLSA introduced into attestion formatting

  • New argument in in-toto-run for switch attestion format

  • New testcase for SLSA provenance and Link provenance

  • work with rebuildered to migrate output format into SLSA

API updated

API argument update description type
in_toto_run build_id renamed from name correspond to SLSA build_id &str
in_toto_run build_type new argument correspond to SLSA build_type, necessary for SLSA format Option<&str>
in_toto_run statement_format new argument output format of attestation Option<MetaFormat>

Corresponding PR

PR description
#27 Add Pre-Authentication Encoding(pae) for DSSE
#28 Add data model EnvelopeFile for DSSE
#32 Add 3 Predicate model / 2 State models for SLSA
#36 Add callee of SLSA model

Develop in-toto-rs capabilities to support rebuilderd

Description:

in-toto-rs was first created to enable the generation of in-toto link attestations in rebuilderd. However, in-toto-rs currently does not support the generation of signed link attestations. This functionality is provided via the runlib on the in-toto implementations written in Python and Go.

Current behaviour: in-toto-rs has no mechanisms for generating signed link attestations.

Expected behaviour: in-toto-rs provides a runlib that is equivalent to those found in other in-toto implementations, and which can be used by rebuilderd to generate in-toto link metadata.

Return-value of byproducts in a link-file should be treated as an int

In in-toto spec, it says

At a minimum, the byproducts dictionary should have standard output (stdout), 
standard error (stderr) and return value (return-value), even if no values are filled in. 
The return value should be stored as integer value.

An legal example of byproducts field is like

"byproducts": {
    "return-value":0,
    "stderr":"a foo.py\n",
    "stdout":""
}

Present in-toto-rs implementation treats byproducts as a HashMap<String, String>, thus make link file like

...
"byproducts": {
    "return-value":"0",
    "stderr":"a foo.py\n",
    "stdout":""
}
...

should fix this

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.