Giter Site home page Giter Site logo

anchore / syft Goto Github PK

View Code? Open in Web Editor NEW
5.8K 60.0 532.0 19.6 MB

CLI tool and library for generating a Software Bill of Materials from container images and filesystems

License: Apache License 2.0

Makefile 0.72% Go 92.94% Dockerfile 0.49% Ruby 0.25% Shell 1.68% Java 0.17% HTML 0.01% Python 0.92% Roff 0.01% C 0.12% Vim Script 0.01% Elixir 0.25% Erlang 0.12% PHP 0.04% C++ 0.02% Lua 2.25%
containers docker go golang static-analysis tool oci sbom spdx cyclonedx

syft's Introduction

Cute pink owl syft logo

Syft

A CLI tool and Go library for generating a Software Bill of Materials (SBOM) from container images and filesystems. Exceptional for vulnerability detection when used with a scanner like Grype.

 Validations   Go Report Card   GitHub release   GitHub go.mod Go version   License: Apache-2.0   Slack 

syft-demo

Introduction

Syft is a powerful and easy-to-use open-source tool for generating Software Bill of Materials (SBOMs) for container images and filesystems. It provides detailed visibility into the packages and dependencies in your software, helping you manage vulnerabilities, license compliance, and software supply chain security.

Syft development is sponsored by Anchore, and is released under the Apache-2.0 License. For commercial support options with Syft or Grype, please contact Anchore.

Features

  • Generates SBOMs for container images, filesystems, archives, and more to discover packages and libraries
  • Supports OCI, Docker and Singularity image formats
  • Linux distribution identification
  • Works seamlessly with Grype (a fast, modern vulnerability scanner)
  • Able to create signed SBOM attestations using the in-toto specification
  • Convert between SBOM formats, such as CycloneDX, SPDX, and Syft's own format.

Installation

Syft binaries are provided for Linux, macOS and Windows.

Recommended

curl -sSfL https://raw.githubusercontent.com/anchore/syft/main/install.sh | sh -s -- -b /usr/local/bin

Install script options:

  • -b: Specify a custom installation directory (defaults to ./bin)
  • -d: More verbose logging levels (-d for debug, -dd for trace)
  • -v: Verify the signature of the downloaded artifact before installation (requires cosign to be installed)

Homebrew

brew install syft

Scoop

scoop install syft

Chocolatey

The chocolatey distribution of Syft is community-maintained and not distributed by the Anchore team

choco install syft -y

Nix

Note: Nix packaging of Syft is community maintained. Syft is available in the stable channel since NixOS 22.05.

nix-env -i syft

... or, just try it out in an ephemeral nix shell:

nix-shell -p syft

Getting started

SBOM

To generate an SBOM for a container image:

syft <image>

The above output includes only software that is visible in the container (i.e., the squashed representation of the image). To include software from all image layers in the SBOM, regardless of its presence in the final image, provide --scope all-layers:

syft <image> --scope all-layers

Output formats

The output format for Syft is configurable as well using the -o (or --output) option:

syft <image> -o <format>

Where the formats available are:

Note that flags using the @ can be used for earlier versions of each specification as well.

Supported Ecosystems

  • Alpine (apk)
  • C (conan)
  • C++ (conan)
  • Dart (pubs)
  • Debian (dpkg)
  • Dotnet (deps.json)
  • Objective-C (cocoapods)
  • Elixir (mix)
  • Erlang (rebar3)
  • Go (go.mod, Go binaries)
  • Haskell (cabal, stack)
  • Java (jar, ear, war, par, sar, nar, native-image)
  • JavaScript (npm, yarn)
  • Jenkins Plugins (jpi, hpi)
  • Linux kernel archives (vmlinz)
  • Linux kernel modules (ko)
  • Nix (outputs in /nix/store)
  • PHP (composer)
  • Python (wheel, egg, poetry, requirements.txt)
  • Red Hat (rpm)
  • Ruby (gem)
  • Rust (cargo.lock)
  • Swift (cocoapods, swift-package-manager)
  • Wordpress plugins

Documentation

Our wiki contains further details on the following topics:

Syft Team Meetings

The Syft Team hold regular community meetings online. All are welcome to join to bring topics for discussion.

syft's People

Contributors

anchore-actions-token-generator[bot] avatar brian-ebarb avatar coheigea avatar cpendery avatar dakaneye avatar deitch avatar dependabot[bot] avatar developer-guy avatar hainenber avatar houdini91 avatar jedevc avatar jonasagx avatar joshbressers avatar kzantow avatar laurentgoderre avatar luhring avatar mikcl avatar noqcks avatar samj1912 avatar shanedell avatar spiffcs avatar testwill avatar tofay avatar toure avatar vargenau avatar wagoodman avatar westonsteimel avatar willmurphyscode avatar witchcraze avatar zhill avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

syft's Issues

Add centos distro identification

Should be able to identify a centos image and determine the distro version (populating a distro.Distro object from distro.Identify())

Better abstraction for dir/img decision fork

As it is currently implemented, upstream consumers of this API still have to pass around *image.Image around, or a path, to produce the catalog, and then to potentially deal with the distro identification.

This is problematic because it introduces a lot of boilerplate switch cases while keeping track of image, distro, and catalog.

An abstraction should get into place that allows simplifying how consumers retrieve a catalog and optionally can identify the distro from a path or an image.

Add first vulnerability matcher (dpkg)

This will require:

  • version parsing & comparison support (start with semver and dkpg format)
  • version constraint evaluation
  • adding a matcher interface
  • implementing a generic OS matcher (exact package name matching)

Add python Analyzer

Should:

  • be able to discover and parse a .dist-info/.egg-info files within the image scope.
  • populate results via the CatalogWriter
  • implement the Analyzers interface (see the design outline)

Note: consider splitting this into multiple Analyzers

Add scope selector / tree builder

Create an object that can build a tree or slice of trees based on the following selections:

  • Squashed Scope: visible files in container
  • Full Scope: all layers

Stretch goal:

  • Hidden Scope: all layers - squashed
  • User Scope: all layers - base layer
  • User Squashed Scope: squashed - base layer
  • Custom Scope: user selects which layers they care about (e.g. "1,5-7,9")

Add fedora distro identification

Should be able to identify a fedora image and determine the distro version (populating a distro.Distro object from distro.Identify())

Consider submodules for separating app/lib dependencies

Currently app/lib dependencies are mixed in a single go.mod, this is not ideal. For instance, the application pulls in zap for logging, and this is abstracted away via an interface when passed down int the library component. However, if someone uses this library as a dependency, they will also then depend on zap as well.

This can be handled in one of two ways:
a. make the root go.mod the application config and introduce a new go.mod in the library directory (preferred).
b. make the root go.mod the library config and introduce a new go.mod in the cmd directory.

option A is preferred as this would still allow for main.go to reside at root, but this is purely an opinion.

Either approach requires that the pipeline/Makefile is adjusted to account for this, as go tooling (go test, etc) typically only works on a single module (and ignores sub modules).

Add arch distro identification

Should be able to identify an arch image and determine the distro version (populating a distro.Distro object from distro.Identify())

Add user image error handling

Currently when running the application if the user does not pass an image there is a panic. This should be a user friendly error

update check

Check that a new version is available and inform the user on start up.

Note: this includes the infrastructure necessary to complete this task.

Note: this affects the release process.

Note: create a ticket that encapsulates what automation is needed to be done for hard release.

Add alpine distro identification

Should be able to identify a alpine image and determine the distro version (populating a distro.Distro object from distro.Identify())

Add busybox identification

Should be able to identify a busybox image and determine the version (populating a distro.Distro object from distro.Identify())

Add debian distro identification

Should be able to identify a debian image and determine the distro version (populating a distro.Distro object from distro.Identify())

Enhance scoping selections

Add the following user scope selections:

  • Hidden Scope: all layers - squashed
  • User Scope: all layers - base layer
  • User Squashed Scope: squashed - base layer
  • Custom Scope: user selects which layers they care about (e.g. "1,5-7,9")

Add yarn Cataloger

Should be able to discover and parse a yarn.lock files within the image scope.

Add gpg key verification

(From @zhill comments on related issues) Report on gpg key and/or source record of each package to help verify actual package source. The objective is to identify rpms installed from the distro vs rpms from the package maintainer directly or built by a user.

This should be applied to:

  • rpms
  • deb/dpkg

Add symlink/hardlink support

Stereoscope now supports resolving symlinks and hardlinks, imgbom should take advantage of this. This also is an opportunity to use the scope objects to encapsulate image-specific concerns away from the analyzers/cataloger objects. That is, the scope objects should know how to resolve a symlink/hardlink using stereoscope and be made available to the analyzers/catalogers directly (instead of the raw filetree reader objects). This would improve testability and decouple stereoscope concerns (image/trees) from catalogers.

Add (default) text presenter

  • Add a new Presenter which outputs the SBOM results in a human friendly way
  • Make this the default --output option

Directory cataloging only showing partial results

Currently the integration tests for the directory presenter are only showing bundle results, however, the directory in question has multiple packages types. Additionally, if the Gemfile.lock test file is placed into a sub directory then no packages are found.

Add DPKG Analyzer

Shuold:

  • be able to parse a dpkg status file from /var/lib/dpkg/status within the image scope
  • populate results via the CatalogWriter
  • implement the Analyzers interface (see the design outline)

Test against:

  • debian:8 - debian:10
  • ubuntu:14.04 - ubuntu:20.10

Add Java Analyzer

Should:

  • be able to discover and parse a *.jar/*.war/*.ear/*.jpi/*.hpi (Java ZIP Archive) files within the image scope.
  • use maven / pom.xml
  • implement the Cataloger interface (see the design outline)

Note: the main exploratory feature is to be able to "sub-index" files within archives to track the source (track that a vulnerability comes from a specific file within a jar archive)

Note: Engine does this recursively.

Note: split this into multiple Catalogers / or by features

Note: package will need to track relative paths within the file as sources for other packages (nested jars)

Add `setup.py` analyzer

Should be a best effort on parsing setup.py for dependencies

For example this is an extract from Pytest's setup.py file:

requires = [
    'chardet>=3.0.2,<4',
    'idna>=2.5,<3',
    'urllib3>=1.21.1,<1.26,!=1.25.0,!=1.25.1',
    'certifi>=2017.4.17'

]
test_requirements = [
    'pytest-httpbin==0.0.7',
    'pytest-cov',
    'pytest-mock',
    'pytest-xdist',
    'PySocks>=1.5.6, !=1.5.7',
    'pytest>=3'
]

In this case, grype should be able to only pick up: 'pytest-httpbin==0.0.7'

Add package definition(s) & Catalog

Preferable this is a single unified Package struct that can represent os packages, language packages, and other package-like artifacts. Additionally needs a collections of these packages into a single Catalog struct.

More thought is needed on how this data will be accessed. This will help guide how to best store (and potentially index) the data.

Add RPMDB Analyzer

Should:

  • be able to parse a RPMDB file from /var/lib/rpm/Packages within the image scope
  • populate results via the CatalogWriter
  • implement the Analyzers interface (see the design outline)

Test against:

  • centos:5 - centos:8
  • fedora:26 - fedora:32
  • amazonlinux:1 - amazonlinux:2
  • oraclelinux:6 - oraclelinux:8

Add APK Cataloger

Should be able to parse a APK flatfile from /lib/apk/db/installed within the image scope

Test against:

  • alpine:3.8 - alpine:3.11

Add package-lock.json Analyzer

Should be able to discover and parse a package-lock.json (NPM) files within the image scope.

Note: consider making a ticket for parsing just the package.json

Add basic pipeline

Should be able to support:

  • lint checks
  • unit testing
  • integration testing

Invoked on at least:

  • open PRs
  • new commits

Add distroless image identification

Should be able to identify a distroless image and determine the version (populating a distro.Distro object from distro.Identify())

Note: anchore-engine has logic for this, which this ticket should keep in parity

Improve test coverage to >= 80%

Once coverage is at a good threshold, add a quality gate to the pipeline to prevent regression of coverage below a threshold.

Add analyzer coordinator

This object will be responsible for:

  • (potentially) constructing all analyzers
  • compile a list of files that should be fetched from the image tar (originating from analyzer.SelectFiles() for every analyzer)
  • Fetching the contents of each file for every analyzer
  • Commencing the analysis for each analyzer
  • Returns a populated catalog

Add redhat distro identification

Should be able to identify a redhat image and determine the distro version (populating a distro.Distro object from distro.Identify())

Add Ubuntu distro identification

Should be able to identify a ubuntu image and determine the version (populating a distro.Distro object from distro.Identify())

SBOM from all-layers scope showing duplicate packages

The set of analyzers will surface packages based on the small set of rules that each analyzer is coded to enforce. This may surface multiple packages from the same underlying source (e.g. python egg-info analyzer picks up a package that was also picked up by dpkg).

Note: this behavior should be optional via configuration and CLI options, defaulting to not deduping packages.

Add suse distro identification

Should be able to identify a suse image and determine the distro version (populating a distro.Distro object from distro.Identify())

Add requirements.txt analyzer

Should:

  • Parse requirements.txt, keep in mind files can be named differently.
  • implement the Analyzers interface (see the design outline)

Note: we should ignore packages without versions since that it not useful for vulnscan.

Add Pacman Analyzer

Should:

  • be able to parse a flatfile from /var/lib/pacman/local within the image scope
  • populate results via the CatalogWriter
  • implement the Analyzers interface (see the design outline)

Test against:

  • archlinux:20191105 - archlinux:20200505

Add poetry Cataloger

Should:

  • be able to discover and parse a poetry.lock (Python) files within the image scope
  • implement the Cataloger interface for poetry files
  • Add a new PoetryPkg package type

Add gem Analyzer

Should:

  • be able to discover and parse a gemfile.lock within the image scope
  • populate results via the CatalogWriter
  • implement the Analyzers interface (see the design outline)

Remove dummy analyzer

PR #20 introduced a new dummy package/analyzer. Once an actual analyzer is created, the dummy should get removed.

Upload SBOM results to Anchore Engine

There should be an option to allow for results to be submitted to an Anchore Engine instance via CLI options and configuration.

Acceptance criteria:

  1. Syft can be invoked with configuration that specifies a means of reaching and authenticating with a deployed instance of Anchore Engine.
  2. After Syft exits, the cataloged packages can be observed directly from the instance of Anchore Engine (such as via anchore-cli image content ...).

Steps to test:

  1. Select a container image.
  2. Run Syft, passing in as arguments: a reference the container image, the URL to the instance of Anchore Engine, and authentication credentials.
  3. Run anchore-cli image content <image> <content-type> for various content types to observe that Anchore Engine correctly received and stores Syft's cataloged findings.

Developer notes:

  1. For context on the inspiration for this functionality and some preliminary implementation thoughts, be sure to check out the Connecting Toolbox & Enterprise document
  2. To be ironed out: what's the expected behavior for uploading analysis for an image (tag) whose analysis has already been stored in Anchore Engine?
  3. To be ironed out: schema version vs syft version for validating data shape on upload?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.