ossf / alpha-omega Goto Github PK

Our mission is to catalyze sustainable improvements to critical open source software projects and ecosystems.

License: Apache License 2.0

Dockerfile 1.56% PowerShell 0.53% Python 19.41% Shell 4.09% Open Policy Agent 71.01% CSS 0.13% JavaScript 0.26% HTML 3.01%

open-source-security opensource security

alpha-omega's Introduction

Alpha-Omega

Alpha-Omega's mission is to catalyze sustainable security improvements to critical open source projects and ecosystems. We accomplish this in various ways, such as funding security staff at organizations like the Rust Foundation and the Eclipse Foundation, security improvements to projects like Homebrew, security audits of projects like OpenSSL, and security features in projects like Rustls. We also sponsor work to identify serious vulnerabilities across a large set of open source projects, such as our work through OpenRefactory.

Since 2022, Alpha-Omega has been working hard, "turning money into security". Learn more at alpha-omega.dev or read our latest Annual Report.

Alpha

We have active engagements with the following projects:

Learn more about Alpha.

Project Information

Meeting times

We usually meet on the first Wednesday of each month at 9:00am PT. You can find the meeting invite link on the OpenSSF Community Calendar.

Core Team

The Alpha-Omega core team members include:

Michael Scovetta (Microsoft)
Bob Callaway (Google)
Henri Yandell (AWS)
Miaolai Zhou (AWS)
Michael Winser (Technical Strategist)
Michelle Martineau (Linux Foundation)

Get Involved

You can get involved by engaging with us in various ways:

Slack: We watch the #alpha_omega Slack channel.
Monthly Meeting: Come and talk to us directly.
Mailing List: Join the alpha-omega-announcements mailing list to be notified of upcoming developments.
Contact Us: Let us know you'd like to get involved.

alpha-omega's People

Contributors

Stargazers

Watchers

Forkers

rafaelgss theheels jcasman xee5ch superf0sh fdegir olcuhu panickervinod sumodgeorge mbarbero akbarxcode kannanprabu joelmarcey bobcallaway supertassu qpc-github a-atmos krinkle veenaamb akhilsahuji shubhbapna fool1280 ravikantcool2023 rkothand walterhpearce sethmlarson glenda1015 jquery iofbd openrefactory sarahgran amir-montazery purs3lab stnert ethicalsecurity-agency trail-of-forks akieid fazledyn rubycentral tomfansdwdf kaizumaki emaste khorben hyandell alice-sowerby

alpha-omega's Issues

[Omega Analyzer] Add `set -e` to runtools.sh

We should use set -e as a best practice for good scripts. Unless, of course, we decide that the tool runner shouldn't be a giant shell script.

Generic functionality for executing all assertions

    There should be a more generic method to execute all assertions and map it to their relative key,value pair via a .yaml or .json file. This could prevent updating or modifiying source code to add new assertions, as well as redeploying code every time assertions and their values need to change.

Originally posted by @Cyber-JiuJiteria in #31 (comment)

From scovetta:
"I have something in DynamicPolicy which does a similar thing of sniffing out what policies exist and creating/executing them, so the user can just "run everything in builtin" -- but it's a little different here because different assertions need different input data."

Bug: Severity key comes out as a number

e.g.

{
  "_type": "https://in-toto.io/Statement/v0.1",
  "predicate": {
    "content": {
      "severity": {
        "0": 37
      }
    },

Consider adding additional information to logs (like local username).

Rationale: When processing logs, without a username or something similar, it may be hard to trace back to what happened. For now, it's intended to be run locally, so we can defer this out.

For corp use cases, we may need a %(username) or similar value for log format.

Originally posted by @Cyber-JiuJiteria in #31 (comment)

Assurance Assertions - Turn off evidence by default

The evidence collection can make each assertion many megabytes, so we should disable it by default under the theory that anyone can regenerate the evidence from the tools.

Implement Repository/Base

    to be implemented - needs base.py requirements and use cases

Originally posted by @Cyber-JiuJiteria in #31 (comment)

Add additional metrics to triage portal home page view

Additional metrics:

resolved findings?
updated findings? (say a finding was patched, but showed up again in a later report)

Originally posted by @Cyber-JiuJiteria in #55 (comment)

Implement signing / base

    to be implemented - needs requirements and design in a fashion to support other signing tool such as sigstore

Originally posted by @Cyber-JiuJiteria in #31 (comment)

Generic Cloud storage design for support of other providers

    Ensure storage design is generic enough to allow easy support for other cloud providers storage products, i.e., Google, AWS....

Originally posted by @Cyber-JiuJiteria in #31 (comment)

Add gitleaks to Omega analyzer

Looks like another good secret scanner, but need to validate.

https://github.com/zricethezav/gitleaks

OAF / Naming

We should have a catchy name for the assertion work. Any ideas?

Assurance Assertions -- Pretty bad
Project Verde -- I was thinking GUAC, SLSA, "Salsa Verde"?
OAF -- "Omega Assertion Framework", not terrible, not is it pronounced "O-A-F" or "Oaf"?
Something else? I'm open to anything, we just need a way to consistently talk about it.

I think the default will just be OAF unless we come up with something better.

Omega: How should we handle attestations of work completed?

For Omega, we're targeting the top 10,000 projects, using tooling (Omega Analysis Toolchain, etc.) and triage.

We need to provide some evidence that work was completed, both for internal tracking ("did we already look at X?") and external assurance ("has X been reviewed and believed to be safe to use?").

Let's use this thread to discuss how we can do this.

Some preliminaries:

We'll want to assert that some activity took place. That activity could be a tool execution (with some result), a manual action ("reviewing a thing"), or some combination of the two. For either, the target could be a physical artifact (foo.tar.gz) or something else, like a GitHub repo (like Scorecard results).

In-toto attestation seems to be a reasonable vehicle for providing attestations, but delegates the actual predicate content to the user (us) -- though I haven't read the specs in a while, so I might be wrong here.

Some questions:

How granular should the attestations be?
- Tool X did not find SQL injection in subject Y?
- Tool X did not find any critical issues in subject Y?
- Tool X did not find any critical issues in artifact Z associated with subject Y?
What is the appropriate subject? In-toto has opinions on this being a physical artifact (with a hash). What does that mean in the context of: source code in git repo, source archive on Debian, patches applied, compiled binaries for N different architectures? Are (transitive) dependencies included in the mix at all?
How much information should we provide? Give too little, and end-users need to trust the Omega team, perhaps more than they want to. Give too much, and we risk 0-daying the community.
- Option A: Only provide attestations of "no issues found" or "issues were found and fixed in version X"
- Option B: Only provide aggregated results, clearly specified: (4 potential issues found [untriaged]). Problem there is that it's a trivial lift to go and download and re-run the toolchain to find the details of those 4 potential issues. On the other hand, the tools are public anyway, anyone could run them at any time.
What's the relationship of PackageURL with this?

I came up with a slightly frankenstein'ed attestation format, to see what this could look like. I'll push the code to generate this up shortly.

{
  "_type": "https://github.com/ossf/alpha-omega/omega-analysis-toolchain/Statement/v0.1",
  "_comment": "Generated by the Omega Analysis Toolchain",
  "subject": [
    {
      "type": "https://github.com/ossf/alpha-omega/omega-analysis-toolchain/Types/PackageURL/v0.1",
      "purl": "pkg:npm/[email protected]",
      "digest": {
        "alg": "sha256",
        "value": "hwwP4QliI6WNT4gy0Ip+ZR6i/K245od7L9wmtmLUgd0="
      },
      "filename": "[email protected]"
    }
  ],
  "predicateType": "https://github.com/ossf/alpha-omega/omega-analysis-toolchain/Predicate/v0.1",
  "predicate": {
    "review_text": "This was a lot of fun reviewing this, but I didn't find any issues."
  },
  "status": "pass",
  "signature": "MEQCIG3RfjZb/LMjpwSDvapI6TJDzLS/5moghpvWLFyHwZTkAiBJNdCrNrr+rDfaJqk3uMnRCCnKkFRegjbS2sjKyAQSYg=="
}

Thoughts?

Implement BaseEvidence

    to be implemented - requires definition of BaseEvidence requirements

Originally posted by @Cyber-JiuJiteria in #31 (comment)

Assertion: Security Advisories: Normalize severity?

Should we try to normalize severity when produce a security advisory assertion? What does "critical" mean? Or should we provide the raw data, or as much as we can?

I think we need to try to normalize, since the consumer of the advisory won't really know/care how it's generated, so the fields they look for in the policy need to be well-defined.

MacBook bug: Unable to locate package powershell

Get the following error even after installing powershell for mac via powershell installation instructions

Resolve the error so powershell is either:

not needed
an option for window users

# 9 20.38 E: Unable to locate package powershell
------
executor failed running [/bin/bash -c cd /tmp &&     wget -q https://packages.microsoft.com/config/ubuntu/22.04/packages-microsoft-prod.deb -O packages-microsoft-prod.deb &&     dpkg -i packages-microsoft-prod.deb &&     rm packages-microsoft-prod.deb &&     touch /etc/apt/preferences &&     echo "Package: *" >> /etc/apt/preferences &&     echo "Pin: origin \"packages.microsoft.com\"" >> /etc/apt/preferences &&     echo "Pin-Priority: 1001" >> /etc/apt/preferences &&     add-apt-repository universe &&     apt update &&     apt-get install -y dotnet-sdk-${DOTNET_VERSION} &&     apt-get install -y powershell]: exit code: 100

implement get_intermediate_files() from azure storage

    get_intermediate_files() to be implemented. Issue Created

Originally posted by @Cyber-JiuJiteria in #55 (comment)

Investigate why manalyze takes so long

We've noticed that occasionally, manalyze seems to take forever (perhaps literally). This issue will be used to track. For now, we're going to add a timeout to simply stop after a little while.

AA: Dependency View is a mess

We added a lot more policies (one for each CWE) so the columns in the dependency view are a mess:

We need to reconsider how to convey this information to humans without a 42,000 inch screen.

[Analyzer] Shhgit no longer compiles

Complains about a -i option being passed to go build.

This option was deprecated, so we can just take it out of the Dockerfile.

[Help wanted] Add maven to the default installed packages in the analyzer

The analyzer has Java installed, but not Maven. We should pre-install it.

Design a method for securityscorecard.py that will be future proof for new package types.

    Another instance where designing generically will help with backwards capability and future tooling

Originally posted by @Cyber-JiuJiteria in #31 (comment)

oss-find-source: "Error" detect when there isn't

Some of the tools (oss-find-source, oss-detect-cryptography) emit banner information to stderr, which is incidentally how the postprocessor figured out if there's been an error. We'll need to account for this somehow.

Separate out javascript from html file

    Could be on the lower end of priorities, but future designs should separate, especially with the consideration that triage-portal will be publicly accessible on the web. We'd need to evaluate risk for that considering the sensitivity of the data in T-P.

separate the javascript from the HTML files so that execution of JS is behind the curtains. This issue will help evaluate the risk, complexity, and prioritize the refractor of previous code.

Originally posted by @Cyber-JiuJiteria in #55 (comment)

Add better error handling to Security Scorecard assertion generator

Target: pkg:github/madler/[email protected]

Output (snippet):

2022-12-31 09:54:06,411 ERROR oaf.py:272 (generate_assertion) - Error processing assertion: expected string or bytes-like object
Traceback (most recent call last):
  File "/opt/ssd/repos/alpha-omega/omega/oaf/omega/oaf.py", line 270, in generate_assertion
    assertion.process()
  File "/opt/ssd/repos/alpha-omega/omega/oaf/omega/assertion/assertion/securityscorecard.py", line 102, in process
    repository = find_repository(purl)
  File "/opt/ssd/repos/alpha-omega/omega/oaf/omega/assertion/utils.py", line 73, in find_repository
    return purl2url(package_url)
  File "/opt/ssd/repos/alpha-omega/omega/oaf/venv/lib/python3.10/site-packages/packageurl/contrib/purl2url.py", line 47, in get_repo_url
    return _get_url_from_router(repo_router, purl)
  File "/opt/ssd/repos/alpha-omega/omega/oaf/venv/lib/python3.10/site-packages/packageurl/contrib/purl2url.py", line 38, in _get_url_from_router
    return router.process(purl)
  File "/opt/ssd/repos/alpha-omega/omega/oaf/venv/lib/python3.10/site-packages/packageurl/contrib/route.py", line 177, in process
    endpoint = self.resolve(string)
  File "/opt/ssd/repos/alpha-omega/omega/oaf/venv/lib/python3.10/site-packages/packageurl/contrib/route.py", line 197, in resolve
    candidates = [r for r in self.route_map.values() if r.match(string)]
  File "/opt/ssd/repos/alpha-omega/omega/oaf/venv/lib/python3.10/site-packages/packageurl/contrib/route.py", line 197, in <listcomp>
    candidates = [r for r in self.route_map.values() if r.match(string)]
  File "/opt/ssd/repos/alpha-omega/omega/oaf/venv/lib/python3.10/site-packages/packageurl/contrib/route.py", line 86, in match
    return self.pattern_match(string)
TypeError: expected string or bytes-like object
2022-12-31 09:54:06,412 ERROR oaf.py:142 (parse_args_generate) - No assertion was generated.

Add validation based on Microsoft Naming Referencing containers guide on blobs

    todo for validation on blobs

Add validation based on https://docs.microsoft.com/en-us/rest/api/storageservices/naming-and-referencing-containers--blobs--and-metadata

Originally posted by @Cyber-JiuJiteria in #55 (comment)

Implement is_safe_function() under model/filters

    To be implemented is_safe_function()

Originally posted by @Cyber-JiuJiteria in #55 (comment)

Protect environment variables better

We need to pass some environment variables into the analysis container in order for tools like Snyk to use, or for the GitHub API to be queried.

We also install packages into the container, so we should expect environment variable to walk away.

Some options:

Trivially encode, or encrypt with a static password. An attacker would have to learn that the payload came from the Omega Analyzer in order to decode it, which is probably more trouble than it's worth.
Squirrel away the password inside the image -- maybe the first thing the image does when starting is pull those variables out of env and put them somewhere on disk. An attacker would have to have a malware payload specifically look for it.
Have the parts of analysis that involve running untrusted code happen at the very end, and clear environment variables right before it, so there's nothing left to lose.

I'm leaning toward the last one, which should be pretty straightforward. Places we run code:

For npm only -- CodeQL post-install - We install a module and then run CodeQL across the entire directory, to capture all transitive dependencies. As part of the analysis, CodeQL will attempt to build, running configure/make/etc.
For npm, pypi, and nuget -- The strace logic does an "install" (npm i, pip install, or dotnet add).
For npm only -- npm audit does an install first.

These already exist near the end of the script, we might just need to move Snyk Code and manalyze up a bit.

Expand export to support other package types

    This code exists to import (en masse) results generated by the Analyzer, stored in a blob store. But any type of package can be used (not just npm). I think I have npm in there for testing -- this is one of the areas that needs a bit of work.

Originally posted by @scovetta in #55 (comment)

Don't create an export directory on run failure

$ tree -d
.
├── npm
│   ├── co
│   │   └── 4.6.0
│   │       └── reference-binaries
│   └── left-pad
│       ├── 1.3.0
│       │   └── reference-binaries
│       └── 1.3.5
...

If a directory fails execution, it should not create a new directory on the container

Implement api_download_file() in views/findings.py

    API download file method to be implemented.

Need to define the output file types that will be provided (json, csv, pdf)

Originally posted by @Cyber-JiuJiteria in #55 (comment)

Please use double-dash instead of single-dash for long name options

Most Unix/POSIX tools have an option convention of - followed by a collection of single-letter options, and -- followed by a single long-name option. Long options came from GNU but many other tools also support them. E.g.:

ls -ad
ls --all --directory

So I think:

osf generate –assertion reproducible -subject pkg:npm/[email protected] –repository assertions.openssf.org

Should be:

osf generate --assertion reproducible --subject pkg:npm/[email protected] --repository assertions.openssf.org

If this is already happening, and it's just misleading due to markdown formatting, then never mind... :-).

complete implementation of dynamic.py

    complete implementation of dynamic.py - need requirements of functionality for methods

Originally posted by @Cyber-JiuJiteria in #31 (comment)

OAF: Improve support for running under Windows

We're currently testing on Linux, and there are a couple places where I'm sure I've made assumptions about things like path separators.

We should test everything on Windows and make fixes to make sure things work there too.

Investigate GUAC

We should dive deep into GUAC to see what kind of alignment makes sense. Some options:

We can emit assertions into GUAC (Neo4J).
We can run policies via GUAC (Neo4J).
We could pivot into GUAC entirely.

We should probably use the format we defined in ossf/security-reviews. Maybe even change this to require that. There is a validator over in that repo we can use.

Manual reviews should follow the format from ossf/security-reviews and not just totally freeform.

We should probably use the format we defined in ossf/security-reviews. Maybe even change this to require that. There is a validator over in that repo we can use.

Originally posted by @scovetta in #31 (comment)

Discussion: CWE-based policies for Assurance Assertions

How should we represent CWEs for assertions?

Option 1 - Each CWE gets it's own policy (auto-generated):

Option 2 - We roll up CWEs according to our own collections, or by the CWE 'childOf' hierarchy.

Option 3 - We don't use CWEs at all.

Thoughts?

Complete design and implementation of assertion subject

    to be implemented.- needs requirements and understanding of use cases / functionality

Originally posted by @Cyber-JiuJiteria in #31 (comment)

Add support for oss-reproducible to Omega Analyzer

The Omega Analyzer runs in a Docker container.

OSS Reproducible uses Docker containers itself in order to attempt to reproduce a package from source.

It would be nice to get OSS Reproducible running within the Docker container (using a nested Docker image) -- I'm not sure how easy this is these days, but it would be worth a bit of noodling.

Docker build is broken (.NET)

It looks like after .NET was added to the default Ubuntu package repository, the way that we install and then use it to install .NET tools has broken.

There is a workaround at dotnet/core#7699.

SecurityScorecard: Multiple 'expiration' fields

It looks like we pass expiration in twice:

2022-12-31 10:28:41,224 DEBUG Running command: ['python', 'oaf.py', '--verbose', 'generate', '--expiration=2024-12-30T10:28:41.224358Z', '--assertion', 'SecurityScorecard', '--subject', 'pkg:github/madler/[email protected]', '--repository', 'dir:/opt/ssd/repos/alpha-omega/omega/oaf/omega/k8', '--signer', '', '--expiration', '2024-12-30T10:28:41.224358Z']

Migrate the Toolshed powershell scripts to bash

Everything else is either bash or python at this point, we should change over the ones in omega/analyzer/worker to probably bash.

Design: Should evidence output by structured (JSON) or blob of text?

We can either put evidence in a single place, e.g.

evidence: {
  content: {
    "output": "{\"package\":{\"system\":\"NPM\",\"name\":\"left-pad\"}...

Or we can expand it out when possible:

evidence: {
  content: {
    "output": {"package":
       {
         "system":"NPM",
         "name": "left-pad"
       }...

Implement BasePolicy

    to be implemented - needed BasePolicy requirements for implementation of use case(s)

Originally posted by @Cyber-JiuJiteria in #31 (comment)

Bug: Timestamps aren't consistent

How should we leave timestamps in assertions?

Local time with timezone?
UTC?
Epoch milliseconds?
Include time?

Not sure what the best practice is -- I'm leaning toward UTC since it'll make later filtering easier (e.g. "show me all non-expired assertions") but still keep it readable. Keep time, because why make it harder later?

Bug: Subject not updated with latest version

When we analyze a versionless subject (e.g. pkg:npm/left-pad), the subject should be updated to be the latest version (for assertions that analyze a particular version). This doesn't happen.

Example -- npm/xmldom for the SecurityAdvisory assertion.

Add testing validation for running analysis on a package with a namespace.

    Might need to add a task for fuzzing testing to validate any unique namespaces with special characters, such as word__text or word-11_02_22, plus the URL encoding for special characters and double "//"

Originally posted by @Cyber-JiuJiteria in #31 (comment)

Functionality for code signing with OSS

    Where does this private key get added to? Any possiblity of exposures?

Possibly of leveraging the public key instead?

Originally posted by @Cyber-JiuJiteria in #31 (comment)

#31 (comment)

Support local directory analysis

We should be able to run the Omega Analyzer against a directory and not only a PackageURL.

Support a mapped path (via Docker) so if e.g. /opt/input-src exists, then use that instead of grabbing PackageURLs remotely. There are a bunch of places in runtools.sh that will need to be updated to account for this.

Generic support for container types

    Source code limits this to docker by design, yet, the variable `--toolchain-container` implies a generic container. There could be a point where docker containers are not an option for consumers, and their choice may be i.e., podman, rkt, artifactory... and the community decides to create a new container for Assertions.

We should design so it's generic enough to support and plug-&-play for any type container. This functionality could be extracted and defined in an abstract class (file) for the docker commands and variables. Leaving the docker execution, code and default value set for our development and execution, but a standard/"skeleton" for another container type.

Originally posted by @Cyber-JiuJiteria in #31 (comment)

Disable semgrep/opt.semgrep-rules.gitlab.eslint.detect-object-injection

I noticed that the Semgrep "detect-object-injection" rule has very high false positive, to the point of being unusable as an analysis rule. This was confirmed at https://gitlab.com/gitlab-org/gitlab/-/issues/351399 so we just need to disable it within the Omega Analyzer using the --exclude-rule CLI parameter when running semgrep scan in runtools.sh.

Ref: https://semgrep.dev/docs/cli-reference/

ossf / alpha-omega Goto Github PK

alpha-omega's Introduction

Alpha-Omega

Alpha

Project Information

Meeting times

Core Team

Get Involved

alpha-omega's People

Contributors

Stargazers

Watchers

Forkers

alpha-omega's Issues

Recommend Projects

Recommend Topics

Recommend Org