Giter Site home page Giter Site logo

ossmalware's Introduction

ossmalware

Attempts to use dynamic analysis to find malware hosted on package managers.

Current Status

This is currently a personal project, so getting setup isn't streamlined. I'll be working to improve this shortly.

Getting Started

The first thing you'll need to do is to change up the variables in the terraform/ directory to point to an S3 bucket and SQS queue you control.

Then, you'll need to create an EC2 instance with permission to write to S3 and read from SQS.

When you're SSH'd into that EC2 instance, run the scripts/setup.sh script in this repository to bootstrap the host. This downloads the various Docker images and tooling (like sysdig and tcpdump) that you'll need during analysis.

Then, you can adjust the environment variables in the scripts/start.sh script in this repository, then run it to start listening for packages on the SQS queue.

At this point, you can upload packages to SQS and the worker(s) will start processing them.

ossmalware's People

Contributors

jordan-wright avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ossmalware's Issues

Make Processing More Resilient

For some reason, hosts would occasionally have defunct sysdig processes laying around. And more generally I saw the occasional package processing weirdness result in a major stalling of the workers.

If we want this to be a long term solution, we should try to clean up the processing workflow as much as possible to fail gracefully.

Move to Google Cloud

Right now, this system runs on AWS which was fine for the initial experiment. As this project moves under the umbrella of the OpenSSF, we'll want to move it to GCP to better provide funding and support.

This involves both updating the service components to support GCP (ref #8), as well as updating our Terraform configs.

Move from EC2 to a container-based workload

Right now we need to maintain EC2 worker instances for processing which isn't ideal since weird errors that occur during processing can propagate to cause the entire host to stall.

Ideally, we would move to a workload based solution like Fargate so that each package is installed in a totally isolated environment. Most seem to support SYS_PTRACE which is required for sysdig to work. It'll just be a matter of figuring out how to make it work.

Add Worker Bootstrapping to Terraform

Right now EC2 instances are manually provisioned instead of using Terraform. This is from my inexperience with Terraform, but I'd like to take the time to change this.

I'm also ok if we hold off on this until we move to a workload based solution.

Make Terraform Configurable

Right now the S3 and SQS names are hard coded. We should make these proper variables.

I'm not a terraform expert so I'm not sure what else we can do to make this more robust, so let me know if you have ideas!

License?

Hey,

Any chance you could add a license to this repo? Many employers (including mine) only allow us to contribute to repos with an allowed license.

Automatically Bootstrap Workers

While we're still using persistent hosts (EC2 instances) as workers instead of something like Fargate, we should make the setup of workers easy.

Right now, users have to manually run setup.sh and start.sh. Ideally we could bootstrap both of these from the EC2 user data so new hosts are automatically setup and start running.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.