Giter Site home page Giter Site logo

provenance-stats's Introduction

Provenance Stats

This framework contains codebase that helps compare different provenance auditing methods. More specifically, this repository is collection of system setup, test generation and test stats generation scripts for instrumentation tradeoff and comparison work at SRI International.

This framework uses SPADE as provenance auditing collection tool. The comparison matrix includes both compile time and runtime instrumentation. For compile time instrumentation, we are using LLVM based code injecting tool that is shipped as part of SPADE. For run time instrumentation, we use two types of methods: syscall level provenance tracking using Strace (shipped along with SPADE) and tracking of data during runtime using dtracker tool. These tests are performed on GNU coreutils.

To study application loading time, we have added null program in coreutils. C sourecode for null program is int main(){return 0;}

This repository performs these three tasks:

  • Prepares the system for running tests. It uses Ansible [http://docs.ansible.com/] to automate the process of installing all required packages and their dependencies. It also downloads and builds SPADE, dtracker and other tools.
  • Test generation scripts generate more scripts that are used to run a specific test. A config file for test generation is also provided.
  • Stats collection scripts that checks the log files created when running the tests and generates a stats file.

This has been tested on Ubuntu 14.04 LTS 32 bit only with LLVM version 3.6 and GNU coreutils 8.24.

Layout of this repository is as follows:

  • setup: Scripts related to setting up the system to run tests.
  • setup/ubuntu.yml: Ansible Playbook that sets up development environment for SPADE.
  • setup/ubuntu.sh: Automates the install process by invoking Ansible. This script also builds all required tools and softwares.
  • setup/localhost: Required for Ansible.
  • setup/buildscripts/*: Scripts for building up dtracker, SPADE and coreutils (including instrumented version).
  • testgen: Scripts related to test generation.
  • testgen/config.py: Config file for parameterizing test generation.
  • testgen/mktests.py: Script that generates test scripts.
  • testgen/mkstats.py: Script to generate stats (a CSV file) from the test.

1. Install Prerequisites

Only ansible and git are required. You can install these in Ubuntu by executing these commands:

sudo add-apt-repository ppa:ansible/ansible
sudo apt-get update
sudo apt-get install ansible git

2. Setup Machine

Setup the machine using this command:

source <(curl -s https://raw.githubusercontent.com/hasanatkazmi/provenance-stats/master/setup/ubuntu.sh)

This installs all required packages using apt-get and/or directly from the provider. It will also download and build all provenance-auditing tools used. This step will take considerable time. (for debug commands, read top of ubuntu.yml)

For Strace reporter to report correctly, edit and set /proc/sys/kernel/yama/ptrace_scope to 0.

3. Configure a test

Edit testgen/config.py to tailor the test accordingly. This file is well documented and each variable is explained.

4. Generate a test

Execute testgen/mktests.py to create a test directory. Default test directory is test and will be located at root of the repository.

5. Start SPADE

SPADE will be at provenance-stats/staging/SPADE. Run ./bin/spade start|debug to start SPADE.

6. Run a Test

Exectute <test>/run_all.sh to run all tests. You can also run tests for individual reporters by running <test>/<reporter>/run_all_<reporter>.sh. You can furthermore run test for individual utility by executing <test>/<reporter>/<util>/run_util.sh.

7. Stats generation

Run testgen/mkstats.pyto generate a CSV file of test stats. By default CSV file location is <test>/stats.csv. Each column in the CSV file has is explained here:

  • reporter: Type of instrumentation that was performed.
  • util: Specific Coreutil utility that was used in the test.
  • time_avg: Average time taken of multiple runs of the utility. Unit of time is seconds.
  • time_stddev: Standard deviation for time in multiple test runs. Unit is seconds.
  • vertices_avg: Average vertices count for the provenance graph
  • edges_avg: Average edge count for the provenance graph.
  • timedout_count: Number of times util was killed because of timeout.

provenance-stats's People

Contributors

hasanatkazmi avatar

Watchers

 avatar  avatar

Forkers

m000

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.