Giter Site home page Giter Site logo

chaogaoucr / mirage Goto Github PK

View Code? Open in Web Editor NEW

This project forked from gururaj-s/mirage

0.0 0.0 0.0 30 MB

MIRAGE (USENIX Security 2021)

License: MIT License

Shell 0.34% C++ 78.77% Python 13.73% Perl 0.14% C 3.29% Emacs Lisp 0.01% Objective-C 0.01% Java 0.01% Scala 0.01% Fortran 0.03% SuperCollider 0.02% Assembly 1.40% Awk 0.01% Forth 0.01% Hack 0.46% Makefile 0.16% HTML 0.29% CMake 0.08% Jupyter Notebook 1.13% M4 0.12%

mirage's Introduction

MIRAGE: Mitigating Conflict-Based Cache Attacks with a Practical Fully-Associative Design

Authors: Gururaj Saileshwar and Moinuddin Qureshi, Georgia Institute of Technology.
Appears in USENIX Security 2021.

Citation

Gururaj Saileshwar and Moinuddin Qureshi. "MIRAGE: Mitigating Conflict-Based Cache Attacks with a Practical Fully-Associative Design". In 30th USENIX Security Symposium (USENIX Security 21). 2021.

Introduction

The artifact covers two aspects of results from the paper:

  • Security Analysis of MIRAGE: A Bins and Buckets model of the Last-Level-Cache implementing MIRAGE is provided in a C++ program, to quantify its security properties. This aspect can be easily evaluated on a commodity CPU (perhaps even a laptop with 4-cores/8 threads) in 3-6 hours runtime, without major SW dependencies. Your will be able to recreate all the Security-Analysis related tables and graphs: Fig-7, Table-1, Fig-9, Fig-10, Table-4.
  • Performance Analysis of MIRAGE: An implementation of the MIRAGE LLC is provided as a part of the Gem5 CPU Simulator. To run the performance evaluations, a server-grade system is needed (at least 30 threads preferred) and the expected runtime is 12 - 24 hours. You are also required to have access to the SPEC-2006 benchmark suite (we are unable to provide this due to its restrictive license). You will be able to recreate the performance results provided in Appendix-B.
    • Note that the main performance results in the paper were generated with a cache simulator using an Intel Pin version that is no longer publicly available. Hence, for the artifact, we are providing MIRAGE implemented in Gem5 (results in Appendix-B), which is much easier to open-source and replicate. We plan to add other results shown in the paper to the Gem5 implementation, before open-sourcing it.

Requirements For Security Evaluation:

  • SW Dependencies : C++, Python3, Jupyter Notebook and Python3 Packages (pandas, matplotlib, seaborn).

  • HW Dependencies :

    • A 8 core CPU desktop/laptop will allow a simulation of 10 Billion cache-fills in 1-2 hours (default for artifact evaluation).
    • Note the run-scripts spawn 8 parallel threads by default. If your system supports fewer than 8 threads, please modify security_analysis/results/base/run_base.sh.

Steps for Security Evaluation

Here you can recreate all the Security-Analysis related tables and graphs: Fig-7, Table-1, Fig-9, Fig-10, Table-4, by following these instructions::

  • Compile the binaries: cd security_analysis ; make all
  • Run the experiments: ./run_exp.sh . This will run following scripts for all experiments:
    • Base experiments: cd results/base; ./run_base.sh.
      • This will run the default base configuration of 16-Way LLC, i.e. 8 Base-Ways-Per-Skews
      • This will spawn 6 parallel experiments for Extra-Ways-per-Skew = 1 to 6. (if your system cannot support 6 threads, please modify ./run_base.sh).
      • Each experiment defaults to 10 Billion Ball Throws (default for the artifact evaluation). This can be controlled with arguments as ./run_base.sh <NUM_BILLION_THROWS> <NUM_EXP>, to execute NUM_BILLION_THROWS x NUM_EXP` ball throws. We used 500 Billion x 20 to simulate 10 Trillion Ball throws for the paper, but using 10 Billion provides results in similar order of magnitudes.
    • Sensitivity experiments: cd results/sensitivity; ./run_sensitivity.sh.
      • This will run the evaluations for 8-Way and 32-Way LLC (4 and 16 Base-Ways-Per-Skews).
      • Only 10 Billion Ball Throws are simulated in these experiements.
  • Visualize the results: jupyter notebook results/visualize_results.ipynb. This will plot the following:
    • Fig-7: Bucket-Spill-Frequency as Bucket-Capacity (Ways-Per-Skew) changes. This is directly from the results of the simulations.
    • Fig-9,Fig-10: Empirical and Analytical Bucket-Probabilities and Bucket-Spill-Frequency. The Empirical results are directly from the simulations. The Analytical values are calculated using the Bucket-Probability(0) from the experiments, in the Equations in Section-4.3 and 4.4 in the paper.
    • Table-1: Is directly taken from Fig-10.
    • Table-4: Bucket-Spill-Frequency as LLC-Associativity varies. This uses similar analysis as Fig-10, except the values are used from the sensitivity experiments.
    • (Note: Results may not identically match the paper results as only 10-Billion Cache-Fill simulations are performed, while the paper had 10 Trillion Cache-Fills. However the results should be in similar order of magnitude as the paper-results.)

Requirements For Performance Evaluations in Gem5 CPU Simulator:

  • SW Dependencies: Gem5 Dependencies - gcc, Python-2.7, scons-3.
    • Tested with gcc v4.8.5/v6.4.0 and scons-3.1.2.
    • Scons-3.1.2 download link. To install, tar -zxvf scons-3.1.2.tar.gz and cd scons-3.1.2; python setup.py install (use --prefix=<PATH> for local install).
  • Benchmark Dependencies: SPEC-2006 Installed.
  • HW Dependencies:
    • A 15 CPU Core or more system, to finish experiments in ~6 hours.
    • A 4 CPU Core system may require approximately 1 - 1.5 days.

Steps for Gem5 Evaluation

Here you will recreate results in Appendix-B(Fig-15), by executing the following steps:

  • Compile Gem5: cd perf_analysis/gem5 ; scons -j50 build/X86/gem5.opt
  • Set Paths in scripts/env.sh. You will set the following :
    • GEM5_PATH: the full path of the gem5 directory (current directory).
    • SPEC_PATH: the path to your SPEC-CPU2006 installation.
    • CKPT_PATH: the path to a new folder where the checkpoints will be created next.
    • Please source the paths as: source scripts/env.sh after modifying the file.
  • Test Creating and Running Checkpoints: For each program the we need to create a checkpoint of the program state after the initialization phase of the program is complete, which will be used to run the simulations with different hardware configurations.
    • To test the checkpointing process, run cd scripts; ./ckptscript_test.sh perlbench 4;: this will create a checkpoint after 100K instructions (should complete in a couple of minutes). Once it completes, run ./runscript_test.sh perlbench Test Baseline 4 8MB 3: this will run the baseline design for 500K instructions from the checkpoint.
      • In case the ckptscript_test.sh fails with the error $SPEC_PATH/benchspec/CPU2006/400.perlbench/run/run_base_ref_amd64-m64-gcc41-nn.0000: No such file or directory, it indicates the script is unable to find the run-directory for perlbench. Please follow the steps outlined in README_SPEC_INSTALLATION.md to ensure the run-directories are properly set up for all the SPEC-benchmarks.
    • To check if the run is successfully complete, check less ../output/multiprogram_8Gmem_100K.C4/Test/Baseline/perlbench/runscript.log. The last line should have Exiting .. because a thread reached the max instruction count.
  • Run All Experiments: for all the benchmarks, run ./run_all_exp.sh. This will run the following scripts:
    • ./run.perf.4C.sh - This creates checkpoints and runs the experiments for the performance-results with 8MB LLC (shared among 4-cores). Specifically it runs:
      • Create Checkpoint: For each benchmark, the checkpoints will be created using ./ckptscript.sh <BMARK> 4.
        • By default, ckptscript.sh is run for 42 programs in parallel (14 single-program, 14 multi-core and 14 mixed workloads). Please modify run.perf.4C.sh if your system cannot support 28 - 42 parallel threads.
        • For each program, the execution is forwarded by 10 Billion Instructions (by when the initialization of the program should have completed) and then the architectural state (e.g. registers, memory) is checkpointed. Subsequently, when each HW-config is simulated, these checkpoints will be reloaded.
        • This process can take 12 hours for each benchmark. Hence, all the benchmarks are run in parallel by default.
        • Please see ../configs/example/spec06_config.py for list of benchmarks supported.
      • Run experiments: Once all the checkpoints are created, the experiments will be run using ./runscript.sh <BMARK> <RUN-NAME> <SCHEME>, where each HW config (Baseline, Scatter-Cache, MIRAGE) is simulated for each benchmark.
        • The arguments for runscript.sh are as follows:
          • RUN-NAME: Any string that will be used to identify this run, and the name for the results-folder of this run.
          • SCHEME: [Baseline, scatter-cache, skew-vway-rand]. (skew-vway-rand is MIRAGE).
          • NUM_CORES: Number of cores (default is 4).
          • LLCSZ: Size of the LLC (default is 8MB).
          • ENCRLAT: Encryptor Latency (default is 3 cycles).
        • Each program is simulated for 1 billion instructions. This takes ~8 hours per benchmark, per scheme. Benchmarks in 2-3 schemes are run in parallel for a total of up to 84 parallel Gem5 runs at a time (please modify run.perf.4C.sh if your system cannot support upto 80 parallel threads).
      • Generate results: cd stats_scripts; ./data_perf.sh. This will compare the normalized performance (using weighted speedup metric) vs baseline.
        • The normalized peformance results will be stored in stats_scripts/data/perf.stat.
        • Script to collect the LLC misses-per-thousand-instructions (MPKI) for each of the schemes is also available in stats_scripts/data_mpki.sh.
    • ./run.sensitivity.cachesz.sh - This runs the evaluations for sensitivity to LLC-Size from 2MB to 64MB (shared between 4-cores)
      • Experiments are run using the script ./runscript.sh
      • Results for normalized Perf vs. LLCSz can be generated using cd stats_scripts; ./data_LLCSz.sh.
      • Results are stored in stats_scripts/data/perf.LLCSz.stat.
    • ./run.sensitivity.encrlat.sh - This runs the evaluations for Encryption-latencies from 1 to 5 (used in cache-indexing).
      • Experiments are run using the script ./runscript.sh
      • Results for normalized Perf vs. EncrLat can be generated using cd stats_scripts; ./data_EncrLat.sh.
      • Results are stored in stats_scripts/data/perf.EncLat.stat.
  • Visualize the results: Graphs can be generated using jupyter notebook graphs/plot_graphs.ipynb for Performance, LLCSz vs Perf., EncrLat vs Perf.
  • Note on Simulation Time: Running all experiments takes almost 3-4 days on a system supporting 72 threads.
    • To shorten experiment run time, you may reduce instruction count in runscript.shto 500 Million.
    • You can also run only ./run.perf.4C.sh and skip the sensitivity analysis.
    • You can also run many more parallel gem5 sims if your system supports it by modifying the sleep-loops in run.perf.4C.sh and run.sensitivity.*.sh.

mirage's People

Contributors

gururaj-s avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.