Giter Site home page Giter Site logo

invoke-evasion's Introduction

Invoke-Evasion

This repository contains various datasets, Jupyter notebooks, and machine learning models that accompany the "Learning Machine Learning" series of blog posts:

Structure

./notebooks/

  • Feature Selection.ipynb - code for performing the various types of feature selection
  • LogisticRegression.ipynb - training a tuned Logistic Regression model on the augmented obfuscated PowerShell dataset
  • TreeModels.ipynb - training various tree ensemble models on the augmented obfuscated PowerShell dataset
  • NeuralNetworks.ipynb - training various Neural Network models on the augmented obfuscated PowerShell dataset
  • WhiteBox.ipynb - white box attacks against the trained Logistic Regression and LightGBM Classifier
  • WhiteBox-NeutalNetwork.ipynb - white box attacks against the trained Neural Network
  • BlackBox.ipynb - black box attacks against the trained models
  • BlackBox-Model3.ipynb - optimization attacks against model 3, the trained Neural Network

./models/

  • tuned_ridge.bin - Pickled tuned L2 (Ridge) regularized Logistic Regression model pipeline trained on the augmented obfuscated PowerShell dataset
  • tuned_lgbm.bin - Pickled tuned LightGBM classifier model trained on the augmented obfuscated PowerShell dataset
  • ./neural_network/ - Saved model weights for a 4-layer 192 neuron Neural Network with a dropout of .5

./datasets/

  • PowerShellCorpus.ast.csv.7z - compressed csv of AST features extracted from an augmented PowerShell corpus dataset of 14702 samples
  • BlackBoxData.ast.csv.7z - compressed csv of AST features extracted from a subset of the PowerShell corpus (3000 samples)

./PS-AST/

  • C# project that integrates the checks from Revoke-Obfuscation (by Daniel Bohannon & Lee Holmes, Apache License 2.0) for AST file generation. Also contains SplitScriptFunctions that outputs every function in a script to a separate file, used for data augmentation.

./samples/

  • Various adversarial samples generated by white/black box evasion methods

invoke-evasion's People

Contributors

harmj0y avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

invoke-evasion's Issues

Extract the AST features from script with python

Thank you for this excellent repository and blog post.

I see that you used a dataset with AST features already. How can I extract these features in my ML pipeline? Can we extract AST features from the Powershell script only or do we need to use the events log to obtain them? Is there any way to use the ast python package to give us the same features?

I am an experienced ML engineer but I am new to PowerShell. I am interested in creating an end-to-end pipeline for preprocessing a raw PowerShell script, extracting AST features, selecting useful features, and finally feeding them to my model. Can you assist me with the "extract AST features" part, please?

aab

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.