Malware (a portmanteau for malicious software) is any software intentionally designed to cause damage to a computer, server, client, or computer network. - wiki
-
Malware detection systems use machine learning models (both in antivirus softwares and cloud) to analyze static DLLs, API features, etc. to detect malwares.
-
But it is possible to fool the machine learning models by using adversarial attacks (generating malwares which look like benign sample to the machine learning model).
That's where GAN becomes useful.
A generative adversarial network is a class of machine learning frameworks designed by Ian Goodfellow and his colleagues in 2014. Two neural networks contest with each other in a game. Given a training set, this technique learns to generate new data with the same statistics as the training set. - - wiki
- It is possible to generate a generative adversarial network (GAN) based algorithm to generate adversarial malware examples, which are able to bypass black-box machine learning based detection models.
In ideal lab scenario, we have access to the machine learning model which is detecting malwares (let's say, we design an MLP classifier which analyzers boolean features from an API, and makes a prediction). While training the GAN, if have full access to the detection model (directly), we can train our GAN by utilizing predictions from the MLP (while optimizing) to make robust adversarial examples. This is the white-box setup.
But in the real world, we may not always have full access to the detection model directly. The model can be treated as black-box and an alternate model can be used for generating the adversarial examples.
gdown https://drive.google.com/uc?id=1PwsY_T0MT4Mbk6g70l-jMpZrA7XweHEZ
gdown https://drive.google.com/uc?id=1sz12ejCuV9_yEzVI7qhRfUeTu7b4bsXO
-
With docker -
- Build the docker image
docker build .
nvidia-docker run -it -d -v /home/:/malgan --net=host d1cbaadbc4ea /bin/bash
nvidia-docker run -it -d -v /home/:/malgan --net=host d1cbaadbc4ea /bin/bash
- Build the docker image
-
Without docker
- Make sure you have Nvidia driver and CUDA >= 9.2 (for GPU support)
pip install -r requirements.txt
- Dataset
- Publish the synthetic data generator