Giter Site home page Giter Site logo

cyberhunters / malware-detection-using-machine-learning Goto Github PK

View Code? Open in Web Editor NEW
67.0 2.0 21.0 1.96 MB

Multi-class malware classification using Deep Learning

Jupyter Notebook 99.25% Python 0.75%
big 15 big15 big-2015 kaggle-competition machine-learning malware-detection malware-analysis microsoft-big-2015 asm

malware-detection-using-machine-learning's Introduction

Malware Detection Using Machine Learning

This repository contains the source code for detecting different type of malwares using Deep learning based Feature Extraction and Wraper based Feature Selection Technique. A research paper describing how it works is availible at "https://arxiv.org/abs/1910.10958"

Two major approaches we used for malware classification: 1- Image representation of byte file Independent of the platform It requires No knowledge of domain like assembly instructions 2- Hybrid feature space using both ASM and byte file This approach is platform dependent but gives a better performance that using byte file. Requires huge resources and processing time.

The data used in these tutorial can be found on the Hybrid(Final) folder of following drive link:

https://drive.google.com/drive/folders/1s7EC4s_-hP9q5vEhs-3vAubspcZbBADK?usp=sharing

After downloading the required dataset, following is the sequence of files in the hybrid folder whose execution will lead to results.

  1. "Creating hybrid dataset"

  2. "Min-max normalization(hybrid dataset)"

  3. "ANN-Results"

The project was done under the guidance of Dr. Asifullah Khan, DCIS, PIEAS.

malware-detection-using-machine-learning's People

Contributors

cyberhunters avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

malware-detection-using-machine-learning's Issues

How to generate CSV Files?

I am trying to recreate this project so I can use my own Malicious and Benign files to build the CSV files but there is no explanation or code within the project to get the data for data_to_test.csv, data_to_tra.csv, data_to_val.csv. Is it possible to know how this data was extracted or if you would be so kind to include it to the repo. Thank you for any insight

How to make predictions on files

First of all thank you very much for supplying all this as its very difficult to follow the documentation on the kaggle challenge. I have the n_final_hybrid_valid.csv and optimize folder with checkpoints. Im using windows 10 x64 and python3.6 and the folders it creates wont open saying there corrupted or damaged. I have also tried saving them as rar files but same thing happens. My main question here is what data must I pass to the ann_hybrid to make a prediction on a unknown file? is that the proper way to use this or does a prediction module need to be written?. Would i need to write something that will take the file path as a input then disassemble it and extract the byte code and normalize the data from that file to match the output in the CSV? please help and thank you for your time.

I hope to hear from you soon!
GREAT WORK!!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.