Giter Site home page Giter Site logo

malshare-crawler's Introduction

                        MALSHARE CRAWLER

A CGI program that consumes from the malshare.com API alowing to obtain info and to download malware samples. Also is used the API of VirusTotal to obtain results of antivirus scannings on those files. One of the functions developed on this project download the files on malshare.com and submit it to virustotal for scanning on a eternal loop so can be studied whats the effectiveness of those antivirus on new malwares.

What has been used:

- Python requests
- Threading
- Text Parsing
- Text encoding and decoding
- File handling (read and write)
- Json Parsing
- Working with directories and paths
- Working with console commands (getopt, sys)
- Exception handling
- Sqlite

Usage is: "mlshrcrawler -s/--starting-date Starting date(%Y %m %d) -e/--ending-date Ending date (%Y %m %d) -o/--output-database Output database address -p/--download-to-address Address in the pc where to download files -t/--file-type Filter for type of file -h/--help -d/--download This option will enable the download. -r/--register This option will enable the register into db -k/--api-key Malshare api key, -q/--apikey-from-file Address of a file containing the Malshare API Key -c/--continue-downloading If specified will look for the last element in the db specified and will continue downloading from the rest elements in that date. -n/--notify-each A number of instances after which the program will notify the status of downloaded and or registered. -v/--last-24h-virus-scan Iniside an infinite loop will download last 24 hours found viruses on malshare dataset, send it to virustotal and register the virus metadata and results of the scan. -w/--virustotal-apikey Virus total api key, -a/--virustotal-apikey-fromfile Load Virus total api key from specified file address. -f/--verbose-to-file Print results of any operation to file

Command example:

The next command line will start a loop downloading the most recent files uploaded to malshare to the folder specified with -p. Each of this file are obtained the details (PE details and Magic file type) and the details provided by malshare. Then is requested virus total to obtain scan results, if the file was not scanned before then is sent to virus total for scanning. Each api key is provided with the -q and -a selectors. Also the logs will be printed to the file "logs" specified by the selector -f :

./mlshrcrawler.py -o virustotalscan.db -p vstestfolder -d -r -q my_malshare_api_key -v -a virustotal_apikey -f logs

malshare-crawler's People

Contributors

roj4s avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.