Giter Site home page Giter Site logo

needmorecowbell / hamburglar Goto Github PK

View Code? Open in Web Editor NEW
315.0 12.0 24.0 33.48 MB

Hamburglar -- collect useful information from urls, directories, and files

License: GNU General Public License v3.0

Python 38.82% YARA 61.18%
information-gathering python3

hamburglar's Introduction

The Hamburglar

Setup

There are 2 versions of hamburglar, full and lite. The main branch is the full version, and hamburglar lite is on a separate branch.

Hamburglar

Full fledged scraping tool for artifact retrieval from multiple sources. There are some dependencies, so install them first:

pip3 install -r requirements.txt

Hamburglar also has the option of checking against file signatures during a hexdump. It will get skipped if not set up. To get it working, you will need to first create the database and a user:

CREATE DATABASE 
CREATE USER 'hamman'@'localhost' IDENTIFIED BY 'deadbeef';
GRANT ALL PRIVILEGES ON fileSign.signatures TO 'hamman'@'localhost';

Then, run magic_sig_scraper. This can be run on a cronjob to regularly update it, or just run it once:

python3 magic_sig_scraper.py

Hamburglar Lite

Multithreaded and recursive directory scraping script. Stores useful information with the filepath and finding. Hamburglar lite will never require external packages, and will always remain as a single script. Setup is as simple as requesting the file and using it:

wget https://raw.githubusercontent.com/needmorecowbell/Hamburglar/hamburglar-lite/hamburglar-lite.py

This is designed to be quickly downloaded and executed on a machine.

Operation

usage: hamburglar.py [-h] [-g] [-x] [-v] [-w] [-i] [-o FILE] [-y YARA] path

positional arguments:
  path                  path to directory, url, or file, depending on flag
                        used

optional arguments:
  -h, --help            show this help message and exit
  -g, --git             sets hamburglar into git mode
  -x, --hexdump         give hexdump of file
  -v, --verbose         increase output verbosity
  -w, --web             sets Hamburgler to web request mode, enter url as path
  -i, --ioc             uses iocextract to parse contents
  -o FILE, --out FILE   write results to FILE
  -y YARA, --yara YARA  use yara ruleset for checking

Directory Traversal

  • python3 hamburglar.py ~/Directory/
    • This will recursively scan for files in the given directory, then analyzes each file for a variety of findings using regex filters

Single File Analysis

  • python3 hamburglar.py ~/Directory/file.txt
    • This will recursively scan for files in the given directory, then analyzes each file for a variety of findings using regex filters

YARA Rule Based Analysis

  • python3 hamburglar.py -y rules/ ~/Directory
    • This will compile the yara rule files in the rules directory and then check them against every item in Directory.

Git Scraping Mode

  • python3 hamburglar.py -g https://www.github.com/needmorecowbell/Hamburglar
    • Adding -y <rulepath> will allow the repo to be scraped using yara rules

Web Request Mode

  • python3 hamburglar.py -w https://google.com
    • Adding a -w to hamburgler.py tells the script to handle the path as a url.
    • Currently this does not spider the page, it just analyzes the requested html content

IOC Extraction

  • python3 hamburglar.py -w -i https://pastebin.com/SYisR95m
    • Adding a -i will use iocextract to extract any ioc's from the requested url

Hex Dump Mode

  • python3 hamburglar.py -x ~/file-to-dump
    • This just does a hex dump and nothing more right now -- could be piped into a file
    • This will eventually be used for binary analysis

Tips

  • Adding -v will set the script into verbose mode, and -h will show details of available arguments
  • Adding -o FILENAME will set the results filename, this is especially useful in scripting situations where you might want multiple results tables (ie github repo spidering)

Settings

  • whitelistOn: turns on or off whitelist checking
  • maxWorkers: number of worker threads to run concurrently when reading file stack
  • whitelist: list of files or directories to exclusively scan for (if whitelistOn=True)
  • blacklist: list of files, extensions, or directories to block in scan
  • regexList: dictionary of regex filters with filter type as the key

The Hamburglar can find

  • ipv4 addresses (public and local)
  • emails
  • private keys
  • urls
  • ioc's (using iocextract)
  • cryptocurrency addresses
  • anything you can imagine using regex filters and yara rules

Example output:

{
    "/home/adam/Dev/test/email.txt": {
        "emails": "{'[email protected]'}"
    },
    "/home/adam/Dev/test/email2.txt": {
        "emails": "{'[email protected]'}"
    },
    "/home/adam/Dev/test/ips.txt": {
        "ipv4": "{'10.0.11.2', '192.168.1.1'}"
    },
    "/home/adam/Dev/test/test2/email.txt": {
        "emails": "{'[email protected]', '[email protected]'}"
    },
    "/home/adam/Dev/test/test2/ips.txt": {
        "ipv4": "{'10.0.11.2', '192.168.1.1'}"
    },
    "/home/adam/Dev/test/test2/links.txt": {
        "site": "{'http://login.web.com'}"
    }
}

Contributions

hamburglar's People

Contributors

adi928 avatar dependabot[bot] avatar jaeger-2601 avatar joanbono avatar needmorecowbell avatar pingywon avatar tijko avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hamburglar's Issues

use yara rule checking file as flag option

example hamburglar.py -y <yarafile.yar> <input/directory/webpath>

Where yarafile.yar is parsed into python and then checked against the given input.

SUPER helpful and builds on top of regex nicely. However, now we would need to have imported dependencies. Maybe have a hamburglar-lite and that can be used with python2 or 3, then a hamburglar which runs on 3 with imports?

all rule matches for a given file should be merged to one dictionary

Currently rule matches are paired with a filepath. This means that if two rules flag for the same file, they will both be put into a list. Instead of this, the list should be a dictionary, and any time a rule is matched, it the flag id should be added to a list in the value portion.

so right now it is:

"path/to/file1" : "ruleName",
"path/to/file1": "rule2Name"

instead..
"path/to/file1": [ "ruleName", "rule2Name"]

fix the output so it's actually useful

right now even when it's not on verbose mode there is way too much output. As more flags get added, this way of logging will become useless, there's just too many -- especially when file id rules are added. The more rules/filters the better, so logging will have to properly work with that in mind. Currently non verbose mode might even be too much as a verbose mode.

web request mode does not spider the page

Currently it just analyzes the requested html content, web request mode does not spider the page. Aggregating any links on the same domain and then scraping those links recursively should be a goal of this script.

Phone regex only matches if on first line of file

Scans return only if phone number is on the first line of the file and the match is an empty string.

Given files:

first/foo.txt

Phone Numbers:

671-342-2121

first/taz.txt

1-(888)-234-1111
(718)-212-5555
978-543-2919
617-241-9990

first/test
781-231-3292

Then running python hamburglar.py first/

[+] scanning...
[+] adding:first/test.txt (0kb) to stack
[+] adding:first/taz.txt (0kb) to stack
[+] adding:first/foo.txt (0kb) to stack
[+] scan complete
[+] first/test.txt -- 1 result(s) found.
[+] writing to hamburglar-results.json...
[+] The Hamburglar has finished snooping

with hamburglar-results.json:

{
    "first/test.txt": {
        "phone": "{''}"
    }
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.