Giter Site home page Giter Site logo

scavenger's Introduction

Scavenger - OSINT Bot - REWORKED


bot in action


Anurag's GitHub stats


Intro

Just the code of my OSINT bot searching for sensitive data leaks on paste sites.

Search terms:

  • credentials
  • private RSA keys
  • Wordpress configuration files
  • MySQL connect strings
  • onion links
  • SQL dumps
  • API keys
  • complete emails

Search terms can be customized. You can learn more about it in the configuration section.

Articles About Scavenger

Main Features

For pastebin.com the bot has two modes:

  • looking for sensitive data in the archive via scraping
  • looking for sensitive data by tracking users who publish leaks

Additional features:

  • customizable search terms
  • scan folders with text files for sensitive information

Configuration

  1. Delete the README.md files in every subfolder as they are only placeholders
  2. The bot searches for email:password combinations and other kinds sensitive data by default. If you want to add more search terms edit the configs/searchterms.txt file or use the -3 switch in the control script Default configs/searchterms.txt configuration:
mysqli_connect(
BEGIN RSA PRIVATE KEY
The name of the database for WordPress
apiKey:
Return-Path:
insert into
INSERT INTO
.onion

If you want to add other search terms just add them to file line by line. You know a useful search terms which is missing here? Tell me! :-) 3. For the user tracking module of pastebin.com you need to add the target users line by line to the configs/users.txt file.

Usage

Program help:

$ python3 scavenger.py -h

  _________
 /   _____/ ____ _____ ___  __ ____   ____    ____   ___________
 \_____  \_/ ___\\__  \\  \/ // __ \ /    \  / ___\_/ __ \_  __ \
 /        \  \___ / __ \\   /\  ___/|   |  \/ /_/  >  ___/|  | \/
/_______  /\___  >____  /\_/  \___  >___|  /\___  / \___  >__|
        \/     \/     \/          \/     \//_____/      \/       Reworked

usage: scavenger.py [-h] [-0] [-1] [-2] [-3] [-4]

control script

optional arguments:
  -h, --help           show this help message and exit
  -0, --pbincom        Activate pastebin.com archive scraping module
  -1, --pbincomTrack   Activate pastebin.com user tracking module
  -2, --sensitivedata  Search a specific folder for sensitive data. This might
                       be useful if you want to analyze some pastes which
                       were not collected by the bot.
  -3, --editsearch     Edit search terms file for additional search terms
                       (email:password combinations will always be searched)
  -4, --editusers      Edit user file of the pastebin.com user track module

example usage: python3 scavenger.py -0 -1

Crawled pastes are stored at different locations depending on their status.

  • Paste crawled but nothing was detected -> data/raw_pastes
  • Paste crawled and an email:password combination was detected -> data/raw_pastes and data/files_with_passwords
  • Paste crawled and other sensitive data was detected -> data/raw_pastes and data/otherSensitivePastes

Pastes get stored in data/raw_pastes until they reach a limit of 48000 files. Once there are more then 48000 pastes they get ziped and moved to the archive folder.


Start the pastebin.com archive scraping module

$ python3 scavenger.py -0

Start pastebin.com user tracking module

$ python3 scavenger.py -1

When starting one of these modules, a tmux session with the running module is created in the background.

List tmux sessions

$ tmux ls
pastebincomArchive: 1 windows (created Sun Apr 14 06:33:32 2021) [204x58]
pastebincomTrack: 1 windows (created Sun Apr 14 06:33:32 2021) [204x58]

Interact with a tmux session example

$ tmux a -t pastebincomArchive
$ tmux a -t pastebincomTrack

To detach from a session hit STRG+b d.


If you want to start a module without using the control software you can do this by calling them directly.

Pastebin.com archive scraper

$ python3 pbincomArchiveScrape.py

Pastebin.com user tracker

$ python3 pbincomTrackUser.py

Search specific folder for sensitive data:

$ python3 findSensitiveData.py TARGET_FOLDER

To Do

If you miss anything and want me to add features or make changes, just let me know via Twitter or GitHub issue :-)

scavenger's People

Contributors

rndinfosecguy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

scavenger's Issues

pasteorg.py (SSLv3)

I am aware of an SSLv3 issue currently present in the pasteorg.py file.
Working on it...

Never ending Loop error

can't quite figure out where this error is happening on P_bot.py. Seems to loop infinitely with the message below:

  1. iterator:
    Expecting value: line 1 column 1 (char 0)
  2. iterator:
    Expecting value: line 1 column 1 (char 0)
  3. iterator:
    Expecting value: line 1 column 1 (char 0)

parser library?

Couldn't find a tree builder with the features you requested: lxml. Do you need to install a parser library?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.