Giter Site home page Giter Site logo

rabbithole's Introduction

rabbithole

Send HTTP scrapers to Wonderland

about

This repository is to showcase techniques for occupying scrapers/spiders which might otherwise be up to malicious activities. Please be aware that these scripts were designed for entertainment purposes to spite the only traffic my site was getting (automated attacks from Chinese IP addresses), not as an actual web defense. Still, I hope that they will slow, frustrate, or crash attackers' poorly-written scrapers and provide a few good examples of cases to test when writing your own scraper.

techniques

Below is a list of the techniques I've tested and documented enough that I'm comfortable sharing them. Unless otherwise stated, expect the webserver to be Apache 2.4 and the Python interpreter to be Python 3.6 or later.

cgi-bin/junkstream.py

This is a CGI script to supply an infinite stream of random garbage. By responding to URLs an attacker should not be accessing with the output of this endpoint, a webserver may occupy poorly written scrapers for a very long time, and even crash very poorly written scrapers when the system runs into memory constraints. Simply copy junkstream.py to your server's CGI directory and apply a rewrite rule such as the following, written in Apache's mod_rewrite syntax:

RewriteEngine on  
RewriteRule /wp-admin/(.+)$ /cgi-bin/junkstream.py [NC,PT]  

For Apache users, remember to exempt this file from mod_deflate's attention or the output pipe will fill and the CGI process will deadlock, sending no traffic to the client, but keeping the HTTP connection open.

<FilesMatch /junkstream.py$>
        SetEnv no-gzip 1
</FilesMatch>

Remember to adjust the log directories, bitrate, etc in the CGI script. The default output type is plain/text, but you may wish to change this to plain/xml or something similar. (Remember to add a <body> tag if you do this!) You can also change the character encoding and various other features. Check the script itself for more detailed documentation!

rabbithole's People

Contributors

geofurb avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

acidtib

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.