Giter Site home page Giter Site logo

vanshyr / atod Goto Github PK

View Code? Open in Web Editor NEW

This project forked from miguelangelher/atod

0.0 2.0 0.0 36.43 MB

Atod is a python script which analizes the Alexa top 1 million and creates a dictionary and a stats file. The idea is using the Dictionary as input during the Information Gathering phase and the stats file for analysis purposes.

Python 100.00%

atod's Introduction

atod

Atod (standing for Alexa to Dictionary) is a python script which analizes the Alexa top 1 million and creates a dictionary and a stats file. The idea is using the Dictionary as input during the Information Gathering phase and the stats file for analysis purposes.

The script is cross platform Windows/Linux (should be Mac as well but I didn't tested it) since it has been developed in Python. The only thing you need from this rep is the .py file even though I have uploaded here the dictionary and stat file as a result of the script execution (Date 17/09/2016).

Why atod?

I developed this first version keeping in mind the idea of using the resultant dictionary for brute forcing during my pentests. Hope it will be useful as well for you as a complement of the ones available with other tools as Skipfish, Dirbuster, ZAP or Burp. But the results of this scripts can be brough much further... I hope to have time for a version 2 including more features... :)

What atod exactly does?

The functionality embedded in atod enables the automation of the next process: 1.- Download the Alexa top 1M file 2.- Decompress the file in the current directory 3.- Create a queue to look using both schemes (http and https) for the robots.txt file. It will take into account the "LIMIT" constant to analyse only the number of URLs indicated in it 4.- Run NPROCES processes, being NPROCESS another constant you can edit at your convenience 5.- Analyse each robots.txt file for the Disallowed folders 6.- Build two different files, one to use as a dictionary and the other one for analytics purposes including the number of times each folder have been found as disallowed 7.- Before writing all the folders down to file a cleaning post-processing is performed, to avoid wild cards and malformed entries in the robots.txt causing pointless information in the files. 8.- The script wait all processes to finish and ends.

How to run

Atod is not parametrized, I did it as an exercise of learning python so once you download it just grant running permission and execute:

]$ ./atod

Atod is initaly configured to analyse the Alexa top 1000 and running 10 paralel processes so you have a good base in a matter of minutes. If you want to go further just access the script using your favourite editor and modify the next parameters:

  • LIMIT = X //X being the number of URLs you want to analyze once the alexa file is downloaded
  • NPROCESS = Y //Y being the number of parallel processes you want to run

How it performed for me

Taking into account that my laptop is quite latest (16Gb RAM, SSD and core i7) and my internet connection is not bad (20Mbps), it tooks ~6 hours to pull all the information from the alexa top 1 million using 150 paralel processes.

You can have a consistent analysis using much more less than that. I tried with all the file to push the script to the limits but a configuration with 50 parallel processes and a limit of 100K URLs will give you a great dictionary and results for analysis in ~1 hour

To take into account

  • Be aware that increasing the number of processes will cause as well an increase in the time the script takes waiting for them to stop.
  • Mind as well that the script have been developed with resilience in mind. That means some errors are already treated in the script and you will have meaninfull results even if some IOErrors are received by the script. It will finish and gives you its best

atod's People

Contributors

miguelangelher avatar

Watchers

James Cloos avatar Javier Talens Segui avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.