Giter Site home page Giter Site logo

sinsixx / cloudscraper Goto Github PK

View Code? Open in Web Editor NEW

This project forked from jordanpotti/cloudscraper

0.0 0.0 0.0 3.26 MB

CloudScraper: Tool to enumerate targets in search of cloud resources. S3 Buckets, Azure Blobs, Digital Ocean Storage Space.

License: MIT License

Python 100.00%

cloudscraper's Introduction

logo

CloudScraper is a Tool to spider and scrape targets in search of cloud resources. Plug in a URL and it will spider and search the source of spidered pages for strings such as 's3.amazonaws.com', 'windows.net' and 'digitaloceanspaces'. AWS, Azure, Digital Ocean resources are currently supported.

@ok_bye_now

Pre-Requisites

Non-Standard Python Libraries:

  • requests
  • rfc3987
  • termcolor

Created with Python 3.6

General

This tool was inspired by a recent talk by Bryce Kunz. The talk Blue Cloud of Death: Red Teaming Azure takes us through some of the lesser known common information disclosures outside of the ever common S3 Buckets.

Usage:

usage: CloudScraper.py [-h] [-v] [-p Processes] [-d DEPTH] [-u URL] [-l TARGETLIST]

optional arguments:
  -h, --help     show this help message and exit
  -u URL         Target Scope
  -d DEPTH       Max Depth of links Default: 5
  -l TARGETLIST  Location of text file of Line Delimited targets
  -v Verbose     Verbose output
  -p Processes  Number of processes to be executed in parallel. Default: 2

example: python3 CloudScraper.py -u https://rottentomatoes.com

ToDo

  • Add key word customization

Various:

To add keywords, simply add to the list in the parser function.

Contribute

Sharing is caring! Pull requests welcome, things like adding support for more detections, multithreading etc are highly desired :)

Why

So Bryce Kunz actually made a tool to do something similar but it used scrapy and I wanted to build something myself that didn't depend on Python2 or any scraping modules such as scrapy. I did end up using BeautifulSoup to parse for href links for spidering only. Hence, CloudScraper was born. The benefit of using raw regex's instead of parsing for href links, is that many times, these are not included in href links, they can be buried in JS or other various locations. CloudScraper grabs the entire page and uses a regex to look for links. This also has its flaws such as grabbing too much or too little but at least we know we are covering our bases :)

cloudscraper's People

Contributors

janmasarik avatar jordanpotti avatar turr0n avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.