Giter Site home page Giter Site logo

hakluke / hakrawler Goto Github PK

View Code? Open in Web Editor NEW
4.2K 58.0 474.0 1.76 MB

Simple, fast web crawler designed for easy, quick discovery of endpoints and assets within a web application

Home Page: https://hakluke.com

License: GNU General Public License v3.0

Go 98.42% Dockerfile 1.58%
bugbounty crawling hacking osint pentesting recon reconnaissance

hakrawler's Introduction

Hakrawler

Fast golang web crawler for gathering URLs and JavaScript file locations. This is basically a simple implementation of the awesome Gocolly library.

Example usages

Single URL:

echo https://google.com | hakrawler

Multiple URLs:

cat urls.txt | hakrawler

Timeout for each line of stdin after 5 seconds:

cat urls.txt | hakrawler -timeout 5

Send all requests through a proxy:

cat urls.txt | hakrawler -proxy http://localhost:8080

Include subdomains:

echo https://google.com | hakrawler -subs

Note: a common issue is that the tool returns no URLs. This usually happens when a domain is specified (https://example.com), but it redirects to a subdomain (https://www.example.com). The subdomain is not included in the scope, so the no URLs are printed. In order to overcome this, either specify the final URL in the redirect chain or use the -subs option to include subdomains.

Example tool chain

Get all subdomains of google, find the ones that respond to http(s), crawl them all.

echo google.com | haktrails subdomains | httpx | hakrawler

Installation

Normal Install

First, you'll need to install go.

Then run this command to download + compile hakrawler:

go install github.com/hakluke/hakrawler@latest

You can now run ~/go/bin/hakrawler. If you'd like to just run hakrawler without the full path, you'll need to export PATH="~/go/bin/:$PATH". You can also add this line to your ~/.bashrc file if you'd like this to persist.

Docker Install (from dockerhub)

echo https://www.google.com | docker run --rm -i hakluke/hakrawler:v2 -subs

Local Docker Install

It's much easier to use the dockerhub method above, but if you'd prefer to run it locally:

git clone https://github.com/hakluke/hakrawler
cd hakrawler
sudo docker build -t hakluke/hakrawler .
sudo docker run --rm -i hakluke/hakrawler --help

Kali Linux: Using apt

Note: This will install an older version of hakrawler without all the features, and it may be buggy. I recommend using one of the other methods.

sudo apt install hakrawler

Then, to run hakrawler:

echo https://www.google.com | docker run --rm -i hakluke/hakrawler -subs

Command-line options

Usage of hakrawler:
  -d int
    	Depth to crawl. (default 2)
  -dr
    	Disable following HTTP redirects.
  -h string
    	Custom headers separated by two semi-colons. E.g. -h "Cookie: foo=bar;;Referer: http://example.com/"
  -i	Only crawl inside path
  -insecure
    	Disable TLS verification.
  -json
    	Output as JSON.
  -proxy string
    	Proxy URL. E.g. -proxy http://127.0.0.1:8080
  -s	Show the source of URL based on where it was found. E.g. href, form, script, etc.
  -size int
    	Page size limit, in KB. (default -1)
  -subs
    	Include subdomains for crawling.
  -t int
    	Number of threads to utilise. (default 8)
  -timeout int
    	Maximum time to crawl each URL from stdin, in seconds. (default -1)
  -u	Show only unique urls.
  -w	Show at which link the URL is found.

hakrawler's People

Contributors

albonycal avatar ameenmaali avatar bladeswords avatar cablej avatar cow-watch-hour avatar daehee avatar delic avatar enfinlay avatar epicfaace avatar erikowen avatar garlic0x1 avatar gigarashi avatar hakluke avatar haticeerturk avatar hoenn avatar jub0bs avatar lc avatar omnifocal avatar random-robbie avatar shourdev avatar xvapourx avatar yuradoc avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hakrawler's Issues

Not working properly

Hello, i have been using this tool lately but it is not working properly apparently, and also is inconsistent.

Running the following command:

echo "https://google.com" | hakrawler -plain -depth 3 -scope subs or
echo "google.com" | hakrawler -plain -depth 3 -scope subs -wayback

returns only 272 urls of google, and this is obviously not working properly because it is not getting any subdomains of google and not even a decent amount of directories

[error] parse "http://\x1b[38;5;1m[-] error occurred: httpsconnectionpool

hello
when i ran hakrawler i got this error going for ever without any useful result

[error] parse "http://\x1b[38;5;1m[-] error occurred: httpsconnectionpool(host='raw.githubusercontent.com', port=443): max retries exceeded with url: /mrzhang960217/www.------l.cn/18c8688ca592dbc8b7d1b570aea3479adfc1eecf/application/index/view/redminote4/redminote4.html (caused by newconnectionerror('<urllib3.connection.httpsconnection object at 0x7fafb9796460>: failed to establish a new connection: [errno -2] name or service not known'))\x1b[0m": net/url: invalid control character in URL

Feature request: list external urls

List external domains url links, without needing to run it with "-scope yolo".

Reason: might be useful for second level domain takeover.

steps to reproduce:

> hakrawler -urls -depth 1 -scope yolo -url https://github.com | grep twitter
[url] https://twitter.com/github
[> hakrawler -urls -depth 1 -scope subs -url https://github.com | grep twitter
>

Max time

Hello! Would it be possible to add a max time per host?

for example running:

cat hosts.txt | hakrawler -plain -depth 2 -usewayback -t 20 > out-hakrawler.txt

meaning that it takes a max time of 20 seconds per host.

Show Data Source When Using Plain Mode

Hello,

When using the -plain mode it looks like hakrawler does not show the data source for each link/url found:

image

Worth changing so that it shows the data source with no colors?

Subdomain detection isn't completely correct

./hakrawler -domain www.dota2.com

โ–ˆโ–ˆโ•—  โ–ˆโ–ˆโ•— โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•— โ–ˆโ–ˆโ•—  โ–ˆโ–ˆโ•—โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•—  โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•— โ–ˆโ–ˆโ•—    โ–ˆโ–ˆโ•—โ–ˆโ–ˆโ•—     โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•—โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•—
โ–ˆโ–ˆโ•‘  โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•”โ•โ•โ–ˆโ–ˆโ•—โ–ˆโ–ˆโ•‘ โ–ˆโ–ˆโ•”โ•โ–ˆโ–ˆโ•”โ•โ•โ–ˆโ–ˆโ•—โ–ˆโ–ˆโ•”โ•โ•โ–ˆโ–ˆโ•—โ–ˆโ–ˆโ•‘    โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•‘     โ–ˆโ–ˆโ•”โ•โ•โ•โ•โ•โ–ˆโ–ˆโ•”โ•โ•โ–ˆโ–ˆโ•—
โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•”โ• โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•”โ•โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•‘ โ–ˆโ•— โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•‘     โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•—  โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•”โ•
โ–ˆโ–ˆโ•”โ•โ•โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•”โ•โ•โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•”โ•โ–ˆโ–ˆโ•— โ–ˆโ–ˆโ•”โ•โ•โ–ˆโ–ˆโ•—โ–ˆโ–ˆโ•”โ•โ•โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ–ˆโ•—โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•‘     โ–ˆโ–ˆโ•”โ•โ•โ•  โ–ˆโ–ˆโ•”โ•โ•โ–ˆโ–ˆโ•—
โ–ˆโ–ˆโ•‘  โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•‘  โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•‘  โ–ˆโ–ˆโ•—โ–ˆโ–ˆโ•‘  โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•‘  โ–ˆโ–ˆโ•‘โ•šโ–ˆโ–ˆโ–ˆโ•”โ–ˆโ–ˆโ–ˆโ•”โ•โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•—โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•—โ–ˆโ–ˆโ•‘  โ–ˆโ–ˆโ•‘
โ•šโ•โ•  โ•šโ•โ•โ•šโ•โ•  โ•šโ•โ•โ•šโ•โ•  โ•šโ•โ•โ•šโ•โ•  โ•šโ•โ•โ•šโ•โ•  โ•šโ•โ• โ•šโ•โ•โ•โ•šโ•โ•โ• โ•šโ•โ•โ•โ•โ•โ•โ•โ•šโ•โ•โ•โ•โ•โ•โ•โ•šโ•โ•  โ•šโ•โ•
                        Crafted with <3 by hakluke

[url] http://www.dota2.com/
[subdomain] www.dota2.com
[url] http://www.dota2.com/store/
[url] http://www.dota2.com/heroes/
[url] http://www.dota2.com/items/
[url] http://www.dota2.com/workshop/builds
[url] http://www.dota2.com/quiz
[url] http://www.dota2.com/news/updates/
[url] http://www.dota2.com/international/battlepass
[url] http://www.dota2.com/leaderboards/
[url] http://www.dota2.com/dpc/
[url] http://www.dota2.com/mars/
[url] http://www.dota2.com/frosthaven/
[url] http://www.dota2.com/720
[url] http://www.dota2.com/grimstroke
[url] http://www.dota2.com/international/overview
[url] http://www.dota2.com/feastofabscession
[url] http://www.dota2.com/plus
[url] http://www.dota2.com/duelingfates/
[url] http://www.dota2.com/international/overview/
[url] http://www.dota2.com/700
[url] http://www.dota2.com/darkrift
[url] http://www.dota2.com/international/
[url] http://www.dota2.com.cn/theshanghaimajor/english/overview
[subdomain] www.dota2.com.cn
[url] http://www.dota2.com/winter2016
[url] http://www.dota2.com/balanceofpower
[url] http://www.dota2.com/reborn/updates/
[url] http://www.dota2.com/international2015/overview/
[url] http://www.dota2.com/international2015/compendium/
[url] http://www.dota2.com/newbloom/
[url] http://www.dota2.com/shiftingsnows/
[url] http://www.dota2.com/oracle/day2/
[url] http://www.dota2.com/rekindlingsoul/
[url] http://www.dota2.com/techies/
[url] http://www.dota2.com/international2014/
[url] http://www.dota2.com/springcleaning/
[url] http://www.dota2.com/newbloom2014/
[url] http://www.dota2.com/wraithnight/
[url] http://www.dota2.com/threespirits/
[url] http://www.dota2.com/firstblood/
[url] http://www.dota2.com/thebetaisover/
[url] http://www.dota2.com/international2013/
[url] http://www.dota2.com/greeviling/
[url] http://www.dota2.com/diretide/
[url] http://www.dota2.com/aegisofchampions/
[url] http://www.dota2.com/tournaments/international2012/mainevent/results/champions/
[url] http://www.dota2.com/spoilsofwar/
[url] http://www.dota2.com/tournaments/international2011/
[url] http://www.dota2.com/comics/are_we_heroes_yet/
[url] http://www.dota2.com/flockheartsgamble

[subdomain] www.dota2.com.cn is not actually a subdomain of www.dota2.com

Better output options

Need to add some better output options for easier integration with other tools, namely JSON and CSV.

issue:panic: runtime error: slice bounds out of range

hi get the following error when i run it

goroutine 237 [running]:
bufio.(*Writer).Write(0xc000202840, 0xc00173ae00, 0xf1, 0x100, 0xf1, 0x100, 0xc000d207e0)
/usr/local/go/src/bufio/bufio.go:625 +0x230
github.com/hakluke/hakrawler/pkg/collector.(*Collector).colorPrint(0xc0003217d0, 0xb1b2c0, 0xc0010e83c0, 0xc000d64a50, 0xf0, 0x1)
/root/go_projects/src/github.com/hakluke/hakrawler/pkg/collector/collector.go:393 +0x612
github.com/hakluke/hakrawler/pkg/collector.(*Collector).recordIfInScope(0xc0003217d0, 0xb1b2c0, 0xc0010e83c0, 0xc000318600, 0x19, 0xc000d64a50, 0xf0, 0xc00031e580, 0x1)
/root/go_projects/src/github.com/hakluke/hakrawler/pkg/collector/collector.go:280 +0x19d
github.com/hakluke/hakrawler/pkg/collector.(*Collector).visitWaybackURLs(0xc0003217d0, 0xc000318600, 0x19, 0xc000321830, 0xc00031e580)
/root/go_projects/src/github.com/hakluke/hakrawler/pkg/collector/collector.go:293 +0x13d
github.com/hakluke/hakrawler/pkg/collector.(*Collector).Crawl.func3(0xc000324150, 0xc0003217d0, 0xc000318600, 0x19, 0xc000321830, 0xc00031e580)
/root/go_projects/src/github.com/hakluke/hakrawler/pkg/collector/collector.go:155 +0x7f
created by github.com/hakluke/hakrawler/pkg/collector.(*Collector).Crawl
/root/go_projects/src/github.com/hakluke/hakrawler/pkg/collector/collector.go:153 +0x453
panic: runtime error: slice bounds out of range [:4131] with capacity 4096

goroutine 495 [running]:
bufio.(*Writer).Flush(0xc000202840, 0xc002720810, 0x23)
/usr/local/go/src/bufio/bufio.go:591 +0x1c0
bufio.(*Writer).Write(0xc000202840, 0xc002720810, 0x26, 0x30, 0x26, 0x30, 0xc000d84850)
/usr/local/go/src/bufio/bufio.go:627 +0xfa
github.com/hakluke/hakrawler/pkg/collector.(*Collector).colorPrint(0xc0006514d0, 0xb1b2c0, 0xc001068b80, 0xc001950240, 0x25, 0x1)
/root/go_projects/src/github.com/hakluke/hakrawler/pkg/collector/collector.go:393 +0x612
github.com/hakluke/hakrawler/pkg/collector.(*Collector).recordIfInScope(0xc0006514d0, 0xb1b2c0, 0xc001068b80, 0xc00040eaa0, 0x1b, 0xc001950240, 0x25, 0xc000296dc0, 0x40e4c8)
/root/go_projects/src/github.com/hakluke/hakrawler/pkg/collector/collector.go:280 +0x19d
github.com/hakluke/hakrawler/pkg/collector.(*Collector).visitHTMLFunc.func1(0xc0009e8f00)
/root/go_projects/src/github.com/hakluke/hakrawler/pkg/collector/collector.go:188 +0x235
github.com/gocolly/colly.(*Collector).handleOnHTML.func1(0x9, 0xc001190000)
/root/go_projects/src/github.com/gocolly/colly/colly.go:1074 +0x8c
github.com/PuerkitoBio/goquery.(*Selection).Each(0xc0006fd080, 0xc000243cd0, 0x7)
/root/go_projects/src/github.com/PuerkitoBio/goquery/iteration.go:10 +0x53
github.com/gocolly/colly.(*Collector).handleOnHTML(0xc000294700, 0xc001fe0000, 0x0, 0x0)
/root/go_projects/src/github.com/gocolly/colly/colly.go:1064 +0x21b
github.com/gocolly/colly.(*Collector).fetch(0xc000294700, 0xc000318de0, 0x1b, 0xa370d3, 0x3, 0x1, 0x0, 0x0, 0xc000322a40, 0xc000651620, ...)
/root/go_projects/src/github.com/gocolly/colly/colly.go:676 +0x487
github.com/gocolly/colly.(*Collector).scrape(0xc000294700, 0xc000318de0, 0x1b, 0xa370d3, 0x3, 0x1, 0x0, 0x0, 0x0, 0xc000651620, ...)
/root/go_projects/src/github.com/gocolly/colly/colly.go:577 +0x47e
github.com/gocolly/colly.(*Collector).Visit(0xc000294700, 0xc00040eaa0, 0x1b, 0x0, 0x0)
/root/go_projects/src/github.com/gocolly/colly/colly.go:446 +0x82
github.com/hakluke/hakrawler/pkg/collector.(*Collector).Crawl.func4(0xc000324180, 0xc0006514d0, 0xc00040eaa0, 0x1b)
/root/go_projects/src/github.com/hakluke/hakrawler/pkg/collector/collector.go:163 +0x6f
created by github.com/hakluke/hakrawler/pkg/collector.(*Collector).Crawl
/root/go_projects/src/github.com/hakluke/hakrawler/pkg/collector/collector.go:161 +0x393
panic: runtime error: slice bounds out of range [:4131] with capacity 4096

goroutine 837 [running]:
bufio.(*Writer).Flush(0xc000202840, 0xc0022b70c0, 0x23)
/usr/local/go/src/bufio/bufio.go:597 +0x1b0
bufio.(*Writer).Write(0xc000202840, 0xc0022b70c0, 0x34, 0x40, 0x34, 0x40, 0xc000d690a0)
/usr/local/go/src/bufio/bufio.go:627 +0xfa
github.com/hakluke/hakrawler/pkg/collector.(*Collector).colorPrint(0xc0008f3b90, 0xb1b2c0, 0xc00128af40, 0xc00152b3c0, 0x33, 0x1)
/root/go_projects/src/github.com/hakluke/hakrawler/pkg/collector/collector.go:393 +0x612
github.com/hakluke/hakrawler/pkg/collector.(*Collector).recordIfInScope(0xc0008f3b90, 0xb1b2c0, 0xc00128af40, 0xc000291760, 0x17, 0xc00152b3c0, 0x33, 0xc00031f040, 0xc000291101)
/root/go_projects/src/github.com/hakluke/hakrawler/pkg/collector/collector.go:280 +0x19d
github.com/hakluke/hakrawler/pkg/collector.(*Collector).visitWaybackURLs(0xc0008f3b90, 0xc000291760, 0x17, 0xc0008f3bf0, 0xc00031f040)
/root/go_projects/src/github.com/hakluke/hakrawler/pkg/collector/collector.go:293 +0x13d
github.com/hakluke/hakrawler/pkg/collector.(*Collector).Crawl.func3(0xc00041a2b0, 0xc0008f3b90, 0xc000291760, 0x17, 0xc0008f3bf0, 0xc00031f040)
/root/go_projects/src/github.com/hakluke/hakrawler/pkg/collector/collector.go:155 +0x7f
created by github.com/hakluke/hakrawler/pkg/collector.(*Collector).Crawl
/root/go_projects/src/github.com/hakluke/hakrawler/pkg/collector/collector.go:153 +0x453

i am on go1.14.3

Error in gocolly

While installing hakrawler with command:- go get github.com/hakluke/hakrawler

it's giving me the error:-
go/src/github.com/hakluke/hakrawler/pkg/collector/collector.go:15:9: no Go files in /root/go/src/github.com/gocolly/colly

Ignore SSL verification errors

It would be great to have an option to ignore SSL verification errors, rather than silently exiting when they're encountered.

some js files are not logged

running hakrawler -js -url "https://intigriti.com" doesn't return the scripts that are included in the front-page. Haven't figured out why yet..

[Minor] Add Versioning

Hello,

Great work on this tool, thank you!

This issue is to add a version number somewhere in the output with -h or with a -v/-version flag.

i got this issue

github.com/hakluke/hakrawler/pkg/collector

.go/src/github.com/hakluke/hakrawler/pkg/collector/collector.go:34:23: undefined: strings.ReplaceAll
.go/src/github.com/hakluke/hakrawler/pkg/collector/collector.go:251:23: undefined: strings.ReplaceAll

Allowing domains AND urls

Hello ! Thanks for your really perfect tool, I really appreciate it.

However it could be nice to specify an url as target and not only the domain.

For example your pentest assessment could be only on target.com/thisfolder/ and not on target.com/. Only specifying the domain could bring out of scope items.

Thanks again !

Feature Request: Show unique URLs

It would be really great if you could add an option to show unique URL's only. don't show duplicates.

Thank you for this is awesome tool :) kudos to the developers.

-urls extract more than urls

When run -urls arg, return also mailto: witch are in tag but are not exactly a URL

# hakrawler -domain โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ.com -scope subs -depth 1 -plain --urls
https://coach.โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ.com/login
mailto:contato@โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ.com

I think mailto: ftp: ssh: etc, are useful, but not is a url, maybe filter or make another arg to extract all protocols.

What do you think?

PS: I can send the url case in particular, dont want my client found his domain in public repo =]

Custom headers

Add an option to use custom headers on the crawler, which will enable crawling authenticated pages

Crawling wouldn't occur properly for piped URLs

For some of the versions prior to beta9, there was an issue in the program logic which made the crawling only work for URLs specified by the -url parameter, and not URLs that were piped through stdin. This has now been resolved, I'm just creating this ticket for logging purposes.

Add page size limits

Some sites have massive HTTP downloads (eg ISOs or program installers) that you can't avoid by regexing the URL
When the script follows these URLs it slows down the crawler significantly.
You could perhaps grab the HTTP Headers and check 'Content-Length' before continuing with downloading a page

undefined: strings.ReplaceAll

github.com/hakluke/hakrawler/pkg/collector
go/src/github.com/hakluke/hakrawler/pkg/collector/collector.go:34:23: undefined: strings.ReplaceAll
go/src/github.com/hakluke/hakrawler/pkg/collector/collector.go:256:23: undefined: strings.ReplaceAll

blacklist extentions

Hi,
i was wondering if you could add some blacklist regex for exclude some unnecessary extensions like (jpg|jpeg|gif|css|tif|tiff|png|ttf|woff|woff2|ico) for example.

AND it would be great if this is default whatever i want to go deep using -depth 5 OR 10 even.
i guess no one would need those extensions in terminal output.

Thanks

More Sources

I thought you may add other sources like CommonCrawl or OTX.

Also, you may wanna look at this repos.
https://github.com/xyele/igoturls
https://github.com/xyele/secretx (for extracting API/Secret Keys?)

Install error golang ver 2:1.11~1

I am getting an error on install, never seen this issue before. Love the tool.
golang is already the newest version (2:1.11~1).
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.

:~$ go get github.com/hakluke/hakrawler

github.com/hakluke/hakrawler/pkg/collector

go/src/github.com/hakluke/hakrawler/pkg/collector/collector.go:34:23: undefined: strings.ReplaceAll
go/src/github.com/hakluke/hakrawler/pkg/collector/collector.go:251:23: undefined: strings.ReplaceAll

Piping to hakrawler issue

Hello,

when I use hakrawler without piping it works flawless.
But when I pipe it with for example:

echo "domain.com" | hakrawler
cat domains.txt | hakrawler

it will run forever.
Basically it will output everything but wont stop. You can see the process running but no there is no more output.
So I would like to run it as follow in my bash script

assetfinder $domain | sort -u | hakrawler -usewayback -plain

but with that issue its not possible

Thank you!

Form Submit

It would be really cool to be able to submit found forms, and get the URL back form it.

Feature request: support virtual hosts

Setting the Host header doesn't seem to have any effect. Based on other test, might be the flag -headers takes action later.

 > strace -s 9999 -f -etrace=write hakrawler -depth 2 -headers 'Host: github.com' -insecure -scope subs -url 140.82.113.4 -plain 2>&1 | grep GET
[pid 22395] write(5, "GET / HTTP/1.1\r\nHost: 140.82.113.4\r\nUser-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36\r\nAccept: */*\r\nAccept-Encoding: gzip\r\n\r\n", 193) = 193
[pid 22395] write(3, "GET /robots.txt HTTP/1.1\r\nHost: 140.82.113.4\r\nUser-Agent: Go-http-client/1.1\r\nAccept-Encoding: gzip\r\n\r\n", 103) = 103
[pid 22396] write(6, "GET /sitemap.xml HTTP/1.1\r\nHost: 140.82.113.4\r\nUser-Agent: Go-http-client/1.1\r\nAccept-Encoding: gzip\r\n\r\n", 104) = 104

This particular example works ok as github accepts the host as the ip to later redirect I think.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.