Giter Site home page Giter Site logo

protoscan's Introduction

Protoscan

Protocol Scanner for high speed packet crafting and injection in several different protocols. This is intended to test bidirectional censorship mechanisms the world over.

Generating address lists

See the Readme in each of cmd/generate_by_alloc, cmd/generate_from_addrs, and cmd/generate_from_subnets for more details.

./generate_by_alloc -d /data/GeoLite2/ -filter "./zmap-udp53.csv" -all

Scanning

The bidirectional censorship scanner currently supports DNS, HTTP, TLS, and Quic injection in both IPv4 and IPv6.

See the Readme in cmd/bidi/ for more details on usage.

echo "52.44.73.6" | sudo ./bidi -type http -iface "wlo1" -domains domains.txt -workers 1 -wait 1s

protoscan's People

Contributors

ewust avatar jmwample avatar

Stargazers

 avatar

Watchers

 avatar  avatar

protoscan's Issues

TLS Fingerprint Matching

The curent TLS probe uses a random TLS clienthello that I found on the internet. We may want the ClientHello we send to match a more specific client fingerprint like Chrome. This would mean updating the TLS probe to modify ciphersuites, extensions, etc.

We can check tlsfingerprint.io for a specific fingerprint that we want to match.

Raw Socket returning `operation not permitted`

A (typically) small fraction of the probes that we send are returning an error operation not permitted when written into the raw socket. The program running them is running as root, and a large majority of the other packets sent are doing so without error. It is unclear to me why this would be an issue.

Temp fix in ba9abc0 is to retry the send when an error occurs. If an error occurs every time then fail and return the error. However, I am not sure if this results in the packet being sent multiple times (i.e the send is successful even when it returns an error and we retry sending a duplicate), or if the retryDelay is helping.

I don't have a good way to reproduce this error other than scanning quickly over a large set of addresses.

relevant code:

protoscan/cmd/bidi/tcp.go

Lines 223 to 240 in ba9abc0

func sendPkt(sockFd int, payload []byte, addr syscall.Sockaddr) error {
var err error
retries := 3
retryDelay := 1 * time.Millisecond
for i := 0; i < retries; i++ {
err = syscall.Sendto(sockFd, payload, 0, addr)
if err == nil {
stats.incPacketPerSec()
stats.incBytesPerSec(len(payload))
return nil
}
if err != nil {
time.Sleep(retryDelay)
}
}
return os.NewSyscallError("sendto", err)
}

ECH / ESNI Probe

Add a new TLS probe that fakes the presence of TLS ECH or TLS ESNI (potentially configurable between the two) in the ClientHello packet.

This test would be different from the others in that we don't have to run for all sensitive domains as the protocol itself is the sensitive element in question.

see:

Init step for `hex.Decode` in probe generation

As demonstrated by the GeneratePayloads perf benchmark lots (~30%) of tls payload build time is spent on hex.Decode which is avoidable. However, for now generating payload is really fast anyways and hex is a convenient format in which to interact with the payload. It might make sense to do hex.Decode calls as some sort of init if speed matters in the future. Or move to using slice init with bytes. But for now it doesn't matter.

This sort of strategy is applicable any of the probes built using hex.decode and fmt.Sprintf

Hostname Identifiers and validation

Generating random values as identifiers and then logging them seems to be a costly way to link response packets to hostname when the hostname is not available in the injected response (e.g. TCP RST).

To fix this we can associate each Domain with an ID and then use 1000+ID as the source port (works in both TCP and UDP). We will likely want to validate that response packets are infact associated. To do so the "random" fields can be a hash of the target IP and the domain (found on response using a lookup in the ID table). This limits us to ~64k domains in a single run, but that is a lot to be testing anyway. For TCP based probes (TLS, HTTP) that will be seq/ack numbers. For UDP based protocol it depends on the protocol - for quic specifically we should be able to use the Connection ID field which should be included in any associated response packets.

We do not need to do this for DNS probes as DNS responses contain the hostname.

DTLS probe

Add a new probe that sends a DTLS ClientHello Handshake. Any clients that support DTLS should ignore this since there was not a pre-negotiated session (i.e. for webrtc).

This tests two things

  • is the DTLS protocol (and any protocols that use it) blocked or interfered with in general?
  • Do censors check SNI and interfere for DTLS handshakes?

TCP Scan Speed

For some reason TCP scans are injecting packets too slow. I was running the TLS scan with wait=5ms and workers=2000 which should result in ~400,000 pps. However, the scan ran for 20 hours and only has 30M lines which comes out to around 400 pps.

UDP doesn't seem to have the same issue.

Update send engine

Currently the send flow opens one raw socket associated with the prober that all workers write into. This is fast enough for current sends, however it hits a barrier where an increasing number of workers does not help and may even hurt the overall pps send rate.

To overcome this we should refactor the send engine to have one raw socket per worker. This way adding workers does not create a queue of workers waiting to write into one socket and the send rate goes back to roughly scaling by number of workers.

This may take some pretty serious re-organization but has the potential to significantly increase the limit on send rate. This would be especially useful if (for example) we have the opportunity to scan from a 10 Gbps vantage.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.