I can't seem to get some of our anon relays working correctly. I think it's the sa

hrm, if I remove from <a href="https://github.com/DNSCrypt/dnscrypt-proxy/blob/master/

Testing with TCP is not really useful here. With <code class="notran

I though you sent queries directly to a resolver. <p di

Anonymous DNS relays not working for some of our servers about encrypted-dns-server HOT 13 CLOSED

dnscrypt commented on May 20, 2024

Anonymous DNS relays not working for some of our servers

from encrypted-dns-server.

Comments (13)

adit-s commented on May 20, 2024

I had the same problem you described and in tickets #1578 and #1346. I couldn't figure it out and didn't make any sense why the anonymizing relaying wasn't working.
The way I got it to work was setting all of the dnscrypt parameters and anonymizing routes to what I wanted, setting
force_tcp = true (this is the key), and then restarting the container. Following this procedure, the servers were anonymized and everything worked as expected. After the connection was established and relaying working like I wanted it to, I simply changed force_tcp = false and restarted the container. Since then, I've restarted the container multiple times and the anonymizing has been fine over UDP.
(If, for whatever reason, the relaying stops working again, you should be able to set force_tcp = true again it should work over TCP.)

from encrypted-dns-server.

jedisct1 commented on May 20, 2024

Unfortunately, I have no idea why this is happening. I've never had any fragmentation issues ever :(

from encrypted-dns-server.

df-cryptostorm commented on May 20, 2024

hrm, if I remove from https://github.com/DNSCrypt/dnscrypt-proxy/blob/master/dnscrypt-proxy/serversInfo.go#L450 the code

	if knownBugs.fragmentsBlocked && relay != nil && relay.Dnscrypt != nil {
		relay = nil
		if proxy.skipAnonIncompatibleResolvers {
			dlog.Infof("[%v] couldn't be reached anonymously, it will be ignored", name)
			return ServerInfo{}, errors.New("Resolver couldn't be reached anonymously")
		}
		dlog.Warnf("[%v] couldn't be reached anonymously", name)
	}

it seems to work fine with the above CS Switzerland relay to scaleway-fr, so I'm guessing fragmentsBlocked is getting set.
Looks like that happens on the line after https://github.com/DNSCrypt/dnscrypt-proxy/blob/master/dnscrypt-proxy/dnsutils.go#L332 -

option := _dnsExchange(proxy, proto, query, serverAddress, relay, 480)
option.fragmentsBlocked = true

if I fiddle with that 480, it'll start working at 1440. 1439 or anything below will cause the "couldn't be reached anonymously" error. The MTU for eth0 is set to the default 1500 on both servers, so that's probably not it.

Reverting back to the original code and sniffing packets on our Moldova relay, which does work correctly shows:

So fragmentation is happening there.
On our Latvian relay, which does NOT work correctly it only shows:

So fragmentation isn't happening there. But the weird thing is that fragmentation works fine on that server. If I connect to the VPN there and do things that would cause UDP fragmentation, tcpdump -nni any '((ip[6:2] > 0) and (not ip[6] = 64))' shows it's happening correctly, nothing blocking it. Even simply connecting to the regular (non-relaying) DNSCrypt part of the encrypted-dns-server running there will cause fragmentation to happen, and that goes through just fine. So something in either dnscrypt-proxy or encrypted-dns-server is causing that relay fragment test to fail.

from encrypted-dns-server.

jedisct1 commented on May 20, 2024

The intent of that code is to simultaneously send packets padded to 1500 bytes (triggering fragmentation) and the same packets padded to only 480 bytes.

If 1500 bytes packets don't get a response, but 480 bytes do, if means that the server is functional, but blocks fragments.

Is the server detected as usable at all when you raise 480 to 1480 or more?

Maybe that parallel check doesn't work as intended. I need to take a closer look at it.

While disabling these checks, can you successfully get large responses? You check for example check with test-tcp.dnscrypt.info after having bumped up the max UDP packet size in EDNS.

from encrypted-dns-server.

df-cryptostorm commented on May 20, 2024

Yes, the server is detected as useable with 1480 and up (tested 1480-1499). One thing I noticed is that dnscrypt-proxy starts up almost instantly with a server that's working correctly. The timestamp for startup, "[2021-03-29 16:14:31] [NOTICE] Anonymizing queries for [scaleway-fr]", etc. is all within the same second by the time it reaches "dnscrypt-proxy is ready". Whenever that padding is changed to something below 1440 (for the broken relays that require 1440+), there's at least a 5 second delay before it fails, or if set to 1440 and up, 5 seconds by the time it succeeds. Maybe a timer or timeout somewhere is the culprit?

And yes, large responses work too. I was testing with balancer.cstorm.is since it's got 46 IPs, but test-tcp.dnscrypt.info also resolves. With the padding still set to 1499, dig @127.0.0.1 test-tcp.dnscrypt.info switches over to TCP for the response. This is DiG 9.16.13 btw, which has had a lot of EDNS improvements over the years.
Here's what dig +short rs.dns-oarc.net TXT @127.0.0.1 returns:

rst.x1188.rs.dns-oarc.net.
rst.x1198.x1188.rs.dns-oarc.net.
rst.x1204.x1198.x1188.rs.dns-oarc.net.
"212.47.228.136 DNS reply size limit is at least 1204"
"212.47.228.136 sent EDNS buffer size 1232"

That's with the relay at sdns://gRMxMDkuMjQ4LjE0OS4xMzM6NDQz (109.248.149.133:443, Latvian server, doesn't work with padding below 1440).

from encrypted-dns-server.

jedisct1 commented on May 20, 2024

Testing with TCP is not really useful here.

With dig I think +bufsize=4096 +notcp is what you should use to avoid tcp

dig is always running behind, but this extension has been around for 12+ years :)

from encrypted-dns-server.

df-cryptostorm commented on May 20, 2024

weird, even with dig +bufsize=4096 +notcp balancer.cryptostorm.pw @127.0.0.1 it still switches over to TCP:

18:44:19.446603 eth0  Out IP 163.172.15.73.50339 > 109.248.149.133.443: UDP, length 604
18:44:19.566126 eth0  In  IP 109.248.149.133.443 > 163.172.15.73.50339: UDP, length 304
18:44:19.566326 eth0  Out IP 163.172.15.73.40774 > 109.248.149.133.443: Flags [S], seq 3679341395, win 64240, options [mss 1460,sackOK,TS val 1152258403 ecr 0,nop,wscale 7], length 0
18:44:19.601262 eth0  In  IP 109.248.149.133.443 > 163.172.15.73.40774: Flags [S.], seq 2682753559, ack 3679341396, win 29200, options [mss 1460,nop,wscale 9], length 0

etc. etc. The response still comes through too. Maybe 212.47.228.136 (scaleway-fr) is forcing TCP somehow?
And yea, EDNS has been supported for more than a decade, but some improvements have been made as recently as last year.

from encrypted-dns-server.

jedisct1 commented on May 20, 2024

I though you sent queries directly to a resolver.

Relays will switch to TCP if UDP fails, even though responses to clients will eventually be over UDP.

And yea, EDNS has been supported for more than a decade, but some improvements have been made as recently as last year.

Absolutely zero improvements, besides adding extensions unrelated to UDP packet size :)

from encrypted-dns-server.

df-cryptostorm commented on May 20, 2024

I though you sent queries directly to a resolver.

I was sending to dnscrypt-proxy running on localhost on the system I'm testing with (163.172.15.73), which is relaying through the encrypted-dns-server on 109.248.149.133:443 to scaleway-fr. I assume that the upstream resolver the encrypted-dns-server uses isn't relevant, since with relaying it should just be forwarding to scaleway-fr? Anyways, the DNSCrypt server part of the encrypted-dns-server on 109.248.149.133 seems to be working correctly, as is the upstream (PowerDNS recursor on 109.248.149.133:53). The only thing that isn't working is relaying, unless I set that fragmentation to 1440 or higher.

Absolutely zero improvements, besides adding extensions unrelated to UDP packet size :)

heh, well the bind team calls them improvements, not sure if anyone else would though.

from encrypted-dns-server.

df-cryptostorm commented on May 20, 2024

Just leaving an update here. I migrated our old CentOS Switzerland server to a new Arch Linux setup we've been using on any new servers and the relaying issue went away. So it's definitely something on our end. No clue what it might be though since so many things get changed in that kind of migration.

from encrypted-dns-server.

jedisct1 commented on May 20, 2024

Awesome!

The CentOS project doesn't exist any more, so the migration was worth it anyway :)

from encrypted-dns-server.

df-cryptostorm commented on May 20, 2024

Just wanted to leave one last update here, just in case somebody else runs into this issue later with some other data center.

I switched everything over to Arch Linux, but 3 or so servers still suffered from this issue.
Some data centers do block either incoming or outgoing (or both) IP fragmentation for "security reasons".
My guess is that they still have some forgotten rule in some firewall for that ancient teardrop attack.

Anyways, I used this little python script to figure out if IP fragmentation packets were being blocked, and along with tcpdump, which direction the block was occurring:

#!/usr/bin/python
import sys
from scapy.all import *
dip=sys.argv[1]
payload="A"*496+"B"*500
packet=IP(dst=dip,id=53)/UDP(sport=1500,dport=443)/payload
frags=fragment(packet,fragsize=500)
for fragment in frags:
 fragment.show()
 send(fragment)

On the affected server I'd run ./that.py [remote ip] then on [remote ip] I'd run tcpdump to see if the IP frag packets came in.
If not, the server was blocking outgoing fragmentation.
Then I'd do the reverse, ./that.py running on [remote ip] and tcpdump on the affected server.
If that failed, the server's blocking incoming fragmentation.

After showing the results of this test to the affected server's DCs, one said:
"It was exactly as your client suspected; we do block fragments for security reasons. We have just started permitting them."
Another said:
"I just removed some port security from the Top Of Rack switch port connected to your server. I think this might solve the problem of the fragmentation packets being blocked."

So if anyone else runs an encrypted-dns-server with Anonymized relaying enabled and dnscrypt-proxy is complaining about something blocking fragmented packets, check with the DC. Use that python script above to show them proof that it's happening and maybe they can remove the block.

I'm still waiting on a response from our Latvia and South Korea DCs, so in the pull request I just submitted that has the most recent resolver/relay IPs, those two aren't included in the relay list. Their DNSCrypt resolvers work fine, but the Anonymized DNS relays would only work with force_tcp = true because of the IP frag blocking.

from encrypted-dns-server.

jedisct1 commented on May 20, 2024

Wow, thanks a lot for the followup!

This is very useful information and something that should probably added to the main documentation.

That script can also come in handy!

Thanks again, and also for working with these providers to have that issue solved!

from encrypted-dns-server.

Anonymous DNS relays not working for some of our servers about encrypted-dns-server HOT 13 CLOSED

Comments (13)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent