Comments (70)
Here is some info from @Noki's router. It's an x86-64 virual machine.
Fri Aug 11 14:12:07 2023 daemon.err td-client: [b.bbb-vpn.berlin.freifunk.net:8943] Connection lost.
Fri Aug 11 14:12:07 2023 daemon.info td-client: Performing broker selection...
Fri Aug 11 14:12:07 2023 daemon.info olsrd[23577]: Removing interface bbbdigger
Fri Aug 11 14:12:07 2023 daemon.notice netifd: bbbdigger (23281): udhcpc: received SIGTERM
Fri Aug 11 14:12:07 2023 daemon.notice netifd: bbbdigger (23281): udhcpc: unicasting a release of 10.36.193.52 to 10.36.193.6
Fri Aug 11 14:12:07 2023 daemon.notice netifd: bbbdigger (23281): udhcpc: sending release
Fri Aug 11 14:12:07 2023 daemon.notice netifd: bbbdigger (23281): udhcpc: connect: Network unreachable
Fri Aug 11 14:12:07 2023 daemon.notice netifd: bbbdigger (23281): udhcpc: entering released state
Fri Aug 11 14:12:07 2023 daemon.notice netifd: bbbdigger (23281): Command failed: Permission denied
Fri Aug 11 14:12:07 2023 daemon.notice netifd: Interface 'bbbdigger' is now down
Fri Aug 11 14:12:07 2023 daemon.notice netifd: Interface 'bbbdigger' is disabled
Fri Aug 11 14:12:08 2023 daemon.debug td-client: [b.bbb-vpn.berlin.freifunk.net:8943] Broker usage: 63
Fri Aug 11 14:12:08 2023 daemon.info td-client: Selected b.bbb-vpn.berlin.freifunk.net:8943 as the best broker.
Fri Aug 11 14:12:08 2023 daemon.info olsrd: /etc/init.d/olsrd: olsrd_setup_smartgw_rules() Notice: Inserting firewall rules for SmartGateway
Fri Aug 11 14:12:08 2023 daemon.info olsrd[23577]: Tunnel tnl_600b1f0a removed, to -
Fri Aug 11 14:12:08 2023 daemon.info olsrd[23577]: Tunnel tnl_a07f1f0a added, to 10.31.127.160
Fri Aug 11 14:12:08 2023 daemon.info olsrd[23577]: Tunnel tnl_a07f1f0a removed, to -
Fri Aug 11 14:12:08 2023 daemon.info olsrd[23577]: Tunnel tnl_a0821f0a added, to 10.31.130.160
Fri Aug 11 14:12:08 2023 daemon.info olsrd[23577]: Tunnel tnl_a0821f0a removed, to -
Fri Aug 11 14:12:08 2023 daemon.info olsrd[23577]: Writing '1' (was 0) to /proc/sys/net/ipv4/conf/all/send_redirects
Fri Aug 11 14:12:08 2023 daemon.info olsrd[23577]: Writing '1' (was 0) to /proc/sys/net/ipv4/conf/eth2/send_redirects
Fri Aug 11 14:12:08 2023 daemon.info olsrd[23577]: olsr.org - pre-0.9.9-git_0000000-hash_91d5ae30edfe8ee23769434a3154dcd7 stopped
Fri Aug 11 14:12:08 2023 daemon.info olsrd[24105]: Writing '1' (was 1) to /proc/sys/net/ipv4/ip_forward
Fri Aug 11 14:12:08 2023 daemon.info olsrd[24105]: Writing '0' (was 0) to /proc/sys/net/ipv4/conf/tunl0/rp_filter
Fri Aug 11 14:12:08 2023 daemon.info olsrd[24105]: Writing '0' (was 1) to /proc/sys/net/ipv4/conf/all/send_redirects
Fri Aug 11 14:12:08 2023 daemon.info olsrd[24105]: Writing '0' (was 0) to /proc/sys/net/ipv4/conf/all/rp_filter
Fri Aug 11 14:12:08 2023 daemon.info olsrd[24105]: Writing '0' (was 1) to /proc/sys/net/ipv4/conf/eth2/send_redirects
Fri Aug 11 14:12:08 2023 daemon.info olsrd[24105]: Writing '0' (was 0) to /proc/sys/net/ipv4/conf/eth2/rp_filter
Fri Aug 11 14:12:08 2023 daemon.info olsrd[24105]: Adding interface eth2
Fri Aug 11 14:12:08 2023 daemon.info olsrd[24105]: New main address: 10.31.21.185
Fri Aug 11 14:12:08 2023 daemon.info olsrd[24105]: olsr.org - pre-0.9.9-git_0000000-hash_91d5ae30edfe8ee23769434a3154dcd7 successfully started
Fri Aug 11 14:12:09 2023 daemon.info td-client: [b.bbb-vpn.berlin.freifunk.net:8943] Tunnel successfully established.
Fri Aug 11 14:12:09 2023 daemon.notice netifd: Interface 'bbbdigger' is enabled
Fri Aug 11 14:12:09 2023 daemon.notice netifd: Network device 'bbbdigger' link is up
Fri Aug 11 14:12:09 2023 daemon.notice netifd: Interface 'bbbdigger' has link connectivity
Fri Aug 11 14:12:09 2023 daemon.notice netifd: Interface 'bbbdigger' is setting up now
Fri Aug 11 14:12:09 2023 daemon.notice netifd: bbbdigger (24271): udhcpc: started, v1.33.2
Fri Aug 11 14:12:09 2023 daemon.notice netifd: bbbdigger (24271): udhcpc: sending discover
Fri Aug 11 14:12:09 2023 daemon.notice netifd: bbbdigger (24271): udhcpc: sending select for 10.36.193.52
Fri Aug 11 14:12:09 2023 daemon.notice netifd: bbbdigger (24271): udhcpc: lease of 10.36.193.52 obtained, lease time 1800
Fri Aug 11 14:12:09 2023 daemon.notice netifd: Interface 'bbbdigger' is now up
Fri Aug 11 14:12:09 2023 user.notice firewall: Reloading firewall due to ifup of bbbdigger (bbbdigger)
Fri Aug 11 14:12:09 2023 user.notice policyrouting: Add route: ip route add 10.36.193.0/24 dev bbbdigger table localnets
Fri Aug 11 14:12:09 2023 user.notice policyrouting: Add route: ip route add 10.36.193.0/24 dev bbbdigger table olsr
Fri Aug 11 14:12:09 2023 user.notice policyrouting: Use mesh gateway for interface bbbdigger (IPv4)
Fri Aug 11 14:12:10 2023 daemon.info olsrd: /etc/init.d/olsrd: olsrd_setup_smartgw_rules() Notice: Inserting firewall rules for SmartGateway
Fri Aug 11 14:12:10 2023 daemon.info olsrd[24105]: Writing '1' (was 0) to /proc/sys/net/ipv4/conf/all/send_redirects
Fri Aug 11 14:12:10 2023 daemon.info olsrd[24105]: Writing '1' (was 0) to /proc/sys/net/ipv4/conf/eth2/send_redirects
Fri Aug 11 14:12:10 2023 daemon.info olsrd[24105]: olsr.org - pre-0.9.9-git_0000000-hash_91d5ae30edfe8ee23769434a3154dcd7 stopped
Fri Aug 11 14:12:10 2023 daemon.info olsrd[24533]: Writing '1' (was 1) to /proc/sys/net/ipv4/ip_forward
Fri Aug 11 14:12:10 2023 daemon.info olsrd[24533]: Writing '0' (was 0) to /proc/sys/net/ipv4/conf/tunl0/rp_filter
Fri Aug 11 14:12:10 2023 daemon.info olsrd[24533]: Writing '0' (was 1) to /proc/sys/net/ipv4/conf/all/send_redirects
Fri Aug 11 14:12:10 2023 daemon.info olsrd[24533]: Writing '0' (was 0) to /proc/sys/net/ipv4/conf/all/rp_filter
Fri Aug 11 14:12:10 2023 daemon.info olsrd[24533]: Writing '0' (was 1) to /proc/sys/net/ipv4/conf/bbbdigger/send_redirects
Fri Aug 11 14:12:10 2023 daemon.info olsrd[24533]: Writing '0' (was 0) to /proc/sys/net/ipv4/conf/bbbdigger/rp_filter
Fri Aug 11 14:12:10 2023 daemon.info olsrd[24533]: Adding interface bbbdigger
Fri Aug 11 14:12:10 2023 daemon.info olsrd[24533]: New main address: 10.36.193.52
Fri Aug 11 14:12:10 2023 daemon.info olsrd[24533]: Writing '0' (was 1) to /proc/sys/net/ipv4/conf/eth2/send_redirects
Fri Aug 11 14:12:10 2023 daemon.info olsrd[24533]: Writing '0' (was 0) to /proc/sys/net/ipv4/conf/eth2/rp_filter
Fri Aug 11 14:12:10 2023 daemon.info olsrd[24533]: Adding interface eth2
Fri Aug 11 14:12:10 2023 daemon.info olsrd[24533]: Received netlink error code File exists (-17)
Fri Aug 11 14:12:10 2023 daemon.err olsrd[24533]: Error on inserting policy rule aimed to activate RtTable 112!
Fri Aug 11 14:12:10 2023 daemon.info olsrd[24533]: olsr.org - pre-0.9.9-git_0000000-hash_91d5ae30edfe8ee23769434a3154dcd7 successfully started
Fri Aug 11 14:12:14 2023 daemon.err olsrd[24546]: olsrd exit: main: Bad configuration
Fri Aug 11 14:12:17 2023 daemon.info olsrd[24533]: Tunnel tnl_a0821f0a added, to 10.31.130.160
Fri Aug 11 14:12:17 2023 daemon.info olsrd[24533]: Tunnel tnl_a0821f0a removed, to -
Fri Aug 11 14:12:17 2023 daemon.info olsrd[24533]: Tunnel tnl_38531f0a added, to 10.31.83.56
Fri Aug 11 14:12:29 2023 daemon.err olsrd[24670]: olsrd exit: main: Bad configuration
Fri Aug 11 14:12:44 2023 daemon.err olsrd[24673]: olsrd exit: main: Bad configuration
Fri Aug 11 14:13:10 2023 daemon.warn td-client: [b.bbb-vpn.berlin.freifunk.net:8943] Tunnel has timed out, closing down interface.
Fri Aug 11 14:13:10 2023 daemon.notice netifd: Network device 'bbbdigger' link is down
Fri Aug 11 14:13:10 2023 daemon.notice netifd: Interface 'bbbdigger' has link connectivity loss
Fri Aug 11 14:13:10 2023 daemon.err td-client: [b.bbb-vpn.berlin.freifunk.net:8943] Connection lost.
and @kaechele suggested to get the following info
root@noki-test:~# cat /proc/net/nf_conntrack
ipv4 2 tcp 6 37 TIME_WAIT src=127.0.0.1 dst=127.0.0.1 sport=57290 dport=2006 packets=8 bytes=431 src=127.0.0.1 dst=127.0.0.1 sport=2006 dport=57290 packets=8 bytes=135529 [ASSURED] mark=0 zone=0 use=2
ipv4 2 udp 17 24 src=10.31.212.38 dst=149.112.112.112 sport=42202 dport=53 packets=1 bytes=71 [UNREPLIED] src=149.112.112.112 dst=172.31.226.250 sport=53 dport=42202 packets=0 bytes=0 mark=0 zone=0 use=2
ipv4 2 udp 17 24 src=10.31.212.38 dst=149.112.112.112 sport=42202 dport=53 packets=1 bytes=71 [UNREPLIED] src=149.112.112.112 dst=172.31.226.250 sport=53 dport=42202 packets=0 bytes=0 mark=0 zone=0 use=2
ipv4 2 tcp 6 73 ESTABLISHED src=10.31.212.199 dst=20.93.28.56 sport=56461 dport=443 packets=10 bytes=410 [UNREPLIED] src=20.93.28.56 dst=172.31.226.250 sport=443 dport=56461 packets=0 bytes=0 mark=0 zone=0 use=2
ipv4 2 tcp 6 29 LAST_ACK src=10.31.212.199 dst=172.217.18.10 sport=56553 dport=443 packets=7 bytes=581 [UNREPLIED] src=172.217.18.10 dst=172.31.226.250 sport=443 dport=56553 packets=0 bytes=0 mark=0 zone=0 use=2
ipv4 2 udp 17 54 src=10.31.212.38 dst=149.112.112.112 sport=42162 dport=53 packets=2 bytes=124 [UNREPLIED] src=149.112.112.112 dst=172.31.226.250 sport=53 dport=42162 packets=0 bytes=0 mark=0 zone=0 use=2
ipv4 2 udp 17 13 src=10.31.212.38 dst=1.0.0.1 sport=46972 dport=53 packets=1 bytes=61 [UNREPLIED] src=1.0.0.1 dst=172.31.226.250 sport=53 dport=46972 packets=0 bytes=0 mark=0 zone=0 use=2
ipv4 2 udp 17 24 src=10.31.212.38 dst=1.0.0.1 sport=53905 dport=53 packets=1 bytes=65 [UNREPLIED] src=1.0.0.1 dst=172.31.226.250 sport=53 dport=53905 packets=0 bytes=0 mark=0 zone=0 use=2
ipv4 2 tcp 6 27 LAST_ACK src=10.31.212.199 dst=172.217.18.10 sport=56556 dport=443 packets=7 bytes=550 [UNREPLIED] src=172.217.18.10 dst=172.31.226.250 sport=443 dport=56556 packets=0 bytes=0 mark=0 zone=0 use=2
ipv4 2 udp 17 38 src=192.168.189.3 dst=194.150.168.168 sport=51147 dport=53 packets=1 bytes=75 src=194.150.168.168 dst=192.168.189.3 sport=53 dport=51147 packets=1 bytes=120 mark=0 zone=0 use=2
ipv4 2 udp 17 15 src=10.31.212.199 dst=142.250.185.67 sport=49784 dport=443 packets=3 bytes=3834 [UNREPLIED] src=142.250.185.67 dst=172.31.226.250 sport=443 dport=49784 packets=0 bytes=0 mark=0 zone=0 use=2
ipv4 2 udp 17 54 src=10.31.212.38 dst=9.9.9.9 sport=42162 dport=53 packets=2 bytes=124 [UNREPLIED] src=9.9.9.9 dst=172.31.226.250 sport=53 dport=42162 packets=0 bytes=0 mark=0 zone=0 use=2
ipv4 2 tcp 6 14 SYN_SENT src=10.31.212.199 dst=40.99.148.82 sport=56629 dport=443 packets=1 bytes=60 [UNREPLIED] src=40.99.148.82 dst=172.31.226.250 sport=443 dport=56629 packets=0 bytes=0 mark=0 zone=0 use=2
ipv4 2 udp 17 59 src=10.31.212.38 dst=255.255.255.255 sport=698 dport=698 packets=10665 bytes=14656064 [UNREPLIED] src=255.255.255.255 dst=10.31.212.38 sport=698 dport=698 packets=0 bytes=0 mark=0 zone=0 use=2
ipv4 2 udp 17 11 src=172.31.226.250 dst=172.31.224.1 sport=68 dport=67 packets=1 bytes=328 src=172.31.224.1 dst=172.31.226.250 sport=67 dport=68 packets=1 bytes=328 mark=0 zone=0 use=2
ipv4 2 udp 17 38 src=192.168.189.3 dst=46.182.19.48 sport=51147 dport=53 packets=1 bytes=75 src=46.182.19.48 dst=192.168.189.3 sport=53 dport=51147 packets=1 bytes=120 mark=0 zone=0 use=2
ipv4 2 tcp 6 28 SYN_SENT src=10.31.212.199 dst=2.19.213.33 sport=56630 dport=443 packets=1 bytes=60 [UNREPLIED] src=2.19.213.33 dst=172.31.226.250 sport=443 dport=56630 packets=0 bytes=0 mark=0 zone=0 use=2
ipv4 2 udp 17 13 src=10.31.212.38 dst=1.1.1.1 sport=46972 dport=53 packets=2 bytes=122 [UNREPLIED] src=1.1.1.1 dst=172.31.226.250 sport=53 dport=46972 packets=0 bytes=0 mark=0 zone=0 use=2
ipv4 2 udp 17 13 src=10.31.212.38 dst=1.0.0.1 sport=49873 dport=53 packets=1 bytes=61 [UNREPLIED] src=1.0.0.1 dst=172.31.226.250 sport=53 dport=49873 packets=0 bytes=0 mark=0 zone=0 use=2
ipv4 2 udp 17 30 src=10.31.212.199 dst=142.250.180.106 sport=54932 dport=443 packets=8 bytes=8100 [UNREPLIED] src=142.250.180.106 dst=172.31.226.250 sport=443 dport=54932 packets=0 bytes=0 mark=0 zone=0 use=2
ipv4 2 udp 17 33 src=192.168.189.3 dst=77.87.49.11 sport=41975 dport=8943 packets=93 bytes=59362 src=77.87.49.11 dst=192.168.189.3 sport=8943 dport=41975 packets=40 bytes=46366 [ASSURED] mark=0 zone=0 use=2
ipv4 2 udp 17 24 src=10.31.212.38 dst=9.9.9.9 sport=42202 dport=53 packets=1 bytes=71 [UNREPLIED] src=9.9.9.9 dst=172.31.226.250 sport=53 dport=42202 packets=0 bytes=0 mark=0 zone=0 use=2
ipv4 2 udp 17 120 src=192.168.189.3 dst=77.87.49.11 sport=46999 dport=8943 packets=370 bytes=447620 src=77.87.49.11 dst=192.168.189.3 sport=8943 dport=46999 packets=171 bytes=193226 [ASSURED] mark=0 zone=0 use=2
ipv4 2 udp 17 13 src=10.31.212.38 dst=149.112.112.112 sport=49873 dport=53 packets=1 bytes=61 [UNREPLIED] src=149.112.112.112 dst=172.31.226.250 sport=53 dport=49873 packets=0 bytes=0 mark=0 zone=0 use=2
ipv4 2 tcp 6 49 ESTABLISHED src=10.31.212.199 dst=52.123.144.67 sport=56601 dport=443 packets=10 bytes=5822 [UNREPLIED] src=52.123.144.67 dst=172.31.226.250 sport=443 dport=56601 packets=0 bytes=0 mark=0 zone=0 use=2
ipv4 2 udp 17 178 src=192.168.189.3 dst=77.87.49.11 sport=47108 dport=8943 packets=91 bytes=59074 src=77.87.49.11 dst=192.168.189.3 sport=8943 dport=47108 packets=107 bytes=116418 [ASSURED] mark=0 zone=0 use=2
ipv4 2 udp 17 24 src=10.31.212.38 dst=1.1.1.1 sport=53905 dport=53 packets=1 bytes=65 [UNREPLIED] src=1.1.1.1 dst=172.31.226.250 sport=53 dport=53905 packets=0 bytes=0 mark=0 zone=0 use=2
ipv4 2 udp 17 34 src=10.31.212.199 dst=142.250.180.106 sport=56187 dport=443 packets=6 bytes=6497 [UNREPLIED] src=142.250.180.106 dst=172.31.226.250 sport=443 dport=56187 packets=0 bytes=0 mark=0 zone=0 use=2
ipv4 2 udp 17 30 src=10.31.212.199 dst=142.250.180.99 sport=54138 dport=443 packets=5 bytes=6390 [UNREPLIED] src=142.250.180.99 dst=172.31.226.250 sport=443 dport=54138 packets=0 bytes=0 mark=0 zone=0 use=2
ipv4 2 udp 17 29 src=10.31.212.199 dst=142.251.143.46 sport=50077 dport=443 packets=5 bytes=6390 [UNREPLIED] src=142.251.143.46 dst=172.31.226.250 sport=443 dport=50077 packets=0 bytes=0 mark=0 zone=0 use=2
ipv4 2 udp 17 59 src=10.31.21.185 dst=255.255.255.255 sport=698 dport=698 packets=1266 bytes=117368 [UNREPLIED] src=255.255.255.255 dst=10.31.21.185 sport=698 dport=698 packets=0 bytes=0 mark=0 zone=0 use=2
ipv4 2 udp 17 36 src=10.31.215.33 dst=77.87.50.8 sport=44899 dport=25826 packets=250 bytes=335807 [UNREPLIED] src=77.87.50.8 dst=172.31.226.250 sport=25826 dport=44899 packets=0 bytes=0 mark=0 zone=0 use=2
ipv4 2 udp 17 58 src=10.31.212.38 dst=1.1.1.1 sport=56122 dport=53 packets=1 bytes=70 [UNREPLIED] src=1.1.1.1 dst=172.31.226.250 sport=53 dport=56122 packets=0 bytes=0 mark=0 zone=0 use=2
ipv4 2 tcp 6 56 SYN_SENT src=10.31.212.199 dst=52.98.240.146 sport=56634 dport=443 packets=1 bytes=60 [UNREPLIED] src=52.98.240.146 dst=172.31.226.250 sport=443 dport=56634 packets=0 bytes=0 mark=0 zone=0 use=2
ipv4 2 udp 17 58 src=10.31.212.38 dst=149.112.112.112 sport=56122 dport=53 packets=1 bytes=70 [UNREPLIED] src=149.112.112.112 dst=172.31.226.250 sport=53 dport=56122 packets=0 bytes=0 mark=0 zone=0 use=2
ipv4 2 udp 17 2 src=10.36.193.6 dst=10.36.193.52 sport=67 dport=68 packets=2 bytes=656 [UNREPLIED] src=10.36.193.52 dst=172.31.226.250 sport=68 dport=67 packets=0 bytes=0 mark=0 zone=0 use=2
ipv4 2 icmp 1 28 src=192.168.189.3 dst=192.36.148.17 type=8 code=0 id=4036 packets=213 bytes=5964 src=192.36.148.17 dst=192.168.189.3 type=0 code=0 id=4036 packets=213 bytes=5964 mark=0 zone=0 use=2
ipv4 2 udp 17 38 src=192.168.189.3 dst=9.9.9.10 sport=51147 dport=53 packets=1 bytes=75 src=9.9.9.10 dst=192.168.189.3 sport=53 dport=51147 packets=1 bytes=120 mark=0 zone=0 use=2
ipv4 2 udp 17 24 src=10.31.212.38 dst=9.9.9.9 sport=53905 dport=53 packets=1 bytes=65 [UNREPLIED] src=9.9.9.9 dst=172.31.226.250 sport=53 dport=53905 packets=0 bytes=0 mark=0 zone=0 use=2
ipv6 10 udp 17 38 src=0000:0000:0000:0000:0000:0000:0000:0001 dst=0000:0000:0000:0000:0000:0000:0000:0001 sport=36240 dport=53 packets=1 bytes=95 src=0000:0000:0000:0000:0000:0000:0000:0001 dst=0000:0000:0000:0000:0000:0000:0000:0001 sport=53 dport=36240 packets=1 bytes=140 mark=0 zone=0 use=2
ipv4 2 udp 17 22 src=10.31.212.38 dst=9.9.9.9 sport=39539 dport=53 packets=1 bytes=66 [UNREPLIED] src=9.9.9.9 dst=172.31.226.250 sport=53 dport=39539 packets=0 bytes=0 mark=0 zone=0 use=2
ipv4 2 icmp 1 4 src=172.31.226.250 dst=46.182.19.48 type=8 code=0 id=26379 packets=1 bytes=84 src=46.182.19.48 dst=172.31.226.250 type=0 code=0 id=26379 packets=1 bytes=84 mark=0 zone=0 use=2
ipv4 2 udp 17 38 src=192.168.189.3 dst=80.67.169.40 sport=51147 dport=53 packets=1 bytes=75 src=80.67.169.40 dst=192.168.189.3 sport=53 dport=51147 packets=1 bytes=120 mark=0 zone=0 use=2
ipv4 2 udp 17 24 src=10.31.212.38 dst=1.1.1.1 sport=42202 dport=53 packets=1 bytes=71 [UNREPLIED] src=1.1.1.1 dst=172.31.226.250 sport=53 dport=42202 packets=0 bytes=0 mark=0 zone=0 use=2
ipv4 2 tcp 6 67 TIME_WAIT src=127.0.0.1 dst=127.0.0.1 sport=58386 dport=2006 packets=5 bytes=275 src=127.0.0.1 dst=127.0.0.1 sport=2006 dport=58386 packets=5 bytes=14805 [ASSURED] mark=0 zone=0 use=2
ipv4 2 udp 17 54 src=10.31.212.38 dst=1.1.1.1 sport=42162 dport=53 packets=2 bytes=124 [UNREPLIED] src=1.1.1.1 dst=172.31.226.250 sport=53 dport=42162 packets=0 bytes=0 mark=0 zone=0 use=2
ipv4 2 udp 17 1 src=127.0.0.1 dst=127.0.0.1 sport=35993 dport=53 packets=1 bytes=75 src=127.0.0.1 dst=127.0.0.1 sport=53 dport=35993 packets=1 bytes=91 mark=0 zone=0 use=2
ipv4 2 udp 17 13 src=10.31.212.38 dst=1.1.1.1 sport=49873 dport=53 packets=2 bytes=122 [UNREPLIED] src=1.1.1.1 dst=172.31.226.250 sport=53 dport=49873 packets=0 bytes=0 mark=0 zone=0 use=2
ipv4 2 udp 17 37 src=192.168.189.3 dst=77.87.50.8 sport=44286 dport=25826 packets=88 bytes=118788 [UNREPLIED] src=77.87.50.8 dst=192.168.189.3 sport=25826 dport=44286 packets=0 bytes=0 mark=0 zone=0 use=2
ipv4 2 udp 17 58 src=10.31.212.38 dst=9.9.9.9 sport=56122 dport=53 packets=1 bytes=70 [UNREPLIED] src=9.9.9.9 dst=172.31.226.250 sport=53 dport=56122 packets=0 bytes=0 mark=0 zone=0 use=2
ipv4 2 udp 17 22 src=10.31.212.38 dst=149.112.112.112 sport=39539 dport=53 packets=1 bytes=66 [UNREPLIED] src=149.112.112.112 dst=172.31.226.250 sport=53 dport=39539 packets=0 bytes=0 mark=0 zone=0 use=2
ipv4 2 udp 17 19 src=10.36.193.6 dst=255.255.255.255 sport=698 dport=698 packets=253 bytes=335244 [UNREPLIED] src=255.255.255.255 dst=10.36.193.6 sport=698 dport=698 packets=0 bytes=0 mark=0 zone=0 use=2
ipv4 2 tcp 6 300 ESTABLISHED src=10.31.212.38 dst=10.31.21.185 sport=37866 dport=22 packets=526 bytes=34882 src=10.31.21.185 dst=10.31.212.38 sport=22 dport=37866 packets=862 bytes=350958 [ASSURED] mark=0 zone=0 use=4
ipv4 2 udp 17 13 src=10.31.212.38 dst=9.9.9.9 sport=49873 dport=53 packets=1 bytes=61 [UNREPLIED] src=9.9.9.9 dst=172.31.226.250 sport=53 dport=49873 packets=0 bytes=0 mark=0 zone=0 use=2
ipv4 2 tcp 6 35 SYN_SENT src=10.31.212.199 dst=52.98.240.114 sport=56631 dport=443 packets=1 bytes=60 [UNREPLIED] src=52.98.240.114 dst=172.31.226.250 sport=443 dport=56631 packets=0 bytes=0 mark=0 zone=0 use=2
ipv4 2 udp 17 39 src=192.168.189.3 dst=149.112.112.10 sport=51147 dport=53 packets=1 bytes=75 src=149.112.112.10 dst=192.168.189.3 sport=53 dport=51147 packets=1 bytes=120 mark=0 zone=0 use=2
ipv4 2 icmp 1 25 src=10.31.212.2 dst=1.1.1.1 type=8 code=0 id=65217 packets=164 bytes=13776 src=1.1.1.1 dst=172.31.226.250 type=0 code=0 id=65217 packets=164 bytes=13776 mark=0 zone=0 use=2
ipv4 2 udp 17 24 src=10.31.212.38 dst=1.0.0.1 sport=42202 dport=53 packets=1 bytes=71 [UNREPLIED] src=1.0.0.1 dst=172.31.226.250 sport=53 dport=42202 packets=0 bytes=0 mark=0 zone=0 use=2
ipv4 2 udp 17 13 src=10.31.212.38 dst=149.112.112.112 sport=46972 dport=53 packets=1 bytes=61 [UNREPLIED] src=149.112.112.112 dst=172.31.226.250 sport=53 dport=46972 packets=0 bytes=0 mark=0 zone=0 use=2
ipv4 2 icmp 1 26 src=192.168.189.3 dst=77.87.50.8 type=8 code=0 id=36208 packets=213 bytes=17892 src=77.87.50.8 dst=192.168.189.3 type=0 code=0 id=36208 packets=213 bytes=17892 mark=0 zone=0 use=2
ipv4 2 udp 17 58 src=10.31.212.38 dst=1.0.0.1 sport=56122 dport=53 packets=1 bytes=70 [UNREPLIED] src=1.0.0.1 dst=172.31.226.250 sport=53 dport=56122 packets=0 bytes=0 mark=0 zone=0 use=2
ipv6 10 udp 17 1 src=0000:0000:0000:0000:0000:0000:0000:0001 dst=0000:0000:0000:0000:0000:0000:0000:0001 sport=35993 dport=53 packets=1 bytes=95 src=0000:0000:0000:0000:0000:0000:0000:0001 dst=0000:0000:0000:0000:0000:0000:0000:0001 sport=53 dport=35993 packets=1 bytes=111 mark=0 zone=0 use=2
And maybe capturing some packets. But what exactly should I be capturing the packets on? Tcpdump on the client and capturing packets going to the server? Is there anything else that could be helpful to debug this problem.
I would also like to make it clear that is does not happen on older versions of tunneldigger-broker. Specifically we are using c50ef46 which never causes this problem on multiple server each with appox 90 clients
from tunneldigger.
Specifically we are using c50ef46 which never causes this problem on multiple server each with appox 90 clients
That would put the regression range to c50ef46...4f72b30. The NAT removal is the first thing that sticks out. Old kernels are known to have problems with our multiple UDP sockets on the same port (see #126). What's the kernel version on the host?
Since it already happens with 2 clients, I am fairly sure this is a kernel issue -- that's usually how those manifest.
FWIW we have a big setup with 4 servers with 150 clients each, and I don't see a lot of these issues. (I can't say for sure none of this happens since how would I know, but we don't get user complaints en masse.) So the fact that you have servers that are unaffected doesn't say much.
reason=0x5
"ERROR_REASON_UNDEFINED". Not very useful... this is the default that will be given when something calls close()
on the server without specifying a reason.
from tunneldigger.
I have the tunneldigger-broker running on openwrt (x86-64 Virtual Machine) as a test of the latest commit to tunneldigger-broker and so far we have 2 routers connecting to the broker.
How is that VM exposed to the outside such that the router can connect to it? In #126 (comment) an in-between NAT was mentioned. If that NAT screws up and doesn't properly map the ports back and forth then that could of course also explain the problem. Though I'm not sure how that should happen without screwing up all UDP completely... from the NAT perspective this is just two different clients talking to UDP port 10000 on the server, and it has to keep those data streams apart.
from tunneldigger.
Specifically we are using c50ef46 which never causes this problem on multiple server each with appox 90 clients
That would put the regression range to c50ef46...4f72b30. The NAT removal is the first thing that sticks out. Old kernels are known to have problems with our multiple UDP sockets on the same port (see #126). What's the kernel version on the host?
Since it already happens with 2 clients, I am fairly sure this is a kernel issue -- that's usually how those manifest.
FWIW we have a big setup with 4 servers with 150 clients each, and I don't see a lot of these issues. (I can't say for sure none of this happens since how would I know, but we don't get user complaints en masse.) So the fact that you have servers that are unaffected doesn't say much.
reason=0x5
"ERROR_REASON_UNDEFINED". Not very useful... this is the default that will be given when something calls
close()
on the server without specifying a reason.
the client is
root@noki-test:~# uname -a
Linux noki-test 5.4.238 #0 SMP Mon Apr 17 13:15:36 2023 x86_64 GNU/Linux
the broker is
root@b-bbb-bvpn:~# uname -a
Linux b-bbb-bvpn 5.10.176 #0 SMP Thu Apr 27 20:28:15 2023 x86_64 GNU/Linux
from tunneldigger.
I have the tunneldigger-broker running on openwrt (x86-64 Virtual Machine) as a test of the latest commit to tunneldigger-broker and so far we have 2 routers connecting to the broker.
How is that VM exposed to the outside such that the router can connect to it? In #126 (comment) an in-between NAT was mentioned. If that NAT screws up and doesn't properly map the ports back and forth then that could of course also explain the problem. Though I'm not sure how that should happen without screwing up all UDP completely... from the NAT perspective this is just two different clients talking to UDP port 10000 on the server, and it has to keep those data streams apart.
The broker is directly connected to the internet. What I was mentioning with connection to NAT was the client device.
Feel free to make a tunneldigger connection to b.bbb-vpn.berlin.freifunk.net:8943
(using 8943 for now during the test phase, later it will be switched to 8942)
from tunneldigger.
the client is
root@noki-test:~# uname -a Linux noki-test 5.4.238 #0 SMP Mon Apr 17 13:15:36 2023 x86_64 GNU/Linux
the broker is
root@b-bbb-bvpn:~# uname -a Linux b-bbb-bvpn 5.10.176 #0 SMP Thu Apr 27 20:28:15 2023 x86_64 GNU/Linux
And the client which is not constantly reconnecting is:
root@ptest:~# uname -a
Linux ptest 5.4.238 #0 Mon Apr 17 13:15:36 2023 ppc GNU/Linux
from tunneldigger.
The broker is the relevant kernel here. 5.10 should be new enough...
from tunneldigger.
It should even be new enough to have one other bug fixed that existed with UDP and SO_REUSEPORT
: https://kernel.dance/#69421bf98482d089e50799f45e48b25ce4a8d154
from tunneldigger.
FYI, I have updated #126 with the above information. The new minimum Kernel that should work with the master
branch is therefor 5.10.152.
from tunneldigger.
FWIW we have a big setup with 4 servers with 150 clients each, and I don't see a lot of these issues. (I can't say for sure none of this happens since how would I know, but we don't get user complaints en masse.) So the fact that you have servers that are unaffected doesn't say much.
sudo journalctl -u tunneldigger.service | grep Clos
will show you how long the clients connected. Anything less that 100 I would be suspicious of. But mostly I'm getting closings in the 5 to 30 second range.
Also, how else can I debug this issue? Any pointers would be helpful. I would like to get the new broker online.
from tunneldigger.
will show you how long the clients connected.
Hm yes, I am seeing some clients that struggle, and others that last very long. Interesting.
Also, how else can I debug this issue? Any pointers would be helpful. I would like to get the new broker online.
I think a good first step would be trying to figure out why the server is closing the tunnel. So for instance you could modify this function
to raise an exception if the code is ERROR_REASON_UNDEFINED
. Then the stacktrace will tell us where that close
call is coming from.
from tunneldigger.
oot@b-bbb-bvpn:/usr/lib/python3.10/site-packages/tunneldigger_broker# /usr/bin/python -m tunneldigger_broker.main /var/etc/tunneldigger-broker.cfg
[INFO/tunneldigger.broker] Initializing the tunneldigger broker.
[INFO/tunneldigger.broker] Registered script '/usr/lib/tunneldigger-broker/hooks/setup' for hook 'session.up'.
[INFO/tunneldigger.broker] Registered script '/usr/lib/tunneldigger-broker/hooks/teardown' for hook 'session.pre-down'.
[INFO/tunneldigger.broker] Registered script '/usr/lib/tunneldigger-broker/hooks/mtu-changed' for hook 'session.mtu-changed'.
[INFO/tunneldigger.broker] Maximum number of tunnels is 1024.
[INFO/tunneldigger.broker] Tunnel identifier base is 100.
[INFO/tunneldigger.broker] Listening on 77.87.49.11:8943.
[INFO/tunneldigger.broker] Broker initialized.
[INFO/tunneldigger.broker] Creating tunnel (b6:01:be:53:f0:d5:71:a2:3a:35) with id 100.
[INFO/tunneldigger.tunnel] Set tunnel 100 MTU to 1446.
[INFO/tunneldigger.hooks] Running hook 'session.up' via script '/usr/lib/tunneldigger-broker/hooks/setup'.
[INFO/tunneldigger.broker] Creating tunnel (b6:e5:b7:75:f8:b4:6d:3d:9f:41) with id 101.
[INFO/tunneldigger.tunnel] Set tunnel 101 MTU to 1446.
[INFO/tunneldigger.hooks] Running hook 'session.up' via script '/usr/lib/tunneldigger-broker/hooks/setup'.
[INFO/tunneldigger.tunnel] Closing tunnel 101 (b6:e5:b7:75:f8:b4:6d:3d:9f:41) after 6 seconds (reason=0x5)
[INFO/tunneldigger.broker] Shutting down tunneldigger broker.
[INFO/tunneldigger.tunnel] Closing tunnel 100 (b6:01:be:53:f0:d5:71:a2:3a:35) after 8 seconds (reason=0x5)
Traceback (most recent call last):
File "/usr/lib/python3.10/site-packages/tunneldigger_broker/broker.py", line 203, in close
tunnel.close()
File "/usr/lib/python3.10/site-packages/tunneldigger_broker/main.py", line 115, in <module>
event_loop.start()
File "/usr/lib/python3.10/site-packages/tunneldigger_broker/eventloop.py", line 61, in start
pollable.close()
File "/usr/lib/python3.10/site-packages/tunneldigger_broker/tunnel.py", line 247, in close
raise
RuntimeError: No active exception to reraise
[INFO/tunneldigger.tunnel] Closing tunnel 101 (b6:e5:b7:75:f8:b4:6d:3d:9f:41) after 6 seconds (reason=0x5)
Traceback (most recent call last):
File "/usr/lib/python3.10/site-packages/tunneldigger_broker/broker.py", line 203, in close
tunnel.close()
File "/usr/lib/python3.10/site-packages/tunneldigger_broker/main.py", line 115, in <module>
event_loop.start()
File "/usr/lib/python3.10/site-packages/tunneldigger_broker/eventloop.py", line 61, in start
pollable.close()
File "/usr/lib/python3.10/site-packages/tunneldigger_broker/tunnel.py", line 247, in close
raise
RuntimeError: No active exception to reraise
Traceback (most recent call last):
File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
File "/usr/lib/python3.10/site-packages/tunneldigger_broker/main.py", line 115, in <module>
event_loop.start()
File "/usr/lib/python3.10/site-packages/tunneldigger_broker/eventloop.py", line 61, in start
pollable.close()
File "/usr/lib/python3.10/site-packages/tunneldigger_broker/tunnel.py", line 247, in close
raise
RuntimeError: No active exception to reraise
I put raise
at the very beginning of the function after the logger call. If that's too aggressive, let me know
from tunneldigger.
Ah... something is catching the exception and then initiating a broker shutdown. And we don't get the original stacktrace somehow. Dang.
Is there a way in Python to just print a stacktrace without raising an exception?
from tunneldigger.
I'm not a python user...
I changed the raise
to traceback.print_stack()
root@b-bbb-bvpn:/usr/lib/python3.10/site-packages/tunneldigger_broker# /usr/bin/python -m tunneldigger_broker.main /var/etc/tunneldigger-broker.cfg
[INFO/tunneldigger.broker] Initializing the tunneldigger broker.
[INFO/tunneldigger.broker] Registered script '/usr/lib/tunneldigger-broker/hooks/setup' for hook 'session.up'.
[INFO/tunneldigger.broker] Registered script '/usr/lib/tunneldigger-broker/hooks/teardown' for hook 'session.pre-down'.
[INFO/tunneldigger.broker] Registered script '/usr/lib/tunneldigger-broker/hooks/mtu-changed' for hook 'session.mtu-changed'.
[INFO/tunneldigger.broker] Maximum number of tunnels is 1024.
[INFO/tunneldigger.broker] Tunnel identifier base is 100.
[INFO/tunneldigger.broker] Listening on 77.87.49.11:8943.
[INFO/tunneldigger.broker] Broker initialized.
[INFO/tunneldigger.broker] Creating tunnel (b6:e5:b7:75:f8:b4:6d:3d:9f:41) with id 100.
[INFO/tunneldigger.tunnel] Set tunnel 100 MTU to 1446.
[INFO/tunneldigger.hooks] Running hook 'session.up' via script '/usr/lib/tunneldigger-broker/hooks/setup'.
[INFO/tunneldigger.broker] Creating tunnel (b6:01:be:53:f0:d5:71:a2:3a:35) with id 101.
[INFO/tunneldigger.tunnel] Set tunnel 101 MTU to 1446.
[INFO/tunneldigger.hooks] Running hook 'session.up' via script '/usr/lib/tunneldigger-broker/hooks/setup'.
[INFO/tunneldigger.tunnel] Closing tunnel 100 (b6:e5:b7:75:f8:b4:6d:3d:9f:41) after 12 seconds (reason=0x5)
File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
File "/usr/lib/python3.10/site-packages/tunneldigger_broker/main.py", line 115, in <module>
event_loop.start()
File "/usr/lib/python3.10/site-packages/tunneldigger_broker/eventloop.py", line 61, in start
pollable.close()
File "/usr/lib/python3.10/site-packages/tunneldigger_broker/tunnel.py", line 248, in close
traceback.print_stack()
[INFO/tunneldigger.hooks] Running hook 'session.pre-down' via script '/usr/lib/tunneldigger-broker/hooks/teardown'.
[INFO/tunneldigger.tunnel] Set tunnel 101 MTU to 1438.
[INFO/tunneldigger.hooks] Running hook 'session.mtu-changed' via script '/usr/lib/tunneldigger-broker/hooks/mtu-changed'.
[INFO/tunneldigger.broker] Creating tunnel (b6:e5:b7:75:f8:b4:6d:3d:9f:41) with id 102.
[INFO/tunneldigger.tunnel] Set tunnel 102 MTU to 1446.
[INFO/tunneldigger.hooks] Running hook 'session.up' via script '/usr/lib/tunneldigger-broker/hooks/setup'.
[INFO/tunneldigger.tunnel] Closing tunnel 102 (b6:e5:b7:75:f8:b4:6d:3d:9f:41) after 12 seconds (reason=0x5)
File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
File "/usr/lib/python3.10/site-packages/tunneldigger_broker/main.py", line 115, in <module>
event_loop.start()
File "/usr/lib/python3.10/site-packages/tunneldigger_broker/eventloop.py", line 61, in start
pollable.close()
File "/usr/lib/python3.10/site-packages/tunneldigger_broker/tunnel.py", line 248, in close
traceback.print_stack()
[INFO/tunneldigger.hooks] Running hook 'session.pre-down' via script '/usr/lib/tunneldigger-broker/hooks/teardown'.
It's basically the same trace as before. It's now line 248 in tunnel.py because I had to add an import traceback
above
from tunneldigger.
Oh, I guess you just did raise
in the code? That's re-raising an exception, not raising a new one, so it would lead to confusing errors. But yeah print_stack
sounds good. Here's the diff you should apply:
--- a/broker/src/tunneldigger_broker/tunnel.py
+++ b/broker/src/tunneldigger_broker/tunnel.py
@@ -244,6 +244,8 @@ class Tunnel(protocol.HandshakeProtocolMixin, network.Pollable):
"""
logger.info("Closing tunnel {} ({}) after {} seconds (reason=0x{:x})".format(self.tunnel_id, self.uuid, int(time.time() - self.created_time), reason))
+ import traceback
+ traceback.print_stack()
# Run pre-down hook.
self.broker.hook_manager.run_hook(
from tunneldigger.
Okay so this is where close
gets called:
tunneldigger/broker/src/tunneldigger_broker/eventloop.py
Lines 60 to 61 in b8fd97e
epoll is telling us that the socket went bust, so we're closing the socket. Strange. Sounds like this is related to #148. That PR was a bugfix for #143.
from tunneldigger.
Should I separate the EPOLLERR from the EPOLLHUP so we can see which we are dealing with?
from tunneldigger.
I did the separation...
[INFO/tunneldigger.broker] Broker initialized.
[INFO/tunneldigger.broker] Creating tunnel (b6:01:be:53:f0:d5:71:a2:3a:35) with id 100.
[INFO/tunneldigger.tunnel] Set tunnel 100 MTU to 1446.
[INFO/tunneldigger.hooks] Running hook 'session.up' via script '/usr/lib/tunneldigger-broker/hooks/setup'.
[INFO/tunneldigger.broker] Rejecting tunnel (b6:e5:b7:75:f8:b4:6d:3d:9f:41) due to global rate limiting: last tunnel was created too recently
[WARNING/tunneldigger.protocol] Failed to create tunnel (b6:e5:b7:75:f8:b4:6d:3d:9f:41) while processing prepare request.
[INFO/tunneldigger.broker] Creating tunnel (b6:e5:b7:75:f8:b4:6d:3d:9f:41) with id 101.
[INFO/tunneldigger.tunnel] Set tunnel 101 MTU to 1446.
[INFO/tunneldigger.hooks] Running hook 'session.up' via script '/usr/lib/tunneldigger-broker/hooks/setup'.
[INFO/tunneldigger.tunnel] Set tunnel 100 MTU to 1438.
[INFO/tunneldigger.hooks] Running hook 'session.mtu-changed' via script '/usr/lib/tunneldigger-broker/hooks/mtu-changed'.
[INFO/tunneldigger.tunnel] Closing tunnel 101 (b6:e5:b7:75:f8:b4:6d:3d:9f:41) after 26 seconds (reason=0x5)
File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
File "/usr/lib/python3.10/site-packages/tunneldigger_broker/main.py", line 115, in <module>
event_loop.start()
File "/usr/lib/python3.10/site-packages/tunneldigger_broker/eventloop.py", line 61, in start
pollable.close()
File "/usr/lib/python3.10/site-packages/tunneldigger_broker/tunnel.py", line 248, in close
traceback.print_stack()
[INFO/tunneldigger.hooks] Running hook 'session.pre-down' via script '/usr/lib/tunneldigger-broker/hooks/teardown'.
[INFO/tunneldigger.broker] Creating tunnel (b6:e5:b7:75:f8:b4:6d:3d:9f:41) with id 102.
[INFO/tunneldigger.tunnel] Set tunnel 102 MTU to 1446.
[INFO/tunneldigger.hooks] Running hook 'session.up' via script '/usr/lib/tunneldigger-broker/hooks/setup'.
[INFO/tunneldigger.tunnel] Closing tunnel 102 (b6:e5:b7:75:f8:b4:6d:3d:9f:41) after 9 seconds (reason=0x5)
File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
File "/usr/lib/python3.10/site-packages/tunneldigger_broker/main.py", line 115, in <module>
event_loop.start()
File "/usr/lib/python3.10/site-packages/tunneldigger_broker/eventloop.py", line 61, in start
pollable.close()
File "/usr/lib/python3.10/site-packages/tunneldigger_broker/tunnel.py", line 248, in close
traceback.print_stack()
[INFO/tunneldigger.hooks] Running hook 'session.pre-down' via script '/usr/lib/tunneldigger-broker/hooks/teardown'.
eventloop.py", line 61 is the EPOLLERR.
60 elif event & select.EPOLLERR:
61 pollable.close()
62 elif event & select.EPOLLHUP:
63 pollable.close()
from tunneldigger.
Right, so the socket closed in an unexpected way. Why socket got closed... no idea, not sure if we can get the kernel to tell us a reason.
from tunneldigger.
Can you capture a packet trace of the communication between that one client and the broker (as seen from the broker)?
from tunneldigger.
Here is a capture of 2 full sessions including the time up to when the client tries to reconnect.
I hope it brings some new info.
from tunneldigger.
Might be interesting. The timing is pretty close. The first connections was 14 sec, the second 22.
from tunneldigger.
Yeah that looks like an MTU problem. Can you try decreasing the MTU inside the network?
However I didn't think it would just cut the connection then... does l2tp not do fragmentation?
from tunneldigger.
No, l2tp does not do fragmentation. It is UDP based. This is why we have in Tunneldigger max package size discovery. Not sure how you are running it, but we have preset different possible package size and then Tunneldigger picked the largest one which still worked.
from tunneldigger.
But is it really expected that the entire connection dies when a too-big-packet is sent inside the connection?
from tunneldigger.
Hm, tunnel should not die. But TCP connections inside tunnel might have issues (trying to resend but unsuccessfully). But if there is one too-big-packet then there are also probably more?
from tunneldigger.
What seems to be happening is that the kernel sends us a EPOLLERR when a packet that went through the tunnel leads to an outside-the-tunnel "ICMP fragmentation needed" packet. tunneldigger used to entirely ignore EPOLLERR, but that caused #143. Now tunneldigger closes the connection on EPOLLERR, and that means such an "fragmentation needed" packet closes the connection. At least that's my current hypothesis.
Though it is also unclear what to do instead. We can drop the packet, but that will lead to timeouts. Ideally we'd generate an ICMP-fragmentation-needed inside the tunnel, but that's really something the kernel would have to do...
from tunneldigger.
I prefer to have pmtu=0 in the config so that on the broker I can have a bridge for each mtu size. This works flawlessly with c50ef46
As a test, I set pmtu=1446 and also pmtu=1438. In both cases the tunnel stays up for over 5 minutes.
Perhaps now that NAT is removed, the ICMP message from the client is not being processed in the same way that it was before?
But on the other hand, if I change the MTU on the wan port of my stable client (with pmtu=0 on the broker), the pmtu feature works totally fine.
from tunneldigger.
It has been ages since I have done any work around tunneldigger so feel free to ignore my thoughts. But I think there might then be mismatch with MTU set on the inside/tunnel interface?
that on the broker I can have a bridge for each mtu size.
I think we have it like that as well. I think we also do not use kernel's PMTU, but use the one by tunneldigger (if I remember correctly). I am not sure now how we decide to which interface to connect a client though.
Anyway, I think you two seems to have much more recent knowledge about all this so I will leave you both to it. :-)
from tunneldigger.
The messages that is too big is also a strange one -- it is supposedly a control message, and filled up with lots of 0s!
msg_type is CONTROL_TYPE_PMTUD. Ah, so the PMTU probe is itself killing the connection... great.
from tunneldigger.
Perhaps now that NAT is removed, the ICMP message from the client is not being processed in the same way that it was before?
I think this has nothing to do with the NAT patch, it is about #148. You could try commenting out these two lines
tunneldigger/broker/src/tunneldigger_broker/eventloop.py
Lines 60 to 61 in 6cfce8f
However... just ignoring EPOLLERR led to problems in the past, so this is not a proper fix. But it would still be informative to see whether this helps.
from tunneldigger.
You could try commenting out these two lines
tunneldigger/broker/src/tunneldigger_broker/eventloop.py
Lines 60 to 61 in 6cfce8f
However... just ignoring EPOLLERR led to problems in the past, so this is not a proper fix. But it would still be informative to see whether this helps.
With the lines commented out, the problem client stays connected and successfully changes mtu and it's associated bridge
from tunneldigger.
Okay, we learned something!
So... EPOLLERR can indicate a permanent error, as in #143, or a transient error, as in "the control packet you just wanted to send was too big". Dang.
from tunneldigger.
Dang
from tunneldigger.
Could you try this branch? https://github.com/wlanslovenija/tunneldigger/tree/pmtu-fix
It should then ignore the first few EPOLLERR, and that count resets when a message is received (such as the keepalive). Hopefully that is enough to handle #143 while still also fixing your PMTU issues...
FWIW we are not using PMTU discovery in our network so the short-lived clients for us must have a different reason. 🤷
from tunneldigger.
Wed Aug 16 11:39:06 2023 daemon.err python[4098]: [INFO/tunneldigger.broker] Shutting down tunneldigger broker.
Wed Aug 16 11:39:06 2023 daemon.err python[4098]: [INFO/tunneldigger.tunnel] Closing tunnel 101 (b6:01:be:53:f0:d5:71:a2:3a:35) after 20 seconds (reason=0x5)
Wed Aug 16 11:39:06 2023 daemon.err python[4098]: [INFO/tunneldigger.hooks] Running hook 'session.pre-down' via script '/usr/lib/tunneldigger-broker/hooks/teardown'.
Wed Aug 16 11:39:06 2023 kern.info kernel: [497564.843694] br-digger1438: port 1(l2tp101-101) entered blocking state
Wed Aug 16 11:39:06 2023 kern.info kernel: [497564.843697] br-digger1438: port 1(l2tp101-101) entered disabled state
Wed Aug 16 11:39:06 2023 kern.info kernel: [497564.843779] device l2tp101-101 entered promiscuous mode
Wed Aug 16 11:39:06 2023 kern.info kernel: [497564.843829] br-digger1438: port 1(l2tp101-101) entered blocking state
Wed Aug 16 11:39:06 2023 kern.info kernel: [497564.843831] br-digger1438: port 1(l2tp101-101) entered forwarding state
Wed Aug 16 11:39:06 2023 kern.info kernel: [497564.848972] device l2tp101-101 left promiscuous mode
Wed Aug 16 11:39:06 2023 kern.info kernel: [497564.848994] br-digger1438: port 1(l2tp101-101) entered disabled state
Wed Aug 16 11:39:06 2023 daemon.notice netifd: bridge 'br-digger1438' link is down
Wed Aug 16 11:39:06 2023 daemon.notice netifd: Interface 'digger1438' has link connectivity loss
Wed Aug 16 11:39:06 2023 daemon.info olsrd[2611]: Removing interface br-digger1438
Wed Aug 16 11:39:06 2023 daemon.err python[4098]: Traceback (most recent call last):
Wed Aug 16 11:39:06 2023 daemon.err python[4098]: File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
Wed Aug 16 11:39:06 2023 daemon.err python[4098]: File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
Wed Aug 16 11:39:06 2023 daemon.err python[4098]: File "/usr/lib/python3.10/site-packages/tunneldigger_broker/main.py", line 115, in <module>
Wed Aug 16 11:39:06 2023 daemon.err python[4098]: event_loop.start()
Wed Aug 16 11:39:06 2023 daemon.err python[4098]: File "/usr/lib/python3.10/site-packages/tunneldigger_broker/eventloop.py", line 76, in start
Wed Aug 16 11:39:06 2023 daemon.err python[4098]: pollable.close(reason=protocol.ERROR_REASON_FAILURE)
Wed Aug 16 11:39:06 2023 daemon.err python[4098]: TypeError: HookProcess.close() got an unexpected keyword argument 'reason'
from tunneldigger.
@RalfJung I sent you a private email yesterday to the address listed on your webpage. Did you get a chance to read it?
from tunneldigger.
Also, my stable client is losing it's connection freqently with the above patch
Wed Aug 16 11:47:57 2023 daemon.err python[4288]: [INFO/tunneldigger.broker] Creating tunnel (b6:e5:b7:75:f8:b4:6d:3d:9f:41) with id 106.
Wed Aug 16 11:47:57 2023 daemon.err python[4288]: [INFO/tunneldigger.tunnel] Set tunnel 106 MTU to 1446.
Wed Aug 16 11:47:57 2023 daemon.err python[4288]: [INFO/tunneldigger.hooks] Running hook 'session.up' via script '/usr/lib/tunneldigger-broker/hooks/setup'.
Wed Aug 16 11:47:57 2023 daemon.notice netifd: bridge 'br-digger1446' link is up
Wed Aug 16 11:47:57 2023 daemon.info olsrd[2611]: Writing '0' (was 0) to /proc/sys/net/ipv4/conf/br-digger1446/send_redirects
Wed Aug 16 11:47:57 2023 daemon.notice netifd: Interface 'digger1446' has link connectivity
Wed Aug 16 11:47:57 2023 daemon.info olsrd[2611]: Writing '0' (was 0) to /proc/sys/net/ipv4/conf/br-digger1446/rp_filter
Wed Aug 16 11:47:57 2023 daemon.info olsrd[2611]: Adding interface br-digger1446
Wed Aug 16 11:47:57 2023 kern.info kernel: [498096.085857] br-digger1446: port 1(l2tp106-106) entered blocking state
Wed Aug 16 11:47:57 2023 kern.info kernel: [498096.085859] br-digger1446: port 1(l2tp106-106) entered disabled state
Wed Aug 16 11:47:57 2023 kern.info kernel: [498096.085897] device l2tp106-106 entered promiscuous mode
Wed Aug 16 11:47:57 2023 kern.info kernel: [498096.085926] br-digger1446: port 1(l2tp106-106) entered blocking state
Wed Aug 16 11:47:57 2023 kern.info kernel: [498096.085927] br-digger1446: port 1(l2tp106-106) entered forwarding state
Wed Aug 16 11:48:09 2023 daemon.err python[4288]: [INFO/tunneldigger.tunnel] Closing tunnel 106 (b6:e5:b7:75:f8:b4:6d:3d:9f:41) after 12 seconds (reason=0x4)
Wed Aug 16 11:48:09 2023 daemon.err python[4288]: [INFO/tunneldigger.hooks] Running hook 'session.pre-down' via script '/usr/lib/tunneldigger-broker/hooks/teardown'.
Wed Aug 16 11:48:09 2023 kern.info kernel: [498108.098874] device l2tp106-106 left promiscuous mode
Wed Aug 16 11:48:09 2023 kern.info kernel: [498108.098911] br-digger1446: port 1(l2tp106-106) entered disabled state
Wed Aug 16 11:48:09 2023 daemon.notice netifd: bridge 'br-digger1446' link is down
Wed Aug 16 11:48:09 2023 daemon.notice netifd: Interface 'digger1446' has link connectivity loss
Wed Aug 16 11:48:09 2023 daemon.info olsrd[2611]: Removing interface br-digger1446
from tunneldigger.
HookProcess.close() got an unexpected keyword argument 'reason'
Did you completely switch to the branch? That function does have that argument...
Don't just cherry-pick one file. You need to be on the branch entirely.
from tunneldigger.
HookProcess.close() got an unexpected keyword argument 'reason'
Did you completely switch to the branch? That function does have that argument...
I copied https://raw.githubusercontent.com/wlanslovenija/tunneldigger/51be74a18ae7f0f64a8854ed6023646a338e5cb5/broker/src/tunneldigger_broker/eventloop.py directly into the running system. I will do a fresh install of the python code now.
from tunneldigger.
@RalfJung I sent you a private email yesterday to the address listed on your webpage. Did you get a chance to read it?
I got the mail, yeah. But I don't have the setup to connect a client to your server, so server SSH access isn't really helpful, sorry. But thanks for the offer!
from tunneldigger.
With a complete new checkout...
Wed Aug 16 11:56:17 2023 daemon.err python[4795]: [INFO/tunneldigger.broker] Initializing the tunneldigger broker.
Wed Aug 16 11:56:17 2023 daemon.err python[4795]: [INFO/tunneldigger.broker] Registered script '/usr/lib/tunneldigger-broker/hooks/setup' for hook 'session.up'.
Wed Aug 16 11:56:17 2023 daemon.err python[4795]: [INFO/tunneldigger.broker] Registered script '/usr/lib/tunneldigger-broker/hooks/teardown' for hook 'session.pre-down'.
Wed Aug 16 11:56:17 2023 daemon.err python[4795]: [INFO/tunneldigger.broker] Registered script '/usr/lib/tunneldigger-broker/hooks/mtu-changed' for hook 'session.mtu-changed'.
Wed Aug 16 11:56:17 2023 daemon.err python[4795]: [INFO/tunneldigger.broker] Maximum number of tunnels is 1024.
Wed Aug 16 11:56:17 2023 daemon.err python[4795]: [INFO/tunneldigger.broker] Tunnel identifier base is 100.
Wed Aug 16 11:56:17 2023 daemon.err python[4795]: [INFO/tunneldigger.broker] Listening on 77.87.49.11:8943.
Wed Aug 16 11:56:17 2023 daemon.err python[4795]: [INFO/tunneldigger.broker] Broker initialized.
Wed Aug 16 11:56:22 2023 daemon.err python[4795]: [INFO/tunneldigger.broker] Creating tunnel (b6:e5:b7:75:f8:b4:6d:3d:9f:41) with id 100.
Wed Aug 16 11:56:22 2023 daemon.err python[4795]: [INFO/tunneldigger.tunnel] Set tunnel 100 MTU to 1446.
Wed Aug 16 11:56:22 2023 daemon.err python[4795]: [INFO/tunneldigger.hooks] Running hook 'session.up' via script '/usr/lib/tunneldigger-broker/hooks/setup'.
Wed Aug 16 11:56:22 2023 daemon.notice netifd: bridge 'br-digger1446' link is up
Wed Aug 16 11:56:22 2023 daemon.notice netifd: Interface 'digger1446' has link connectivity
Wed Aug 16 11:56:22 2023 daemon.info olsrd[2611]: Writing '0' (was 0) to /proc/sys/net/ipv4/conf/br-digger1446/send_redirects
Wed Aug 16 11:56:22 2023 daemon.info olsrd[2611]: Writing '0' (was 0) to /proc/sys/net/ipv4/conf/br-digger1446/rp_filter
Wed Aug 16 11:56:22 2023 daemon.info olsrd[2611]: Adding interface br-digger1446
Wed Aug 16 11:56:22 2023 kern.info kernel: [498600.767594] br-digger1446: port 1(l2tp100-100) entered blocking state
Wed Aug 16 11:56:22 2023 kern.info kernel: [498600.767598] br-digger1446: port 1(l2tp100-100) entered disabled state
Wed Aug 16 11:56:22 2023 kern.info kernel: [498600.767648] device l2tp100-100 entered promiscuous mode
Wed Aug 16 11:56:22 2023 kern.info kernel: [498600.767691] br-digger1446: port 1(l2tp100-100) entered blocking state
Wed Aug 16 11:56:22 2023 kern.info kernel: [498600.767693] br-digger1446: port 1(l2tp100-100) entered forwarding state
Wed Aug 16 11:56:38 2023 daemon.err python[4795]: [INFO/tunneldigger.tunnel] Closing tunnel 100 (b6:e5:b7:75:f8:b4:6d:3d:9f:41) after 16 seconds (reason=0x4)
Wed Aug 16 11:56:38 2023 daemon.err python[4795]: [INFO/tunneldigger.hooks] Running hook 'session.pre-down' via script '/usr/lib/tunneldigger-broker/hooks/teardown'.
Wed Aug 16 11:56:38 2023 kern.info kernel: [498616.779932] device l2tp100-100 left promiscuous mode
Wed Aug 16 11:56:38 2023 kern.info kernel: [498616.779978] br-digger1446: port 1(l2tp100-100) entered disabled state
Wed Aug 16 11:56:38 2023 daemon.info olsrd[2611]: Removing interface br-digger1446
Wed Aug 16 11:56:38 2023 daemon.notice netifd: bridge 'br-digger1446' link is down
Wed Aug 16 11:56:38 2023 daemon.notice netifd: Interface 'digger1446' has link connectivity loss
Wed Aug 16 11:56:47 2023 daemon.err python[4795]: [INFO/tunneldigger.broker] Creating tunnel (b6:01:be:53:f0:d5:71:a2:3a:35) with id 101.
Wed Aug 16 11:56:47 2023 daemon.err python[4795]: [INFO/tunneldigger.tunnel] Set tunnel 101 MTU to 1446.
Wed Aug 16 11:56:47 2023 daemon.err python[4795]: [INFO/tunneldigger.hooks] Running hook 'session.up' via script '/usr/lib/tunneldigger-broker/hooks/setup'.
Wed Aug 16 11:56:47 2023 daemon.info olsrd[2611]: Writing '0' (was 0) to /proc/sys/net/ipv4/conf/br-digger1446/send_redirects
Wed Aug 16 11:56:47 2023 daemon.info olsrd[2611]: Writing '0' (was 0) to /proc/sys/net/ipv4/conf/br-digger1446/rp_filter
Wed Aug 16 11:56:47 2023 daemon.info olsrd[2611]: Adding interface br-digger1446
Wed Aug 16 11:56:47 2023 daemon.notice netifd: bridge 'br-digger1446' link is up
Wed Aug 16 11:56:47 2023 daemon.notice netifd: Interface 'digger1446' has link connectivity
Wed Aug 16 11:56:48 2023 kern.info kernel: [498626.235669] br-digger1446: port 1(l2tp101-101) entered blocking state
Wed Aug 16 11:56:48 2023 kern.info kernel: [498626.235673] br-digger1446: port 1(l2tp101-101) entered disabled state
Wed Aug 16 11:56:48 2023 kern.info kernel: [498626.235719] device l2tp101-101 entered promiscuous mode
Wed Aug 16 11:56:48 2023 kern.info kernel: [498626.235757] br-digger1446: port 1(l2tp101-101) entered blocking state
Wed Aug 16 11:56:48 2023 kern.info kernel: [498626.235758] br-digger1446: port 1(l2tp101-101) entered forwarding state
Wed Aug 16 11:57:07 2023 daemon.err python[4795]: [INFO/tunneldigger.tunnel] Set tunnel 101 MTU to 1438.
Wed Aug 16 11:57:07 2023 daemon.err python[4795]: [INFO/tunneldigger.hooks] Running hook 'session.mtu-changed' via script '/usr/lib/tunneldigger-broker/hooks/mtu-changed'.
Wed Aug 16 11:57:08 2023 kern.info kernel: [498646.244150] device l2tp101-101 left promiscuous mode
Wed Aug 16 11:57:08 2023 kern.info kernel: [498646.244177] br-digger1446: port 1(l2tp101-101) entered disabled state
Wed Aug 16 11:57:08 2023 daemon.notice netifd: bridge 'br-digger1446' link is down
Wed Aug 16 11:57:08 2023 daemon.notice netifd: Interface 'digger1446' has link connectivity loss
Wed Aug 16 11:57:08 2023 daemon.info olsrd[2611]: Removing interface br-digger1446
Wed Aug 16 11:57:08 2023 daemon.notice netifd: bridge 'br-digger1438' link is up
Wed Aug 16 11:57:08 2023 daemon.notice netifd: Interface 'digger1438' has link connectivity
Wed Aug 16 11:57:08 2023 daemon.info olsrd[2611]: Writing '0' (was 0) to /proc/sys/net/ipv4/conf/br-digger1438/send_redirects
Wed Aug 16 11:57:08 2023 daemon.info olsrd[2611]: Writing '0' (was 0) to /proc/sys/net/ipv4/conf/br-digger1438/rp_filter
Wed Aug 16 11:57:08 2023 daemon.info olsrd[2611]: Adding interface br-digger1438
Wed Aug 16 11:57:08 2023 kern.info kernel: [498646.283740] br-digger1438: port 1(l2tp101-101) entered blocking state
Wed Aug 16 11:57:08 2023 kern.info kernel: [498646.283743] br-digger1438: port 1(l2tp101-101) entered disabled state
Wed Aug 16 11:57:08 2023 kern.info kernel: [498646.283835] device l2tp101-101 entered promiscuous mode
Wed Aug 16 11:57:08 2023 kern.info kernel: [498646.283886] br-digger1438: port 1(l2tp101-101) entered blocking state
Wed Aug 16 11:57:08 2023 kern.info kernel: [498646.283888] br-digger1438: port 1(l2tp101-101) entered forwarding state
Wed Aug 16 11:57:40 2023 daemon.err python[4795]: [INFO/tunneldigger.tunnel] Set tunnel 102 MTU to 1446.
Wed Aug 16 11:57:40 2023 daemon.err python[4795]: [INFO/tunneldigger.hooks] Running hook 'session.up' via script '/usr/lib/tunneldigger-broker/hooks/setup'.
Wed Aug 16 11:57:40 2023 daemon.info olsrd[2611]: Writing '0' (was 0) to /proc/sys/net/ipv4/conf/br-digger1446/send_redirects
Wed Aug 16 11:57:40 2023 daemon.info olsrd[2611]: Writing '0' (was 0) to /proc/sys/net/ipv4/conf/br-digger1446/rp_filter
Wed Aug 16 11:57:40 2023 daemon.info olsrd[2611]: Adding interface br-digger1446
Wed Aug 16 11:57:40 2023 daemon.notice netifd: bridge 'br-digger1446' link is up
Wed Aug 16 11:57:40 2023 daemon.notice netifd: Interface 'digger1446' has link connectivity
Wed Aug 16 11:57:40 2023 kern.info kernel: [498678.901008] br-digger1446: port 1(l2tp102-102) entered blocking state
Wed Aug 16 11:57:40 2023 kern.info kernel: [498678.901012] br-digger1446: port 1(l2tp102-102) entered disabled state
Wed Aug 16 11:57:40 2023 kern.info kernel: [498678.901061] device l2tp102-102 entered promiscuous mode
Wed Aug 16 11:57:40 2023 kern.info kernel: [498678.901100] br-digger1446: port 1(l2tp102-102) entered blocking state
Wed Aug 16 11:57:40 2023 kern.info kernel: [498678.901101] br-digger1446: port 1(l2tp102-102) entered forwarding state
Wed Aug 16 11:57:51 2023 authpriv.info dropbear[4907]: Child connection from 161.35.114.58:37058
Wed Aug 16 11:57:51 2023 authpriv.warn dropbear[4907]: Login attempt for nonexistent user
Wed Aug 16 11:57:51 2023 authpriv.info dropbear[4907]: Exit before auth from <161.35.114.58:37058>: Exited normally
Wed Aug 16 11:57:57 2023 daemon.err python[4795]: [INFO/tunneldigger.tunnel] Closing tunnel 102 (b6:e5:b7:75:f8:b4:6d:3d:9f:41) after 17 seconds (reason=0x4)
Wed Aug 16 11:57:57 2023 daemon.err python[4795]: [INFO/tunneldigger.hooks] Running hook 'session.pre-down' via script '/usr/lib/tunneldigger-broker/hooks/teardown'.
Wed Aug 16 11:57:57 2023 kern.info kernel: [498695.912741] device l2tp102-102 left promiscuous mode
Wed Aug 16 11:57:57 2023 kern.info kernel: [498695.912767] br-digger1446: port 1(l2tp102-102) entered disabled state
Wed Aug 16 11:57:57 2023 daemon.notice netifd: bridge 'br-digger1446' link is down
Wed Aug 16 11:57:57 2023 daemon.notice netifd: Interface 'digger1446' has link connectivity loss
Wed Aug 16 11:57:57 2023 daemon.info olsrd[2611]: Removing interface br-digger1446
reason 0x4
from tunneldigger.
0x4 is the new 0x5, I changed the error on "connection closed due to epoll error" from ERROR_REASON_UNDEFINED to ERROR_REASON_FAILURE.
But why you would now see this for both clients, I cannot explain.
from tunneldigger.
@RalfJung I sent you a private email yesterday to the address listed on your webpage. Did you get a chance to read it?
I got the mail, yeah. But I don't have the setup to connect a client to your server, so server SSH access isn't really helpful, sorry. But thanks for the offer!
I don't think it's necessary for you to have your own client. You can just rely on the one stable client and the one unstable client which are already connecting. But if you aren't interested in ssh access, that's also fine.
from tunneldigger.
0x5 is the new 0x5, I changed the error on "connection closed due to epoll error" from ERROR_REASON_UNDEFINED to ERROR_REASON_FAILURE.
But why you would now see this for both clients, I cannot explain.
I already updated my comment above. I wrote too quickly. The client with uuid ending in 9f:41 is the unstable client. The client with uuid ending in 3a:35 is the stable client.
from tunneldigger.
Ah okay. So it doesn't make things worse. But it also doesn't help. Dang.
from tunneldigger.
I did a tcpdump on the working client, which changes mtu successfully. The "Destination unreachable (Fragmentation need)" packet is not coming in.
So, I assume that the icmp message is not coming from the problem client but instead some router between the problem client and the broker.
Just trying to make sure the situation is clear.
from tunneldigger.
And I just checked to make sure. There are no icmp messages going out of the problem client.
from tunneldigger.
I got it working!
self.error_count[fd] += 1
if self.error_count[fd] >= 10:
pollable.close(reason=protocol.ERROR_REASON_FAILURE)
pollable.read(file_object)
read the object to get it out of the buffer??? Anyhow, the problem client is changing mtu's and not disconnecting after just a few seconds.
from tunneldigger.
Probably read is not the right call to make. Because after another test, I got the following error
Wed Aug 16 13:14:28 2023 daemon.err python[7737]: [ERROR/tunneldigger.broker] Unhandled top-level exception:
Wed Aug 16 13:14:28 2023 daemon.err python[7737]: [ERROR/tunneldigger.broker] Traceback (most recent call last):
Wed Aug 16 13:14:28 2023 daemon.err python[7737]: File "/usr/lib/python3.10/site-packages/tunneldigger_broker/main.py", line 116, in <module>
Wed Aug 16 13:14:28 2023 daemon.err python[7737]: event_loop.start()
Wed Aug 16 13:14:28 2023 daemon.err python[7737]: File "/usr/lib/python3.10/site-packages/tunneldigger_broker/eventloop.py", line 79, in start
Wed Aug 16 13:14:28 2023 daemon.err python[7737]: pollable.read(file_object)
Wed Aug 16 13:14:28 2023 daemon.err python[7737]: File "/usr/lib/python3.10/site-packages/tunneldigger_broker/hooks.py", line 90, in read
Wed Aug 16 13:14:28 2023 daemon.err python[7737]: data = file_object.read()
Wed Aug 16 13:14:28 2023 daemon.err python[7737]: ValueError: read of closed file
Wed Aug 16 13:14:28 2023 daemon.err python[7737]:
And on the broker, I'm getting the fragmentation needed message 12 times. Is there a better way to flush the queue than to read?
from tunneldigger.
Ok. so now I seperated the ERR and HUP cases and it seems to work reliably:
if event & select.EPOLLIN:
pollable.read(file_object)
# we got a message here, reset error count -- if there was an error, it probably was transient
self.error_count[fd] = 0
elif event & select.EPOLLHUP:
pollable.close(reason=protocol.ERROR_REASON_FAILURE)
elif event & select.EPOLLERR:
# Here we have a problem. This can indicate permanent connection failure
# (https://github.com/wlanslovenija/tunneldigger/issues/143), or it can
# indicate that we sent a packet that was too big (e.g. the PMTU probe
# reply, see https://github.com/wlanslovenija/tunneldigger/issues/171).
# To distinguish the two, we count how many consecutive errors we see without a proper `read` in between.
# If that reaches a threshold, we consider this error permanent and close the connection.
# PMTU discovery sends 6 probes, so 10 should be safe as a threshold.
self.error_count[fd] += 1
if self.error_count[fd] >= 10:
pollable.close(reason=protocol.ERROR_REASON_FAILURE)
pollable.read(file_object)
After 6 connections, manually re-established by me, there is no "read of closed file" errors anymore
from tunneldigger.
Hm, interesting that that helps. We certainly shouldn't close-then-read. But I am surprised that read makes any difference when there's no EPOLLIN
...
from tunneldigger.
We certainly shouldn't close-then-read.
very true. reworked
self.error_count[fd] += 1
pollable.read(file_object)
if self.error_count[fd] >= 10:
pollable.close(reason=protocol.ERROR_REASON_FAILURE)
from tunneldigger.
I am surprised that read makes any difference when there's no
EPOLLIN
I'm not really sure either. But I guess the error message is something which can be somehow read and which anyhow should be considered to be different than normal IN data.
from tunneldigger.
So once you get EPOLLERR error, you can get the underlying errno using getsockopt
: https://stackoverflow.com/questions/13150377/how-to-get-errno-when-epoll-wait-returns-epollerr
Maybe that could help to know when the EPOLLERR error is something to ignore and when something to act upon?
from tunneldigger.
We're doing a read
on the socket now, which also seems to return that pending error.
from tunneldigger.
Sure, but shouldn't we count based on the error type and not use read to log it?
from tunneldigger.
Not sure how I would factor the error type into the counting...
Do you mean, entirely ignore the error if it's EMSGSIZE, not even add it to the count? I guess we could, though I doubt it is necessary.
from tunneldigger.
Do you mean, entirely ignore the error if it's EMSGSIZE, not even add it to the count?
Yea, for example. So maybe ignore if error is EMSGSIZE, but otherwise immediately close it and not even count?
from tunneldigger.
Other errors can be transient too... I'm seeing a bunch of [Errno 111] Connection refused
and at least some of these clients actually have somewhat working connections... just occasionally one of the packets gets rejected for some reason.
from tunneldigger.
And some connections see just a few of these errors initially and then they work fine.
from tunneldigger.
So, after leaving the system on overnight (commit fdbb096) I can report that everything seems to be running stable.
Thank you everyone for helping to squash this bug
from tunneldigger.
Just updated to d538780 and it seems to be working well.
Maybe it's a good time to update
Line 5 in d538780
from tunneldigger.
I've never been involved in a release, but yeah I'm all for making another one. @mitar any objections? I'd say we should wait a week to see if unexpected issues come up in production (I deployed d538780 to all our servers this night).
@pmelange do you want to make a PR changing the setup.py version and, if necessary, completing the changelog in HISTORY.rst?
from tunneldigger.
No objections.
from tunneldigger.
But probably we should write up a good changelog, like the whole fact that NAT is not necessary and all other stuff since the last release What all has changed since the last release?
No objections. Tell me if you need any permissions for anything.
from tunneldigger.
Are we talking about going to 0.4.0 or 0.4.0-dev2?
from tunneldigger.
We could go with 0.4.0-dev2 for now and then in few weeks also do 0.4.0 if people test it. If we think people will test a released 0.4.0-dev2. But my feeling is that we do not have such a community of eager beta testers so it might be easier to just go with 0.4.0 and then if bugs are reported we release a bugfix release and this is it.
from tunneldigger.
ok. then I will work on 0.4.0. BTW, there are a LOT of commits since 0.3.0 (Sun Apr 2 13:05:23 2017). It's going to take a while to go through all the commits and create a changelog :/
git diff v0.3.0 --stat
49 files changed, 1332 insertions(+), 1026 deletions(-)
from tunneldigger.
I thought 0.4.0-dev1 was just a name for "somewhere along the way to 0.4.0". I don't see the point to tag a dev version (and there are no such tags in the repo).
from tunneldigger.
Related Issues (20)
- [Docs] max_tunnels HOT 3
- Failed to send() control packet HOT 5
- High CPU load due to a single misbehaving client HOT 2
- Change rate limiting to be per-UUID HOT 1
- No releases since 2017? HOT 3
- The client can get stuck in a high-frequency retry loop despite working brokers HOT 10
- TC/Traffic Control: Error: Invalid handle. HOT 1
- Review CI Tooling HOT 19
- Newer Kernels log error "recv short packet" for every broker packet HOT 7
- How does tunneldigger work compared to l2tp? HOT 5
- TC/Traffic Control does not always work HOT 1
- Proposal: Broker usage check on reconnect
- tunneldigger-broker: connection fails with `Error: Invalid handle.` HOT 3
- Silence `tc` output when `ignore_fails` is true
- teardown script crashes tunneldigger-broker HOT 5
- broker throwing OSError on creation of timers HOT 1
- setup.py is deprecated HOT 4
- Broker: Wait for interface to have an IP before listening HOT 2
- Update CI to ubuntu-22.04
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tunneldigger.