nlnog / nlnog-ring Goto Github PK
View Code? Open in Web Editor NEWNLNOG Server Ring Project
Home Page: http://ring.nlnog.net/
NLNOG Server Ring Project
Home Page: http://ring.nlnog.net/
11:03 @cmouse some kind of automatic disable for nodes that are persistently gone would be nice
11:03 @cmouse so that they would be removed from 'ring-all' etc.
11:03 @cmouse until fixed
11:04 @Teun yeah, they slow down many ring tools
11:10 @cmouse prhaps if node is unreachable for more than 2 days it would be automatically removed
11:10 @cmouse and notified
11:10 @cmouse would get reinserted if it stays reachable for 24 hours
13:39 < sid3windr> does this include revoking ssh keys after 7d unreach? ;>
make 'pastebinit' work with a ring pastebin
It's too much work to maintain all sshkeys for all participants, participants should be able to add and remove their sshkeys via webinterface or email interface or something else.
two nice patches which could be useful for the mtr on the nodes:
http://www.roderick.triple-it.nl/blog/2010/12/29/mtr-with-aslookup/ (ASN lookups)
http://prolixium.com/files/code/mtr-patches/ (MPLS labels)
suppress the ssh login failure messages to stdout while the ring-trace in running, as it gives the impression that the application is failing when it is actually busy finishing its routine.
Reported by Ad Trouwborst [email protected]
$ ring-trace -n 2 ring.nlnog.net
ring-trace v1.8.1 - written by Teun Vink [email protected]
picked 2 hosts at random: voxel02 claranet06
Performing ICMP traceroutes towards ring.nlnog.net from 2 ring hosts, ssh-timeout is 10 seconds.
Failed to lookup ASNs at cymru.com.
Traceback (most recent call last):
File "/usr/local/bin/ring-trace", line 1453, in
ns.show_country, ns.remove_broken, ns.highlight_ixp, ns.remove_duplicate, ns.transparent)
File "/usr/local/bin/ring-trace", line 825, in graph
"\n%s" % tracedata[ips[index]]['fqdn'] if (index == 0 or resolve) else "",
KeyError: 'fqdn'
$
It would be fantastic if some sort of 'outbound' mode is added to ring-trace, which would mean that ring-trace does an mtr on localhost towards all other RING nodes and makes a map from that. This way you can easily create a forward and a reverse ring-trace to assess the overall topology in both directions.
[11:02:10] host jump01 done in 8.1 seconds.
[11:02:10] host tenet01 done in 4.9 seconds.
[11:02:12] host yourorg01 done in 5.6 seconds.
[11:02:13] host voxel02 done in 6.6 seconds.
ssh: connect to host apnic01.ring.nlnog.net port 22: Connection timed out
[11:02:14] host apnic01 done in 13.5 seconds.
ssh: connect to host globalaxs02.ring.nlnog.net port 22: Connection timed out
[11:02:15] host globalaxs02 done in 13.5 seconds.
[11:02:16] host voxel01 done in 9.3 seconds.
ssh: connect to host ic-hosting01.ring.nlnog.net port 22: Connection timed out
[11:02:16] host ic-hosting01 done in 14.1 seconds.
ssh: connect to host nedzone01.ring.nlnog.net port 22: Connection timed out
[11:02:16] host nedzone01 done in 13.2 seconds.
[11:02:16] host gossamerthreads01 done in 14.6 seconds.
[11:02:16] host seeweb01 done in 12.1 seconds.
ssh: connect to host leaseweb01.ring.nlnog.net port 22: Connection timed out
[11:02:23] host leaseweb01 done in 20.3 seconds.
ssh: connect to host signet01.ring.nlnog.net port 22: Connection timed out
[11:02:25] host signet01 done in 20.4 seconds.
ssh: connect to host surfnet01.ring.nlnog.net port 22: Connection timed out
[11:02:26] host surfnet01 done in 20.3 seconds.
[11:02:26] analysing traces.
[11:02:27] Failed to parse ASN lookup line: Error: no ASN or IP match on line 633.
[11:02:27] Looking up DNS entries.
[11:02:38] 679 DNS lookups done.
[11:02:38] Checking IXP lists.
[11:02:50] generating color table.
[11:02:50] generating graphs.
Traceback (most recent call last):
File "/usr/local/bin/ring-trace", line 1284, in
ns.show_country, ns.remove_broken, ns.highlight_ixp, ns.remove_duplicate)
File "/usr/local/bin/ring-trace", line 745, in graph
asns.insert(0, a[0])
IndexError: list index out of range
job@master01:~$
occaid01 is causing this crash, most probably due to the lack of IPv4.
job: !ring-ping -6 2a02:ce0:9::e2f8:47ff:fe13:3836
nlnog: 2a02:ce0:9::e2f8:47ff:fe13:3836: 99 servers: 181ms average
nlnog: 2a02:ce0:9::e2f8:47ff:fe13:3836: unreachable from: boxed-it01
nlnog: ssh connection failed: apnic01
If one node out of ~100 fails, I don't think that's very interesting to know for each ping.
Proposal:
if less than x% (x=5?) of nodes are not responding (ssh connection failed etc.), just don't mention it at all, or perhaps just report a count on one of the first two lines, thus reducing noise.
need sum firewalling
A broken path causes a crash when using -a and -B together:
pels@fizzix:~/Desktop$ ring-trace -R -6 -a -X -t png -n 0 -i as250net01 -B public01.infra.ring.nlnog.net
ring-trace v1.7.1 - written by Teun Vink [email protected]
Including 1 hosts: as250net01
Performing ICMP traceroutes towards public01.infra.ring.nlnog.net from 1 ring hosts, ssh-timeout is 10 seconds.
Traceback (most recent call last):
File "/home/pels/bin/ring-trace", line 1320, in
ns.show_country, ns.remove_broken, ns.highlight_ixp, ns.remove_duplicate, ns.transparent)
File "/home/pels/bin/ring-trace", line 765, in graph
asns.insert(0, a[0])
IndexError: list index out of range
This is triggered by the unfinished traceroute:
artin@as250net01:~$ mtr --report www.ams-ix.net
HOST: as250net01 Loss% Snt Last Avg Best Wrst StDev
1.|-- 2001:4ce8:fbb0::f128 0.0% 11 0.1 0.2 0.1 1.6 0.5
2.|-- ??? 100.0 10 0.0 0.0 0.0 0.0 0.0
When using only -a or only -B no crash occurs.
waiting for the DNS resolver to timeout is often the longest wait, doing those lookups in parallel will be faster.
ignore IPv6-only hosts when doing a ring-ping to an IPv4-only destination so the results won't get messed up.
Feature request: remove specific addreses from ring traces, since sometimes routers with multiple interfaces mess up traces.
It could be good to have DNS LOC entries, with GPS position of nodes directly in DNS, to be able to use this information.
An example of such thing is on http://hewgill.com/tools/dnsloc (try with nautile01.ring.nlnog.net).
this stuff is mega cool - http://erg.wand.net.nz/amp/matrix.php/ipv4/latency/NZ/
Put a timestamp in the ringtrace comment.
would be nice if participants can (voluntary) setup ebgp multi-hop session to this bird instance and announce all they have in ipv4 and ipv6 table.
ring users could ssh to this machine and will drop straight into looking glass software
[11:39] lochii:#ring wanted to know if possible to run any form of httpd on our ring nodes, serving nothing (just replying 200)
[11:39] lochii:#ring and by run, I mean puppeted
[11:39] lochii:#ring don't care what it is, as long as it can answer 200
[11:40] lochii:#ring reason being, want to integrate it with some external monitoring which is only able to do HTTP health checks
[11:46] lochii:#ring anything 2xx is fine
[12:36] job:#ring lochii: not a bad idea. Please email ring-admins as a reminder
[12:37] job:#ring Or open an issue on github
[12:38] lochii:#ring ack, thanks
[12:41] job:#ring And we can use those pages to promote the ring
Add an additional - hop in -a mode so it's clear multiple boxes from one org are in the same ASN.
DNS resolving (-r) doesn't work for text output (-t txt):
tdc@tdc01:~$ ring-trace -4 -t txt -n 1 -r www.cisco.com
ring-trace v1.8.1 - written by Teun Vink <[email protected]>
picked 1 host at random: claranet05
Performing ICMP traceroutes towards www.cisco.com from 1 ring hosts, ssh-timeout is 10 seconds.
traceroutes to 23.78.47.242 generated by ring-trace 1.8.1 at 2017-05-30 19:12:43
Node: claranet05
----------------
1.|-- 62.240.228.2 0.0% 1 0.3 0.3 0.3 0.3 0.0
2.|-- 212.43.193.114 0.0% 1 4.9 4.9 4.9 4.9 0.0
3.|-- 62.240.250.213 0.0% 1 1.3 1.3 1.3 1.3 0.0
4.|-- 213.242.120.69 0.0% 1 1.9 1.9 1.9 1.9 0.0
5.|-- 4.69.140.30 0.0% 1 1.5 1.5 1.5 1.5 0.0
6.|-- 141.136.103.181 0.0% 1 1.3 1.3 1.3 1.3 0.0
7.|-- 89.149.181.146 0.0% 1 1.3 1.3 1.3 1.3 0.0
8.|-- 46.33.95.178 0.0% 1 1.6 1.6 1.6 1.6 0.0
9.|-- 78.152.47.136 0.0% 1 26.1 26.1 26.1 26.1 0.0
10.|-- 78.152.57.10 0.0% 1 29.1 29.1 29.1 29.1 0.0
11.|-- 23.78.47.242 0.0% 1 25.9 25.9 25.9 25.9 0.0
Add a flag to filter IXP hops from graphs to simplify graphs.
Also, hilighting IXP hops when not removed can be useful.
Needed: a list of IXP addresses, Rodecker might be able to provide this.
All shell command's and as much as possible should be logged (over encrypted connection) to one or two masterservers and be stored for future reference in case of abuse.
so we don't manually have to do 'sudo adduser $participant admin' on freshly provisioned machines
ring-trace is quite ignorant of the result of the ring-trace command. This can result in problems, e.g. when a new host key is offered and user input is needed, but also when the SSH connection fails.
Better handling of the actual SSH command should fix this.
It is not possible to trace only from a specific set of hosts. -n needs to be at least 1 at the moment:
$ ring-trace -i zylon01 -i interconnect01 -i solido01 -i prolocation01 -i totaalnet01 -i cyso01 -i msp01 128.0.0.1
ring-trace v0.9.9 - written by Teun Vink [email protected]
Including 7 hosts: zylon01 interconnect01 solido01 prolocation01 totaalnet01 cyso01 msp01
Performing ICMP traceroutes towards 128.0.0.1 from 49 ring hosts.
$ ring-trace -n 0 -i zylon01 -i interconnect01 -i solido01 -i prolocation01 -i totaalnet01 -i cyso01 -i msp01 128.0.0.1
ring-trace v0.9.9 - written by Teun Vink [email protected]
Including 7 hosts: zylon01 interconnect01 solido01 prolocation01 totaalnet01 cyso01 msp01
Performing ICMP traceroutes towards 128.0.0.1 from 49 ring hosts.
$ ring-trace -n 1 -i zylon01 -i interconnect01 -i solido01 -i prolocation01 -i totaalnet01 -i cyso01 -i msp01 128.0.0.1
ring-trace v0.9.9 - written by Teun Vink [email protected]
picked 1 host at random: portlane01
Including 7 hosts: zylon01 interconnect01 solido01 prolocation01 totaalnet01 cyso01 msp01
Performing ICMP traceroutes towards 128.0.0.1 from 8 ring hosts.
Created trace-128.0.0.1.jpg in 6.6 seconds.
arrange with puppet facts
"traceroute -I " has been requested multiple times by now :-)
Feature request: make traces from all nodes, but only graph the top/bottom X hosts based on hop count, latency and possibly other characteristics of the trace.
Give outsiders more access to how we manage the ring
Also master01 should pull from this repo after some checking has been done
All broken hops (resulting in * * *) are graphed as one single node in the graph, though it might actually represent a large number of different hops.
Ignoring these hops results into improper graphs, so the best solution would be to create seperate nodes (and edges) for these hops.
We've been tracing a reachability issue trying to detect which ASN is performing DPI on incoming DNS packets towards one of our root servers and blocking some of them.
Ultimately we used ping-trace
to build a view of the topology facing that server, and then had to manually run tests on the chosen nodes to find out which ones had the query blocked, and from that determine which common ASN was responsible for that blocking.
It would be really useful if ping-trace
could run a secondary command alongside the mtr tracer, and then indicate in the 0th hop output what the return status of that command was. This would allow us at a single glance to identify which group of topologically connected nodes are causing an issue.
Here are two identical runs of ring-trace, where the first one crashes and the second one went OK:
tore@stat:~$ /stat/local/nlnog-ring/scripts/ring-trace -ab4X -n 20 dns.i.bitbit.net
ring-trace v1.6.1 - written by Teun Vink <[email protected]>
picked 20 hosts at random: speedpartner01 rrbone01 rootlu01 dcsone01 nexellent01 nxs01 strato01 enestdata01 hosteam01 melbourne01 softlayer06 nuqe01 bluezonejordan01 ebayclassifiedsgroup01 sixdegrees01 onet02 poznan01 dyn01 softlayer01 openminds01
Performing ICMP traceroutes towards dns.i.bitbit.net from 20 ring hosts, ssh-timeout is 10 seconds.
ssh: connect to host softlayer06.ring.nlnog.net port 22: Connection refused
Host softlayer06 returned no trace.
Traceback (most recent call last):
File "/stat/local/nlnog-ring/scripts/ring-trace", line 1293, in <module>
ns.show_country, ns.remove_broken, ns.highlight_ixp, ns.remove_duplicate)
File "/stat/local/nlnog-ring/scripts/ring-trace", line 737, in graph
a = [(tracedata[ip].get("asn", "unknown"),tracedata[ip].get("ix", False),tracedata[ip].get("desc", "unknown"), "") for ip in ips if not (tracedata[ip].get("ix", False) and no_ixp)]
KeyError: ''
tore@stat:~$ /stat/local/nlnog-ring/scripts/ring-trace -ab4X -n 20 dns.i.bitbit.net
ring-trace v1.6.1 - written by Teun Vink <[email protected]>
picked 20 hosts at random: rackfish01 is01 ripe01 hostway01 rbnetwork02 afilias01 claranet06 signet01 softlayer04 dyn01 westnet01 atrato03 tetaneutral01 direcpath01 hurricane01 ualbany01 isc01 cambrium01 siminn01 ispservices01
Performing ICMP traceroutes towards dns.i.bitbit.net from 20 ring hosts, ssh-timeout is 10 seconds.
Image uploaded to https://ring.nlnog.net/paste/p/1v4d2w4tvrm2a4c4
Done in 26.4 seconds.
For use in powerpoint
A .ring-tracerc file would be useful, so we can set default arguments. This might require a rewrite of commandline arguments so each option can be disabled as well as enabled.
Feature request: show only the AS-path trace instead of the actual IP's. This would result into simpler graphs.
Some kind of wrapper that can manage the sshkeys for users would be nice. I imagine the following:
user types ring-change-sshkeys on a ring-node, and it will open up the preferred editor and it should like like typing 'crontab -e'.
The first few lines of the authorized_keys file that is displayed should be some comments which contain warnings and what the file means.
The user can then edit the file to it's liking and when quiting the editor the resulting data should be validated:
If an error is found the user should be presented with a choice: discard all changes or open the editor again.
If the file is valid, it can be emailed to ring-admins@ in puppetized format and incorporated in the repos. In the future we can automate this if we have reason to trust the system.
So we can make a meaningful ring-dns.
Need to work out how to push the resolver IPs into puppet (field in the webform? retrieve from node's resolv.conf?)
ring-trace needs a TCP-based DNS query to retrieve the list of nodes. If TCP/53 is blocked a better error message should be given, or a workaround should be provided.
We require a db backend for this, habbie has already written some python to help in that space
Current steps are
this should be easier...
Optionally remove duplicate edges to reduce graph complexity.
We need a new logo. I will contact a graphics designer. :-)
standard MOTD should contain at least
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.