Giter Site home page Giter Site logo

nlnog-ring's People

Contributors

arjenz avatar arkadiusznowicki avatar habbie avatar jan-vandenberg avatar job avatar jonglezb avatar kev6565 avatar leoluk avatar madeddie avatar marlinc avatar miekg avatar nickhilliard avatar pieterlexis avatar qvr avatar raybellis avatar rbianic avatar richih avatar rodecker avatar sparkeh avatar sspans avatar teunvink avatar wk avatar ytti avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

nlnog-ring's Issues

Automatically deactivate unreachable nodes

11:03 @cmouse some kind of automatic disable for nodes that are persistently gone would be nice
11:03 @cmouse so that they would be removed from 'ring-all' etc.
11:03 @cmouse until fixed
11:04 @Teun yeah, they slow down many ring tools
11:10 @cmouse prhaps if node is unreachable for more than 2 days it would be automatically removed
11:10 @cmouse and notified
11:10 @cmouse would get reinserted if it stays reachable for 24 hours
13:39 < sid3windr> does this include revoking ssh keys after 7d unreach? ;>

ring pastebin

make 'pastebinit' work with a ring pastebin

  • this pastebin only accessible from ring ip's
  • good promotion for the ring
  • html viewer option (ring-all output is in markdown notation)

Redesign sshkey control

It's too much work to maintain all sshkeys for all participants, participants should be able to add and remove their sshkeys via webinterface or email interface or something else.

Supress failure messages

suppress the ssh login failure messages to stdout while the ring-trace in running, as it gives the impression that the application is failing when it is actually busy finishing its routine.

Reported by Ad Trouwborst [email protected]

ring-trace breaks when whois.cymru.com is down

$ ring-trace -n 2 ring.nlnog.net
ring-trace v1.8.1 - written by Teun Vink [email protected]
picked 2 hosts at random: voxel02 claranet06
Performing ICMP traceroutes towards ring.nlnog.net from 2 ring hosts, ssh-timeout is 10 seconds.
Failed to lookup ASNs at cymru.com.
Traceback (most recent call last):
File "/usr/local/bin/ring-trace", line 1453, in
ns.show_country, ns.remove_broken, ns.highlight_ixp, ns.remove_duplicate, ns.transparent)
File "/usr/local/bin/ring-trace", line 825, in graph
"\n%s" % tracedata[ips[index]]['fqdn'] if (index == 0 or resolve) else "",
KeyError: 'fqdn'
$

outbound mode for ring-trace

It would be fantastic if some sort of 'outbound' mode is added to ring-trace, which would mean that ring-trace does an mtr on localhost towards all other RING nodes and makes a map from that. This way you can easily create a forward and a reverse ring-trace to assess the overall topology in both directions.

ring-trace crash

[11:02:10] host jump01 done in 8.1 seconds.
[11:02:10] host tenet01 done in 4.9 seconds.
[11:02:12] host yourorg01 done in 5.6 seconds.
[11:02:13] host voxel02 done in 6.6 seconds.
ssh: connect to host apnic01.ring.nlnog.net port 22: Connection timed out
[11:02:14] host apnic01 done in 13.5 seconds.
ssh: connect to host globalaxs02.ring.nlnog.net port 22: Connection timed out
[11:02:15] host globalaxs02 done in 13.5 seconds.
[11:02:16] host voxel01 done in 9.3 seconds.
ssh: connect to host ic-hosting01.ring.nlnog.net port 22: Connection timed out
[11:02:16] host ic-hosting01 done in 14.1 seconds.
ssh: connect to host nedzone01.ring.nlnog.net port 22: Connection timed out
[11:02:16] host nedzone01 done in 13.2 seconds.
[11:02:16] host gossamerthreads01 done in 14.6 seconds.
[11:02:16] host seeweb01 done in 12.1 seconds.
ssh: connect to host leaseweb01.ring.nlnog.net port 22: Connection timed out
[11:02:23] host leaseweb01 done in 20.3 seconds.
ssh: connect to host signet01.ring.nlnog.net port 22: Connection timed out
[11:02:25] host signet01 done in 20.4 seconds.
ssh: connect to host surfnet01.ring.nlnog.net port 22: Connection timed out
[11:02:26] host surfnet01 done in 20.3 seconds.
[11:02:26] analysing traces.
[11:02:27] Failed to parse ASN lookup line: Error: no ASN or IP match on line 633.
[11:02:27] Looking up DNS entries.
[11:02:38] 679 DNS lookups done.
[11:02:38] Checking IXP lists.
[11:02:50] generating color table.
[11:02:50] generating graphs.
Traceback (most recent call last):
File "/usr/local/bin/ring-trace", line 1284, in
ns.show_country, ns.remove_broken, ns.highlight_ixp, ns.remove_duplicate)
File "/usr/local/bin/ring-trace", line 745, in graph
asns.insert(0, a[0])
IndexError: list index out of range
job@master01:~$

occaid01 is causing this crash, most probably due to the lack of IPv4.

ring-ping (irc): hide errors

  job:  !ring-ping -6 2a02:ce0:9::e2f8:47ff:fe13:3836
nlnog:  2a02:ce0:9::e2f8:47ff:fe13:3836: 99 servers: 181ms average
nlnog:  2a02:ce0:9::e2f8:47ff:fe13:3836: unreachable from: boxed-it01
nlnog:  ssh connection failed: apnic01

If one node out of ~100 fails, I don't think that's very interesting to know for each ping.

Proposal:
if less than x% (x=5?) of nodes are not responding (ssh connection failed etc.), just don't mention it at all, or perhaps just report a count on one of the first two lines, thus reducing noise.

Crash when -a -B encounters incomplete trace

A broken path causes a crash when using -a and -B together:

pels@fizzix:~/Desktop$ ring-trace -R -6 -a -X -t png -n 0 -i as250net01 -B public01.infra.ring.nlnog.net
ring-trace v1.7.1 - written by Teun Vink [email protected]
Including 1 hosts: as250net01
Performing ICMP traceroutes towards public01.infra.ring.nlnog.net from 1 ring hosts, ssh-timeout is 10 seconds.
Traceback (most recent call last):
File "/home/pels/bin/ring-trace", line 1320, in
ns.show_country, ns.remove_broken, ns.highlight_ixp, ns.remove_duplicate, ns.transparent)
File "/home/pels/bin/ring-trace", line 765, in graph
asns.insert(0, a[0])
IndexError: list index out of range

This is triggered by the unfinished traceroute:

artin@as250net01:~$ mtr --report www.ams-ix.net
HOST: as250net01 Loss% Snt Last Avg Best Wrst StDev
1.|-- 2001:4ce8:fbb0::f128 0.0% 11 0.1 0.2 0.1 1.6 0.5
2.|-- ??? 100.0 10 0.0 0.0 0.0 0.0 0.0

When using only -a or only -B no crash occurs.

create routeviews.ring.nlnog.net bgp looking glass

would be nice if participants can (voluntary) setup ebgp multi-hop session to this bird instance and announce all they have in ipv4 and ipv6 table.

ring users could ssh to this machine and will drop straight into looking glass software

permit running of http daemon on node

[11:39] lochii:#ring wanted to know if possible to run any form of httpd on our ring nodes, serving nothing (just replying 200)
[11:39] lochii:#ring and by run, I mean puppeted
[11:39] lochii:#ring don't care what it is, as long as it can answer 200
[11:40] lochii:#ring reason being, want to integrate it with some external monitoring which is only able to do HTTP health checks
[11:46] lochii:#ring anything 2xx is fine
[12:36] job:#ring lochii: not a bad idea. Please email ring-admins as a reminder
[12:37] job:#ring Or open an issue on github
[12:38] lochii:#ring ack, thanks
[12:41] job:#ring And we can use those pages to promote the ring

ring-trace -t txt -r does not do resolving

DNS resolving (-r) doesn't work for text output (-t txt):

tdc@tdc01:~$ ring-trace -4 -t txt -n 1 -r www.cisco.com
ring-trace v1.8.1 - written by Teun Vink <[email protected]>
picked 1 host at random: claranet05
Performing ICMP traceroutes towards www.cisco.com from 1 ring hosts, ssh-timeout is 10 seconds.
traceroutes to 23.78.47.242 generated by ring-trace 1.8.1 at 2017-05-30 19:12:43

Node: claranet05
----------------
1.|-- 62.240.228.2               0.0%     1    0.3   0.3   0.3   0.3   0.0
2.|-- 212.43.193.114             0.0%     1    4.9   4.9   4.9   4.9   0.0
3.|-- 62.240.250.213             0.0%     1    1.3   1.3   1.3   1.3   0.0
4.|-- 213.242.120.69             0.0%     1    1.9   1.9   1.9   1.9   0.0
5.|-- 4.69.140.30                0.0%     1    1.5   1.5   1.5   1.5   0.0
6.|-- 141.136.103.181            0.0%     1    1.3   1.3   1.3   1.3   0.0
7.|-- 89.149.181.146             0.0%     1    1.3   1.3   1.3   1.3   0.0
8.|-- 46.33.95.178               0.0%     1    1.6   1.6   1.6   1.6   0.0
9.|-- 78.152.47.136              0.0%     1   26.1  26.1  26.1  26.1   0.0
10.|-- 78.152.57.10               0.0%     1   29.1  29.1  29.1  29.1   0.0
11.|-- 23.78.47.242               0.0%     1   25.9  25.9  25.9  25.9   0.0

Filter/highlight IXP hops

Add a flag to filter IXP hops from graphs to simplify graphs.
Also, hilighting IXP hops when not removed can be useful.

Needed: a list of IXP addresses, Rodecker might be able to provide this.

good tty logging

All shell command's and as much as possible should be logged (over encrypted connection) to one or two masterservers and be stored for future reference in case of abuse.

ring-trace needs better SSH output handling

ring-trace is quite ignorant of the result of the ring-trace command. This can result in problems, e.g. when a new host key is offered and user input is needed, but also when the SSH connection fails.

Better handling of the actual SSH command should fix this.

Fix -i flag behaviour

It is not possible to trace only from a specific set of hosts. -n needs to be at least 1 at the moment:

$ ring-trace -i zylon01 -i interconnect01 -i solido01 -i prolocation01 -i totaalnet01 -i cyso01 -i msp01 128.0.0.1
ring-trace v0.9.9 - written by Teun Vink [email protected]
Including 7 hosts: zylon01 interconnect01 solido01 prolocation01 totaalnet01 cyso01 msp01
Performing ICMP traceroutes towards 128.0.0.1 from 49 ring hosts.

$ ring-trace -n 0 -i zylon01 -i interconnect01 -i solido01 -i prolocation01 -i totaalnet01 -i cyso01 -i msp01 128.0.0.1
ring-trace v0.9.9 - written by Teun Vink [email protected]
Including 7 hosts: zylon01 interconnect01 solido01 prolocation01 totaalnet01 cyso01 msp01
Performing ICMP traceroutes towards 128.0.0.1 from 49 ring hosts.

$ ring-trace -n 1 -i zylon01 -i interconnect01 -i solido01 -i prolocation01 -i totaalnet01 -i cyso01 -i msp01 128.0.0.1
ring-trace v0.9.9 - written by Teun Vink [email protected]
picked 1 host at random: portlane01
Including 7 hosts: zylon01 interconnect01 solido01 prolocation01 totaalnet01 cyso01 msp01
Performing ICMP traceroutes towards 128.0.0.1 from 8 ring hosts.
Created trace-128.0.0.1.jpg in 6.6 seconds.

broken hops are not graphed correctly

All broken hops (resulting in * * *) are graphed as one single node in the graph, though it might actually represent a large number of different hops.

Ignoring these hops results into improper graphs, so the best solution would be to create seperate nodes (and edges) for these hops.

ring-trace - combine command status with mtr output

We've been tracing a reachability issue trying to detect which ASN is performing DPI on incoming DNS packets towards one of our root servers and blocking some of them.

Ultimately we used ping-trace to build a view of the topology facing that server, and then had to manually run tests on the chosen nodes to find out which ones had the query blocked, and from that determine which common ASN was responsible for that blocking.

It would be really useful if ping-trace could run a secondary command alongside the mtr tracer, and then indicate in the 0th hop output what the return status of that command was. This would allow us at a single glance to identify which group of topologically connected nodes are causing an issue.

ring-trace crashes from time to time

Here are two identical runs of ring-trace, where the first one crashes and the second one went OK:

tore@stat:~$ /stat/local/nlnog-ring/scripts/ring-trace -ab4X -n 20 dns.i.bitbit.net
ring-trace v1.6.1 - written by Teun Vink <[email protected]>
picked 20 hosts at random: speedpartner01 rrbone01 rootlu01 dcsone01 nexellent01 nxs01 strato01 enestdata01 hosteam01 melbourne01 softlayer06 nuqe01 bluezonejordan01 ebayclassifiedsgroup01 sixdegrees01 onet02 poznan01 dyn01 softlayer01 openminds01
Performing ICMP traceroutes towards dns.i.bitbit.net from 20 ring hosts, ssh-timeout is 10 seconds.
ssh: connect to host softlayer06.ring.nlnog.net port 22: Connection refused
Host softlayer06 returned no trace.
Traceback (most recent call last):
  File "/stat/local/nlnog-ring/scripts/ring-trace", line 1293, in <module>
    ns.show_country, ns.remove_broken, ns.highlight_ixp, ns.remove_duplicate)
  File "/stat/local/nlnog-ring/scripts/ring-trace", line 737, in graph
    a = [(tracedata[ip].get("asn", "unknown"),tracedata[ip].get("ix", False),tracedata[ip].get("desc", "unknown"), "") for ip in ips if not (tracedata[ip].get("ix", False) and no_ixp)]
KeyError: ''
tore@stat:~$ /stat/local/nlnog-ring/scripts/ring-trace -ab4X -n 20 dns.i.bitbit.net
ring-trace v1.6.1 - written by Teun Vink <[email protected]>
picked 20 hosts at random: rackfish01 is01 ripe01 hostway01 rbnetwork02 afilias01 claranet06 signet01 softlayer04 dyn01 westnet01 atrato03 tetaneutral01 direcpath01 hurricane01 ualbany01 isc01 cambrium01 siminn01 ispservices01
Performing ICMP traceroutes towards dns.i.bitbit.net from 20 ring hosts, ssh-timeout is 10 seconds.

Image uploaded to https://ring.nlnog.net/paste/p/1v4d2w4tvrm2a4c4
Done in 26.4 seconds.

support .ring-tracerc for defaults

A .ring-tracerc file would be useful, so we can set default arguments. This might require a rewrite of commandline arguments so each option can be disabled as well as enabled.

ring-change-sshkeys

Some kind of wrapper that can manage the sshkeys for users would be nice. I imagine the following:

user types ring-change-sshkeys on a ring-node, and it will open up the preferred editor and it should like like typing 'crontab -e'.

The first few lines of the authorized_keys file that is displayed should be some comments which contain warnings and what the file means.

The user can then edit the file to it's liking and when quiting the editor the resulting data should be validated:

  • empty file is error
  • invalid keys error
  • etc

If an error is found the user should be presented with a choice: discard all changes or open the editor again.

If the file is valid, it can be emailed to ring-admins@ in puppetized format and incorporated in the repos. In the future we can automate this if we have reason to trust the system.

ring-trace breaks when TCP/53 is blocked

ring-trace needs a TCP-based DNS query to retrieve the list of nodes. If TCP/53 is blocked a better error message should be given, or a workaround should be provided.

DNSSEC

  • sign ring.nlnog.net
  • work with the nlnog.net hostmaster to get our parent signed

make provisioning of machines less work

We require a db backend for this, habbie has already written some python to help in that space

Current steps are

  • check machine cpu (64 bit?)
  • check ubuntu distribution
  • check v6 connectivity and configuration (should be static)
  • add CNAME in DNS on job's private server
  • add machine to /etc/puppet/files/etc/hosts so master can create smokeping configs and not error
  • add ring repo and apt-get update
  • puppetd --test
  • apt-get update
  • puppetd --test a few times
  • add machine to ring.nlnog.net txt record
  • add owner to admin group
  • add news item on ring.nlnog.net
  • notify participant they can subscribe to mailman
  • manually type email to participant to welcome them
  • manually type email to ring-users@ to announce the new node
  • notify teun a new node has arrived and he should add it to map

this should be easier...

new logo

We need a new logo. I will contact a graphics designer. :-)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.