jwbensley / dfz_name_and_shame Goto Github PK
View Code? Open in Web Editor NEWDFZ Name and Shame - A Twitter bot which tweets stats about the BGP DFZ.
DFZ Name and Shame - A Twitter bot which tweets stats about the BGP DFZ.
Move logging format to config
Add pid to output
Make output more parseable (e.g. pipe separator).
Update the test MRT function to fine MRTs with no UPDATE in the BGP message.
Example from: /media/usb0/downloads/LINX/updates.20220208.1645.bz2
OrderedDict([('timestamp', {1644339034: '2022-02-08 17:50:34'}), ('type', {17: 'BGP4MP_ET'}), ('subtype', {4: 'BGP4MP_MESSAGE_AS4'}), ('length', 180)])
THis BGP message has no UPDATE. It's just am empty message. Is it a problem with the MRT?
Only tweet Tweets from the tweet queue not already in the "tweeted" list
Ensure that all classes have unit tests
rrc19.ripe.net at Johannesburg, South Africa.
Check for bogons/martian/RFC reserved/unallocated IP space.
Need to implement global running stats which will be used for tweets
HTTP 404 for MRT getting raises a class error, so hard fail, need to ignore an carry on.
Example below, Route Views archive is missing MRT files - need to ignore and wait for next update file to become available, and these missing MRTs need to back backfilled later:
$ ./get_mrts.py --continuous --update
2022-02-17 09:18:44,869 INFO Starting MRT downloader with logging level 20
2022-02-17 09:18:44,887 INFO Downloading http://archive.routeviews.org/route-views.linx/bgpdata/2022.02/UPDATES//updates.20220217.0800.bz2 to /media/usb0/downloads/LINX/updates.20220217.0800.bz2
Traceback (most recent call last):
File "/opt/dnas/scripts/get_mrts.py", line 244, in <module>
main()
File "/opt/dnas/scripts/get_mrts.py", line 241, in main
continuous(args)
File "/opt/dnas/scripts/get_mrts.py", line 95, in continuous
replace=args["replace"],
File "/opt/dnas/scripts/../dnas/mrt_getter.py", line 313, in get_rv_latest_upd
mrt_getter.download_mrt(filename=filename, url=url)
File "/opt/dnas/scripts/../dnas/mrt_getter.py", line 455, in download_mrt
req.raise_for_status()
File "/opt/pypy3.8-v7.3.7-aarch64/lib/pypy3.8/site-packages/requests/models.py", line 960, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: http://archive.routeviews.org/route-views.linx/bgpdata/2022.02/UPDATES//updates.20220217.0800.bz2
Change Tweets to be generated based on daily global stats no daily diff stats.
Also - only Tweet the header, not the fully body.
global_stats.py needs a --range option to generate stats for multiple days
Add total count of all advertisements, updates, and withdraws per MRT file / per day
Can't parse updates, and the timestamp is wrong (both for the same reason - path attribute is an ordered dict of dicts, not lists).
--continuous is downloading MRT files from all archives, even ones which aren't enabled.
2022-03-07 16:37:27,725 INFO parse_file Processing /media/usb0/downloads/LINX/updates.20220118.1500.bz2...
Traceback (most recent call last):
File "/opt/dnas/scripts/parse_mrts.py", line 434, in <module>
main()
File "/opt/dnas/scripts/parse_mrts.py", line 429, in main
process_range(args)
File "/opt/dnas/scripts/parse_mrts.py", line 388, in process_range
parse_files(filelist=filelist, args=args)
File "/opt/dnas/scripts/parse_mrts.py", line 188, in parse_files
mrt_s = parse_file(file)
File "/opt/dnas/scripts/parse_mrts.py", line 140, in parse_file
splitter = mrt_splitter(filename)
File "/opt/dnas/scripts/../dnas/mrt_splitter.py", line 47, in __init__
logging.debug("Assuming BZ2 file")
NameError: name 'logging' is not defined
After a long period of continuous MRT parsing (24 hours or more) the kernel OOM killer kicks in and kills the container. Why?
When running parse_mrts.py in a venv, or with PyPy, one must parse the MRT data as OrderedDict.
The following code from method parse_upd_dump()
in class mrt_parser
produces the following error output (this works in the non-venv scenario though):
class mrt_parser:
...
@staticmethod
def parse_upd_dump(filename: str = None) -> 'mrt_stats':
...
for idx, mrt_e in enumerate(mrt_entries):
...
print(mrt_e.data["timestamp"])
ts = mrt_parser.posix_to_ts(
mrt_e.data["timestamp"][0]
)
Output:
{1641001800: '2022-01-01 02:50:00'}
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/lib/python3.8/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "/usr/lib/python3.8/multiprocessing/pool.py", line 48, in mapstar
return list(map(*args))
File "/home/bensley/GitHub/dnas/scripts/../dnas/mrt_parser.py", line 281, in parse_upd_dump
mrt_e.data["timestamp"][0]
KeyError: 0
"""
When running with Python3 instead of PyPy, or with Python3 but not in a venv, the MRT data must be passed as a list.
The following code produces the following output (this works in the venv scenario though):
class mrt_parser:
...
@staticmethod
def parse_upd_dump(filename: str = None) -> 'mrt_stats':
...
for idx, mrt_e in enumerate(mrt_entries):
...
print(mrt_e.data["timestamp"])
ts = mrt_parser.posix_to_ts(
next(iter(mrt_e.data["timestamp"].items()))[0]
)
Output:
[1641001800, '2022-01-01 02:50:00']
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/lib/python3.8/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "/usr/lib/python3.8/multiprocessing/pool.py", line 48, in mapstar
return list(map(*args))
File "/home/bensley/GitHub/dnas/scripts/../dnas/mrt_parser.py", line 280, in parse_upd_dump
next(iter(mrt_e.data["timestamp"].items()))[0]
AttributeError: 'list' object has no attribute 'items'
"""
Check for really short and really long prefixes that are typically filtered.
Add continuous flags and sleep timers to scripts, to allow for the continous pipeline operation.
Enabled MRT feed for AS57355 in the DNAS config and add mini HTTP container to serve the MRTs.
Incorrect use of os.path.normpath
instead of
import os
os.path.normpath("http://wwww.example.com")
'http:/wwww.example.com'
This should be urllib.parse:
import urllib.parse
urllib.parse.urljoin("http://www.example.com/", "/path/to/file")
'http://www.example.com/path/to/file'
2022-03-07 10:40:57,184 INFO download_mrt Downloading http:/archive.routeviews.org/route-views.linx/bgpdata/2022.03/UPDATES/updates.20220305.1300.bz2 to /media/usb0/downloads/LINX/updates.20220305.1300.bz2
Traceback (most recent call last):
File "/opt/dnas/scripts/get_mrts.py", line 460, in <module>
main()
File "/opt/dnas/scripts/get_mrts.py", line 455, in main
get_range(args)
File "/opt/dnas/scripts/get_mrts.py", line 141, in get_range
get_mrts(replace=args["replace"], url_list=url_list)
File "/opt/dnas/scripts/get_mrts.py", line 92, in get_mrts
url=url
File "/opt/dnas/scripts/../dnas/mrt_getter.py", line 168, in download_mrt
req = requests.get(url, stream=True)
File "/opt/pypy3.8-v7.3.7-aarch64/lib/pypy3.8/site-packages/requests/api.py", line 75, in get
return request('get', url, params=params, **kwargs)
File "/opt/pypy3.8-v7.3.7-aarch64/lib/pypy3.8/site-packages/requests/api.py", line 61, in request
return session.request(method=method, url=url, **kwargs)
File "/opt/pypy3.8-v7.3.7-aarch64/lib/pypy3.8/site-packages/requests/sessions.py", line 515, in request
prep = self.prepare_request(req)
File "/opt/pypy3.8-v7.3.7-aarch64/lib/pypy3.8/site-packages/requests/sessions.py", line 453, in prepare_request
hooks=merge_hooks(request.hooks, self.hooks),
File "/opt/pypy3.8-v7.3.7-aarch64/lib/pypy3.8/site-packages/requests/models.py", line 318, in prepare
self.prepare_url(url, params)
File "/opt/pypy3.8-v7.3.7-aarch64/lib/pypy3.8/site-packages/requests/models.py", line 395, in prepare_url
raise InvalidURL("Invalid URL %r: No host supplied" % url)
requests.exceptions.InvalidURL: Invalid URL 'http:/archive.routeviews.org/route-views.linx/bgpdata/2022.03/UPDATES/updates.20220305.1300.bz2': No host supplied
This is to allow testing of a single MRT file.
Only updates for 2022-01-13 are pulled, the rest up to 2022-01-19 3:59 aren't pulled:
$ docker run -it --rm --volume /media/usb0/:/media/usb0/ dnas:2022-02-17--001 /opt/pypy3.8-v7.3.7-aarch64/bin/pypy3 /opt/dnas/scripts/get_mrts.py --range --start 20220113.0000 --end 20220119.2359 --update --debug
2022-02-19 15:03:29,648 INFO Starting MRT downloader with logging level 10
2022-02-19 15:03:29,649 DEBUG Archive RV_LINX is enabled
2022-02-19 15:03:29,665 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.0000.bz2
2022-02-19 15:03:29,667 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.0015.bz2
2022-02-19 15:03:29,668 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.0030.bz2
2022-02-19 15:03:29,669 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.0045.bz2
2022-02-19 15:03:29,671 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.0100.bz2
2022-02-19 15:03:29,673 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.0115.bz2
2022-02-19 15:03:29,675 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.0130.bz2
2022-02-19 15:03:29,676 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.0145.bz2
2022-02-19 15:03:29,677 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.0200.bz2
2022-02-19 15:03:29,679 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.0215.bz2
2022-02-19 15:03:29,680 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.0230.bz2
2022-02-19 15:03:29,681 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.0245.bz2
2022-02-19 15:03:29,683 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.0300.bz2
2022-02-19 15:03:29,684 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.0315.bz2
2022-02-19 15:03:29,685 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.0330.bz2
2022-02-19 15:03:29,687 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.0345.bz2
2022-02-19 15:03:29,688 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.0400.bz2
2022-02-19 15:03:29,689 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.0415.bz2
2022-02-19 15:03:29,692 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.0430.bz2
2022-02-19 15:03:29,693 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.0445.bz2
2022-02-19 15:03:29,695 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.0500.bz2
2022-02-19 15:03:29,696 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.0515.bz2
2022-02-19 15:03:29,697 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.0530.bz2
2022-02-19 15:03:29,699 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.0545.bz2
2022-02-19 15:03:29,700 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.0600.bz2
2022-02-19 15:03:29,701 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.0615.bz2
2022-02-19 15:03:29,703 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.0630.bz2
2022-02-19 15:03:29,704 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.0645.bz2
2022-02-19 15:03:29,705 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.0700.bz2
2022-02-19 15:03:29,707 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.0715.bz2
2022-02-19 15:03:29,708 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.0730.bz2
2022-02-19 15:03:29,711 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.0745.bz2
2022-02-19 15:03:29,712 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.0800.bz2
2022-02-19 15:03:29,713 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.0815.bz2
2022-02-19 15:03:29,715 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.0830.bz2
2022-02-19 15:03:29,716 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.0845.bz2
2022-02-19 15:03:29,717 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.0900.bz2
2022-02-19 15:03:29,718 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.0915.bz2
2022-02-19 15:03:29,720 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.0930.bz2
2022-02-19 15:03:29,721 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.0945.bz2
2022-02-19 15:03:29,722 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.1000.bz2
2022-02-19 15:03:29,724 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.1015.bz2
2022-02-19 15:03:29,725 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.1030.bz2
2022-02-19 15:03:29,728 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.1045.bz2
2022-02-19 15:03:29,729 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.1100.bz2
2022-02-19 15:03:29,731 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.1115.bz2
2022-02-19 15:03:29,732 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.1130.bz2
2022-02-19 15:03:29,733 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.1145.bz2
2022-02-19 15:03:29,735 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.1200.bz2
2022-02-19 15:03:29,736 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.1215.bz2
2022-02-19 15:03:29,737 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.1230.bz2
2022-02-19 15:03:29,738 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.1245.bz2
2022-02-19 15:03:29,740 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.1300.bz2
2022-02-19 15:03:29,741 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.1315.bz2
2022-02-19 15:03:29,742 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.1330.bz2
2022-02-19 15:03:29,744 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.1345.bz2
2022-02-19 15:03:29,746 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.1400.bz2
2022-02-19 15:03:29,748 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.1415.bz2
2022-02-19 15:03:29,749 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.1430.bz2
2022-02-19 15:03:29,750 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.1445.bz2
2022-02-19 15:03:29,752 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.1500.bz2
2022-02-19 15:03:29,753 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.1515.bz2
2022-02-19 15:03:29,754 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.1530.bz2
2022-02-19 15:03:29,756 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.1545.bz2
2022-02-19 15:03:29,757 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.1600.bz2
2022-02-19 15:03:29,758 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.1615.bz2
2022-02-19 15:03:29,760 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.1630.bz2
2022-02-19 15:03:29,761 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.1645.bz2
2022-02-19 15:03:29,762 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.1700.bz2
2022-02-19 15:03:29,764 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.1715.bz2
2022-02-19 15:03:29,765 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.1730.bz2
2022-02-19 15:03:29,767 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.1745.bz2
2022-02-19 15:03:29,768 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.1800.bz2
2022-02-19 15:03:29,769 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.1815.bz2
2022-02-19 15:03:29,771 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.1830.bz2
2022-02-19 15:03:29,772 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.1845.bz2
2022-02-19 15:03:29,774 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.1900.bz2
2022-02-19 15:03:29,775 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.1915.bz2
2022-02-19 15:03:29,776 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.1930.bz2
2022-02-19 15:03:29,778 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.1945.bz2
2022-02-19 15:03:29,779 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.2000.bz2
2022-02-19 15:03:29,780 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.2015.bz2
2022-02-19 15:03:29,782 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.2030.bz2
2022-02-19 15:03:29,783 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.2045.bz2
2022-02-19 15:03:29,785 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.2100.bz2
2022-02-19 15:03:29,786 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.2115.bz2
2022-02-19 15:03:29,787 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.2130.bz2
2022-02-19 15:03:29,789 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.2145.bz2
2022-02-19 15:03:29,790 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.2200.bz2
2022-02-19 15:03:29,791 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.2215.bz2
2022-02-19 15:03:29,793 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.2230.bz2
2022-02-19 15:03:29,794 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.2245.bz2
2022-02-19 15:03:29,795 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.2300.bz2
2022-02-19 15:03:29,796 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.2315.bz2
2022-02-19 15:03:29,798 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.2330.bz2
2022-02-19 15:03:29,799 INFO Not overwriting existing file /media/usb0/downloads/LINX/updates.20220113.2345.bz2
$
orig_filename which is full path to file is correctly appended to file list for RIB dumps, but UPDATE dumps are incorrectly only appending filename without put:
dfz_name_and_shame/dnas/mrt_parser.py
Line 75 in 6ede7b6
vs.
dfz_name_and_shame/dnas/mrt_parser.py
Line 239 in 6ede7b6
Possbile testing to automate:
pycodestyle --show-source --show-pep8
https://www.pydocstyle.org/en/stable/usage.html
Maybe use black?
Start all containers with docker-compose
Add a "wait for" Redis
Can dnas use smaller images like python:slim or python:alpine?
Install packages from apt, run thr command (e.g. unzip or pip) then apt uninstall again, all in single docker RUN command to reduce layers?
Add env variables for Redis database server name/IP, post, and author details?
For some reason this file always needs to be parsed, it isn't being recognised as already in the DB:
$ docker run -it --rm --volume /media/usb0/:/media/usb0/ dnas:2022-02-17--001 /opt/pypy3.8-v7.3.7-aarch64/bin/pypy3 /opt/dnas/scripts/parse_mrts.py --single /media/usb0/downloads/LINX/updates.20220118.1500.bz2 --debug
2022-03-07 08:37:00,437 INFO Starting MRT parser with logging level 10
2022-03-07 08:37:00,438 DEBUG Assuming file is from RV_LINX archive
Done 0/1
2022-03-07 08:37:00,440 INFO Checking file /media/usb0/downloads/LINX/updates.20220118.1500.bz2
2022-03-07 08:37:00,441 DEBUG Assuming file is from RV_LINX archive
2022-03-07 08:37:00,629 INFO Processing /media/usb0/downloads/LINX/updates.20220118.1500.bz2...
2022-03-07 08:37:58,783 INFO Added /media/usb0/downloads/LINX/updates.20220118.1500.bz2 to RV_LINX_UPD:20220118 file list
Done 1/1
^ File should be parsed an now in the DB...
$ docker run -it --rm --volume /media/usb0/:/media/usb0/ dnas:2022-02-17--001 /opt/pypy3.8-v7.3.7-aarch64/bin/pypy3 /opt/dnas/scripts/parse_mrts.py --single /media/usb0/downloads/LINX/updates.20220118.1500.bz2 --debug
2022-03-07 08:38:08,484 INFO Starting MRT parser with logging level 10
2022-03-07 08:38:08,485 DEBUG Assuming file is from RV_LINX archive
Done 0/1
2022-03-07 08:38:08,487 INFO Checking file /media/usb0/downloads/LINX/updates.20220118.1500.bz2
2022-03-07 08:38:08,488 DEBUG Assuming file is from RV_LINX archive
2022-03-07 08:38:08,674 INFO Processing /media/usb0/downloads/LINX/updates.20220118.1500.bz2...
2022-03-07 08:39:12,371 INFO Added /media/usb0/downloads/LINX/updates.20220118.1500.bz2 to RV_LINX_UPD:20220118 file list
Done 1/1
^ FIle shouldn't be parsed again.
Add global totals for updates/withdraws/advertisements - not specific per-prefix or per-asn stats.
Add type hinting to all functions except for the UPDATE and RIB MRT file parsing functions.
Reading an MRT file and splitting it into chunks in the same target directory is I/O bound by crappy media. Allow for reading of input MRT file from one location, and writing of chunks to a seperate location (i.e. a seperate media) to remove I/O block.
gen_tweets.py needs a --range option to generate tweets for multiple days
Example from the wild - will DNAS pick this up from RIPE or RV? At least one of these two MRT archives (I forgot which) only supports AS paths with a max length of 255.
Mar 10 01:09:58:E:BGP: From Peer 45.11.248.255 received Long AS_PATH= AS_SEQ(2) 34549 6830 174 327708 37616 37616 37616 37616 37616 37616 37616 37616 37616 37616 37616 37616 37616 37616 37616 37616 37616 37616 37616 37616 37616 37616 37616 37616 37616 37616 ... attribute length (428) More than configured MAXAS-LIMIT 300
Mar 10 01:07:52:E:BGP: From Peer 5.226.148.84 received Long AS_PATH= AS_SEQ(2) 6830 174 327708 37616 37616 37616 37616 37616 37616 37616 37616 37616 37616 37616 37616 37616 37616 37616 37616 37616 37616 37616 37616 37616 37616 37616 37616 37616 37616 37616 ... attribute length (373) More than configured MAXAS-LIMIT 300
Mar 10 01:05:38:E:BGP: From Peer 5.226.144.250 received Long AS_PATH= AS_SEQ(2) 174 327708 37616 37616 37616 37616 37616 37616 37616 37616 37616 37616 37616 37616 37616 37616 37616 37616 37616 37616 37616 37616 37616 37616 37616 37616 37616 37616 37616 37616 ... attribute length (326) More than configured MAXAS-LIMIT 300
Add support to push stat reports to GitHub
Check for ASNs reserved by RFCs/unallocated/transition ASN 23456 etc. that shouldn't be in the DFZ.
Add test coverage with https://about.codecov.io/
Also, what about GitHub actions?
Error is in gen_upd_rv_url
, y = ym[0:5]
should be y = ym[0:4]
.
2022-02-17 12:12:14,323 INFO HTTP error: 404
Traceback (most recent call last):
File "/opt/dnas/scripts/get_mrts.py", line 260, in <module>
main()
File "/opt/dnas/scripts/get_mrts.py", line 251, in main
backfill(args)
File "/opt/dnas/scripts/get_mrts.py", line 71, in backfill
url=arch.gen_upd_url(filename=filename),
File "/opt/dnas/scripts/../dnas/mrt_getter.py", line 435, in download_mrt
req.raise_for_status()
File "/opt/pypy3.8-v7.3.7-aarch64/lib/pypy3.8/site-packages/requests/models.py", line 960, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: http://archive.routeviews.org/route-views.linx/bgpdata/20220.01/UPDATES/updates.20220104.0000.bz2
Initial Tweet generation put's 7 tweets in the queue;
$./scripts/gen_tweets.py --debug --generate --ymd 20220102
2022-03-22 08:40:07,395 INFO main Starting Tweet generation and posting with logging level DEBUG
2022-03-22 08:40:19,253 INFO gen_tweets Storing 7 tweets under TWEET_Q:20220102
Running the above again adds the same 7 tweets to the queue leading to duplicate tweets.
Downloads aren't going to USB mass storage, and get_latest_upd()|get_latest_rib()
aren't updated in get_mrts.py to use new arguments.
Parsing this RIPE MRT throws an error - should this stop everything or log and carry on?
2022-03-07 08:36:05,803 INFO Processing /media/usb0/downloads/RCC23/updates.20220118.2020.gz...
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/opt/pypy3.8-v7.3.7-aarch64/lib/pypy3.8/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "/opt/pypy3.8-v7.3.7-aarch64/lib/pypy3.8/multiprocessing/pool.py", line 48, in mapstar
return list(map(*args))
File "/opt/dnas/scripts/../dnas/mrt_parser.py", line 254, in parse_upd_dump
if len(mrt_e.data["bgp_message"]["withdrawn_routes"]) > 0:
KeyError: 'bgp_message'
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/opt/dnas/scripts/parse_mrts.py", line 285, in <module>
main()
File "/opt/dnas/scripts/parse_mrts.py", line 282, in main
process_mrt_files(args)
File "/opt/dnas/scripts/parse_mrts.py", line 177, in process_mrt_files
process_files(filelist=filelist, remove=args["remove"])
File "/opt/dnas/scripts/parse_mrts.py", line 212, in process_files
mrt_s = process_file(file)
File "/opt/dnas/scripts/parse_mrts.py", line 249, in process_file
mrt_chunks = Pool.map(mrt_parser.parse_upd_dump, file_chunks)
File "/opt/pypy3.8-v7.3.7-aarch64/lib/pypy3.8/multiprocessing/pool.py", line 364, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "/opt/pypy3.8-v7.3.7-aarch64/lib/pypy3.8/multiprocessing/pool.py", line 771, in get
raise self._value
KeyError: 'bgp_message'
dfz_name_and_shame/scripts/parse_mrts.py
Line 201 in 191c719
This else should be and if else, if file name is in the list and overwrite is true, we don't need to re-add the file to the file list.
mrt_archives.py arch_from_url() needs to catch value error when enumerating each mrt_archive.
It's a trial and error discovery process for the correct arch type and url, so ValueError is thrown which needs to be caught.
DFZ Account needs to be added and used instead of my personal account.
add option to parse_mrts.py to parse between specific --start and --end. In doing so, move URL generation functions from mrt_getter class to mrt_archive class.
These all say 1 peer but relate to multiple peers ASNs, looks like they might be relating to one peer but references the origin ASN by mistake:
'{"hdr": "New most BGP withdraws per peer ASN: on the day 2022/01/17 1 peer '
'ASN(s) sent 73820 withdraws", "hdr_id": 0, "body": "Peer ASN(s): AS7 (DSTL) '
'AS5 (SYMBOLICS) AS7 (DSTL) AS5 (SYMBOLICS)", "body_ids": [], "hidden": '
'true}',
'{"hdr": "New most BGP updates per peer ASN: on the day 2022/01/17 1 peer '
'ASN(s) sent 2607299 updates", "hdr_id": 0, "body": "Peer ASN(s): AS1 '
'(LVLT-1) AS3 (MIT-GATEWAYS) AS0 AS3 (MIT-GATEWAYS) AS0", "body_ids": [], '
'"hidden": true}',
'{"hdr": "New most BGP advertisements per peer ASN: on the day 2022/01/17 1 '
'peer ASN(s) sent 2549485 advertisements", "hdr_id": 0, "body": "Peer '
'ASN(s): AS1 (LVLT-1) AS3 (MIT-GATEWAYS) AS0 AS3 (MIT-GATEWAYS) AS0", '
'"body_ids": [], "hidden": true}',
The following URL is being generated by mrt_archive.py gen_upd_url_ripe()
https://data.ris.ripe.net/rrc24/202203/updates.20220307.1140.gz
Missing a dot in the year-month folder name; should be 2022.03
:
https://data.ris.ripe.net/rrc24/2022.03/updates.20220307.1140.gz
Use mypy and check for type errors.
Use black to check for format errors (Does this include PEP8?).
pylint for formating errors? https://pylint.pycqa.org/en/latest/
Any benefit to checking for warning and errors in the testing pipeline? https://docs.python.org/3/library/devmode.html
Have the sleep interval be based on the most frequently updating MRT archive interval.
This requires a re-arch of config and other imports to not flood the global name space, so that importlib.reload() can be used.
Fix this so that it is pulling up-to-date MRTs.
parse_mrts needs a --continuous mode
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.