intentionet / netconan Goto Github PK
View Code? Open in Web Editor NEWnetconan - a Network Configuration Anonymizer
License: Apache License 2.0
netconan - a Network Configuration Anonymizer
License: Apache License 2.0
FortiOS VPN interface pksecret not being anonymized
config system interface
edit "VPN_1"
set interface "wan1"
set ike-version 2
set peertype any
set net-device disable
set proposal aes256gcm-prfsha512
set dhrp 21
set nattraversal disable
set remote-gw 1.1.1.1
set pksecret ENC dfdgdgfhfhhfffgfgfgs
end
I work with a lot of different customers and so there is a ton of variability in the domain names of the devices. It would be good to be able to anonymize the domain name by default instead of having to find all variations and list them as sensitive words.
For example:
ip domain-name a.b.acme.net
Match ip domain-name \S+$
and replace.
Also, maybe a flag for hostname as well?
Maybe even make it more generic to pick up domain names in any config object definition to further anonymize.
At the moment, I do want to submit parse issues automatically to the developers of Batfish, but I am not comfortable doing so as there is still identifiable information in the configurations after a netconan run.
A given sensitive item (from a config line known to contain sensitive info) is anonymized based on the number of the sensitive items encountered before it. This means inserting a new sensitive line and re-anonmyzing a file may result in different anonymized values for other sensitive items in that file.
Could apply a similar idea to what is now used for IP address anonymization (hash of original value + salt, so new anon values would depend solely on the salt and unanonymized sensitive item).
Running netconan -p -i input_file -o output_file
On this input_file
:
set license keys key "something something"
produces a partially anonymized output_file
:
set license keys key "netconanRemoved0 something"
when it would ideally produce a fully anonymized file like this:
set license keys key "netconanRemoved0"
Password/snmp community regexes for Juniper configs in sensitive_item_removal.py
may need tweaking to catch all allowed syntax/options and some do not have any tests.
The JUNOS regexes for md5
, hello-authentication-key
, and ssh
do not have any tests.
The remaining JUNOS regexes:
test_sensitive_item_removal.py
covering a variety of syntaxes/options allowed by (and verified on) JUNOS routers
password blah
versus password 0 blah
or password encrypt sha512 blah
)snmp community
JUNOS regexIt took me a few minutes before realizing -c
was for the netconan config and not for the input config files to be sanitized.
Confirm the snmp community regexes in sensitive_item_removal.py
catch communities for all possible config lines (covering the variety of syntaxes/options allowed by routers).
In additional to anonymizing IPs and scrambling passwords, users may have sensitive words they wish to hide from the final output. E.g.,
! Use ACME's standard logging servers
Proposal:
--sensitive-words=ACME,Wile,Coyote
which will ensure that for each word, grep -ric <word> <file>
returns 0
results.
I think we need to keep the mapping bijective, so that multiple sensitive words aren't anonymized the same. E.g., if I have route maps secret-Wile
and secret-ACME
, they should be distinct after sensitive word processing.
One proposal would be:
anon_word = hex_encode(hash(salt || word))[:N]
In other words, keep N
bytes of a hex-encoded, salted hash.
And replace each instance of word
(case insensitive) with anon_word.
Possible issues:
N = len(word)
, since that could lead to collisions [for short words] and may compromise anonymity. (Though I think it shouldn't, if the salt is well-chosen).In sensitive_item_removal.py
, add in ability to reverse encrypted passwords (where applicable) to make sure anonymized passwords match when the original encrypted passwords matched.
Specifically, this applies to Cisco type 7 and Juniper type 9 passwords.
(.env3) ➜ configs netconan -h
usage: netconan [-h] [-i INPUTDIRECTORY] [-o OUTPUTDIRECTORY] [-p] [-a]
[-s SALT] [-d DUMPIPADDRMAP] [-u]
[--sensitivewords SENSITIVEWORDS]
[-l {DEBUG,INFO,WARNING,ERROR,CRITICAL}]
implies they are optional.
Currently cannot undo IP anonymization.
After IP anonymization, ordering of IP addresses may change
There's an issue in the ipaddress library that has to do with strings vs bytes. We avoid this issue in our unit tests, but it's triggered when run from the CLI. We need a test here that actually runs using an input file.
netconan -i logs -o anonymized --anonymize-ips
WARNING No salt was provided; using randomly generated "XZF0CbM10AKtFLYX"
INFO Anonymizing parser_warnings.txt
Traceback (most recent call last):
File "/usr/local/bin/netconan", line 11, in <module>
sys.exit(main())
File "/usr/local/lib/python2.7/dist-packages/netconan/netconan.py", line 110, in main
args.dump_ip_map, sensitive_words, args.undo, as_numbers)
File "/usr/local/lib/python2.7/dist-packages/netconan/anonymize_files.py", line 70, in anonymize_files_in_dir
anonymizer6=anonymizer6)
File "/usr/local/lib/python2.7/dist-packages/netconan/anonymize_files.py", line 99, in anonymize_file
output_line = anonymize_ip_addr(anonymizer4, output_line, undo_ip_anon)
File "/usr/local/lib/python2.7/dist-packages/netconan/ip_anonymization.py", line 244, in anonymize_ip_addr
return pattern.sub(lambda match: _anonymize_match(anonymizer, match[0], undo_ip_anon), line)
File "/usr/local/lib/python2.7/dist-packages/netconan/ip_anonymization.py", line 244, in <lambda>
return pattern.sub(lambda match: _anonymize_match(anonymizer, match[0], undo_ip_anon), line)
File "/usr/local/lib/python2.7/dist-packages/netconan/ip_anonymization.py", line 217, in _anonymize_match
ip = anonymizer.make_addr(match)
File "/usr/local/lib/python2.7/dist-packages/netconan/ip_anonymization.py", line 177, in make_addr
return ipaddress.IPv4Address(addr_str)
File "/home/cns-admin/.local/lib/python2.7/site-packages/ipaddress.py", line 1391, in __init__
self._check_packed_address(address, 4)
File "/home/cns-admin/.local/lib/python2.7/site-packages/ipaddress.py", line 554, in _check_packed_address
expected_len, self._version))
ipaddress.AddressValueError: '12.123.123.12' (len 13 != 4) is not permitted as an IPv4 address. Did you pass in a bytes (str in Python 2) instead of a unicode object?
Netconan replaces multiple whitespace with a single space in order to make regex maintenance and correctness easier. However, it might be nice to preserve leading (and trailing?) spaces so that files "look the same" on the input and the output.
Especially if anonymizing indented structures such as router configuration dumps, Python, JSON, etc.
The current example isn't self-illustrative:
there may be other improvements to make.
Remove RANCID content and have separate developer implement missing functionality.
The current behavior of generating output (which looks identical to input) is confusing.
key-hash sha256 - Arista
snmp-server mib community-map XXXXX:1000 context FOO - Cisco [XXX should be anonymized]
snmp-server user <x> <x> auth md5 <x> priv <x> localizedkey - ?
enum34
Python module is not compatible with certain Ansible modules that require the enum
module. Since enum34
is not required for Python 3, we need to adjust the inclusion/usage to just Python 2 deployments.
For users that have a long-ish list of sensitive words, it would be more convenient to be able to specify them in a file rather than on command line.
@dhalperi for additional context if needed
Currently, lines with sensitive information are completely removed. Need to be able to preserve the non-sensitive information on those lines.
Some files, like JSON files or some network configuration files may have important context in sensitive lines, which is lost during anonymization.
Consider the following JSON file:
"Text": "password FOOBAR",
"Text2": "cable shared-secret FOOBAR",
"OtherField": ...
Which is no longer proper JSON format after anonymizing passwords:
"Text": "password netconanRemoved0
! Sensitive line SCRUBBED by netconan
"OtherField": ...
or consider this Juniper config snippet:
system {
root-authentication {
password FOOBAR;
}
}
Which loses its line-terminating ;
after anonymizing passwords:
system {
root-authentication {
password netconanRemoved0
}
}
In both of these cases, it would be helpful to preserve the trailing context, so the anonymized files will still be formatted like the original (e.g. so the anonymized JSON file can still be interpreted as a valid JSON file).
Would be nice if Netconan kept these subnets intact too.
netconan
seems to currently output valid public IPs when anonymizing. Looking at the ip_anonymization.py, this might take a good chunk of work to restrict things, but it seems like using reserved documentation ranges for IPv4 and IPv6 would be appropriate here, rather than random public IPs.
It would be lovely to have an optional feature that could anonymize Cisco ACL ip/ports/protocols and ACL remarks.
Many of my clients consider ACLs extremely sensitive and I have several clients who use ACL remarks extensively.
This feature should be optional as I suspect doing so might break some things.
fortinet passwords are encrypted in config, using the following syntax:
set password ENC <hash>
Netconan outputs the following after it anonymizes the password:
set password netconanRemoved0 <hash>
The has is not removed by the ENC syntax is (IE this is just stating Encrypted hash to follow.
Currently, anonymized IP addresses do not preserve classes (i.e. all bits of an address are anonymized)
IOS-XR BGP policy allows you to set a BGP community by either:
a) directly referencing the target community in the config
set community 65001:1111
b) referencing a community-set in the config
community-set BAR
65001:1111
set community BAR
A sample configuration snippet for community-sets and use in a BGP policy is shown below:
community-set LOCAL-NOPREPEND
1111:2222
end-set
!
community-set LOCAL-LASTRESORT
1111:4444
end-set
!
community-set LOCAL-HIGHLOCPREF
1111:1111
end-set
!
!
route-policy FOO
if destination in BAR or destination in BAZ then
set community LOCAL-NOPREPEND
if destination in FOOBAR then
set community LOCAL-LASTRESORT
endif
if destination in BAZBAR then
set community LOCAL-HIGHLOCPREF
set community LOCAL-LASTRESORT
else
set community LOCAL-NOPREPEND
set community LOCAL-LASTRESORT
endif
else
drop
endif
end-policy
Netconan will turn that config into:
community-set LOCAL-NOPREPEND
1111:2222
end-set
!
community-set LOCAL-LASTRESORT
1111:4444
end-set
!
community-set LOCAL-HIGHLOCPREF
1111:1111
end-set
!
!
route-policy FOO
if destination in BAR or destination in BAZ then
set community netconanRemoved42
if destination in FOOBAR then
set community netconanRemoved19
endif
if destination in BAZBAR then
set community netconanRemoved20
set community netconanRemoved21
else
set community netconanRemoved22
set community netconanRemoved23
endif
else
drop
endif
end-policy
Right now, Netconan
only accepts directories as input
and output
params. Need to update to allow specifying a file or directory.
FortiOS multiline private-keys are not handled correctly. Only the first line is handled
private-keys can be found in multiple sections of a config, but as an example:
config vpn certificate local
edit "fortinet_CA_SSL"
set password ENC 535456656ghffgfdgfdgf
set comments "This is the default CA certificate the SSL Inspection....."
set private-key "-----BEGIN ENCRYPTED PRIVATE KEY-----
gfgGFDBFFFfffffffffffffffffffffffffffffffghhgfhhfhghghghgjjghfh
<continues for several lines>
-----END ENCRYPTED PRIVATE KEY-----"
set certificate "-----BEGIN CERTIFICATE-----
gfgGFDBFFFfffffffffffffffffffffffffffffffghhgfhhfhghghghgjjghfh
<continues for several lines>
-----END CERTIFICATE-----"
next
end
set community \"<snmp community>\"
key <key>
tacacs-server host <ip> key 7 \"key\"
Hi,
am I the only one having trouble with the -d args ?
The dump file is created but it is always empty.
Awesome tool beside that :)
The anonymizer should preserve known IPs and/or prefixes as appropriate. The stanardized list of known prefixes is here:
https://www.iana.org/assignments/iana-ipv6-special-registry/iana-ipv6-special-registry.xhtml
Decisions may need to be made on a case-by-case basis.
(To implement this, we'll probably want to generalize the should_anonymize
functionality to handle partial preservations.)
Travis build is failing for many W605
(invalid escape sequence) and a few W504
(line break after binary operator) flake8
failures.
Running netconan -p -w test -i input -o output
(anonymize test
, which is a reserved word) where input
is a file containing:
password test
password "test"
produces the following output
file:
password test
password "netconanRemoved0"
but instead, it should produce an output like:
password test
password "test"
cli, config file.
probably, same setup as sensitive-words would work.
Reason to do this: enable users with ambiguous grammars (like #72) to provide a list of things that should not be treated as passwords.
Today, to use IP-address-anonymization from Netconan
externally, you can do something like:
from netconan.ip_anonymization import IpAnonymizer, anonymize_ip_addr
import uuid
salt = str(uuid.uuid4())
anonymizer = IpAnonymizer(salt=salt)
anon_str = anonymize_ip_addr(anonymizer, '1.2.3.4')
But it would be nice to have a better string-based IP-address anonymization interface in the anonymizer objects themselves.
There is an existing method IpAnonymizer.anonymize(addr)
, but this operates on an unintuitive type (int) and relies on the caller to do all the work (i.e. convert IP address to int, check if the address needs to be anonymized).
If a given line has two passwords and the first is a reserved word, the second will be incorrectly skipped.
e.g. in snmp-server user a b auth md5 ipaddress priv RemoveMe
, RemoveMe
is a password that should be anonymized and ipaddress
is a reserved word that should be left untouched, however Netconan stops searching for additional passwords once it finds the reserved word.
Add support for anonymizing lines like:
Where the anonymized output would be something like:
I see a fairly complicated "next character" regex here -- why can't it just be [^\d]
- or the right thing for "not a digit"?
https://github.com/intentionet/netconan/blob/master/netconan/ip_anonymization.py#L35
It would be useful to include a command-line option that removes "comments" from configs. For example, interface descriptions, BGP neighbor descriptions, remarks in ACLs, etc. Developing keyword lists for this is often non-trivial, so the ability to remove them outright, especially since they don't impact network semantics, would be useful.
(Prompted by a discussion with @sfraint and @progwriter on Slack)
Some proposals:
Use hyphens to separate words (--sensitive-words
, not --sensitivewords
e.g.)
Use names that are longer lasting. E.g., --undo
seems better than --undoipaddr
: what if we add another type of anonymization that is reversible?
Add shortcuts for as many arguments as sensible: -w
for --sensitive-words
?
Update docs.
Hi
I tried netconan
on our juniper configurations but it's failing with the following error message
netconan --anonymize-passwords -c bs1-ash1.candidate.conf
usage: netconan [-h] [-a] [-c CONFIG] [-d DUMP_IP_MAP] -i INPUT
[-l {DEBUG,INFO,WARNING,ERROR,CRITICAL}] [-n AS_NUMBERS] -o
OUTPUT [-p] [-r RESERVED_WORDS] [-s SALT] [-u]
[-w SENSITIVE_WORDS]
netconan: error: Unexpected line 11 in bs1-ash1.candidate.conf: point-to-point;
here is the first part of our config
## Last changed: 2018-12-06 17:09:11 PST
version 17.3R2-S2.1;
groups {
ISIS-BASE {
protocols {
isis {
reference-bandwidth 1000g;
level 1 disable;
level 2 wide-metrics-only;
interface <*-*> {
point-to-point;
bfd-liveness-detection {
minimum-interval 1000;
multiplier 3;
}
}
interface <ae*> {
point-to-point;
}
interface lo0.0 {
passive;
}
}
}
}
My environment
Python 2.7.14
netconan 0.8.1
Please let me know if I'm doing something wrong
thanks
Damien
We need tests that stuff is NOT anonymized. E.g., MAC addresses and various other places ::
may appear.
SNMP Communities are not removed from FortiOS devices:
config system snmp community
edit 100
set name "communityXYZ"
next
end
IP anonymization does not currently support IPv6
Some password regexes for Cisco-like configs in sensitive_item_removal.py
are unverified and untested.
They need to be tested against a variety of config lines generated on a router (to make sure the regex handles different syntaxes allowed in the line) and these config lines should be added to the existing list of test lines in test_sensitive_item_removal.py
.
In sensitive_item_removal.py, MD5 format detection and anonymization currently only supports 4-character salt. Should expand this to support more salt sizes (at least support 8-character salt for Juniper).
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.