intentionet / netconan Goto Github PK

View Code? Open in Web Editor NEW

145.0 14.0 12.0 418 KB

netconan - a Network Configuration Anonymizer

License: Apache License 2.0

Python 98.11% Starlark 1.89%

network configuration anonymizer

netconan's People

Contributors

Stargazers

Watchers

Forkers

mirzawaqasahmed snuggles shh0613 clay584 utkuozsan gahlberg colgate-cs-research elviraant sonicepk petercrocker kircheneer jmm-repo

netconan's Issues

VPN pksecret key not being anonymized

FortiOS VPN interface pksecret not being anonymized

config system interface
    edit "VPN_1"
        set interface "wan1"
        set ike-version 2
        set peertype any
        set net-device disable
        set proposal aes256gcm-prfsha512
        set dhrp 21
        set nattraversal disable
        set remote-gw 1.1.1.1
        set pksecret ENC  dfdgdgfhfhhfffgfgfgs
end

Feature Request: anonymize domain names by default

I work with a lot of different customers and so there is a ton of variability in the domain names of the devices. It would be good to be able to anonymize the domain name by default instead of having to find all variations and list them as sensitive words.

For example:

ip domain-name a.b.acme.net

Match ip domain-name \S+$ and replace.

Also, maybe a flag for hostname as well?

Maybe even make it more generic to pick up domain names in any config object definition to further anonymize.

At the moment, I do want to submit parse issues automatically to the developers of Batfish, but I am not comfortable doing so as there is still identifiable information in the configurations after a netconan run.

Sensitive line anonymization consistency

A given sensitive item (from a config line known to contain sensitive info) is anonymized based on the number of the sensitive items encountered before it. This means inserting a new sensitive line and re-anonmyzing a file may result in different anonymized values for other sensitive items in that file.

Could apply a similar idea to what is now used for IP address anonymization (hash of original value + salt, so new anon values would depend solely on the salt and unanonymized sensitive item).

Support quoted sensitive phrases with spaces in them

Running netconan -p -i input_file -o output_file

On this input_file:

set license keys key "something something"

produces a partially anonymized output_file:

set license keys key "netconanRemoved0 something"

when it would ideally produce a fully anonymized file like this:

set license keys key "netconanRemoved0"

Add support for recursively processing files in a folder

Review and add tests for Juniper sensitive lines

Password/snmp community regexes for Juniper configs in sensitive_item_removal.py may need tweaking to catch all allowed syntax/options and some do not have any tests.

The JUNOS regexes for md5, hello-authentication-key, and ssh do not have any tests.

The remaining JUNOS regexes:

May need to have additional test config lines added in test_sensitive_item_removal.py covering a variety of syntaxes/options allowed by (and verified on) JUNOS routers
- Right now, the regexes may only handle a subset of allowed syntax and options (e.g. not handling an optional digit or param password blah versus password 0 blah or password encrypt sha512 blah)
- Right now, test config lines are only taken from example test rigs in batfish and may not cover the full range of allowed syntax/options
Need to have a capture group added and group number specified
- This is so sensitive info can be extracted and replaced instead of just removing the whole line
- This has already been completed for the snmp community JUNOS regex

Help is slightly ambiguous on -c vs -i

It took me a few minutes before realizing -c was for the netconan config and not for the input config files to be sanitized.

Verify snmp community regexes

Confirm the snmp community regexes in sensitive_item_removal.py catch communities for all possible config lines (covering the variety of syntaxes/options allowed by routers).

Sensitive keyword removal

In additional to anonymizing IPs and scrambling passwords, users may have sensitive words they wish to hide from the final output. E.g.,

! Use ACME's standard logging servers

Proposal:

--sensitive-words=ACME,Wile,Coyote

which will ensure that for each word, grep -ric <word> <file> returns 0 results.

I think we need to keep the mapping bijective, so that multiple sensitive words aren't anonymized the same. E.g., if I have route maps secret-Wile and secret-ACME, they should be distinct after sensitive word processing.

One proposal would be:

anon_word = hex_encode(hash(salt || word))[:N]

In other words, keep N bytes of a hex-encoded, salted hash.

And replace each instance of word (case insensitive) with anon_word.

Possible issues:

replacing accidental occurrences of word, e.g., in hashes. Seems unlikely if words are long enough.
replacements for word are too long and break a parser -- e.g., maybe some ACL name is length-limited. This could break parsing, but maybe parsers should relax their length constraints here.
I don't think it's necessarily safe to force N = len(word), since that could lead to collisions [for short words] and may compromise anonymity. (Though I think it shouldn't, if the salt is well-chosen).

Add support for tracking reversible passwords

In sensitive_item_removal.py, add in ability to reverse encrypted passwords (where applicable) to make sure anonymized passwords match when the original encrypted passwords matched.

Specifically, this applies to Cisco type 7 and Juniper type 9 passwords.

Make input/output required arguments

(.env3) ➜  configs netconan -h
usage: netconan [-h] [-i INPUTDIRECTORY] [-o OUTPUTDIRECTORY] [-p] [-a]
                [-s SALT] [-d DUMPIPADDRMAP] [-u]
                [--sensitivewords SENSITIVEWORDS]
                [-l {DEBUG,INFO,WARNING,ERROR,CRITICAL}]

implies they are optional.

Add support for undoing IP anonymization

Currently cannot undo IP anonymization.

IP address order is not preserved

After IP anonymization, ordering of IP addresses may change

Netconan doesn't work in python 2.7

There's an issue in the ipaddress library that has to do with strings vs bytes. We avoid this issue in our unit tests, but it's triggered when run from the CLI. We need a test here that actually runs using an input file.

netconan -i logs -o anonymized --anonymize-ips
WARNING No salt was provided; using randomly generated "XZF0CbM10AKtFLYX"
INFO Anonymizing parser_warnings.txt
Traceback (most recent call last):
  File "/usr/local/bin/netconan", line 11, in <module>
    sys.exit(main())
  File "/usr/local/lib/python2.7/dist-packages/netconan/netconan.py", line 110, in main
    args.dump_ip_map, sensitive_words, args.undo, as_numbers)
  File "/usr/local/lib/python2.7/dist-packages/netconan/anonymize_files.py", line 70, in anonymize_files_in_dir
    anonymizer6=anonymizer6)
  File "/usr/local/lib/python2.7/dist-packages/netconan/anonymize_files.py", line 99, in anonymize_file
    output_line = anonymize_ip_addr(anonymizer4, output_line, undo_ip_anon)
  File "/usr/local/lib/python2.7/dist-packages/netconan/ip_anonymization.py", line 244, in anonymize_ip_addr
    return pattern.sub(lambda match: _anonymize_match(anonymizer, match[0], undo_ip_anon), line)
  File "/usr/local/lib/python2.7/dist-packages/netconan/ip_anonymization.py", line 244, in <lambda>
    return pattern.sub(lambda match: _anonymize_match(anonymizer, match[0], undo_ip_anon), line)
  File "/usr/local/lib/python2.7/dist-packages/netconan/ip_anonymization.py", line 217, in _anonymize_match
    ip = anonymizer.make_addr(match)
  File "/usr/local/lib/python2.7/dist-packages/netconan/ip_anonymization.py", line 177, in make_addr
    return ipaddress.IPv4Address(addr_str)
  File "/home/cns-admin/.local/lib/python2.7/site-packages/ipaddress.py", line 1391, in __init__
    self._check_packed_address(address, 4)
  File "/home/cns-admin/.local/lib/python2.7/site-packages/ipaddress.py", line 554, in _check_packed_address
    expected_len, self._version))
ipaddress.AddressValueError: '12.123.123.12' (len 13 != 4) is not permitted as an IPv4 address. Did you pass in a bytes (str in Python 2) instead of a unicode object?

Feature request: preserve leading/trailing spaces

Netconan replaces multiple whitespace with a single space in order to make regex maintenance and correctness easier. However, it might be nice to preserve leading (and trailing?) spaces so that files "look the same" on the input and the output.

Especially if anonymizing indented structures such as router configuration dumps, Python, JSON, etc.

Self-illustrative example in the README

The current example isn't self-illustrative:

doesn't have multiple addresses with the same prefix (to show prefix-preservation)
doesn't have an IPv6 address
could use a more complex password format like a salted hash

there may be other improvements to make.

Remove RANCID content

Remove RANCID content and have separate developer implement missing functionality.

Warn and exit if no anonymization option is turned on?

The current behavior of generating output (which looks identical to input) is confusing.

Some more new password/community regexes

key-hash sha256 - Arista
snmp-server mib community-map XXXXX:1000 context FOO - Cisco [XXX should be anonymized]
snmp-server user <x> <x> auth md5 <x> priv <x> localizedkey - ?

Remove need for enum34 for Python 3

enum34 Python module is not compatible with certain Ansible modules that require the enum module. Since enum34 is not required for Python 3, we need to adjust the inclusion/usage to just Python 2 deployments.

Specify sensitive words in a file

For users that have a long-ish list of sensitive words, it would be more convenient to be able to specify them in a file rather than on command line.

@dhalperi for additional context if needed

Anonymize lines with sensitive information

Currently, lines with sensitive information are completely removed. Need to be able to preserve the non-sensitive information on those lines.

Preserve more line context

Some files, like JSON files or some network configuration files may have important context in sensitive lines, which is lost during anonymization.

Consider the following JSON file:

"Text": "password FOOBAR",
"Text2": "cable shared-secret FOOBAR",
"OtherField": ...

Which is no longer proper JSON format after anonymizing passwords:

"Text": "password netconanRemoved0
! Sensitive line SCRUBBED by netconan
"OtherField": ...

or consider this Juniper config snippet:

system {
  root-authentication {
    password FOOBAR;
  }
}

Which loses its line-terminating ; after anonymizing passwords:

system {
  root-authentication {
    password netconanRemoved0
  }
}

In both of these cases, it would be helpful to preserve the trailing context, so the anonymized files will still be formatted like the original (e.g. so the anonymized JSON file can still be interpreted as a valid JSON file).

FR: preserve RFC1918 subnets

10.0.0.0/8 IP addresses: 10.0.0.0 -- 10.255.255.255
172.16.0.0/12 IP addresses: 172.16.0.0 -- 172.31.255.255
192.168.0.0/16 IP addresses: 192.168.0.0 – 192.168.255.255

Would be nice if Netconan kept these subnets intact too.

Suggestion: Use reserved documentation ranges for anonymized IP addresses

netconan seems to currently output valid public IPs when anonymizing. Looking at the ip_anonymization.py, this might take a good chunk of work to restrict things, but it seems like using reserved documentation ranges for IPv4 and IPv6 would be appropriate here, rather than random public IPs.

Feature Request: anonymize ACLs

It would be lovely to have an optional feature that could anonymize Cisco ACL ip/ports/protocols and ACL remarks.

Many of my clients consider ACLs extremely sensitive and I have several clients who use ACL remarks extensively.

This feature should be optional as I suspect doing so might break some things.

incorrect fortinet password removal

fortinet passwords are encrypted in config, using the following syntax:

set password ENC <hash>

Netconan outputs the following after it anonymizes the password:

set password netconanRemoved0 <hash>

The has is not removed by the ENC syntax is (IE this is just stating Encrypted hash to follow.

Add option to preserve classful networks during IP anonymization

Currently, anonymized IP addresses do not preserve classes (i.e. all bits of an address are anonymized)

Support IOS-XR named community sets as a reserved word during anonymization

IOS-XR BGP policy allows you to set a BGP community by either:
a) directly referencing the target community in the config
set community 65001:1111
b) referencing a community-set in the config
community-set BAR
65001:1111
set community BAR

A sample configuration snippet for community-sets and use in a BGP policy is shown below:

community-set LOCAL-NOPREPEND
  1111:2222
end-set
!
community-set LOCAL-LASTRESORT
  1111:4444
end-set
!
community-set LOCAL-HIGHLOCPREF
  1111:1111
end-set
!
!
route-policy FOO
  if destination in BAR or destination in BAZ then
    set community LOCAL-NOPREPEND
    if destination in FOOBAR then
      set community LOCAL-LASTRESORT
    endif
    if destination in BAZBAR then
      set community LOCAL-HIGHLOCPREF
      set community LOCAL-LASTRESORT
    else
      set community LOCAL-NOPREPEND
      set community LOCAL-LASTRESORT
    endif
  else
    drop
  endif
end-policy

Netconan will turn that config into:

community-set LOCAL-NOPREPEND
  1111:2222
end-set
!
community-set LOCAL-LASTRESORT
  1111:4444
end-set
!
community-set LOCAL-HIGHLOCPREF
  1111:1111
end-set
!
!
route-policy FOO
  if destination in BAR or destination in BAZ then
    set community netconanRemoved42
    if destination in FOOBAR then
      set community netconanRemoved19
    endif
    if destination in BAZBAR then
      set community netconanRemoved20
      set community netconanRemoved21
    else
      set community netconanRemoved22
      set community netconanRemoved23
    endif
  else
    drop
  endif
end-policy

Add support for anonymizing single file

Right now, Netconan only accepts directories as input and output params. Need to update to allow specifying a file or directory.

FortiOS multiline private-keys & certificates are not handled correctly

FortiOS multiline private-keys are not handled correctly. Only the first line is handled

private-keys can be found in multiple sections of a config, but as an example:

config vpn certificate local
    edit "fortinet_CA_SSL"
        set password ENC 535456656ghffgfdgfdgf
        set comments "This is the default CA certificate the SSL Inspection....."
        set private-key "-----BEGIN ENCRYPTED PRIVATE KEY-----
gfgGFDBFFFfffffffffffffffffffffffffffffffghhgfhhfhghghghgjjghfh
<continues for several lines>
-----END ENCRYPTED PRIVATE KEY-----"
        set certificate "-----BEGIN CERTIFICATE-----
gfgGFDBFFFfffffffffffffffffffffffffffffffghhgfhhfhghghghgjjghfh
<continues for several lines>
-----END CERTIFICATE-----"
    next
end

Anonymizer could do better detecting nested paswords

set community \"<snmp community>\"
key <key>
tacacs-server host <ip> key 7 \"key\"

-d argument does not seem to work

Hi,

am I the only one having trouble with the -d args ?
The dump file is created but it is always empty.

Awesome tool beside that :)

IPv6 address scrambling should preserve special IPs and IP ranges

The anonymizer should preserve known IPs and/or prefixes as appropriate. The stanardized list of known prefixes is here:

https://www.iana.org/assignments/iana-ipv6-special-registry/iana-ipv6-special-registry.xhtml

Decisions may need to be made on a case-by-case basis.

(To implement this, we'll probably want to generalize the should_anonymize functionality to handle partial preservations.)

Travis build failing

Travis build is failing for many W605 (invalid escape sequence) and a few W504 (line break after binary operator) flake8 failures.

Sensitive words overlapping with reserved words may be anonymized

Running netconan -p -w test -i input -o output (anonymize test, which is a reserved word) where input is a file containing:

password test
password "test"

produces the following output file:

password test
password "netconanRemoved0"

but instead, it should produce an output like:

password test
password "test"

Support extending the list of reserved words

cli, config file.

probably, same setup as sensitive-words would work.

Reason to do this: enable users with ambiguous grammars (like #72) to provide a list of things that should not be treated as passwords.

Better external IP anonymization interface

Today, to use IP-address-anonymization from Netconan externally, you can do something like:

from netconan.ip_anonymization import IpAnonymizer, anonymize_ip_addr
import uuid

salt = str(uuid.uuid4())
anonymizer = IpAnonymizer(salt=salt)
anon_str = anonymize_ip_addr(anonymizer, '1.2.3.4')

But it would be nice to have a better string-based IP-address anonymization interface in the anonymizer objects themselves.

There is an existing method IpAnonymizer.anonymize(addr), but this operates on an unintuitive type (int) and relies on the caller to do all the work (i.e. convert IP address to int, check if the address needs to be anonymized).

Skipping reserved words causes other passwords on the same line to be ignored

If a given line has two passwords and the first is a reserved word, the second will be incorrectly skipped.

e.g. in snmp-server user a b auth md5 ipaddress priv RemoveMe, RemoveMe is a password that should be anonymized and ipaddress is a reserved word that should be left untouched, however Netconan stops searching for additional passwords once it finds the reserved word.

Add support for anonymizing more snmp-server data

Add support for anonymizing lines like:

snmp-server location sensitive location info here
snmp-server contact sensitive contact info here

Where the anonymized output would be something like:

snmp-sever location netconanRemoved1
snmp-sever contact netconanRemoved2

IP anonymization doesn't apply with `1.2.3.4;`

I see a fairly complicated "next character" regex here -- why can't it just be [^\d] - or the right thing for "not a digit"?

https://github.com/intentionet/netconan/blob/master/netconan/ip_anonymization.py#L35

Add option to remove "comments" from configs

It would be useful to include a command-line option that removes "comments" from configs. For example, interface descriptions, BGP neighbor descriptions, remarks in ACLs, etc. Developing keyword lists for this is often non-trivial, so the ability to remove them outright, especially since they don't impact network semantics, would be useful.

(Prompted by a discussion with @sfraint and @progwriter on Slack)

Improve keyword arguments to be more user friendly, memorable, future proof

Some proposals:

Use hyphens to separate words (--sensitive-words, not --sensitivewords e.g.)
Use names that are longer lasting. E.g., --undo seems better than --undoipaddr: what if we add another type of anonymization that is reversible?
Add shortcuts for as many arguments as sensible: -w for --sensitive-words?
Update docs.

netconan: error: Unexpected line : point-to-point;

I tried netconan on our juniper configurations but it's failing with the following error message

netconan  --anonymize-passwords -c bs1-ash1.candidate.conf

usage: netconan [-h] [-a] [-c CONFIG] [-d DUMP_IP_MAP] -i INPUT
                [-l {DEBUG,INFO,WARNING,ERROR,CRITICAL}] [-n AS_NUMBERS] -o
                OUTPUT [-p] [-r RESERVED_WORDS] [-s SALT] [-u]
                [-w SENSITIVE_WORDS]
netconan: error: Unexpected line 11 in bs1-ash1.candidate.conf: point-to-point;

here is the first part of our config

## Last changed: 2018-12-06 17:09:11 PST
version 17.3R2-S2.1;
groups {
    ISIS-BASE {
        protocols {
            isis {
                reference-bandwidth 1000g;
                level 1 disable;
                level 2 wide-metrics-only;
                interface <*-*> {
                    point-to-point;
                    bfd-liveness-detection {
                        minimum-interval 1000;
                        multiplier 3;
                    }
                }
                interface <ae*> {
                    point-to-point;
                }
                interface lo0.0 {
                    passive;
                }
            }
        }
    }

My environment

Python 2.7.14
netconan         0.8.1

Please let me know if I'm doing something wrong
thanks
Damien

false positive tests

We need tests that stuff is NOT anonymized. E.g., MAC addresses and various other places :: may appear.

FortiOS SNMP Community Removal

SNMP Communities are not removed from FortiOS devices:

config system snmp community
    edit 100
        set name "communityXYZ"
    next
end

Add support for IPv6

IP anonymization does not currently support IPv6

Review and add additional tests for Cisco and Arista sensitive lines

Some password regexes for Cisco-like configs in sensitive_item_removal.py are unverified and untested.

They need to be tested against a variety of config lines generated on a router (to make sure the regex handles different syntaxes allowed in the line) and these config lines should be added to the existing list of test lines in test_sensitive_item_removal.py.

Expand MD5 format/anonymization to support variable size salt

In sensitive_item_removal.py, MD5 format detection and anonymization currently only supports 4-character salt. Should expand this to support more salt sizes (at least support 8-character salt for Juniper).