cxx-obfuscator
is a filter that obfuscates a C/C++ source file or
de-obfuscates a previously obfuscated file.
The intended use case is reporting bugs to a compiler vendor without exposing confidential information.
cxx-obfuscator
strikes a balance between protecting intellectual property and easing debugging of obfuscated code by not obfuscating standard identifiers by default.
The conceal
subcommand obfuscates its standard input over its standard output.
It maps identifiers (except whitelisted ones - see below) to words randomly selected from a pool.
It removes comments.
It leaves other tokens unchanged.
The whitelist contains identifiers that must not be obfuscated.
It always contains C/C++ keywords (while, for...), identifiers starting with _
and some names with special meaning (begin(), end(), main()).
The whitelist also contains by default all words in cxx-obfuscator.whitelist.txt
if this file is found in the directory containing cxx-obfuscator
.
This default whitelist contains all identfiers found in /usr/include
on a typical unix system.
The -whitelist
option specifies an alternative whitelist.
By default obfuscated identifiers are taken from a pool of words stored in cxx-obfuscator.pool.txt
in the same directory as cxx-obfuscator
.
The -pool
option specifies an alternative word pool.
The -map
option specifies a file where the conceal
subcommand writes the mapping between clear and obfuscated identifiers.
The reveal
subcommand de-obfuscates its standard input over its standard output using the map file generated by the conceal
subcommand.
The tokenize
subcommand extracts tokens from its standard input.
It is used both for generating pool and whitelist files.
To generate a whitelist file containing all tokens in files in /usr/include
:
find /usr/include -type f -print | xargs cat | cxx-obfuscator tokenize > whitelist.txt
To generate a pool containing all words from Sherlock Holme's canon:
curl https://sherlock-holm.es/stories/plain-text/cano.txt | cxx-obfuscator tokenize > pool.txt