secure-systems-lab / securesystemslib Goto Github PK

View Code? Open in Web Editor NEW

44.0 44.0 48.0 3.27 MB

Cryptographic and general-purpose routines for Secure Systems Lab projects at NYU

License: MIT License

Python 99.43% Shell 0.57%

securesystemslib's People

Contributors

Stargazers

Watchers

Forkers

vladimir-v-diaz lukpueh procrypt santiagotorres raphaelm trishankatdatadog awwad cryptojedi openlawlibrary adityasaky tanishqjasoria joshuagl mike-sul florianveaux trail-of-forks mvrachev mnm678 sechkova shibumi prebuilder jku nagesh4193 fepitre solidifiedray asraa stjordanis ergonlogic elfotografo007 pradyumnakrishna in-toto rugo znewman01 andrewairbotics danbev nabeelshahid7 alanssitis rdimitrov junochiu ckilcoin malancas kommendorkapten miirzabaig nicholastanz ianhundere terricain cynidn2x seanpm2001

securesystemslib's Issues

Boundary between securesystemslib and TUF is off.

Some of the functions in securesystemslib are really pieces of TUF that I don't think are of use to others and don't belong here.

Some examples that IIUC don't belong in securesystemslib:

util.ensure_all_targets_allowed(rolename, list_of_targets, parent_delegations)
util.find_delegated_role(roles, delegated_role)
formats.NUMBINS_SCHEMA
formats.TARGETS_SCHEMA
formats.SNAPSHOT_SCHEMA
formats.TIMESTAMP_SCHEMA
formats.ROOT_SCHEMA
formats.MIRRORDICT_SCHEMA
formats.MIRRORLIST_SCHEMA

Other functions that could conceivably be useful in general have zero context within the module:

util.paths_are_consistent_with_hash_prefixes(paths, path_hash_prefixes)
formats.PROJECT_CFG_SCHEMA

Things I'm less sure about:

formats.THRESHOLD_SCHEMA
exceptions.RoleAlreadyExistsError
exceptions.UnknownRoleError
formats.ROLENAME_SCHEMA

Add LICENSE file and tests to PyPI sdist

Downstream packagers like to use PyPI for (source) distribution files, and have those sources contain a LICENSE file and tests so we can distribute them along with any packages we ship to users, and properly QA the package locally.

At least: add LICENSE and graft tests/ in MANIFEST.in, but preferably also add all requirements files and tox.ini/.travis.yml as they are good references for packagers to use, particularly as dependencies are often inconsistently/incorrectly declared in different places.

Our case: I'm updating the existing FreeBSD port for tuf, and currently creating a port for securesystemslib for it to depend on.

Thanks!

import-style prevents vendoring

The import style used in securesystemslib:
import securesystemslib.module
is incompatible with vendoring (because it would like to modify the import to from my.vendored.path import securesystemslib.module but that is invalid syntax)

Executable failure example:
https://gist.github.com/jku/22d5f08083c8b7b2f3e552f7ca6c301e

This is mostly a "heads-up": I'm exploring solutions in the TUF issue theupdateframework/python-tuf#1165 and will likely suggest whatever solution is found here.

Update JSON canonicalization (backwards incompatible)

Description of issue or feature request:
Securesystemslib provides a custom json canonicalization function based on this Canonical JSON specification.

The specification seems outdated, or at least is not compatible with newer and more detailed specifications, such as the gibson042/canonicaljson-spec, for which a Go implementation exists.

The Notary Go implementation of TUF uses its own canonical JSON implementation which (IIUC) does not conform with any of above two specifications, but looks similar to the latter.

Current behavior:
securesystemslib uses an outdated JSON canonicalization specification.

Expected behavior:
I wonder if we should update securesystemlib's JSON canonicalization, or, given that there is no single accepted specification, switch to something that has wider cross-language support?

Note: I am well aware that this is a bigger request, as it would break backwards compatibility for metadata signatures used in TUF and in-toto, and would therefor require a TAP or ITE.

Add metadata container classes for signed metadata

Description of issue or feature request:
Both TUF and in-toto use a class based model for their metadata, with a similar signature wrapper class. The signature container (i.e. Metadata) and the base class for the signed contents (i.e. Signed) could be hosted in securesystemslib to re-use code and to better draw the line between in-toto/tuf concepts and pure cryptographic signing.

Current behavior:
No class-based metadata model.

Expected behavior:
Add class-based metadata model.

NOTE: I strongly suggest to adopt the classes that where recently added to TUF (theupdateframework/python-tuf#1112), which were based on the the ones from in-toto but don't use the 3rd-party attrs package and slightly streamline the design (most notably see recursive serialisation functions from_dictand to_dict).

Caveat:

A from_dict factory method on a generic Metadata class needs to know about the specific contained metadata, i.e. the subclass of Signed (e.g. tuf.Targets, tuf.Snapshot, in_toto.Link, in_toto.Layout, etc..), which will not be implemented in securesystemslib.

Possible solutions:

Provide a way to register a signed._type-to-subclass-of-Signed mapping on the Metadata class
Sub-class Metadata in tuf and in-toto and implement the factory there, e.g. class tuf.Metadata(securesystemslib.Metadata)

Python 3.3 and 3.5 not tested under Travis

Description of issue or feature request:
Pull request #59 temporarily excluded testing under Python versions 3.3 and 3.5 due to issues with Travis. Although testing under Tox does cover all of our supported Python versions, it is best if continuous integration covered all supported environments. As recommended in #59 (comment), we should open a ticket to keep track of this issue.

Current behavior:
Travis runs our unit tests under Python versions 2.7, 3.4, and 3.6.

Expected behavior:
Travis should run our unit tests under Python versions 2.7, 3.3, 3.4, 3.5, and 3.6.

Fix coveralls integration

Description of issue or feature request:
For each Travis Ci build we submit coverage results to sslib's coveralls.io page (see after_success in .travis.yml) . This has stopped working a couple of builds ago.

Current behavior:
From the log of a recent Travis build:

0.18s$ coveralls
Submitting coverage to coveralls.io...
Traceback (most recent call last):
  File "/home/travis/virtualenv/python3.5.6/lib/python3.5/site-packages/coverage/sqldata.py", line 1046, in execute
    return self.con.execute(sql, parameters)
sqlite3.DatabaseError: file is encrypted or is not a database
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/home/travis/virtualenv/python3.5.6/bin/coveralls", line 11, in <module>
    sys.exit(main())
  File "/home/travis/virtualenv/python3.5.6/lib/python3.5/site-packages/coveralls/cli.py", line 77, in main
    result = coverallz.wear()
  File "/home/travis/virtualenv/python3.5.6/lib/python3.5/site-packages/coveralls/api.py", line 176, in wear
    json_string = self.create_report()
  File "/home/travis/virtualenv/python3.5.6/lib/python3.5/site-packages/coveralls/api.py", line 192, in create_report
    data = self.create_data()
  File "/home/travis/virtualenv/python3.5.6/lib/python3.5/site-packages/coveralls/api.py", line 246, in create_data
    self._data = {'source_files': self.get_coverage()}
  File "/home/travis/virtualenv/python3.5.6/lib/python3.5/site-packages/coveralls/api.py", line 261, in get_coverage
    workman.load()
  File "/home/travis/virtualenv/python3.5.6/lib/python3.5/site-packages/coverage/control.py", line 412, in load
    self._data.read()
  File "/home/travis/virtualenv/python3.5.6/lib/python3.5/site-packages/coverage/sqldata.py", line 742, in read
    with self._connect():       # TODO: doesn't look right
  File "/home/travis/virtualenv/python3.5.6/lib/python3.5/site-packages/coverage/sqldata.py", line 297, in _connect
    self._open_db()
  File "/home/travis/virtualenv/python3.5.6/lib/python3.5/site-packages/coverage/sqldata.py", line 265, in _open_db
    self._read_db()
  File "/home/travis/virtualenv/python3.5.6/lib/python3.5/site-packages/coverage/sqldata.py", line 269, in _read_db
    with self._dbs[get_thread_id()] as db:
  File "/home/travis/virtualenv/python3.5.6/lib/python3.5/site-packages/coverage/sqldata.py", line 1024, in __enter__
    self._connect()
  File "/home/travis/virtualenv/python3.5.6/lib/python3.5/site-packages/coverage/sqldata.py", line 1012, in _connect
    self.execute("pragma journal_mode=off").close()
  File "/home/travis/virtualenv/python3.5.6/lib/python3.5/site-packages/coverage/sqldata.py", line 1063, in execute
    raise CoverageException("Couldn't use data file {!r}: {}".format(self.filename, msg))
coverage.misc.CoverageException: Couldn't use data file '/home/travis/build/secure-systems-lab/securesystemslib/.coverage': Looks like a coverage 4.x data file. Are you mixing versions of coverage?

Expected behavior:
Submit coverage result to coveralls.

Add missing git tag for 0.11.3 release

Using the GitHub sources to grab packages tests until they ship in the PyPI sdist (See #166), I noticed the git tag for the latest 0.11.3 release is missing.

This is a request to add that git tag for the commit hash which corresponds to the currently released 0.11.3 tarball on PyPI.

Our FreeBSD ports (by default), use the canonical version (0.11.3) as an argument to pass to GitHub's API to fetch a specific version of the sources. In the absence of a matching git tag, we need to use a commit hash, which can be unreliable (not exactly the specific version uploaded to PyPI)

Thanks!

key format inconsistency

Description of issue or feature request:

Based on this PR (#250) we had a discussion around a default or custom key format. The PR showed up a few inconsistencies regarding public key generation in in-toto-keygen. We generate a key id for private keys, but no key id will be written for public keys. This lead to the question if we want to support key ids in our key format.

Current behavior:

Right now TUF and in-toto behave differently regarding key formats:

Neither in-toto nor TUF specs mention the keyid field when defining key object formats.
Neither in-toto nor TUF specs show the keyid field in metadata examples.
The TUF reference implementation does not include the keyid field in public key objects in root or targets metadata on disk.
The in-toto reference implementation does include the keyid field in public key objects in layout metadata on disk.
The in-toto reference implementation uses schema checks on key public key objects (held in memory) that require the keyid, and checks that allow keyid:
GPG_PUBKEY_SCHEMA -- requires keyid
ANYKEY_SCHEMA -- requires keyid
PUBLIC_KEY_SCHEMA -- does not require keyid, but allows it because unrecognized keys are allowed
KEY_SCHEMA -- does not require keyid, but allows it because unrecognized keys are allowed.
The TUF reference implementation uses in repository tool and keydb among others the ANYKEY_SCHEMA, which does require keyids.
Both TUF and in-toto reference implementations seem to write ed25519 and ecdsa public keys to disk without keyid.

Also we seem to differ between different stages (in-memory, on-disk, etc):

in memory -- with keyid
on disk inside layout (in-toto) or root/targets (tuf) metadata — with keyid in in-toto, w/o keyid in tuf
on disk as custom pub key serialization format — w/o keyid

This comments from @lukpueh might be important for this PR, too: (#250 (comment)):

If we make a potentially breaking change, we might want to consider switching to a standard format altogether (we do use PEM for RSA keys on disk), or at least revise the included fields, e.g. why do ed25519/ecdsa private keys on disk include a keyid field and public keys don't, or why do both not include the keyid_hash_algorithms field, etc.... (see theupdateframework/tuf#848 for discussion about the latter field).

Right now, securesystemslib and its dependents do know how to handle their keys. We can load public keys from disk (without keyid), keep them in memory (with keyid) and add them to in-toto layout metadata (with keyid), or in tuf root/targets metadata (without keyid).

It would be good to make it very clear, when we use which format why. The various securesystemslib.formats.*KEY_SCHEMAs, are not particularly helpful for that. Some of them require a keyid and other don't which (unexpectedly) makes it optional (see #183 (comment)).

Moreover, the way we currently use the "serialization" function format_keyval_to_metadata looks like a lot of unnecessary work, e.g.:

# simplified and slightly exaggerated for emphasis
a = key["a"]
b = key["b"]
c = key["c"]
metadata = format_keyval_to_metadata(a, b, c)
# metadata is {"a": "...", "b": "...", c: "...", d: "..."}

While working on the Go implementation Lukas found an issue in the PKCS key loading.
In the Go implementation we are able to load a PKCS1 private key, however while extracting the public key out of the PKCS1 key we are storing the public key as PKCS8. Therefore we end up with two different PEM representations in our in-memory key format. The securesystemslib seems to do it that way, too. We should really have a look at this.

Should we drop PKCS1 support completely for private keys?

Expected behavior:

to be discussed

Signature verification ignores the Key ID in the signature

Summary

keys.verify_signature takes a key and a signature (along with the data over which the signature may have been signed). Signature verification determines only whether or not the signature value (sig) can have been produced by the provided key. It does not check whether or not the signature itself reports that it was made using that key, so the keyid listed in the signature provided need not match the key in order for the signature to be deemed valid. Correct behavior, I would say, is to deem the signature invalid if the key id it lists is not the key id that is expected, without bothering to determine if the signature value ('sig') is correct.

Example

Suppose I have two keys, k1 and k2. I use k1 to sign data, and then change the signature dictionary to claim that I used k2 by listing k2's key id, even though I signed with k1. I send you the signature. You want to know if it was generated by k1. You run keys.verify_signature and everything checks out, even though the signature incorrectly claims to be a signature by k2.

Security Implications

Note that this can mean that an attacker could have a signature deemed valid without knowing what the right keyid is, and that, in upstream code, a single signature can be deemed valid if it could have been generated by any acceptable key, without the attacker having to choose one key or attempt multiple times listing different key ids. That problem depends on the behavior of upstream code, though, which can be written to eliminate that possibility.

While this may not be a large security issue -- the signature value itself must still match a signature produced by the provided key over the provided data -- it's confusing, may break assumptions in upstream code, and may have some security impact upstream (where you can use the same signature in the same signature verification request to try to match any acceptable key's signature without choosing a single key).

securesystemslib/securesystemslib/keys.py

Lines 940 to 946 in 433670b

    
           def verify_signature(key_dict, signature, data): 
        
             """ 
        
             <Purpose> 
        
               Determine whether the private key belonging to 'key_dict' produced 
        
               'signature'.  verify_signature() will use the public key found in 
        
               'key_dict', the 'sig' objects contained in 'signature', and 'data' to 
        
               complete the verification.

Doc improvement for formats._canonical_string_encoder / encode_canonical()

@lukpueh has correctly pointed out that formats._canonical_string_encoder()'s docstring is bad. It would be better not to have one, but better yet for it to be clear. It should be substantially more specific about the behavior the function intends and provides and why. It seems that this canonicalization policy is part of the motivation.

The docstring for formats.encode_canonical() should also be adjusted slightly to explain this policy.

Enable password confirmation when generating keys

Description of issue or feature request:

The generate_and_write_XXX_keypair() functions should prompt for a password confirmation. The confirm flag is incorrectly set to False. See

securesystemslib/securesystemslib/interface.py

Lines 175 to 177 in 07bf160

    
           password = _get_password('Enter a password for the encrypted RSA' 
        
               ' key (' + Fore.RED + filepath + Fore.RESET + '): ', 
        
               confirm=False)

Current behavior:

>>> generate_and_write_rsa_keypair("keystore/root_key2")
Enter a password for the RSA key:

Expected behavior:

>>> generate_and_write_rsa_keypair("keystore/root_key2")
Enter a password for the RSA key:
Confirm:

Fine-tune code coverage measurement for optional builds

Description of issue or feature request:
@joshuagl has recently overhauled sslib's extra dependency handling. This included the configuration of different tests for different sets of installed dependencies (see #200 and #206.)

In the course of a code review we discussed the possibility of different coverage profiles that accurately report the tested lines depending on the relevant code branches. See #200 (comment) ff. for some ideas.

Current behavior:
Code branches that are executed due to missing optional dependency are generally excluded from code coverage.

Expected behavior:
Accurately report the tested lines depending on the relevant code branches for a given build.

Better distinguish between keytype and scheme for ECDSA keys

Description of issue or feature request:

The key format provided by securesystemslib defines a keytype field and a scheme field. Currently, there is no distinction between these fields for ECDSA keys.

Current behavior:

ECDSA keys have the following format:

{
    "keytype" : "ecdsa-sha2-nistp256",
    "scheme" : "ecdsa-sha2-nistp256",
    "keyval" : {"public" : PUBLIC}
}

As noted in 761aded, ecdsa-sha2-nistp384 is also supported by securesystemslib.

Expected behavior:

Update securesystemslib to use just ecdsa as keytype while retaining the current scheme formats. This also aligns with the key format for RSA keys where we have:

{
    "keytype" : "rsa",
    "scheme" : "rsassa-pss-sha256",
    "keyval" : {"public" : PUBLIC}
}

put rewinds (or seeks) the given fileobj to its beginning before attempting to store it

Description of issue or feature request:

Per the title, securesystemslib.storage.FilesystemBackend.put() seeks to the beginning of the file-object in an attempt to ensure the entire file contents are copied. This has the side-effect of modifying the position within the file-object.

See #240

Current behavior:

In securesystemslib.storage.FilesystemBackend.put() the file-like object is read from the beginning, not its current offset (if any).

This is a side-effect of the implementation using shutil.copyfileobj(), which has this same behaviour and was persisted in the initial implementation of FilesystemBackend to better match the currently expected behaviour in tuf.

Expected behavior:

It's not clear to me what the correct bevaiour is:

as now, only storing the file position before the seek and restoring it after the copy
copying from the current position within the file and relying on the caller to ensure the file position is appropriate before calling put()
perhaps something else?

`filepath` arg to `generate_and_write_*_keypair` may already exist

Description of issue or feature request:

As pointed out by @adityasaky in #169 (review) interface.generate_and_write_spx_keypair does not check if its filepath argument already exists. In the case of an existing directory we get an error (IOError on py2, IsADirectoryError on py3), in case of a file we silently override it. Moreover, since the function writes two files, i.e. <filepath> and <filepath>.pub, it might succeed writing one but not the other. The same applies to the rest of the generate_and_write_*_keypair functions in interface.

Current behavior:
filepath argument in interface.generate_and_write_*_keypair is not checked if it already exists

Expected behavior:
Check if <filepath> or <filepath>.pub already exist and act accordingly (unsurprisingly).

@adityasaky suggests to warn and/or create the keys under <filepath>/<keyid> and <filepath>/<keyid>.pub. However, these could too already exist.
I suggest to just fail if <filepath> or <filepath>.pub exist.
OTOH, and given that there are many other reasons why the write operations might fail, we could just leave it as it is now, and maybe update the function docstrings (EAFP-style).

Revise extra dependency handling

[Updated on Jan 23, 2020]

Description of issue or feature request:

securesystemslib lists some dependencies that require C-code (cryptography requiresopenssl, pynacl requires libsodium) as optional to allow for a pure-python installation. The runtime handling of missing optional dependencies should be revised.

Current behavior:
cryptography and pynacl are listed as optional (extra) dependencies, but securesystemslib does not fare (consistently) well, if installed without them.

Expected behavior:

~~Public facing modules (e.g. interface.py and keys.py) must be importable, even if the optional dependencies are not installed.~~ Fixed with #200
~~Each public facing function always should be callable and present meaningful user-feedback if an optional dependency that is required for that function is not installed.~~ Fixed with #200
Also address or keep in mind recently merged or pending functionality, that has non-pure Python dependencies (#174, #169, #170).

-Optional: colorama was made a strict dependency in #178 to quickfix #155. @SantiagoTorres, to consider making it a optional again (with respect to required adoptions as outlined above). Fixed with #200

It would be nice to fine-tune code coverage measurement (see #200 (comment) ff.)

Don't use CLI gpg

Usage of CLI tools may be dangerous. It is also not a very clean and powerful approach because CLIs are quite limited.

The better approach is to use some libs directly.

gpgme (debian package python3-gpg and is imported as gpg) is claimed to be LGPL, but it is a bit strange, since gpg itself is GPL.

There is also https://github.com/SecurityInnovation/PGPy , a pure python impl that is pretty abandoned.

Also BouncyCastle can be used via JPype.

Create private key files with umask 177

Description of issue or feature request:
Transfer of theupdateframework/python-tuf#279.

According to common practice (see e.g. ssh utilities) private key files should be created with read and write permissions for the user only (umask 177). Securesystemslib's interface.generate_and_write_{rsa, ed25519, ecdsa}_keypair functions should adopt that behavior.

Current behavior:
Private key files are created with the OS' default umask.

Expected behavior:
Private keys files are created with umask 177.

Support GPG and SSH keys

securesystemslib provides an API to import and export public and private keys in PEM (RSA, ECDSA, also cf. #54) or in a proprietary format (Ed25519).

It would be convenient to extend the API so that users can load, e.g. their existing GPG or SSH keys. Note that securesystemslib does not necessarily have to provide an API to generate those keys. As @aaaaalbert has mentioned, user might even prefer to use their own well-known toolchain.

Dependency monitoring is broken

Description of issue or feature request:
securesystemslib configures pyup to perform dependency monitoring.

Current behavior:

The pyup badge on the README has a status unknown and its link 404s.
The last pyup PR was in April '18 (see #143).

Expected behavior:
Configure dependency monitoring to regularly und automatically run securesystemslib's test suite against the latest versions of its dependencies.

See tuf for a working pyup configuration, or in-toto which revises tuf's dependency monitoring and uses dependabot instead of pyup. (in-toto/in-toto#294)

Remove `keys.format_metadata_to_key()`'s dependency on `settings.HASH_ALGORITHMS`

Description of issue or feature request:

It should be possible to control the hash algorithms used for generating keyids in securesystemslib.keys.format_metadata_to_key() without having to change package level settings in securesystemslib.settings

Related to #219

Current behavior:

The tuf reference implementation has code like the following, i.e. repository_tool.py#L3062

hash_algorithms = securesystemslib.settings.HASH_ALGORITHMS
securesystemslib.settings.HASH_ALGORITHMS = key_metadata['keyid_hash_algorithms']
key_object, keyids = securesystemslib.keys.format_metadata_to_key(key_metadata)
securesystemslib.settings.HASH_ALGORITHMS = hash_algorithms

Expected behavior:

format_metadata_to_key can be called without having to change securesystemslib.settings.HASH_ALGORITHMS, something like:

key_object, keyids = securesystemslib.keys.format_metadata_to_key(key_metadata, hash_algorithms)

Inconsistent passphrase treatment on key generation / loading

Bug

First, if you use repository_tool and try generating an rsa key (rt.generate_and_write_rsa_keypair) and don't provide a passphrase (by just hitting enter at the prompts), repository_tool generates an unencrypted PEM file (always using pycrypto -- see #43).

It makes some sense to support unencrypted key files, so this is fine. Two problems result from the code, though, aside from those mentioned in #43:

There is no way provided to instruct repository_tool to tell pyca_crypto to try to load an unencrypted PEM file, so the keys you just generated (with pycrypto) become unusable.
There is no way provided to instruct repository_tool to tell pyca_crypto to generate an unencrypted PEM file.

Analysis and Steps

Three options are desired for key generation using repository_tool.generate_and_write_rsa_keypair and repository_tool.import_rsa_privatekey_from_file and their intersection is complicating things:

no passphrase; don't encrypt the PEM file / load an unencrypted PEM file
here's a passphrase to use; encrypt the created PEM file or load the encrypted PEM file
prompt me for a passphrase; encrypt the created PEM file or load the encrypted PEM file

It must be possible to do each (assuming we support unencrypted PEM files) and clear to the user how to do each. Further, it must be consistent behaviorally across calls to pyca_crypto and pycrypto.

Lack of colorama hard dependency

Description of issue or feature request:
The current release blows up because colorama is not a dependency. Although it should get pulled in by either cryptography or pynacl, there is no information to describe that is the issue when using securesystemslib.

For example, this is the test log from in-toto using the latest securesystemslib (given that it doesn't hard depend on either library)

======================================================================
ERROR: tests.test_runlib (unittest.loader._FailedTest)
----------------------------------------------------------------------
ImportError: Failed to import test module: tests.test_runlib
Traceback (most recent call last):
  File "/usr/lib64/python3.6/unittest/loader.py", line 428, in _find_test_path
    module = self._get_module_from_name(name)
  File "/usr/lib64/python3.6/unittest/loader.py", line 369, in _get_module_from_name
    __import__(name)
  File "/home/santiago/Documents/personal/programas/in-toto/in-toto/tests/test_runlib.py", line 29, in <module>
    from in_toto.models.metadata import Metablock
  File "/home/santiago/Documents/personal/programas/in-toto/in-toto/in_toto/models/metadata.py", line 34, in <module>
    from in_toto.models.layout import Layout
  File "/home/santiago/Documents/personal/programas/in-toto/in-toto/in_toto/models/layout.py", line 49, in <module>
    import securesystemslib.interface
  File "/home/santiago/Documents/personal/programas/in-toto/in-toto/.tox/py36/lib/python3.6/site-packages/securesystemslib/interface.py", line 51, in <module>
    from colorama import Fore
ModuleNotFoundError: No module named 'colorama'

Needless to say, it's rather dangerous to assume that cryptography or pynacl will depend on colorama forever. Worse than that, I'm assuming that we may add support for other libraries which may or may not pull colorama in.

Current behavior:
Thing breaks unless you explicitly:

Install a dependency that pulls colorama, such as pynacl or cryptography.
Install colorama explicitly to compensate for this lack of hard-dependency

Expected behavior:
The library installs itself with everything it needs and doesn't make any assumptions of the packages that will be pulled and their upstream dependencies.

Remove securesystemslib.settings

NOTE: this is a meta-bug/milestone.

Description of issue or feature request:

The settings module is a legacy of securesystemslib originating as part of the tuf reference implementation and results in awkward cohesion between securesystemslib and its users.

The settings module is a prime example of this, there are places in securesystemslib where modifying the behaviour of a function requires changing values in securesystemslib.settings, which is non-intuitive. Particularly as many of the settings are only documented as comments in that file.

This meta-bug tracks the goal/milestone of removing securesystemslib.settings and should be linked to smaller, more tractable and more specific issues.

Current behavior:

Users have to modify securesystemslib internal settings, rather than being able to pass all required information as arguments to library functions.

Expected behavior:

Users are not required to modify internal library settings and can pass all required configuration values as arguments to the functions they are using.

Capabilities of import_{ed25519,ecdsa}_{private,public}key_from_file don't align with name

Description of issue or feature request:

securesystemslib has a custom json on-disk format for ed25519 and ecdsa private and public keys. The generate_and_write_{ed25519, ecdsa}_keypair(...) and import_{ed25519, ecdsa}_{private, public}key_from_file functions serialize and deserialize keys to and from this format respectively, and according to their names.

However, as discovered while refactoring the related unit tests in #279, the capabilities of these function exceed the scope that their names suggest.

Current behavior:

import_ed25519_publickey_from_file can import ed25519 private keys (if unencrypted)
import_ed25519_privatekey_from_file can also import public keys
import_ecdsa_publickey_from_file can import ed25519 public keys

Expected behavior:
Fail if the imported key is not what the name of the function suggests.

Also see #251 and secure-systems-lab/dsse#1 for long-term plans of dropping the custom key format.

Split up util.get_file_details()

Description of issue or feature request:

util.get_file_details() returns the length and hashes of the specified file. This makes a lot of sense for generating information about TUF targets, where the length and hashes of each target are required, but is potentially too costly a function when generating information about TUF metadata files for the snapshot role, where lengths and hashes are optional. Snapshot metadata may only want hashes, or only want lengths, and we don't want to force an adopter to implement their own subset of the functionality in get_file_details().

Therefore I suggest we factor out the two features of get_file_details() into two new utility functions – one for length and one for hashes – and have the existing get_file_details() wrap them.

Current behavior:

Callers wanting to compute length or hashes, not both, need to either accept the overhead of unwanted computations in get_file_details() (only a significant problem for hashes), or implement their own equivalent functionality.

Expected behavior:

A separate API is provided for at least hash computation, ideally hash and length computation, with get_file_details() conveniently wrapping both.
get_file_details(filepath, hash_algorithms=['sha256'], storage_backend=None)
get_file_length(file_path, storage_backend=None)
get_file_hashes(file_path, hash_algorithms=['sha256'], storage_backend=None)

Remove references to TUF

Description of issue or feature request:

securesystemslib was not always a library. Many of its modules and functions previously lived in the TUF repository. Unfortunately, there are still references to TUF in some of the securesystemslib comments.

Current behavior:
There are references to TUF in comments and there exist schemata that are only relevant to TUF. For example:

securesystemslib/securesystemslib/formats.py

Lines 18 to 24 in d111604

    
             A central location for all format-related checking of TUF objects. 
        
             Note: 'formats.py' depends heavily on 'schema.py', so the 'schema.py' 
        
             module should be read and understood before tackling this module. 
        
             'formats.py' can be broken down into three sections.  (1) Schemas and object 
        
             matching.  (2) Classes that represent Role Metadata and help produce 
        
             correctly formatted files.  (3) Functions that help produce or verify TUF

securesystemslib/securesystemslib/formats.py

Lines 326 to 334 in d111604

    
           # Version information specified in "snapshot.json" for each role available on 
        
           # the TUF repository.  The 'FILEINFO_SCHEMA' object was previously listed in 
        
           # the snapshot role, but was switched to this object format to reduce the 
        
           # amount of metadata that needs to be downloaded.  Listing version numbers in 
        
           # "snapshot.json" also prevents rollback attacks for roles that clients have 
        
           # not downloaded. 
        
           VERSIONINFO_SCHEMA = SCHEMA.Object( 
        
             object_name = 'VERSIONINFO_SCHEMA', 
        
             version = METADATAVERSION_SCHEMA)

Expected behavior:
TUF references in securesystemslib should be removed and TUF-related schemata relocated to the TUF repository.

Revise and automate docs with sphinx and host on readthedocs

Description of issue or feature request:
Revise docstrings of API functions regarding contents and format (use Google Style docstring as suggested in secure-systems-lab/code-style-guidelines#20), akin to what in-toto did in in-toto/in-toto#369.

Needs coordination with #270

Current behavior:

No clear API
Docstrings use custom secure systems lab format

Expected behavior:

Clear API via #270
Use Google-style quasi standard docstring format (see napoleon sphinx extension)
Revise doc contents (aim for conciseness)

Discontinue pycrypto support

pycrypto which we provided as crypto backend alongside pyca/cryptography is no longer maintained (tip of master 3 years old and usage discouraged in issue tracker).

We should probably replace it with the drop-in replacement pycryptodome or drop it altogether?

Load non-encrypted private keys and toggle password prompt

Description of issue or feature request:
The interface module provides functions to import RSA, ed25510 and ECDSA private keys from files.

The functions take an optional password argument and if no password is passed, the caller is prompted for a password. In either case the received password is used to decrypt the key. Hence, there is no way to load a non-password-encrypted private key without getting a prompt, or to fail if the key is encrypted but no password is passed, rather than prompting for a password.

This ticket proposes the addition of an optional boolean prompt argument to the functions:
import_rsa_privatekey_from_file(filepath, password=None, scheme='rsassa-pss-sha256')
import_ed25519_privatekey_from_file(filepath, password=None)
import_ecdsa_privatekey_from_file(filepath, password=None)

Current behavior:
If a password is passed, that password is used to decrypt the key.
If no password is passed, the user will be prompted for a password, which is used to decrypt the key.
Fail if the key can't be decrypted.

Expected behavior:
If a password is passed and prompt is False, the passed password is used to decrypt the key.
If prompt is True, the caller is prompted for a password, which is used to decrypt the key.
If no password is passed and prompt is False, the key will be loaded as non-encrypted key.
Fail if no password is passed and the key is decrypted.
Fail if a password is passed and the key can't be decrypted.

Revise schema and formats facility

Description of issue or feature request:
Here are several loosely ordered observations in regards to securesystemslib's schema/formats facility:

Current behavior:

Some schemas sound more specific than they are, e.g. RELPATH_SCHEMA == PATH_SCHEMA == AnyString. This is misleading, when assessing a function's capability of sanitizing inputs.
Some schemas are an odd replacement for constants, e.g. schemas that define hard-coded strings, like "rsassa-pss-sha256", and are then used to check the same strings, which have been hardcoded into function default arguments.
Schema validation seems generally overused. In TUF nearly every function runs check_match for each argument. IMHO argument sanitizing is mostly important in public-facing interfaces. Programming errors, on the other, should be caught through extensive testing and code review.
Using it everywhere makes the code bloated, also because there are a lot of generic comments describing the checks, and the obligatory and also quite generic FormatError, if <arg> is not properly formatted- blocks in the <Exceptions> section of docstrings.
Schema checking sometimes makes execution branches unreachable (also see secure-systems-lab/code-style-guidelines#18):
```
X_OR_Y_SCHEMA.check_match(arg)

if arg == X:
  # ...
elif arg == Y:
  # ...
else:
  raise WillNeverBeRaisedError()
```
And I've even seen cases where WillNeverBeRaisedError is also listed in the docstring as Exception that might be raised (documentation rot)
The error messages from checking schemas with the check_match method are often not helpful, because they usually don't show the value of the checked variable or lack context.

Expected behavior:

Review existing schemas for their validity/strictness, especially when chained.
Only use in public facing interfaces (open for discussion).
This would also require a clearer definition of what functions should be public interfaces, which btw. TUF/in-toto integrators would greatly benefit from.
At least don't blindly add <arg>.<SCHEMA>.check_match to every function. Coordinate with the rest of the function, and its purpose.
Make sure the error message is helpful, e.g. by:
- supporting an error message override argument in check_match, or by
- using matches plus raise FormatError (or maybe even just ValueError) with a custom message instead of check_match.
Disambiguate schemas and constants.

Create public high-level interface for securesystemslib.keys

Currently securesystemslib.keys provides PEM import functions for public and private ecdsa and rsa keys.
Let's also support ed25519 and add JSON import functions for all three protocols.

UPDATE:
Below discussion has moved the focus of this issue to add/factor out securesystemslib high-level functions from the lower-level securesystemslib.keys module, i.e. what goes into securesystemslib.keys and what goes into securesystemslib.interface.

Revise exception taxonomy

Description of issue or feature request:

In accordance with in-toto/in-toto#126 grep through raise statements and re-assess the suitability of each raised error and its message.

Also see these CR-comments for further motivation:

Handle GPG revocation signatures

(transferred from in-toto/in-toto#263)

Description of issue or feature request:
in-toto/in-toto#257 adds gpg self-signature verification support, for signatures types 0x10-0x13 (certifications) and 0x18 (subkey binding signature). The added infrastructure may be used to also consider other signature types such as,

0x20: Key revocation signature
0x28: Subkey revocation signature
0x30: Certification revocation signature

See RFC4880 5.2.1. Signature Types and 12.1. Key Structures for details about the signatures types and where in the key bundle they occur.

Current behavior:
Revocation type gpg signatures are ignored.

Expected behavior:
Handle revocation type gpg signatures, e.g. ignore revoked keys or certificates and/or warn user about them.

Rewrite the parsers in Kaitai Struct

Please fill in the fields below to submit an issue or feature request. The
more information that is provided, the better.

Description of issue or feature request:
Currently handcoded parsers are used. It is not a very flexible approach. It complicates the matters a lot, both understanding and modification.

There exists a DSL called Kaitai Struct for describing binary parsers. It allows to describe the structure in a declarative way and then compile it into a parser in multiple programming languages.

It may make sense to replace handcoded parsers with generated ones.

BTW, https://github.com/kaitai-io/kaitai_struct_formats/blob/master/security/openpgp_message.ksy may be helpful.

Module is logging warnings on import

Hello 👋

Description of issue or feature request:

When securesystemslib is installed without the colors extra, warnings/error logs are shown.

Current behavior:

On Python3, it complains about the colorama module not being found here

On Python2, it tries to complain for the same thing but is failing with the No handlers could be found for logger "securesystemslib_interface".

Expected behavior:
The library should not complain about colorama as it is an optional dependency (and should also initialize the logger correctly on Python2)

How to reproduce:

Python2: docker run -it python:2 bash -c "pip install securesystemslib[crypto,pynacl] && python -c 'import securesystemslib.interface'"

Python3: docker run -it python:3 bash -c "pip install securesystemslib[crypto,pynacl] && python -c 'import securesystemslib.interface'"

Print expected type instead of its string representation

Description of issue or feature request:

Moved from the TUF repo. Also see this comment.

Current behavior:

Presently, if a user passes an object O other a list to a Schema ListOf object, the error message will print the string representation S of O. Unfortunately, the error message is obscured when the S is large (e.g. O is a set with hundreds of thousands of strings).

Expected behavior:

I think the error message would be just as effective if we simply printed the type of O.

Configure automatic code linter

Description of issue or feature request:

Configure linter (e.g. pylint) to analyze all securesystemslib code including tests
Configure tox+travis to run linter on PR and require full pass
Fix linter errors/warnings in all securesystemslib code including tests

See in-toto/in-toto#289, in-toto/in-toto#296, in-toto/in-toto#279 for an exemplary setup, also see how commits are aggregated by error/warning type, to mitigate the flag day.

Current behavior:
No automatic code linting and unlinted code

Expected behavior:
Automatic code linting and linted code

create_folder does nothing (and does not throw) if filepath is an empty string

Description of issue or feature request:

In order to support existing behaviour in tuf the create_folder method of FilesystemBackend does nothing and does not raise an error if passed an empty string.

This feels like counter-intuitive behaviour that doesn't make sense for the storage backend. Ideally this would be resolved in tuf by removing calls to create_folder with an empty path i.e. performing any special casing in the caller.

See #240

Current behavior:

When securesystemslib.storage.FilesystemBackend.create_folder() is called with an empty string it returns immediately.

Expected behavior:

When securesystemslib.storage.FilesystemBackend.create_folder() is called with an empty string it raises an error, which better mimics the behaviour of os.mkdir():

>>> os.mkdir('')
Traceback (most recent call last):
  File "<input>", line 1, in <module>
    os.mkdir('')
FileNotFoundError: [Errno 2] No such file or directory: ''

Return key objects with keyids_hash_algorithms when importing keys

This issue is discussed in theupdateframework/python-tuf#412.

Since loading keys should be handled in securesystemslib from now on, it makes sense to address the issue here.

TempFile doesn't behave like a Python file object

This is replicating the content of theupdateframework/python-tuf#160, as the TempFile class is now implemented in securesystemslib, not tuf.

Description of issue or feature request:

TempFile isn't a file object and its methods aren't consistent with similar methods on a file object.

Current behavior:

There are inconsistencies between file operations on a securesystemslib.util.TempFile and file objects from the Python standard library, for example:

>>> import securesystemslib.util
>>> tf = securesystemslib.util.TempFile()
>>> import tempfile
>>> ptf = tempfile.TemporaryFile()
>>> f = open('tmpfile', 'w+')
>>> tf.write(b'Hello, world!')
>>> ptf.write(b'Hello, world!')
13
>>> f.write('Hello, world!')
13
>>> tf.read(3)
b''
>>> ptf.read(3)
b''
>>> f.read(3)
''
>>> tf.read()
b'Hello, world!'
>>> ptf.read()
b''
>>> f.read()
''
>>> tf.tell()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'TempFile' object has no attribute 'tell'
>>> ptf.tell()
13
>>> f.tell()
13
>>> tf.seek(0)
>>> ptf.seek(0)
0
>>> f.seek(0)
0
>>> tf.seek(0)
>>> ptf.seek(0)
0
>>> f.seek(0)
0
>>> tf.read(3)
b'Hel'
>>> ptf.read(3)
b'Hel'
>>> f.read(3)
'Hel'
>>> tf.read()
b'Hello, world!'
>>> ptf.read()
b'lo, world!'
>>> f.read()
'lo, world!'

Expected behavior:

Users familiar with Python will expect TempFile to behave like a file object

It would be a usability improvement for Python developers if TempFile behaved like a file object. It should probably implement io.FileIO.

PEM parsing is broken

High-level problem
We currently have an incomplete view on PEM formats. Several functions in the modules pycrypto_keys, pyca_crypto_keys, ecdsa_keys and keys expect PEM header and footer to be one of:

-----BEGIN PUBLIC KEY----- ... -----END PUBLIC KEY-----
-----BEGIN RSA PUBLIC KEY----- ... -----END RSA PUBLIC KEY-----
-----BEGIN RSA PRIVATE KEY----- ... -----END RSA PRIVATE KEY-----
-----BEGIN EC PRIVATE KEY----- ... -----END EC PRIVATE KEY-----

However, there are many values for VALUE in -----BEGIN {VALUE} ... -----END {VALUE}-----, as can be seen e.g. in openssl's pem.h.

Breaking example
Securesystemlib's default crypto library cryptography has changed the PEM header for encrypted RSA private keys from PKSC#5 to PKSC#8 (without our noticing):

# PKCS#5 header example
-----BEGIN RSA PRIVATE KEY-----
Proc-Type: 4,ENCRYPTED
DEK-Info: DES-EDE3-CBC,7D3EC0DB6DFBB404 ....

# PKCS#8 header example
-----BEGIN ENCRYPTED PRIVATE KEY----- ....

As a consequence a newly created encrypted RSA private key is not recognized as such anymore.

from securesystemslib import keys
rsa_key = keys.generate_rsa_key()
rsa_private = rsa_key["keyval"]["private"]
private_pem = keys.create_rsa_encrypted_pem(rsa_private, "1234")
keys.is_pem_private(private_pem)
# Returns `False` (should be `True`)

Note: The unit tests use a key that was generated before the update and therefor pass.

Solutions

Update our PEM parsing functions to account for all formats
Pro: Quick/Easy fix
Con: IMHO bad documentation of PEM formats, seems like a shortcut that can easily break
Remove PEM parsing functions in securesystemslib and rely on crypto libraries
Pro: Seems like a more stable approach
Con: We support 2 crypto libraries (cryptography and pycrypto) and 3 key types (rsa, ecdsa, ed25519). And the affected functions are currently used to decide which crypto library functions should be called. Hence, this fix would require several larger changes.

Despite all, I would prefer solution (2).

Further readings
Below are random sources about PEM, PKCS5 and PKCS8. I didn't find the related RFC's very helpful, but I'll probably have to dig deeper:

Always using pycrypto for key generation

In the current keys.py, line 1702, regardless of which library you are configured to use (pycrypto or pyca_crypto), pycrypto is used to generate key files. This appears to be a typo / minor oversight, since there is logic to choose one or the other.

This is related to issue #44, in that it also complicates inconsistent passphrase treatment.

In any case, this prevents you from generating keys if you don't have pycrypto installed. (That should be supported: you're allowed to use pyca_crypto and not pycrypto if you want.)

It is also worth considering configuring tox.ini to do an integration test for this: run once with only pycrypto (not pyca_crypto) installed, once with only pyca_crypto (not pycrypto) installed, etc.

Revise module architecture and define clear API modules and functions

Description of issue or feature request:
Securesystemslib lacks of a clear public API. It should be clear and intuitive for users of secureystemslib, which modules and which functions are public API.

Current behavior:
API is scattered across:

for general key operations (generate, import, sign, verify)
- keys.py -- high-level public interface to sign/very (key type independent), and generate and import keys (key type dependent). Calls into low-level non-public {rsa, ecdsa, ed25519}_keys.py modules, which are (mostly) separated by key type.
- interface.py -- higher-level public interface (mostly calls into key.py) to generate and import keys
for GPG key operations (import, sign, verify)
- gpg/functions.py -- public interface for gpg subpackage, independent from above key operations.
for other other non-key related operations
- hash.py -- facade for hashlib from stdlib and cryptography.hazmat.primitives.hashes
- process.py -- thin subprocess wrapper
- storage.py-- file system abstraction
- util.py -- mostly I/O related utils
- formats.py-- schema definition constants (likely to be deprecated, see #183), OLPC canonical json implementation

Expected behavior:
Revise module architecture to use mnemonic names for (public) modules (not interface or functions) appropriate for the interface functions they contain. Also see discussion about import guidelines.

Dependabot couldn't authenticate with https://pypi.python.org/simple/

Dependabot couldn't authenticate with https://pypi.python.org/simple/.

You can provide authentication details in your Dependabot dashboard by clicking into the account menu (in the top right) and selecting 'Config variables'.

View the update logs.

requirements.txt contains dev requirements

Description of issue or feature request:
the requirements.txt file should document what packages are needed in order to run or use this library, but it contains dependencies that are used for testing.

Current behavior: I should be able to run pip install -r requirements.txt if I just want to use the library. I get a bunch of dev packages pulled in that don't seem to be necessary (coverage/tox/coveralls)

Expected behavior: I should only pull the minimal amount of packages to install. I was imagining that we should only depend on pynacl/pycrypto or so.

Remove util.file_in_confined_directories

Description of issue or feature request:
Remove obsolete utility function.

See #244 (comment) and #244 (comment).

Requires adoption or removal of similarly obsolete tuf.mirrors.get_list_of_mirrors.

Detect insecure crypto and consider it as invalid. Provide machine-readable description what exactly is invalid.

Description of issue or feature request:
Some crypto primitives are considered insecure by now. When browsers see insecure crypto in TLS, they refuse to use it, unless ordered by the user. Why should OpenPGP be different?

https://github.com/KOLANICH/OpenPGPAbs.py/blob/master/OpenPGPAbs/gpgBackends/gpgme.py#L119
and
SecurityInnovation/PGPy#312

may be helpful.

blake2b digest length is incompatible with Warehouse

I believe the blake2b hashes produced by warehouse and securesystemslib are incompatible

I can't actually run the full update cycle with my pip-branch against the Warehouse tuf-branch yet but I've taken another look at the metadata and I'm fairly sure we will be incompatible in the hash verification step.

This is what Warehouse target metadata currently looks like:

"0b/de/ae38ae748ae74093a0c8abad03d83e179870639d2d1b092ba4b3e82164df/litefeel_pycommon-0.1.1-py3-none-any.whl": {
    "custom": {
     "backsigned": true
    },
    "hashes": {
     "blake2b": "0bdeae38ae748ae74093a0c8abad03d83e179870639d2d1b092ba4b3e82164df"
    },
    "length": 3077
   }

A TUF client compares that "blake2b" hash to one that securesystemslib produces for the given data, something like this simplified example:

>>> dobj = securesystemslib.hash.digest("blake2b")
>>> dobj.update(b'')
>>> dobj.hexdigest()
'786a02f742015903c6c6fd852552d272912f4740e15847618a86e217f71f5419d25e1031afee585313896444934eb04b903a685b1448b755d56f701afe9be2ce'

The digests have different lengths. Warehouse uses blake2b-256 and TUF uses blake2b-512 (which is the maximum).

So:

First, is the warehouse choice set in stone? I can see how it kind of might be as the shorter hash is already used in many places (ping @woodruffw)
If it is, we probably want to support at least a new "blake2b-256" hash name (and add the little bit of new code needed in hash.py::digest())

I can do the PR if there's consensus.

	def verify_signature(key_dict, signature, data):
	"""
	<Purpose>
	Determine whether the private key belonging to 'key_dict' produced
	'signature'. verify_signature() will use the public key found in
	'key_dict', the 'sig' objects contained in 'signature', and 'data' to
	complete the verification.

	password = _get_password('Enter a password for the encrypted RSA'
	' key (' + Fore.RED + filepath + Fore.RESET + '): ',
	confirm=False)

	A central location for all format-related checking of TUF objects.
	Note: 'formats.py' depends heavily on 'schema.py', so the 'schema.py'
	module should be read and understood before tackling this module.

	'formats.py' can be broken down into three sections. (1) Schemas and object
	matching. (2) Classes that represent Role Metadata and help produce
	correctly formatted files. (3) Functions that help produce or verify TUF

	# Version information specified in "snapshot.json" for each role available on
	# the TUF repository. The 'FILEINFO_SCHEMA' object was previously listed in
	# the snapshot role, but was switched to this object format to reduce the
	# amount of metadata that needs to be downloaded. Listing version numbers in
	# "snapshot.json" also prevents rollback attacks for roles that clients have
	# not downloaded.
	VERSIONINFO_SCHEMA = SCHEMA.Object(
	object_name = 'VERSIONINFO_SCHEMA',
	version = METADATAVERSION_SCHEMA)

secure-systems-lab / securesystemslib Goto Github PK

securesystemslib's People

Contributors

Stargazers

Watchers

Forkers

securesystemslib's Issues

Summary

Example

Security Implications

Bug

Analysis and Steps

Recommend Projects

Recommend Topics

Recommend Org