Comments (21)
The hash feature doesn't provide "some security". It provides zero security. Without a signature, It protects against corrupt downloads. That's not security
It is some level of security. There are scenarios where it matters Here we simply disagree.,
Utlimately pip
maintainers decide how they want to communicate with their users. That's their right and well, you might argue as much as you want, but all you might have are opinions, and proposals. And decisions are not yours to make.
On a human level I have a suggestion to you @maltfield - before it gets any further. Remember there are other humans on the other side.
I propose you to consider that you've been listened to, your opinion was considered and rejected. Happens.
I think appreciating all the effort that maintainers do to make things working well (and every day better) in order you can use the software for free, often in their own personal time, away from their families, and things they get money for, I think appreciation of that is better than fighiting with them over minute and completely meaningless details in a long run.
If I may suggest something from my experience - even if I finally got the small thing I fought for - that was overall a bad idea for me to get confrontational here. If I regret something then was how short-sighted and stupid I was back then to get into that rabbit hole, and I would gladly go back in time and revert it.
IMHO you will get much more with accepting and trying to understand the other side an accepting that other people. might have different opinions and when they have a merit of the projects, they have the right to make decisions that are right for them and their users.
But, well that's my opinion, experience and advice - you might take it or not, up to you.
from pip.
Note that this is the case because if the package was maliciously altered (eg by a publishing infrastructure compromise), then the attacker could just as easily modify the hashes such that pip will happily install the malicious module.
No, they can't. From the first line under the Hash-checking Mode section:
This mode uses local hashes, embedded in a requirements.txt file, to protect against remote tampering and network issues.
We're not using the hashes in the URLs/index to check for malicious alteration; we're checking it against a known good hash embedded in a local requirements.txt file.
from pip.
We're not using the hashes in the URLs/index to check for malicious alteration
It is alarming if the PyPI team thinks it's OK that their documentation misleads users into thinking that their unsigned hashes provide security, especially considering the numerous incidents of supply chain vulnerabilities have caused issues with open-source projects due to publishing infrastructure compromise in recent years.
The documentation should clearly state that hashes do not provide security, as you said above.
Please re-open this issue; it's not OK to lie to users.
from pip.
Checking against a known-good local hash does protect against tampering and compromise of the index/infrastructure, since you're checking that the artifact has not been modified compared to the last known-good copy (which the hash is generated from, assuming a decent enough hash function).
If you disagree with this, I'd like to understand what kind of compromise you're thinking of that would lead to a requirements file such as the following to install a malicious copy of samplepackage
.
samplepackage==0.1 \
--hash=sha256:0b93408e04eeef3bdbc97ff29eb819c46a4d610c649f5101999cff7ed9781396
The documentation should clearly state that hashes do not provide security, as you said above.
I have literally quoted the documentation, which states what I have rephrased. I did not say that hashes do not provide security here.
from pip.
Checking against a known-good local hash
Sorry, but the whole point is that there is no such thing as a "known-good" hash if it you didn't cryptographically verify the signature of the hash.
How did the hash get from the Internet onto the computer? That's the vulnerability.
Please re-open this ticket so we can properly educate users of the risks and limitations of PyPI.
from pip.
the whole point is that there is no such thing as a "known-good" hash if it you didn't cryptographically verify the signature of the hash.
In that case, I believe you're looking past what I'm saying.
How did the hash get from the Internet onto the computer? That's the vulnerability.
I think you're mixing provenance with the ability to tell if something has changed unexpectedly (i.e. integrity).
Provenance guarentees will be available once https://peps.python.org/pep-0458/ and https://peps.python.org/pep-0480/ end up being implemented, and until then validating that the files aren't tampered between two uses has significant and meaningful benefits, over the default of not even doing that.
If you want me to reopen this ticket, please provide an answer to the request I made in my previous comment for a clarification/explanation.
from pip.
The security property that I'm referring to Authenticity -- verifying that the software is authentic. That the downloaded software originated from the developers (and not someone malicious).
The documentation suggests that the hashes provide security that protects the user from downloading maliciously modified software. That's authenticity. And it's wrong; the hashes do not provide any assurance of authenticity.
If you want me to reopen this ticket, please provide an answer to the request I made in my previous comment for a clarification/explanation.
Which question did I not address?
from pip.
I personally do not find the documentaiton misleading and find it very well placed and your request unfounded @maltfield.
Security is not all-or-nothing
. Never. There are various level of security you can achieve by applying various techniques. This document has nothing about Authenticity and Provenance. It speaks about "more security" not "ultimate security" and very nicely describes what the hash feature provides. - without even hinting at provenance and authenticity. Not quite sure where you draw that authenticity is hinted. If all that you think is 'Security' header then your definition of Security is pretty narrow.
As a Security Comitee member of the Apache Software Foundation, we are working on achieving more security with our PyPI
distribution - likely becoming trusted publisher and - when available and when PEPs that @pradyunsg mentioned are implemented adding more security by adding provenance, cryptographic signatures (likely with sigstore) - and we discuss closely with various communities on way we can improve security, but it does not mean that "some security levels" cannot be achieved without it (for example in case of Apache Software Foundation the binaries already have cryptographic signatures checkums https://downloads.apache.org/airflow/2.8.2/. Even more - our builds are binary reproducible (which in connnection with cryptographics signatures is even MORE security that what you claim as the only thing that can be named as "security" .
Yet we are actually looking at adding hashes to our constraint files https://github.com/apache/airflow/blob/constraints-2.8.2/constraints-3.8.txt as to add more security
soon, and actually the title of that chapter, led me to start considering doing it - not for corruption setting but precisely to freeze those constraints for our users so that in case of possible future breaches they can rely that the packages were not modified after
we released our software. That definitely adds more security, without providing ultimate security
.
Consider that as an opinion of somoene who helps to overlook security in 100s of ASF projects, and helps to define policies and security approach in a Foundation that did security well before it was fashionable.
from pip.
Which question did I not address?
This one:
I'd like to understand what kind of compromise you're thinking of that would lead to a requirements file such as the following to install a malicious copy of
samplepackage
.samplepackage==0.1 \ --hash=sha256:0b93408e04eeef3bdbc97ff29eb819c46a4d610c649f5101999cff7ed9781396
from pip.
Security is not all-or-nothing
The hash feature doesn't provide "some security". It provides zero security. Without a signature, It protects against corrupt downloads. That's not security.
from pip.
Which question did I not address?
This one:
I'd like to understand what kind of compromise you're thinking of that would lead to a requirements file such as the following to install a malicious copy of
samplepackage
.samplepackage==0.1 \ --hash=sha256:0b93408e04eeef3bdbc97ff29eb819c46a4d610c649f5101999cff7ed9781396
The compromise is that the user downloads maliciously modified software. I'm not sure why this isn't obvious.
How did the user get the hash in the example command?
from pip.
How did the user get the hash in the example command?
How did the user ensure that their copy of Python, and their copy of pip, is not compromised?
The way you are framing everything in terms of absolutes, and accusing the pip maintainers of bad faith in the way we document pip's features, is neither helpful nor welcome. Please consider both your tone and the message you are giving.
If you, personally, don't feel that pip is sufficiently secure, then by all means don't use it. No-one is forcing you to do so. Others can make their own decision.
from pip.
How did the user ensure that their copy of Python, and their copy of pip, is not compromised?
apt-get install python3 python3-pip
Most OS package managers, apt included, verify the authenticity of all packages with cryptographic signatures. This is documented here:
I believe I've answered your question. Please re-open this issue.
from pip.
If this project was open to listening to proposals, this ticket wouldn't have been immediately closed (before some dialog).
The PyPI team is being rude by immediately closing something as "won't fix" when a user informs them of harm that they're causing users, and contributes to the project by taking the time to highlight bugs and their solutions. I am a human taking time out of my day to file this bug to better all python users.
We all make mistakes. Not all bugs are shallow. It shouldn't be considered rude to report bugs and advocate for harm reduction.
Please re-open this ticket.
from pip.
The hash feature doesn't provide "some security". It provides zero security. Without a signature, It protects against corrupt downloads. That's not security.
I don't have an opinion on the larger conversation, but hashes protect you on recreation of environment but not on initial creation of environment.
For example if you had sourced your hashes before the 25th December for the PyTorch Dependency Confusion attack you would have been safe: https://pytorch.org/blog/compromised-nightly-dependency/
There are many other situations in which an attacker may be able to insert a version that matches but not a hash that matches.
This of course does not fully solve supply chain security, but it does protect against certain attack in some situations.
from pip.
For example if you had sourced your hashes before the 25th December for the PyTorch Dependency Confusion attack you would have been safe: https://pytorch.org/blog/compromised-nightly-dependency/
AFAIK, the hashes downloaded "before the 25th of December" would still lack signatures. So, no, you would not have been safe from several supply chain vulnerabilities, including a publishing infrastructure compromise.
from pip.
AFAIK, the hashes downloaded "before the 25th of December" would still lack signatures. So, no, you would not have been safe from several supply chain vulnerabilities, including a publishing infrastructure compromise.
You would have been safe from that attack, the attackers files had different hashes. Preventing a real world security attack, I wasn't talking about security as an abstract concept.
from pip.
Also @maltfield since you are insisting, I would heartily recommend you to do your homework and rather than throw a bunch of random links and expressing your opinion about what "security of supply chain" is we should refer to standards.
I'd heartily recommend you to get familiar with this - widely accepted in the industry - standard describing security of supply chain: https://slsa.dev/spec/v1.0/levels
This standard described there, is not narrowly focusing on having or not having cryptographic signatures. Cryptographic signatures are only a small part of the supply chain security and actually when you focus exclusively (as you do) only on cryptographic signatures as a sign of "security" - you are making a huge mistake - which gives you a false sense of security. Because "Security of supply chain" is way, way, way more than that - those are processes, build platforms and wealth of other things - nicely described in the standard.
And I have a very interesting surprise for you. Regardless if the hashes are cryptographically signed or not - it is still L0 (zero) level in SLSA. Signing hashes or packages cryptographically on its own does not move a needle when it comes to a security level in SLSA. Go and check it yourself.
What DOES change it and moves it to L1 - is to have a verifiable way on determining how the packages were build. Which cryptographic signatures tell absolutely nothing about. Reproducible builds however, do. In fact, they do - and bring the SLSA level 1. And - this might be even more surprising to you @maltfield - it does not matter if the packages are signed at all to be at level 1:
Provenance may be incomplete and/or unsigned at L1. Higher levels require more complete and trustworthy provenance.
So if would use your arguments. YOUR solution cannot be named "security" either - because it does not provide the security - not even Level 1 of SLSA. Even more - anything that provides L1 cannot be named security, because there are L2, L3 levels that provide even more security (note - there are currently not known public platforms of any sort that provide level 3 security - though a number of platforms out there strive to achieve it).
So I think if take your way of thinking and rather than applying it to - pretty narrow and small part of "Supply security" about cryptographics signing being the only criteria to name something "Security", we should not name anything with "Security" - because it does not achieve highest level of security.
I suggest you do some more research and reading in this aspect - it might help you to expand your - currently pretty narrrowly focused on cryptographic signing - knowledge about supply chain security.
Security is like Ogres - it has layers. And many of them.
from pip.
potluk, in my python project we have reproducible builds, hashes, and I cryptographically sign all my hashes.
Reproducible builds are important. If you're suggesting that we add a note to the documentation indicating that reproducible builds are an important aspect in supply chain security that PyPI is currently not enforcing, I would agree that is a valuable addition to the documentation.
Cryptographic signatures are necessary. Currently the documentation is spreading misinformation to developers, making them think that adding an unsigned hash to their download provides security.
I encounter a lot of devs that think hashes provide security. I think part of the problem is documentation like this. Let's fix the docs so that devs aren't misinformed, leaving their build process (and therefore all of their users) vulnerable.
Please re-open this issue so the documentation can be fixed.
from pip.
Please re-open this issue so the documentation can be fixed.
Please, just stop. You've made your point, and been heard. We (the pip developers) don't agree with you. I'm sorry if that frustrates you, but it's a reality you have to deal with. Re-stating the same assertions won't change the outcome. Plenty of other people with security expertise have read pip's documentation and no-one other than you has tried to make the claims you've made here.
Is your only concern here with the word "secure"? Because if so, it is only used twice in that whole section. If all you wanted was to change those two occurrences to use a different word, you could have submitted a PR and done that. I don't personally think it's necessary, but we could have assessed a simple terminology change without all of the aggression and name-calling that your original post included.
Pip's documentation is not the place to educate users about security best practices. Nor is pip the only weak link in the Python package distribution chain. People are aware of the weaknesses, and working on them. Yelling specifically about one section of the pip documentation shows a stunning lack of awareness of the bigger picture here.
Edit: If you continue the discussion in the same way as you have up to now, I'm going to lock this issue as "too heated". If you wish to avoid that, moderate your future comments. You know by now the tone we expect from you - it's simply that you're "open, considerate and respectful" as stated in the code of conduct.
from pip.
Yes, I'd like to submit a PR to improve this documentation. Generally, I think it's best-practice to:
- Submit a bug report
- Have some dialog with the devs about the issue
- Wait for the devs to indicate that the PR is welcome
- Do the work
- Submit the PR
Instead of the above expected process, this bug report was closed immediately without any dialog. That's not very encouraging to community members who want to contribute to improve the project.
Please re-open this issue so we can fix the bugs in the documentation, as described above.
from pip.
Related Issues (20)
- ModuleNotFoundError: No module named 'pip' HOT 2
- Command - pip reset HOT 6
- python -m ensurepip doesn't create the pip.exe shortcut on python-3.13.0b1 HOT 2
- installing via "git+file" fails under Windows when url point to different drive-letter HOT 1
- Pip >22.2.2 fails to install package from a proxy mirror of PyPI, but can do so directly from PyPI HOT 9
- "No module named 'Cython'", but only when installing a package and only with pip 23.1 or later HOT 2
- ERROR: Fatal Internal error [id=1] HOT 4
- Improve the release process to enable trusted publishing HOT 1
- Improve UX and Performance of Install step HOT 5
- Show the detailed error when facing invalid requirements HOT 6
- pip-24.1b1 doesn't tolerate anymore a contraint like 'python-hdf4>=0.10.0+dummy' HOT 5
- pip --no-input will still prompt for git and ssh
- Ongoing CI issues on `main` HOT 8
- pip install error any package, but can work HOT 1
- Suggest a pip upgrade on UnsupportedWheel if the upgrade check is disabled HOT 3
- Installing dependencies from `pyproject.toml` in the Docker build phase takes very long HOT 6
- Autocomplete failures should not be fatal
- Allow installing packages that were current at a defined date HOT 2
- Failed to establish a new connection: HOT 1
- Present more informative `no distributions found` error when attempting to install a stdlib module HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pip.