Giter Site home page Giter Site logo

Comments (21)

potiuk avatar potiuk commented on June 17, 2024 1

The hash feature doesn't provide "some security". It provides zero security. Without a signature, It protects against corrupt downloads. That's not security

It is some level of security. There are scenarios where it matters Here we simply disagree.,

Utlimately pip maintainers decide how they want to communicate with their users. That's their right and well, you might argue as much as you want, but all you might have are opinions, and proposals. And decisions are not yours to make.

On a human level I have a suggestion to you @maltfield - before it gets any further. Remember there are other humans on the other side.

I propose you to consider that you've been listened to, your opinion was considered and rejected. Happens.

I think appreciating all the effort that maintainers do to make things working well (and every day better) in order you can use the software for free, often in their own personal time, away from their families, and things they get money for, I think appreciation of that is better than fighiting with them over minute and completely meaningless details in a long run.

If I may suggest something from my experience - even if I finally got the small thing I fought for - that was overall a bad idea for me to get confrontational here. If I regret something then was how short-sighted and stupid I was back then to get into that rabbit hole, and I would gladly go back in time and revert it.

IMHO you will get much more with accepting and trying to understand the other side an accepting that other people. might have different opinions and when they have a merit of the projects, they have the right to make decisions that are right for them and their users.

But, well that's my opinion, experience and advice - you might take it or not, up to you.

from pip.

pradyunsg avatar pradyunsg commented on June 17, 2024

Note that this is the case because if the package was maliciously altered (eg by a publishing infrastructure compromise), then the attacker could just as easily modify the hashes such that pip will happily install the malicious module.

No, they can't. From the first line under the Hash-checking Mode section:

This mode uses local hashes, embedded in a requirements.txt file, to protect against remote tampering and network issues.

We're not using the hashes in the URLs/index to check for malicious alteration; we're checking it against a known good hash embedded in a local requirements.txt file.

from pip.

maltfield avatar maltfield commented on June 17, 2024

We're not using the hashes in the URLs/index to check for malicious alteration

It is alarming if the PyPI team thinks it's OK that their documentation misleads users into thinking that their unsigned hashes provide security, especially considering the numerous incidents of supply chain vulnerabilities have caused issues with open-source projects due to publishing infrastructure compromise in recent years.

The documentation should clearly state that hashes do not provide security, as you said above.

Please re-open this issue; it's not OK to lie to users.

from pip.

pradyunsg avatar pradyunsg commented on June 17, 2024

Checking against a known-good local hash does protect against tampering and compromise of the index/infrastructure, since you're checking that the artifact has not been modified compared to the last known-good copy (which the hash is generated from, assuming a decent enough hash function).

If you disagree with this, I'd like to understand what kind of compromise you're thinking of that would lead to a requirements file such as the following to install a malicious copy of samplepackage.

samplepackage==0.1 \
    --hash=sha256:0b93408e04eeef3bdbc97ff29eb819c46a4d610c649f5101999cff7ed9781396

The documentation should clearly state that hashes do not provide security, as you said above.

I have literally quoted the documentation, which states what I have rephrased. I did not say that hashes do not provide security here.

from pip.

maltfield avatar maltfield commented on June 17, 2024

Checking against a known-good local hash

Sorry, but the whole point is that there is no such thing as a "known-good" hash if it you didn't cryptographically verify the signature of the hash.

How did the hash get from the Internet onto the computer? That's the vulnerability.

Please re-open this ticket so we can properly educate users of the risks and limitations of PyPI.

from pip.

pradyunsg avatar pradyunsg commented on June 17, 2024

the whole point is that there is no such thing as a "known-good" hash if it you didn't cryptographically verify the signature of the hash.

In that case, I believe you're looking past what I'm saying.

How did the hash get from the Internet onto the computer? That's the vulnerability.

I think you're mixing provenance with the ability to tell if something has changed unexpectedly (i.e. integrity).

Provenance guarentees will be available once https://peps.python.org/pep-0458/ and https://peps.python.org/pep-0480/ end up being implemented, and until then validating that the files aren't tampered between two uses has significant and meaningful benefits, over the default of not even doing that.

If you want me to reopen this ticket, please provide an answer to the request I made in my previous comment for a clarification/explanation.

from pip.

maltfield avatar maltfield commented on June 17, 2024

The security property that I'm referring to Authenticity -- verifying that the software is authentic. That the downloaded software originated from the developers (and not someone malicious).

The documentation suggests that the hashes provide security that protects the user from downloading maliciously modified software. That's authenticity. And it's wrong; the hashes do not provide any assurance of authenticity.

If you want me to reopen this ticket, please provide an answer to the request I made in my previous comment for a clarification/explanation.

Which question did I not address?

from pip.

potiuk avatar potiuk commented on June 17, 2024

I personally do not find the documentaiton misleading and find it very well placed and your request unfounded @maltfield.

Security is not all-or-nothing. Never. There are various level of security you can achieve by applying various techniques. This document has nothing about Authenticity and Provenance. It speaks about "more security" not "ultimate security" and very nicely describes what the hash feature provides. - without even hinting at provenance and authenticity. Not quite sure where you draw that authenticity is hinted. If all that you think is 'Security' header then your definition of Security is pretty narrow.

As a Security Comitee member of the Apache Software Foundation, we are working on achieving more security with our PyPI distribution - likely becoming trusted publisher and - when available and when PEPs that @pradyunsg mentioned are implemented adding more security by adding provenance, cryptographic signatures (likely with sigstore) - and we discuss closely with various communities on way we can improve security, but it does not mean that "some security levels" cannot be achieved without it (for example in case of Apache Software Foundation the binaries already have cryptographic signatures checkums https://downloads.apache.org/airflow/2.8.2/. Even more - our builds are binary reproducible (which in connnection with cryptographics signatures is even MORE security that what you claim as the only thing that can be named as "security" .

Yet we are actually looking at adding hashes to our constraint files https://github.com/apache/airflow/blob/constraints-2.8.2/constraints-3.8.txt as to add more security soon, and actually the title of that chapter, led me to start considering doing it - not for corruption setting but precisely to freeze those constraints for our users so that in case of possible future breaches they can rely that the packages were not modified after we released our software. That definitely adds more security, without providing ultimate security.

Consider that as an opinion of somoene who helps to overlook security in 100s of ASF projects, and helps to define policies and security approach in a Foundation that did security well before it was fashionable.

from pip.

pradyunsg avatar pradyunsg commented on June 17, 2024

Which question did I not address?

This one:

I'd like to understand what kind of compromise you're thinking of that would lead to a requirements file such as the following to install a malicious copy of samplepackage.

samplepackage==0.1 \
    --hash=sha256:0b93408e04eeef3bdbc97ff29eb819c46a4d610c649f5101999cff7ed9781396

from pip.

maltfield avatar maltfield commented on June 17, 2024

Security is not all-or-nothing

The hash feature doesn't provide "some security". It provides zero security. Without a signature, It protects against corrupt downloads. That's not security.

from pip.

maltfield avatar maltfield commented on June 17, 2024

Which question did I not address?

This one:

I'd like to understand what kind of compromise you're thinking of that would lead to a requirements file such as the following to install a malicious copy of samplepackage.

samplepackage==0.1 \
    --hash=sha256:0b93408e04eeef3bdbc97ff29eb819c46a4d610c649f5101999cff7ed9781396

The compromise is that the user downloads maliciously modified software. I'm not sure why this isn't obvious.

How did the user get the hash in the example command?

from pip.

pfmoore avatar pfmoore commented on June 17, 2024

How did the user get the hash in the example command?

How did the user ensure that their copy of Python, and their copy of pip, is not compromised?

The way you are framing everything in terms of absolutes, and accusing the pip maintainers of bad faith in the way we document pip's features, is neither helpful nor welcome. Please consider both your tone and the message you are giving.

If you, personally, don't feel that pip is sufficiently secure, then by all means don't use it. No-one is forcing you to do so. Others can make their own decision.

from pip.

maltfield avatar maltfield commented on June 17, 2024

How did the user ensure that their copy of Python, and their copy of pip, is not compromised?

apt-get install python3 python3-pip

Most OS package managers, apt included, verify the authenticity of all packages with cryptographic signatures. This is documented here:

I believe I've answered your question. Please re-open this issue.

from pip.

maltfield avatar maltfield commented on June 17, 2024

If this project was open to listening to proposals, this ticket wouldn't have been immediately closed (before some dialog).

The PyPI team is being rude by immediately closing something as "won't fix" when a user informs them of harm that they're causing users, and contributes to the project by taking the time to highlight bugs and their solutions. I am a human taking time out of my day to file this bug to better all python users.

We all make mistakes. Not all bugs are shallow. It shouldn't be considered rude to report bugs and advocate for harm reduction.

Please re-open this ticket.

from pip.

notatallshaw avatar notatallshaw commented on June 17, 2024

The hash feature doesn't provide "some security". It provides zero security. Without a signature, It protects against corrupt downloads. That's not security.

I don't have an opinion on the larger conversation, but hashes protect you on recreation of environment but not on initial creation of environment.

For example if you had sourced your hashes before the 25th December for the PyTorch Dependency Confusion attack you would have been safe: https://pytorch.org/blog/compromised-nightly-dependency/

There are many other situations in which an attacker may be able to insert a version that matches but not a hash that matches.

This of course does not fully solve supply chain security, but it does protect against certain attack in some situations.

from pip.

maltfield avatar maltfield commented on June 17, 2024

For example if you had sourced your hashes before the 25th December for the PyTorch Dependency Confusion attack you would have been safe: https://pytorch.org/blog/compromised-nightly-dependency/

AFAIK, the hashes downloaded "before the 25th of December" would still lack signatures. So, no, you would not have been safe from several supply chain vulnerabilities, including a publishing infrastructure compromise.

from pip.

notatallshaw avatar notatallshaw commented on June 17, 2024

AFAIK, the hashes downloaded "before the 25th of December" would still lack signatures. So, no, you would not have been safe from several supply chain vulnerabilities, including a publishing infrastructure compromise.

You would have been safe from that attack, the attackers files had different hashes. Preventing a real world security attack, I wasn't talking about security as an abstract concept.

from pip.

potiuk avatar potiuk commented on June 17, 2024

Also @maltfield since you are insisting, I would heartily recommend you to do your homework and rather than throw a bunch of random links and expressing your opinion about what "security of supply chain" is we should refer to standards.

I'd heartily recommend you to get familiar with this - widely accepted in the industry - standard describing security of supply chain: https://slsa.dev/spec/v1.0/levels

This standard described there, is not narrowly focusing on having or not having cryptographic signatures. Cryptographic signatures are only a small part of the supply chain security and actually when you focus exclusively (as you do) only on cryptographic signatures as a sign of "security" - you are making a huge mistake - which gives you a false sense of security. Because "Security of supply chain" is way, way, way more than that - those are processes, build platforms and wealth of other things - nicely described in the standard.

And I have a very interesting surprise for you. Regardless if the hashes are cryptographically signed or not - it is still L0 (zero) level in SLSA. Signing hashes or packages cryptographically on its own does not move a needle when it comes to a security level in SLSA. Go and check it yourself.

What DOES change it and moves it to L1 - is to have a verifiable way on determining how the packages were build. Which cryptographic signatures tell absolutely nothing about. Reproducible builds however, do. In fact, they do - and bring the SLSA level 1. And - this might be even more surprising to you @maltfield - it does not matter if the packages are signed at all to be at level 1:

Provenance may be incomplete and/or unsigned at L1. Higher levels require more complete and trustworthy provenance.

So if would use your arguments. YOUR solution cannot be named "security" either - because it does not provide the security - not even Level 1 of SLSA. Even more - anything that provides L1 cannot be named security, because there are L2, L3 levels that provide even more security (note - there are currently not known public platforms of any sort that provide level 3 security - though a number of platforms out there strive to achieve it).

So I think if take your way of thinking and rather than applying it to - pretty narrow and small part of "Supply security" about cryptographics signing being the only criteria to name something "Security", we should not name anything with "Security" - because it does not achieve highest level of security.

I suggest you do some more research and reading in this aspect - it might help you to expand your - currently pretty narrrowly focused on cryptographic signing - knowledge about supply chain security.

Security is like Ogres - it has layers. And many of them.

from pip.

maltfield avatar maltfield commented on June 17, 2024

potluk, in my python project we have reproducible builds, hashes, and I cryptographically sign all my hashes.

Reproducible builds are important. If you're suggesting that we add a note to the documentation indicating that reproducible builds are an important aspect in supply chain security that PyPI is currently not enforcing, I would agree that is a valuable addition to the documentation.

Cryptographic signatures are necessary. Currently the documentation is spreading misinformation to developers, making them think that adding an unsigned hash to their download provides security.

I encounter a lot of devs that think hashes provide security. I think part of the problem is documentation like this. Let's fix the docs so that devs aren't misinformed, leaving their build process (and therefore all of their users) vulnerable.

Please re-open this issue so the documentation can be fixed.

from pip.

pfmoore avatar pfmoore commented on June 17, 2024

Please re-open this issue so the documentation can be fixed.

Please, just stop. You've made your point, and been heard. We (the pip developers) don't agree with you. I'm sorry if that frustrates you, but it's a reality you have to deal with. Re-stating the same assertions won't change the outcome. Plenty of other people with security expertise have read pip's documentation and no-one other than you has tried to make the claims you've made here.

Is your only concern here with the word "secure"? Because if so, it is only used twice in that whole section. If all you wanted was to change those two occurrences to use a different word, you could have submitted a PR and done that. I don't personally think it's necessary, but we could have assessed a simple terminology change without all of the aggression and name-calling that your original post included.

Pip's documentation is not the place to educate users about security best practices. Nor is pip the only weak link in the Python package distribution chain. People are aware of the weaknesses, and working on them. Yelling specifically about one section of the pip documentation shows a stunning lack of awareness of the bigger picture here.

Edit: If you continue the discussion in the same way as you have up to now, I'm going to lock this issue as "too heated". If you wish to avoid that, moderate your future comments. You know by now the tone we expect from you - it's simply that you're "open, considerate and respectful" as stated in the code of conduct.

from pip.

maltfield avatar maltfield commented on June 17, 2024

Yes, I'd like to submit a PR to improve this documentation. Generally, I think it's best-practice to:

  1. Submit a bug report
  2. Have some dialog with the devs about the issue
  3. Wait for the devs to indicate that the PR is welcome
  4. Do the work
  5. Submit the PR

Instead of the above expected process, this bug report was closed immediately without any dialog. That's not very encouraging to community members who want to contribute to improve the project.

Please re-open this issue so we can fix the bugs in the documentation, as described above.

from pip.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.