Giter Site home page Giter Site logo

Comments (3)

cooperlees avatar cooperlees commented on June 8, 2024

Howdy,

First off - https://pypi.org/simple/pip/23.0.1/json does not load for me - Did you mean https://pypi.org/pypi/pip/23.0.1/json? I'm going to guess so and work off that.

Has pip moved to using the JSON Simple API as per PEP691? If so, your mirror also needs to be generating PEP691 JSON and returning it when the client (in this case pip) requests it. There is also a huge chance of bugs here as I've never ran (or do run) a bandersnatch mirror with PRP691 support. I have changed roles and no longer run a bandersnatch mirror anywhere.

Bandersnatch uses the base package JSON API for packages (e.g. for pip https://pypi.org/pypi/pip/json). We loop through the digests and add them to simple API JSON only, per PEP691. Code can be seen here:

"hashes": {
digest_name: digest_hash
for digest_name, digest_hash in r["digests"].items()
},

Local Test

I created a venv + installed bandersnatch and used the CI config to get a small mirror locally.

python3 -m venv /tmp/tb
/tmp/tb/bin/pip install bandersnatch
mkdir /tmp/pypi
/tmp/tb/bin/bandersnatch -c src/bandersnatch/tests/ci.conf --debug mirror |& tee /tmp/bandersnatch.log

I got both HTML + JSON output

/tmp/pypi/web/simple/
/tmp/pypi/web/simple/index.v1_json
/tmp/pypi/web/simple/index.v1_html
/tmp/pypi/web/simple/index.html
/tmp/pypi/web/simple/b
/tmp/pypi/web/simple/b/black
/tmp/pypi/web/simple/b/black/index.v1_json
/tmp/pypi/web/simple/b/black/index.v1_html
/tmp/pypi/web/simple/b/black/index.html
/tmp/pypi/web/simple/b/black/versions
/tmp/pypi/web/simple/b/black/versions/index_17485572_2023-04-26T214110.171879Z.v1_json
/tmp/pypi/web/simple/b/black/versions/index_17485572_2023-04-26T214110.171879Z.v1_html
/tmp/pypi/web/simple/b/black/versions/index_17485572_2023-04-26T214110.171879Z.html
/tmp/pypi/web/simple/p
/tmp/pypi/web/simple/p/pyaib
/tmp/pypi/web/simple/p/pyaib/index.v1_json
/tmp/pypi/web/simple/p/pyaib/index.v1_html
/tmp/pypi/web/simple/p/pyaib/index.html
/tmp/pypi/web/simple/p/pyaib/versions
/tmp/pypi/web/simple/p/pyaib/versions/index_2328239_2023-04-26T213908.980858Z.v1_json
/tmp/pypi/web/simple/p/pyaib/versions/index_2328239_2023-04-26T213908.980858Z.v1_html
/tmp/pypi/web/simple/p/pyaib/versions/index_2328239_2023-04-26T213908.980858Z.html
/tmp/pypi/web/simple/a
/tmp/pypi/web/simple/a/acmplus
/tmp/pypi/web/simple/a/acmplus/index.v1_json
/tmp/pypi/web/simple/a/acmplus/index.v1_html
/tmp/pypi/web/simple/a/acmplus/index.html
/tmp/pypi/web/simple/a/acmplus/versions
/tmp/pypi/web/simple/a/acmplus/versions/index_5103287_2023-04-26T213907.943124Z.v1_json
/tmp/pypi/web/simple/a/acmplus/versions/index_5103287_2023-04-26T213907.943124Z.v1_html
/tmp/pypi/web/simple/a/acmplus/versions/index_5103287_2023-04-26T213907.943124Z.html

Which include the black2b_256 hash:

cat /tmp/pypi/web/simple/p/pyaib/index.v1_json | jq
...
    {
      "filename": "pyaib-2.1.0.tar.gz",
      "hashes": {
        "blake2b_256": "0caf0389466685844d95c6f1f857008d4931d14c7937ac8dba689639ccf0cc54",
        "md5": "5a348b49d53cee26925e7204632721b7",
        "sha256": "b6114554fb312f9b0bdeaf6a7498f7da05fc17b9250c0449ed796fac9ab663e2"
      },
      "requires-python": null,
      "url": "../../packages/0c/af/0389466685844d95c6f1f857008d4931d14c7937ac8dba689639ccf0cc54/pyaib-2.1.0.tar.gz",
      "yanked": false
    }
...

Are you running a PEP691 compatible mirror? It requires fancier nginx or what ever your using configuration to respect Content-Type header in the HTTP request.

Our reference banderx uses NGINX to do so: https://bandersnatch.readthedocs.io/en/latest/serving.html

  • Maybe we need better docs here - If it's not a bug and indeed pip using JSON simple API

EDIT: I have asked pip maintainers on discord for thoughts ...

from bandersnatch.

mbeno avatar mbeno commented on June 8, 2024

Hello, and thank you for your reply.

Yes I meant https://pypi.org/pypi/pip/23.0.1/json

This is our nginx config for the mirror

map $http_accept $mirror_suffix {
    default ".html";

    "~*application/vnd\.pypi\.simple\.v1\+json" ".v1_json";
    "~*application/vnd\.pypi\.simple\.v1\+html" ".v1_html";
    "~*text/html" ".html";
}

server {
    listen 443 ssl http2;
    listen [::]:443 ssl http2;
    server_name <name>;
    root /pypi/srv/web;
    autoindex on;
    charset utf-8;

    keepalive_timeout 70;

    location /simple/ {
        index index$mirror_suffix;

        types {
            application/vnd.pypi.simple.v1+json v1_json;
            application/vnd.pypi.simple.v1+html v1_html;
            text/html html;
        }

        try_files $uri$mirror_suffix $uri $uri/ =404;
    }

    # configure default MIME type for JSON data paths
    location /json/ {
        default_type        application/json;
    }
    location /pypi/ {
        default_type        application/json;
    }

}

From what I understand, though I am not very familiar with pip or pypi mirrors in general, in order to adhere to the PEP 691 keys in the hashes dict should be in a format that can be passed to hashlib.new(). And this is where I run into my issue using pip

From https://peps.python.org/pep-0691/

By default, any hash algorithm available via hashlib (specifically any that can be passed to hashlib.new() and do not require additional parameters) can be used as a key for the hashes dictionary. At least one secure algorithm from hashlib.algorithms_guaranteed SHOULD always be included. At the time of this PEP, sha256 specifically is recommended.

This is where pip is raising the unknown hash exception
https://github.com/pypa/pip/blob/main/src/pip/_internal/utils/hashes.py#L77-L83

from bandersnatch.

mbeno avatar mbeno commented on June 8, 2024

Issue resolved in #1442

from bandersnatch.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.