Giter Site home page Giter Site logo

Comments (13)

pombredanne avatar pombredanne commented on June 11, 2024

The problem stems that https://sourceforge.net/projects/scribus/files/scribus/1.6.0/scribus-1.6.0.tar.gz/download is not the actual direct download URL but is followed by several URL redirects that end up in a mirror.

The final destination is something like where the first segment changes from mirror to mirror:
https://kumisystems.dl.sourceforge.net/project/scribus/scribus/1.6.0/scribus-1.6.0.tar.gz

The stable final URL would be https://master.dl.sourceforge.net/project/scribus/scribus/1.6.0/scribus-1.6.0.tar.gz

None of these are practically visible and accessible. Therefore we should IMHO do these:

  • Convert Sourceforge download URL to PURL.
    Update the the code to properly translate a Sourceforge URL to a PURL, either here or in the Python packageurl library, or both places.
  • Consider updating "legacy" Sourceforge URLs to a canonical URL.
    This should be the one that is visible when browsing, ignoring redirections: https://sourceforge.net/projects/scribus/files/scribus/1.6.0/scribus-1.6.0.tar.gz/download
  • Update MineCode Sourceforge miners to handle and store download URLs correctly

from dejacode.

DennisClark avatar DennisClark commented on June 11, 2024

thanks @pombredanne your proposed solution looks good to me!

from dejacode.

tdruez avatar tdruez commented on June 11, 2024

Note that we have support for the https://*.sourceforge.net/project/scribus/scribus/1.6.0/scribus-1.6.0.tar.gz URLs in the packageurl library, returning pkg:sourceforge/scribus/[email protected]

We simply have to add support for this URL syntax: https://sourceforge.net/projects/scribus/files/scribus/1.6.0/scribus-1.6.0.tar.gz/download

from dejacode.

tdruez avatar tdruez commented on June 11, 2024

@DennisClark I've added support for those type of URLs in the purl library, see package-url/packageurl-python#139
Also, as @pombredanne suggested, we are now using the final redirect URL to extract the proper filename.

With those changes, we now generate a proper PURL and filename:
Screenshot 2024-01-04 at 14 08 37

from dejacode.

DennisClark avatar DennisClark commented on June 11, 2024

Hi @tdruez I'm getting mixed results in Staging. My original scribus case went just fine, but I then tried another package from SourceForge, turbovnc-3.1.tar.gz , on staging with download URL of

https://sourceforge.net/projects/turbovnc/files/3.1/turbovnc-3.1.tar.gz/download

and it all went fine, including a scan, except that it did not assign any PURL values. See attached.

turbovnc-3 1 tar gz test on staging

from dejacode.

tdruez avatar tdruez commented on June 11, 2024

@DennisClark I've added support for the following URLs format:

You can give it another try.

from dejacode.

DennisClark avatar DennisClark commented on June 11, 2024

@tdruez I tested the 3 you identified in your comment, plus the scribus package, and they all look rather good, with one small issue.

When I simply click on the download link for the ventoy package, it downloads a file name Ventoy 1.0.96 release source code.tar.gz which I think is correct and what they call it on the web site, but in DejaCode the filename is shown as Ventoy%201.0.96%20release%20source%20code.tar.gz with all the escape characters for the spaces. If we simply don't allow spaces in the DejaCode filename field, I guess that's ok, but it does look kind of strange. See attached.

ventoy package in staging

from dejacode.

DennisClark avatar DennisClark commented on June 11, 2024

@tdruez one other observation, which is not directly related to this issue, but something that is somewhat perplexing. DejaCode found the existing scans that I created yesterday for the 4 packages (good) and apparently they did not get re-scanned (fine I think) but it did not perform any of the auto-updates to fields on the package (not so good), such as the license-expression, even though 3 of the 4 scans have a declared license. See attached.

Screenshot 2024-01-05 at 09 25 54

from dejacode.

DennisClark avatar DennisClark commented on June 11, 2024

In the example above, the geoserver does not have a detected license anyway, so that's not a big deal, but the other 3 all have declared licenses.

from dejacode.

DennisClark avatar DennisClark commented on June 11, 2024

@tdruez Sorry I did not catch this one yesterday, but the results from creating a package with

https://sourceforge.net/projects/spacesniffer/files/spacesniffer_1_3_0_2.zip/download

do not look so great. See attached.

spacesniffer in staging

from dejacode.

DennisClark avatar DennisClark commented on June 11, 2024

It appears that there are an unknown number of (arbitrary) variations in the SourceForge download url's, suggesting we really do not have a satisfactory way to determine if we got them all. I'm sure you would like to finish this one, but it is possibly an unmanageable task. I'm ok if we go with "good enough" once we have fixed the ones we have actually discovered.

from dejacode.

tdruez avatar tdruez commented on June 11, 2024

@DennisClark changes available for review:

  • Ventoy%201.0.96%20release%20source%20code.tar.gz is now properly unquoted
  • Added support for https://sourceforge.net/projects/spacesniffer/files/spacesniffer_1_3_0_2.zip/download

one other observation, which is not directly related to this issue, but something that is somewhat perplexing. DejaCode found the existing scans that I created yesterday for the 4 packages (good) and apparently they did not get re-scanned (fine I think) but it did not perform any of the auto-updates to fields on the package (not so good), such as the license-expression, even though 3 of the 4 scans have a declared license. See attached.

Entered as #30

from dejacode.

DennisClark avatar DennisClark commented on June 11, 2024

@tdruez The spacesniffer package creation works great now. The Ventoy package creation issue is fixed, although it was very slow to complete the Add Package step, with the cursor spinning for more than 2 minutes; I tested it with a different Ventoy version and had the same slow response. So it all appears to be working fine, but you might want to check on the performance problem.

from dejacode.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.