Giter Site home page Giter Site logo

pget's Introduction

Pget - The fastest file download client

.github/workflows/main.yaml codecov Go Report Card GitHub release MIT License

Ad: I'm currently developing a new date and time library synchro for the modern era. please give it ⭐!!

Description

Multi-Connection Download using parallel requests.

  • Fast
  • Resumable
  • Cross-compiled (windows, linux, macOS)

This is an example to download linux kernel. It will be finished between 15s.

pget

Disclaimer

This program comes with no warranty. You must use this program at your own risk.

Note

  • Using a large number of connections to a single URL can lead to DOS attacks.
  • The case is increasing that if you use multiple connections to 1 URL does not increase the download speed with the spread of CDNs.
    • I recommend to use multiple mirrors simultaneously for faster downloads (And the number of connections is 1 for each).

Installation

Homebrew

$ brew install pget

Go

$ go install github.com/Code-Hex/pget/cmd/pget@latest

Synopsis

This example will be used 2 connections per URL.

$ pget -p 2 MIRROR1 MIRROR2 MIRROR3

If you have created such as this file

cat list.txt
MIRROR1
MIRROR2
MIRROR3

You can do this

cat list.txt | pget -p 2

Options

  Options:
  -h,  --help                   print usage and exit
  -p,  --procs <num>            the number of connections for a single URL (default 1)
  -o,  --output <filename>      output file to <filename>
  -t,  --timeout <seconds>      timeout of checking request in seconds
  -u,  --user-agent <agent>     identify as <agent>
  -r,  --referer <referer>      identify as <referer>
  --check-update                check if there is update available
  --trace                       display detail error messages

Pget vs Wget

URL: https://mirror.internet.asn.au/pub/ubuntu/releases/21.10/ubuntu-21.10-desktop-amd64.iso

Using

time wget https://mirror.internet.asn.au/pub/ubuntu/releases/21.10/ubuntu-21.10-desktop-amd64.iso
time pget -p 6 https://mirror.internet.asn.au/pub/ubuntu/releases/21.10/ubuntu-21.10-desktop-amd64.iso

Results

wget   3.92s user 23.52s system 3% cpu 13:35.24 total
pget -p 6   10.54s user 34.52s system 25% cpu 2:56.93 total

wget 13:35.24 total, pget -p 6 2:56.93 total (6x faster)

Binary

You can download from here

Author

codehex

pget's People

Contributors

chenrui333 avatar code-hex avatar fisherzrj avatar littlecxm avatar lon9 avatar septs avatar vedhavyas avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pget's Issues

Support multiple URLs

For example:

pget -p 6 http://example.com/i_wnant_file.iso http://example1.com/i_wnant_file.iso http://example2.com/i_wnant_file.iso http://example3.com/i_wnant_file.iso

Introduction To ensure these are the same size. check use md5 if server has md5 hash.

The operation at this,

untitled diagram

I think that It is friendly to the server. but It is very difficult to make.

Please help me. Thank you.

Support ignore WARNING message when process larger than 4

Support ignore this WARNING message:

[WARNING] Using a large number of connections to 1 URL can lead to DOS attacks.
In most cases, `4` or less is enough. In addition, the case is increasing that if you use multiple conne
ctions to 1 URL does not increase the download speed with the spread of CDNs.
See: https://github.com/Code-Hex/pget#disclaimer

Would you execute knowing these?
 (y/n) [n]: 

[bug] does not support range request

pget 'https://mirrors.tuna.tsinghua.edu.cn/github-release/git-for-windows/git/Git%20for%20Windows%202.35.1%282%29/Git-2.35.1.2-32-bit.exe'
Error:
  https://mirrors.tuna.tsinghua.edu.cn/github-release/git-for-windows/git/Git%20for%20Windows%202.35.1%282%29/Git-2.35.1.2-32-bit.exe: does not support range request

when url not support range request , pget should not use range request and it should be compatible with this error

specify filename

Could you add future "Specify filename"?
cause, sometime any webservice return/redirect to time stamped url. (like index.html?t=20161014)
and pget will fail to download this case.

command like this:
pget -O index.html http://example.com/index.html // wget like
pget -f index.html http://example.com/index.html // friendly-flag? meaning File

thanks,

New benchmark

time wget $url -O wget.iso -q
________________________________________________________
Executed in  388.98 secs    fish           external
   usr time    8.60 secs  774.00 micros    8.59 secs
   sys time   18.56 secs  289.00 micros   18.56 secs



❯ time pget $url -o pget.iso
________________________________________________________
Executed in  312.20 secs    fish           external
   usr time    7.18 secs    0.00 millis    7.18 secs
   sys time   17.47 secs    1.71 millis   17.46 sec



❯ time aria2c $url -o=aria2.iso --file-allocation=falloc
________________________________________________________
Executed in  469.05 secs    fish           external
   usr time   13.69 secs  585.00 micros   13.69 secs
   sys time   14.37 secs  222.00 micros   14.37 secs

Surprisingly you surpassed aria2 when it comes to default settings.
(well, I did specify a better file allocation method for aria2, dunno if that changes the times much)

URLにハイフンが3つ続くとエラーが出る

日本語で失礼します。
youtubeの動画URLをpgetでダウンロードしてみようと思った所、下記のエラーが出ました。

sh-3.2$ pget -p 6 "https://r6---sn-nvoxu-ioq6.googlevideo.com/videoplayback?sparams=cnr%2Cdur%2Cgcr%2Cid%2Cinitcwndbps%2Cip%2Cipbits%2Citag%2Clmt%2Cmime%2Cmm(以下略)"
Error:
  url has not been set in argument

原因を探ってみたところ、URLにハイフンが3つ続くとエラーになる模様です。

pget -p 6 "http://r6-snnvoxuioq6.googlevideo.com"
pget -p 6 "http://r6--snnvoxuioq6.googlevideo.com"

この場合ですと、ちゃんとホストがないというエラーが出ます。

しかし

pget -p 6 "http://r6---snnvoxuioq6.googlevideo.com"

こうなった場合だけ、Error: url has not been set in argument となってしまします。

ご対応、よろしくお願いします。

sftp support?

As far as I know, Lftp is the only one which download a file on SFTP server concurrently. I wish this pget tool support this feature, base on the sftp lib.

Error: file name too long

On Linux using Pget 0.1.1 installed by linuxbrew I'm getting the following error when I download a binary.
How could we fix it?

$ pget https://github.com/42wim/matterbridge/releases/download/v1.25.0/matterbridge-1.25.0-linux-64bit

Error:
  failed to mkdir for download location: mkdir _a44ac081-f540-4721-ae1f-05c2bdd363cf?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20220502%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20220502T142130Z&X-Amz-Expires=300&X-Amz-Signature=db8edd56f819399d6040d92ced71cba3fa7b2d7b94d5cd6255f33d7e092567f5&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=44820350&response-content-disposition=attachment%3B%20filename%3Dmatterbridge-1.25.0-linux-64bit&response-content-type=application%2Foctet-stream.1: file name too long

Not supported range access

With a lot files, I get this error:

pget -p 6 <url>
Checking now <url>
Error:
  not supported range access: <url>

By example, this happens when downloading a github repo zip.

I think it may be related with download redirection, wget does just fine.

disk space warning

Hey guys, correct me if I'm wrong but using -p 20, with a 200MB (more like 175) would only be 3.5GB? I don't have a single partition on my box with less than 20GB of available space, yet if I try to run two of those commands I get the disk space warning?

not supported range access

pget -p 6 https://vagrantcloud.com/cdaf/boxes/WindowsServerStandard/versions/2020.10.12/providers/virtualbox.box            thirumoorthi@Thirus-MacBook-Air
Checking now https://vagrantcloud.com/cdaf/boxes/WindowsServerStandard/versions/2020.10.12/providers/virtualbox.box
Error:
  not supported range access: https://vagrantcloud.com/cdaf/boxes/WindowsServerStandard/versions/2020.10.12/providers/virtualbox.box

Pause/Continue?

Hi, wondering is there a way to pause/continue or recover interrupted downloads? Also, if the origin address becomes invalid after a TTL, can I provide a new address and recover my download? Many thanks.

Please add support for recursive download

wget has the functionality to download files recursively from a website.
but sadly it doesn't support multi-threaded and downloads one file at a time.

Instead of wget -r ftp://example.com/test/
we want
pget -p 4 -r ftp://example.com/test/

Release names of 0.2.x are inconsistence with 0.1.x

Hi, have you accidentally added a blank in release CI? See file names in pget_checksums.txt and release page, there's a blank after pget_ for every archives (.tar.gz and .zip), e.g. pget_ 0.2.1_Windows_x86_64.zip, and in release page of GitHub, they become dot ..

The process cannot access the file because it is being used by another process.

PS D:\tools\pget_windows_amd64> .\pget.exe -p 6 http://www.nerc.edu.cn/wk///2016/08/02/344a06be-dc8e-4d7b-a5b9-bc1b1edb83b5/Scorm/27c69af9-30da-4f7e-8
f71-407277c10da3.mp4 --trace
Checking now http://www.nerc.edu.cn/wk///2016/08/02/344a06be-dc8e-4d7b-a5b9-bc1b1edb83b5/Scorm/27c69af9-30da-4f7e-8f71-407277c10da3.mp4
Download start from http://www.nerc.edu.cn/wk///2016/08/02/344a06be-dc8e-4d7b-a5b9-bc1b1edb83b5/Scorm/27c69af9-30da-4f7e-8f71-407277c10da3.mp4
 53349059 / 53349059 [===================================================================================================================] 100.00% 6s

binding with files...
 0 / 53349059 [-----------------------------------------------------------------------------------------------------------------------------]   0.00%
Error:
remove _27c69af9-30da-4f7e-8f71-407277c10da3.mp4.6/27c69af9-30da-4f7e-8f71-407277c10da3.mp4.6.0: The process cannot access the file because it is bein
g used by another process.
failed to remove a file in download location
github.com/Code-Hex/pget.(*Data).BindwithFiles
        /Users/CodeHex/Desktop/go/src/github.com/Code-Hex/pget/util.go:251
github.com/Code-Hex/pget.(*Pget).Run
        /Users/CodeHex/Desktop/go/src/github.com/Code-Hex/pget/pget.go:78
main.main
        /Users/CodeHex/Desktop/go/pget/cmd/pget/main.go:13
runtime.main
        /usr/local/opt/go/libexec/src/runtime/proc.go:188
runtime.goexit
        /usr/local/opt/go/libexec/src/runtime/asm_amd64.s:1998

OS: Windows7 64
Pget v0.0.4, parallel file download client

Feature request: support wrapping output / url information in quotes to handle spaces

I haven't tested to verify if -o supports quoting for handling paths with spaces, but I can confirm that URLs currently do not. I know I can feed a URL like "https://my.server/has spaces in paths/it sucks/i need this file.exe" by replacing spaces with %20, but I would like if possible to be able to instead use pget -o "C:\my file path\my file name.exe" "https://my.server/has spaces in paths/it sucks/i need this file.exe" as is somewhat standard, at least in Windows applications.

Self locked?

C:\vm>pget https://releases.ubuntu.com/20.04.3/ubuntu-20.04.3-live-server-amd64.iso -p 8
[WARNING] Using a large number of connections to 1 URL can lead to DOS attacks.
In most cases, `4` or less is enough. In addition, the case is increasing that if you use multiple connections to 1 URL does not increase the download speed with the spread of CDNs.
See: https://github.com/Code-Hex/pget#disclaimer

Would you execute knowing these?
 (y/n) [n]: y
1.17 GiB / 1.17 GiB [---------------------------------------------------------------------------] 100.00% 54.10 MiB p/s

binding with files...
Error:
  failed to remove "_ubuntu-20.04.3-live-server-amd64.iso.8/ubuntu-20.04.3-live-server-amd64.iso.8.0" in download location: remove _ubuntu-20.04.3-live-server-amd64.iso.8/ubuntu-20.04.3-live-server-amd64.iso.8.0: The process cannot access the file because it is being used by another process.

Only running one command.

Attempt to download Linux kernel fails: does not support range request

$ pget -p 6  https://git.kernel.org/torvalds/t/linux-6.5-rc2.tar.gz
[WARNING] Using a large number of connections to 1 URL can lead to DOS attacks.
In most cases, `4` or less is enough. In addition, the case is increasing that if you use multiple connections to 1 URL does not increase the download speed with the spread of CDNs.
See: https://github.com/Code-Hex/pget#disclaimer

Would you execute knowing these?
 (y/n) [n]: y
Error:
  https://git.kernel.org/torvalds/t/linux-6.5-rc2.tar.gz: does not support range request

go120-1.20.3
FreeBSD 13.2

Resume

I'll create resume function

Progress Bar freezes

The progress bar will freeze after about 5 seconds, and will not show it updating until you break (CTRL + C) and start the command again.

Any suggestions on how to fix this?

Thanks!

Get url from stdin

I think for command line pipe.
I wanna execute like this.

[url get command] | pget -p 6

Problems used in the project

If I use it in my project, I find it impossible to customize the log output method, and I think newdownloadclient can be used as a public method to provider to external programs

Disable progress bar for scripted applications?

I'm evaluating several download-accelerator tools, found pget, and am enjoying working with it. It would be great, however, if there were an option to disable the progress bar, as the output is otherwise difficult to understand when redirected to a logfile. (Ideally, a user could disable just the progress bar, as other basic info and statistics remain quite useful. With some tools, it's all or nothing: You get nothing at all with e.g. a --quiet switch, or you get everything, including the log-busting progress bar.)

example is readme fails

pget -p 6 http://ubuntutym2.u-toyama.ac.jp/ubuntu/16.04/ubuntu-16.04-desktop-amd64.iso
Checking now http://ubuntutym2.u-toyama.ac.jp/ubuntu/16.04/ubuntu-16.04-desktop-amd64.iso
Error:
  not supported range access: http://ubuntutym2.u-toyama.ac.jp/ubuntu/16.04/ubuntu-16.04-desktop-amd64.iso

Attempt do download pget_0.1.1_Linux_386.deb fails: file name too long

$ pget https://github.com/Code-Hex/pget/releases/download/v0.1.1/pget_0.1.1_Linux_386.deb
Error:
  failed to mkdir for download location: mkdir _09fe5966-0e11-4cbb-a0f9-07bcee8a7fc8?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20230716%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20230716T221902Z&X-Amz-Expires=300&X-Amz-Signature=334ff57eec995afbcc4a04136bf1933e7248a2282e6338e11bd296e8d2d51b87&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=62295217&response-content-disposition=attachment%3B%20filename%3Dpget_0.1.1_Linux_386.deb&response-content-type=application%2Foctet-stream.1: file name too long

The CheckMirrors func have a bug

`func (p *Pget) CheckMirrors(ctx context.Context, url string, ch *Ch) {

res, err := ctxhttp.Head(ctx, http.DefaultClient, url)`

Here, ctxhttp.Head without User-Agent, Referer, so some time will be response 403 ,and resulting in a throw out that not supported range access

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.