notglop / docker-drag Goto Github PK
View Code? Open in Web Editor NEWDownload image from the Docker Hub HTTPS API
License: GNU General Public License v3.0
Download image from the Docker Hub HTTPS API
License: GNU General Public License v3.0
Will this feature be supported?
This script can't continue to download the target file when the request fails.
Traceback (most recent call last):
File "docker_pull.py", line 93, in <module>
os.mkdir(imgdir)
FileExistsError: [Errno 17] File exists: 'tmp_imagename_latest'
Hi, I am wondering which part of the code identify the image source? The default image source got a slow downloading speed.
Can you please add an open source license so that people may use this software?
Hi,
I am currently porting this tool to .net core tool platform. So i have experiment about some cases.
That layer hashes calculated with sha256(layer.tar).
Also, recalculate layers when load image.tar
Could you check this ?
This is good tool. The docker host linux machine in my office doesn't have internet access. I used this tool to download the docker image from docker hub in windows machine which has internet access. It works great.
But I want this tool keep the repository full name for me. I hope the repository name is just same as docker hub after I load the tar file to docker engine, maybe some good repository name format policy would apply inside this tool?
Below is my operations:
> python docker_pull.py library/nginx:1.19.3-alpine
Creating image structure in: tmp_nginx_1.19.3-alpine
188c0c94c7c5: Pull complete [2796860]
61c2c0635c35: Pull complete [6761747]
378d0a9d4d5f: Pull complete [601]
2fe865f77305: Pull complete [897]
b92535839843: Pull complete [664]
Docker image pulled: library_nginx.tar
Then I renamed library_nginx.tar to library_nginx-1.19.3-alpine.tar and upload it to linux machine which installed docker engine.
$ sudo docker load -i library_nginx-1.19.3-alpine2.tar
ace0eda3e3be: Loading layer [==================================================>] 5.843MB/5.843MB
4daeb7840e4d: Loading layer [==================================================>] 17.45MB/17.45MB
835f5b67679c: Loading layer [==================================================>] 3.072kB/3.072kB
d0e26daf1f58: Loading layer [==================================================>] 4.096kB/4.096kB
8d6d1951ab0a: Loading layer [==================================================>] 3.584kB/3.584kB
Loaded image: nginx:1.19.3-alpine
$ sudo docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
nginx 1.19.3-alpine 4efb29ff172a 13 days ago 21.8MB
As you can see the "library/" section is lost in REPOSITORY name.
save to /var/lib/docker/overlay2
and make it visable in docker images
$ python docker_pull.py bkimminich/juice-shop
[-] Cannot fetch manifest for bkimminich/juice-shop [HTTP 404]
b'{"errors":[{"code":"MANIFEST_UNKNOWN","message":"OCI index found, but accept header does not support OCI indexes"}]}\n'
It would be good to have a progress bar in this for loop:
Lines 98 to 144 in 249fc4f
Maybe something like tqdm? https://github.com/tqdm/tqdm
Hello,
I am trying to download a container image from my azure container registry. The azure container registry has username and password authentication.
But I am facing the below error.
Traceback (most recent call last):
File "/home/user/docker_pull.py", line 72, in <module>
auth_head = get_auth_head('application/vnd.docker.distribution.manifest.v2+json')
File "/home/user/docker_pull.py", line 54, in get_auth_head
access_token = resp.json()['token']
KeyError: 'token'
Hello, thank you for the useful script, super appreciate that.
I got 404 when download velero. The dockehub url is https://hub.docker.com/r/velero/velero
$ python3 docker_pull.py velero/velero
target is https://registry-1.docker.io/v2/velero/velero/manifests/latest
[-] Cannot fetch manifest for velero/velero [HTTP 404]
b'{"errors":[{"code":"MANIFEST_UNKNOWN","message":"OCI index found, but accept header does not support OCI indexes"}]}
Lines 118 to 119 in 5413165
For each layer docker
creates a v1 config, and a layer id is basically a digest of the v1 config, another layer id, and the parent layer id. If you're interested I can probably describe it more precisely. And possibly how other parts of docker save
work.
Thanks for this wonderful script.
But I have encountered the following problem.
(python2) C:\Users\red.suh>python docker_pull.py centos
Traceback (most recent call last):
File "docker_pull.py", line 40, in <module>
layers = resp.json()['layers']
KeyError: 'layers'
For nginx or sonatype/nexus3, there is no problem.
How can I solve the problem?
Thanks
I am calling the script as follows:
python docker_pull.py registry.access.redhat.com/ubi8/ubi
The script will output the following:
78afc5364ad2: Downloading...
And then it fails:
Traceback (most recent call last):
File "docker_pull.py", line 155, in <module>
unit = int(bresp.headers['Content-Length']) / 50
File "C:\Users\<NTID>\AppData\Local\Programs\Python\Python37\lib\site-packages\requests\structures.py", line 52, in __getitem__
return self._store[key.lower()][1]
KeyError: 'content-length'
I've made a few adjustments to your script, primarily adding the proxies=
option for calls to requests
functions as I'm stuck behind a proxy. This allows me to communicate with the Redhat registry. So line 155 is actually line 141 as referenced below.
Line 141 in 5413165
I've added a print
statement to dump the bresp.headers
and Content-Length
is not in there:
{
'Accept-Ranges': 'bytes',
'Content-Type': 'text/plain',
'ETag': '"ced9e6d20e1ac931689399b68b0dd6a4:1588112355"',
'Last-Modified': 'Tue, 28 Apr 2020 22:03:16 GMT',
'Server': 'AkamaiNetStorage',
'Vary': 'Accept-Encoding',
'Date': 'Mon, 18 May 2020 17:57:51 GMT',
'Transfer-Encoding': 'chunked',
'X-Docker-Size': '-1',
'Cache-Control': 'proxy-revalidate',
'Connection': 'Keep-Alive',
'Content-Encoding': 'gzip'
}
I've confirmed with a machine at home (different OS and no proxy) that the same error is experienced. I understand that the script relies on the Docker HTTPS API v2 so I apologize if the redhat registry is actually v1. I'm new to docker and do not know how to check for this.
python docker_pull.py postgres
I run the script in Windows 11/ Python 3.11 and get the following error:
Traceback (most recent call last):
File "D:\share\docker_pull.py", line 87, in <module>
layers = resp.json()['layers']
~~~~~~~~~~~^^^^^^^^^^
KeyError: 'layers'
very useful, thank you
Logs:
Creating image structure in: tmp_gitlab-ce_12.3.7-ce.0
e80174c8b43b: Pull complete [44144090]
d1072db285cc: Pull complete [529]
858453671e67: Pull complete [849]
3d07b1124f98: Pull complete [170]
655fb0f51b08: Pull complete [26257584]
063c37e78c5c: Pull complete [141]
a0398d68068f: Pull complete [146]
f41e790a20a6: Pull complete [236]
8eb8c4ceb762: Pull complete [4095]
e3a502127d8c: Pull complete [705893899]
Traceback (most recent call last):
File "docker_pull.py", line 129, in <module>
file.write(unzLayer.read())
File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\gzip.py", li
ne 276, in read
return self._buffer.read(size)
File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\gzip.py", li
ne 471, in read
uncompress = self._decompressor.decompress(buf, size)
MemoryError
I try to use Oauth2 to get token, but failed, do you have any suggestions?
fake_layerid = hashlib.sha256((parentid+'\n'+ublob+'\n').encode('utf-8')).hexdigest()
layerdir = imgdir + '/' + fake_layerid
os.mkdir(layerdir)
I used the docker save -o
test to find that the fake_layerid is different from the script generated.
First super useful tool, made use of it at work after finding it through a stackoverflow post. Thank you for making such a straightforward but functional tool.
Next, minor nitpick while docker load is contained within the main documentation for docker it might not hurt to throw the basic use case in the documentation so people will have some context for how to use this tool effectively.
Hi,
Into file 'docker_pull.py', line 164 and 165 :
# FIXME: json.loads() automatically converts to unicode, thus decoding values whereas Docker doesn't
json_obj = json.loads(confresp.content.decode("utf8"))
Instead of :
json_obj = json.loads(confresp.content)
For me it fixed the failure.
First of all, thanks for the script!
However, I couldn't make it work for
docker pull nvcr.io/nvidia/tensorflow:19.10-py3
via:
python docker_pull.py nvcr.io/nvidia/tensorflow:19.05-py3
Error message:
Traceback (most recent call last):
File "docker_pull.py", line 47, in <module>
reg_service = resp.headers['WWW-Authenticate'].split('"')[3]
IndexError: list index out of range
Hi any possibilities to use this script with authorization needed registry ? Thanks
$ python docker_pull.py foo/bar:latest
[-] Cannot fetch manifest for foo/bar [HTTP 401]
b'{"errors":[{"code":"UNAUTHORIZED","message":"authentication required","detail":[{"Type":"repository","Class":"","Name":"foo/bar","Action":"pull"}]}]}\n'
But if I run:
docker pull foo/bar:latest
it works.
I'm not sure how to pass my creds to the script.
trying
524$ python2 docker_pull.py "jupyter/datascience-notebook"
/usr/lib/python2.7/site-packages/urllib3/connectionpool.py:852: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
InsecureRequestWarning)
Creating image structure in: tmp_datascience-notebook_latest
a48c500ed24e: Pull complete [30957448]
1e1de00ff7e1: Pull complete [841]
0330ca45a200: Pull complete [412]
471db38bcfbf: Pull complete [849]
0b4aba487617: Pull complete [162]
1bac85b3a63e: Pull complete [19696719]
245be47b44f6: Pull complete [424313]
ef168d10cf08: Pull complete [666]
3f40baab49e8: Pull complete [1891]
1074310668a8: Pull complete [6005]
acab6d938518: Pull complete [147]
d3c413e667b9: Pull complete [72063649]
63b84d46215a: Pull complete [11195]
e2aa43484a2e: Pull complete [93156146]
e45a3ec35504: Pull complete [2160]
b91bbc043eab: Pull complete [434]
8842220992fc: Pull complete [691]
fc8f34d51deb: Pull complete [1006]
5f6edb450186: Pull complete [1015]
44257488fae5: Pull complete [824729788]
540df7774880: Pull complete [71414708]
178f3a1a18b4: Pull complete [271621569]
03528a45986d: Pull complete [456599]
5c52a47b5569: Pull complete [10279]
1f67b31a20f8: Pull complete [17277944]
70de4b41273e: Pull complete [171]
Traceback (most recent call last):
File "docker_pull.py", line 86, in <module>
file.write(unzLayer.read())
File "/usr/lib/python2.7/gzip.py", line 261, in read
self._read(readsize)
File "/usr/lib/python2.7/gzip.py", line 303, in _read
self._read_gzip_header()
File "/usr/lib/python2.7/gzip.py", line 197, in _read_gzip_header
raise IOError, 'Not a gzipped file'
IOError: Not a gzipped file
This program fails when downloading Windows server core images. For example,
python docker_pull.py mcr.microsoft.com/windows/nanoserver
will fail.
I believe the issue lies in the fact that microsoft uses https://mcr.microsoft.com/v2/ as it's base API url. I made a work around by setting repo = 'windows'
in docker_pull.py and using the microsoft.com link as the base API url which seems to be working for now. Authentication still goes through https://auth.docker.io and works the same way you wrote it.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.