Giter Site home page Giter Site logo

docker-drag's People

Contributors

dane-l avatar guyomer avatar jcorvino avatar jeffque avatar notglop avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

docker-drag's Issues

How to change images source?

Hi, I am wondering which part of the code identify the image source? The default image source got a slow downloading speed.

Fake layer ids

Hi,

I am currently porting this tool to .net core tool platform. So i have experiment about some cases.
That layer hashes calculated with sha256(layer.tar).

Also, recalculate layers when load image.tar

Could you check this ?

repository full name is not reserved

This is good tool. The docker host linux machine in my office doesn't have internet access. I used this tool to download the docker image from docker hub in windows machine which has internet access. It works great.
But I want this tool keep the repository full name for me. I hope the repository name is just same as docker hub after I load the tar file to docker engine, maybe some good repository name format policy would apply inside this tool?
Below is my operations:

> python docker_pull.py library/nginx:1.19.3-alpine
Creating image structure in: tmp_nginx_1.19.3-alpine
188c0c94c7c5: Pull complete [2796860]
61c2c0635c35: Pull complete [6761747]
378d0a9d4d5f: Pull complete [601]
2fe865f77305: Pull complete [897]
b92535839843: Pull complete [664]
Docker image pulled: library_nginx.tar

Then I renamed library_nginx.tar to library_nginx-1.19.3-alpine.tar and upload it to linux machine which installed docker engine.

$ sudo docker load -i library_nginx-1.19.3-alpine2.tar
ace0eda3e3be: Loading layer [==================================================>]  5.843MB/5.843MB
4daeb7840e4d: Loading layer [==================================================>]  17.45MB/17.45MB
835f5b67679c: Loading layer [==================================================>]  3.072kB/3.072kB
d0e26daf1f58: Loading layer [==================================================>]  4.096kB/4.096kB
8d6d1951ab0a: Loading layer [==================================================>]  3.584kB/3.584kB
Loaded image: nginx:1.19.3-alpine
$ sudo docker images
REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
nginx               1.19.3-alpine       4efb29ff172a        13 days ago         21.8MB

As you can see the "library/" section is lost in REPOSITORY name.

Cannot fetch manifest

$ python docker_pull.py bkimminich/juice-shop
[-] Cannot fetch manifest for bkimminich/juice-shop [HTTP 404]
b'{"errors":[{"code":"MANIFEST_UNKNOWN","message":"OCI index found, but accept header does not support OCI indexes"}]}\n'

[Feature Request] Progress bar using tqdm

It would be good to have a progress bar in this for loop:

docker-drag/docker_pull.py

Lines 98 to 144 in 249fc4f

for layer in layers:
ublob = layer['digest']
# FIXME: Creating fake layer ID. Don't know how Docker generates it
fake_layerid = hashlib.sha256((parentid+'\n'+ublob+'\n').encode('utf-8')).hexdigest()
layerdir = imgdir + '/' + fake_layerid
os.mkdir(layerdir)
# Creating VERSION file
file = open(layerdir + '/VERSION', 'w')
file.write('1.0')
file.close()
# Creating layer.tar file
sys.stdout.write(ublob[7:19] + ': Downloading...')
sys.stdout.flush()
bresp = requests.get('https://{}/v2/{}/blobs/{}'.format(registry, repository, ublob), headers=auth_head, verify=False)
if (bresp.status_code != 200):
bresp = requests.get(layer['urls'][0], headers=auth_head, verify=False)
if (bresp.status_code != 200):
print('\rERROR: Cannot download layer {} [HTTP {}]'.format(ublob[7:19], bresp.status_code, bresp.headers['Content-Length']))
print(bresp.content)
exit(1)
print("\r{}: Pull complete [{}]".format(ublob[7:19], bresp.headers['Content-Length']))
content[0]['Layers'].append(fake_layerid + '/layer.tar')
file = open(layerdir + '/layer.tar', "wb")
mybuff = BytesIO(bresp.content)
unzLayer = gzip.GzipFile(fileobj=mybuff)
file.write(unzLayer.read())
unzLayer.close()
file.close()
# Creating json file
file = open(layerdir + '/json', 'w')
# last layer = config manifest - history - rootfs
if layers[-1]['digest'] == layer['digest']:
# FIXME: json.loads() automatically converts to unicode, thus decoding values whereas Docker doesn't
json_obj = json.loads(confresp.content)
del json_obj['history']
del json_obj['rootfs']
else: # other layers json are empty
json_obj = json.loads(empty_json)
json_obj['id'] = fake_layerid
if parentid:
json_obj['parent'] = parentid
parentid = json_obj['id']
file.write(json.dumps(json_obj))
file.close()

Maybe something like tqdm? https://github.com/tqdm/tqdm

Downloading container image from Azure Container Registry

Hello,

I am trying to download a container image from my azure container registry. The azure container registry has username and password authentication.

But I am facing the below error.

Traceback (most recent call last):
  File "/home/user/docker_pull.py", line 72, in <module>
    auth_head = get_auth_head('application/vnd.docker.distribution.manifest.v2+json')
  File "/home/user/docker_pull.py", line 54, in get_auth_head
    access_token = resp.json()['token']
KeyError: 'token'

Can't find velero's docker image

Hello, thank you for the useful script, super appreciate that.

I got 404 when download velero. The dockehub url is https://hub.docker.com/r/velero/velero

$ python3 docker_pull.py velero/velero
target is  https://registry-1.docker.io/v2/velero/velero/manifests/latest
[-] Cannot fetch manifest for velero/velero [HTTP 404]
b'{"errors":[{"code":"MANIFEST_UNKNOWN","message":"OCI index found, but accept header does not support OCI indexes"}]}

How `docker` calculates the layer ids

docker-drag/docker_pull.py

Lines 118 to 119 in 5413165

# FIXME: Creating fake layer ID. Don't know how Docker generates it
fake_layerid = hashlib.sha256((parentid+'\n'+ublob+'\n').encode('utf-8')).hexdigest()

For each layer docker creates a v1 config, and a layer id is basically a digest of the v1 config, another layer id, and the parent layer id. If you're interested I can probably describe it more precisely. And possibly how other parts of docker save work.

python docker_pull.py centos -> KeyError: 'layers'

Thanks for this wonderful script.
But I have encountered the following problem.

(python2) C:\Users\red.suh>python docker_pull.py centos
Traceback (most recent call last):
  File "docker_pull.py", line 40, in <module>
    layers = resp.json()['layers']
KeyError: 'layers'

For nginx or sonatype/nexus3, there is no problem.
How can I solve the problem?
Thanks

KeyError: 'content-length'

I am calling the script as follows:

python docker_pull.py registry.access.redhat.com/ubi8/ubi

The script will output the following:

78afc5364ad2: Downloading...

And then it fails:

Traceback (most recent call last):
  File "docker_pull.py", line 155, in <module>
    unit = int(bresp.headers['Content-Length']) / 50
  File "C:\Users\<NTID>\AppData\Local\Programs\Python\Python37\lib\site-packages\requests\structures.py", line 52, in __getitem__
    return self._store[key.lower()][1]
KeyError: 'content-length'

I've made a few adjustments to your script, primarily adding the proxies= option for calls to requests functions as I'm stuck behind a proxy. This allows me to communicate with the Redhat registry. So line 155 is actually line 141 as referenced below.

unit = int(bresp.headers['Content-Length']) / 50

I've added a print statement to dump the bresp.headers and Content-Length is not in there:

{
  'Accept-Ranges': 'bytes',
  'Content-Type': 'text/plain',
  'ETag': '"ced9e6d20e1ac931689399b68b0dd6a4:1588112355"',
  'Last-Modified': 'Tue, 28 Apr 2020 22:03:16 GMT',
  'Server': 'AkamaiNetStorage',
  'Vary': 'Accept-Encoding',
  'Date': 'Mon, 18 May 2020 17:57:51 GMT',
  'Transfer-Encoding': 'chunked',
  'X-Docker-Size': '-1',
  'Cache-Control': 'proxy-revalidate',
  'Connection': 'Keep-Alive',
  'Content-Encoding': 'gzip'
}

I've confirmed with a machine at home (different OS and no proxy) that the same error is experienced. I understand that the script relies on the Docker HTTPS API v2 so I apologize if the redhat registry is actually v1. I'm new to docker and do not know how to check for this.

Cannot pull OCI images - KeyError: 'layers'

python docker_pull.py postgres

I run the script in Windows 11/ Python 3.11 and get the following error:

Traceback (most recent call last):
  File "D:\share\docker_pull.py", line 87, in <module>
    layers = resp.json()['layers']
             ~~~~~~~~~~~^^^^^^^^^^
KeyError: 'layers'

Big files(e. g. gitlab/gitlab-ce:12.3.7-ce.0) aren't supported

Logs:

Creating image structure in: tmp_gitlab-ce_12.3.7-ce.0
e80174c8b43b: Pull complete [44144090]
d1072db285cc: Pull complete [529]
858453671e67: Pull complete [849]
3d07b1124f98: Pull complete [170]
655fb0f51b08: Pull complete [26257584]
063c37e78c5c: Pull complete [141]
a0398d68068f: Pull complete [146]
f41e790a20a6: Pull complete [236]
8eb8c4ceb762: Pull complete [4095]
e3a502127d8c: Pull complete [705893899]
Traceback (most recent call last):
  File "docker_pull.py", line 129, in <module>
    file.write(unzLayer.read())
  File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\gzip.py", li
ne 276, in read
    return self._buffer.read(size)
  File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\gzip.py", li
ne 471, in read
    uncompress = self._decompressor.decompress(buf, size)
MemoryError

Able to tell the calculation document about fake_layerid

	fake_layerid = hashlib.sha256((parentid+'\n'+ublob+'\n').encode('utf-8')).hexdigest()
	layerdir = imgdir + '/' + fake_layerid
	os.mkdir(layerdir)

I used the docker save -o test to find that the fake_layerid is different from the script generated.

[Feature Request] Update documentation to provide a sample for how to use the container after downloading it.

First super useful tool, made use of it at work after finding it through a stackoverflow post. Thank you for making such a straightforward but functional tool.

Next, minor nitpick while docker load is contained within the main documentation for docker it might not hurt to throw the basic use case in the documentation so people will have some context for how to use this tool effectively.

A slight detail

Hi,

Into file 'docker_pull.py', line 164 and 165 :

	# FIXME: json.loads() automatically converts to unicode, thus decoding values whereas Docker doesn't
	json_obj = json.loads(confresp.content.decode("utf8"))

Instead of :

	json_obj = json.loads(confresp.content)

For me it fixed the failure.

pulls from nvidia gpu cloud

First of all, thanks for the script!

However, I couldn't make it work for

docker pull nvcr.io/nvidia/tensorflow:19.10-py3

via:

python docker_pull.py nvcr.io/nvidia/tensorflow:19.05-py3

Error message:

Traceback (most recent call last):
  File "docker_pull.py", line 47, in <module>
    reg_service = resp.headers['WWW-Authenticate'].split('"')[3]
IndexError: list index out of range

UNAUTHORIZED, authentication required, when running against a private image

$ python docker_pull.py foo/bar:latest
[-] Cannot fetch manifest for foo/bar [HTTP 401]
b'{"errors":[{"code":"UNAUTHORIZED","message":"authentication required","detail":[{"Type":"repository","Class":"","Name":"foo/bar","Action":"pull"}]}]}\n'

But if I run:
docker pull foo/bar:latest

it works.

I'm not sure how to pass my creds to the script.

Re-authenticate upon token expiration

trying

524$ python2  docker_pull.py "jupyter/datascience-notebook"
/usr/lib/python2.7/site-packages/urllib3/connectionpool.py:852: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecureRequestWarning)
Creating image structure in: tmp_datascience-notebook_latest
a48c500ed24e: Pull complete [30957448]
1e1de00ff7e1: Pull complete [841]
0330ca45a200: Pull complete [412]
471db38bcfbf: Pull complete [849]
0b4aba487617: Pull complete [162]
1bac85b3a63e: Pull complete [19696719]
245be47b44f6: Pull complete [424313]
ef168d10cf08: Pull complete [666]
3f40baab49e8: Pull complete [1891]
1074310668a8: Pull complete [6005]
acab6d938518: Pull complete [147]
d3c413e667b9: Pull complete [72063649]
63b84d46215a: Pull complete [11195]
e2aa43484a2e: Pull complete [93156146]
e45a3ec35504: Pull complete [2160]
b91bbc043eab: Pull complete [434]
8842220992fc: Pull complete [691]
fc8f34d51deb: Pull complete [1006]
5f6edb450186: Pull complete [1015]
44257488fae5: Pull complete [824729788]
540df7774880: Pull complete [71414708]
178f3a1a18b4: Pull complete [271621569]
03528a45986d: Pull complete [456599]
5c52a47b5569: Pull complete [10279]
1f67b31a20f8: Pull complete [17277944]
70de4b41273e: Pull complete [171]
Traceback (most recent call last):
  File "docker_pull.py", line 86, in <module>
    file.write(unzLayer.read())
  File "/usr/lib/python2.7/gzip.py", line 261, in read
    self._read(readsize)
  File "/usr/lib/python2.7/gzip.py", line 303, in _read
    self._read_gzip_header()
  File "/usr/lib/python2.7/gzip.py", line 197, in _read_gzip_header
    raise IOError, 'Not a gzipped file'
IOError: Not a gzipped file


Fails when downloading Windows images

This program fails when downloading Windows server core images. For example,
python docker_pull.py mcr.microsoft.com/windows/nanoserver
will fail.

I believe the issue lies in the fact that microsoft uses https://mcr.microsoft.com/v2/ as it's base API url. I made a work around by setting repo = 'windows' in docker_pull.py and using the microsoft.com link as the base API url which seems to be working for now. Authentication still goes through https://auth.docker.io and works the same way you wrote it.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.