rancher / image-mirror Goto Github PK

Shell 50.60% Makefile 0.82% Python 48.58%

image-mirror's Introduction

Mirroring External Images into Rancher Repo on Dockerhub

This repo is dedicated to mirror images from other organizations into Rancher. There are no packaging changes or changes in the layers of these images.

Mirroring images

The list is maintained in the images-list file, which is structured with the following format...

<original-image-name> <rancher-image-name> <image-tag>

The basic rancher-image-name structure is mirrored-<org>-<repo> and here is an example...

banzaicloud/logging-operator rancher/mirrored-banzaicloud-logging-operator 3.7.0

Images are mirrored using the scripts/image-mirror.sh script.

Adding New Images

When adding new images to the repo, please indicate so in the pull request.

An EIO team member or manager will need to create the repo in DockerHub as well as add the automatedcipublisher as a team member in DockerHub with write access in order for the images to be automatically pushed.

Updating Existing Images

Do not update the tag in the images-list file for an updated image to be pulled/pushed. Add an additional entry with the new tag.

Adding new tags to existing images

Scheduled

There is also a scheduled workflow called Retrieve image tags that can be used if you have images that needs new tags added automatically. It will check a configurable source for available tags, and use the found tags to dispatch the workflow Add tag to existing image . The configuration lives in config.json. The basic structure is having a descriptive key (pick your own), and specify the list of images for which the available tag(s) need to be looked up (versionSource), and an optional SemVer constraint if you need to limit what tags are used. The current datasources are:

github-releases: This will use GitHub releases as source, excluding pre-releases. This can be used if you need to keep all tags from the configured images in sync with GitHub releases
github-latest-release: This will use the release on GitHub marked as Latest release. This can be used if you only want one release to be added that is marked as latest.
github-tagged-images-file: This will look up GitHub git repository tags, and find the list of images inside a specified file. The tag must have an associated release, with the pre-release flag unset. This can be used if your project maintains a list of images in a file, e.g., https://github.com/longhorn/longhorn/blob/master/deploy/longhorn-images.txt
registry: This will use the registry of the first image and look up available tags.
helm-latest:helm-repo-fqdn: This will add the helm-repo-fqdn, and use the latest version of configured Helm chart(s) (helmCharts) configured to extract the images. It uses helm template and helm show values to extract images. You can specify one ore more iterations of helm template by specifying one ore more values configurations to make sure all required images are extracted. If you want to block certain images from being extracted, you can use imageDenylist in the configuration. See example below.
helm-oci: This is the same as helm-latest, except you don't need to provide a repository but it will use the charts directly from the provided helmCharts (which should be formatted as oci://hostname/chart).
helm-directory:/full_path_to_charts_directory: Provide a directory with chart(s) to use, introduced for testing purposes. The full path used by the helm commands is /full_path_to_charts_directory/chart_name_from_config.

The current filters for tags are:

versionConstraint: This is a semver constraint that will match the given expression and ignore tags that do not match.
versionFilter: This is a regex filter that will match the given expression and ignore tags that do not match.
latest: Sorts the found tags numerically and returns only the latest tag
latest_entry: Returns the last found (newest) tag only (can be used when tags are not semver/cannot be sorted numerically)

github-tagged-images-file specific options:

imagesFilePath: the path to the list of images inside a GitHub git repository

Helm specific options:

imageDenylist: An array of images that will not be added (in case the image matching finds images that shouldn't be added as the automation only accounts for adding tags to existing images, not adding new images as they need to be approved first)
kubeVersion: What version to pass to --kube-version when running helm template
devel: Use chart development versions (adds --devel to helm template and helm show values commands)
additionalVersionFilter: Next to retrieving the latest Helm chart, it will also run helm template and helm show values commands with --version parameters from this array. This is useful if you want to include images from multiple versions in a single pull request.
versionFilter: Specify what version of the Helm chart needs to be used (this will only run helm template and helm show values with the configured versionFilter)

See example configuration for github-releases, github-latest-release and registry:

{
  "vsphere-cpi": {
    "images": [
      "gcr.io/cloud-provider-vsphere/cpi/release/manager"
    ],
    "versionSource": "github-releases:kubernetes/cloud-provider-vsphere",
    "versionConstraint": ">1.21.0"
  },
  "flannel": {
    "images": [
      "flannel/flannel"
    ],
    "versionSource": "github-latest-release:flannel-io/flannel"
  },
  "bci-busybox": {
    "images": [
      "registry.suse.com/bci/bci-busybox"
    ],
    "versionSource": "registry",
    "versionFilter": "^15.4.",
    "latest": "true"
  },
  "skopeo": {
    "images": [
      "quay.io/skopeo/stable"
    ],
    "versionSource": "registry",
    "versionFilter": "^v1.\\d{2}.\\d+$",
    "latest": "true"
  },
  "pause": {
    "images": [
      "registry.k8s.io/pause"
    ],
    "versionSource": "registry",
    "versionFilter": "^3.\\d+$",
    "latest": "true"
  },
  "epinio": {
    "images": [
      "ghcr.io/epinio/epinio-server"
    ],
    "versionSource": "registry",
    "versionFilter": "^v1.\\d+.\\d+$",
    "latest": "true"
  },
  "csi-release-syncer": {
    "images": [
      "gcr.io/cloud-provider-vsphere/csi/release/syncer"
    ],
    "versionSource": "registry",
    "versionFilter": "^v2.\\d+.\\d+$",
    "latest": "true"
  }
}

See example configuration for github-tagged-images-file:

{
  "longhorn": {
    "versionSource": "github-tagged-images-file:longhorn/longhorn",
    "imagesFilePath": "deploy/longhorn-images.txt",
    "versionConstraint": ">=1.4.0"
  }
}

See example configuration for helm-latest:helm-repo-fqdn:

{
  "cilium": {
    "versionSource": "helm-latest:https://helm.cilium.io",
    "imageDenylist": [
      "quay.io/cilium/operator",
      "quay.io/cilium/startup-script"
    ],
    "helmCharts": {
      "cilium": {
        "devel": true,
        "chartConfig": {
          "aws": {
            "values": [
              "eni.enabled=true"
            ],
            "kubeVersion": "1.24"
          },
          "azure":  {
            "values": [
              "azure.enabled=true"
            ]
          },
          "generic": {
            "values": [
              "clustermesh.useAPIServer=true",
              "envoy.enabled=true",
              "hubble.ui.enabled=true",
              "hubble.relay.enabled=true",
              "hubble.enabled=true"
            ]
          },
          "kubeversiononly": {
            "kubeVersion": "1.28"
          }
        }
      }
    }
  },
  "epinio": {
    "versionSource": "helm-latest:https://epinio.github.io/helm-charts",
    "helmCharts": {
      "epinio": {
        "chartConfig": {
          "generic": {
            "values": [
              "global.domain=myepiniodomain.org"
            ]
          }
        }
      }
    }
  },
  "kubewarden": {
    "versionSource": "helm-latest:https://charts.kubewarden.io",
    "helmCharts": {
      "kubewarden-controller": {},
      "kubewarden-defaults": {}
    }
  },
  "neuvector": {                                                                                                                                                                                                                                                                                                             
    "versionSource": "helm-latest:https://neuvector.github.io/neuvector-helm",
    "helmCharts": {
      "core": {}
    }
  },
  "longhorn": {
    "versionSource": "helm-latest:https://charts.longhorn.io",
    "additionalVersionFilter": [
      "v1.4.*"
    ],
    "helmCharts": {
      "longhorn": {}
    }
  },
  "longhorn": {
    "versionSource": "helm-latest:https://charts.longhorn.io",
    "helmCharts": {
      "longhorn": {
        "versionFilter": "v1.4.*"
      }
    }
  }

}

See example configuration for helm-oci:

{
  "elemental": {
    "versionSource": "helm-oci",
    "imageDenylist": [
      "registry.suse.com/rancher/elemental-teal-channel"
    ],
    "helmCharts": {
      "oci://registry.suse.com/rancher/elemental-operator-chart": {}
    }
  }
}

See example configuration for helm-directory:

{
  "epinio-directory": {
    "versionSource": "helm-directory:/epinio-charts/chart",
    "helmCharts": {
      "epinio": {
        "chartConfig": {
          "generic": {
            "values": [
              "global.domain=myepiniodomain.org"
            ]
          }
        }
      }
    }
  }
}

If you want to manually test your configuration changes to check if the correct tags are found, you can use the following commands depending on your available runtime:

Docker

docker run -v $PWD:/code -w /code/retrieve-image-tags python:3.10-alpine sh -c "apk -qU add helm && pip install --disable-pip-version-check --root-user-action=ignore -qr requirements.txt && python retrieve-image-tags.py"

podman

podman run -v $PWD:/code -w /code/retrieve-image-tags python:3.10-alpine sh -c "apk -qU add helm && pip install --disable-pip-version-check --root-user-action=ignore -qr requirements.txt && python retrieve-image-tags.py"

containerd

ctr images pull docker.io/library/python:3.10-alpine
ctr run -t --net-host --mount type=bind,src=$PWD,dst=/code,options=rbind:ro --cwd /code/retrieve-image-tags --rm docker.io/library/python:3.10-alpine workflow-test sh -c "apk -qU add helm && pip install --disable-pip-version-check --root-user-action=ignore -qr requirements.txt && python retrieve-image-tags.py"

Using scripts

You can use the following commands/scripts to add a tag to an existing image. Make sure the IMAGES environment variable is set to the image(s) you want to add a tag to, and the TAGS environment variable is set to the tags you want to add to the images. The script will check:

If the image provided is already existing, else it will fail because it only supports adding tags to existing images.
If there is only one mapping in images-list, else it will fail because it cannot determine what mapping to use.
If the tag for the image is not already present, else it will fail because it is not new.
If the tag for the image exists, else it will fail as it cannot be mirrored.

After everything is successfull, it will add the tag to images-list. If all images and tags are added, it will sort images-list.

See an example below:

IMAGES=quay.io/coreos/etcd TAGS=v3.4.20 make add-tag-to-existing-image.sh

There is also a wrapper script to support supplying images with tags. This was added to support the helm-latest version source which extracts images from Helm charts and does not work with the images + tags inputs. The wrapper script for full images can be used as follows:

FULL_IMAGES=quay.io/skopeo/stable:v1.13.3,quay.io/cilium/cilium-envoy:v1.25.9-e198a2824d309024cb91fb6a984445e73033291d make add-full-image-wrapper.sh

The wrapper script will run the add-tag-to-existing-image.sh script for each image, to be aligned with all the checks that are required.

Optionally, you can also check if the newly added image tag exists (this will also be run in GitHub Action):

make check-new-images-exist.sh

Using GitHub Actions workflow

You can use the Add tag to existing image workflow to provide a comma separated list of existing images and to be added tags, and it will create a pull request automatically with the changes. See Using scripts what this does in detail.

Example inputs:

Images: quay.io/cilium/cilium,quay.io/cilium/operator-aws,quay.io/cilium/operator-azure,quay.io/cilium/operator-generic 
Tags: v1.12.1

Full Images: quay.io/skopeo/stable:v1.13.3,quay.io/cilium/cilium-envoy:v1.25.9-e198a2824d309024cb91fb6a984445e73033291d

image-mirror's People

Contributors

Stargazers

Watchers

Forkers

luthermonson cbron pennyscissors dystudio strongmonkey jiaqiluo gggitboy superseb brendarearden paynejacob aiyengar2 oats87 cjellick nickgerace ersitzt bashofmann rawmind0 shuo-wu isabella232 jcaamano erikwilson cmurphy rmweir wuhuizuo briandowns cclhsu vadorovsky manuelbuil kinarashah prachidamle rayandas doflamingo721 jakefhyde dhruvmewada15 meldafrawi ivan-claire dhawton phanle1010 rbrtbnfgl tete17 galal-hussein cnrancher thomasferrandiz yangchiu epinio oxr463 sitedata eliyamlevy mitulshah-suse brooksn brandond geethub97 a-blender royalwang weizhe0422 lin1005q tuananh stormqueen1990 vardhaman22 selvamt94 dbason dereknola futuretea rohitsakala nicholassuse harrisonwaffel diogoasouza joshmeranda yaocw2020 garyduan fgiudici lucasmlp chiukapoor mantissahz alexandrelamarre thatmidwesterncoder pjbgf zhanghanyu1219 xiaoruiguo krunalhinguu ericpromislow ycedillos dharmit akhilerm mallardduck thardeck

image-mirror's Issues

Improve CI process

For given lines added:

Ensure the source exists
Ensure the rancher repo we are mirroring to exists
Ensure the image does not pre-exist in the rancher mirror. In some cases we are ok with this and will hard merge over it anyway, but that should fail CI.

Also

ensure status.docker.com is clean, we have corrupted images in the past when dockerhub is down.

Silent error on failing to mirror images

The following logs are observed on https://drone-publish.rancher.io/rancher/image-mirror/1058/1/3 on trying to mirror the image added in the PR rancher/rancher#43149.

While the Job passed, the image was never actually mirrored to https://hub.docker.com/r/rancher/mirrored-gmsa-webhook-k8s-gmsa-webhook/tags.

Line: registry.k8s.io/gmsa-webhook/k8s-gmsa-webhook rancher/mirrored-gmsa-webhook-k8s-gmsa-webhook v0.7.0
registry.k8s.io/gmsa-webhook/k8s-gmsa-webhook rancher/mirrored-gmsa-webhook-k8s-gmsa-webhook v0.7.0
registry.k8s.io/gmsa-webhook/k8s-gmsa-webhook:v0.7.0 is schemaVersion 2
registry.k8s.io/gmsa-webhook/k8s-gmsa-webhook:v0.7.0 is mediaType application/vnd.docker.distribution.manifest.list.v2+json
	Copying registry.k8s.io/gmsa-webhook/k8s-gmsa-webhook@sha256:38c99a8efc9c3c4c0dca50a95f26fdcc053e84b960d114b93f9e5a426f51478d => docker.io/rancher/mirrored-gmsa-webhook-k8s-gmsa-webhook:v0.7.0-amd64
	        sha256:38c99a8efc9c3c4c0dca50a95f26fdcc053e84b960d114b93f9e5a426f51478d => MISSING
Getting image source signatures
Copying blob sha256:3144a634950d742e6a2c0d51958e0d81d516fc99868dedc6dcc7391fccce318e
Copying blob sha256:d5696692a0e2e08ec9abe07946c01d5d05ebf39bc7f85a0f6343e2b95d78790f
time="2023-10-16T17:10:49Z" level=fatal msg="writing blob: initiating layer upload to /v2/rancher/mirrored-gmsa-webhook-k8s-gmsa-webhook/blobs/uploads/ in registry-1.docker.io: requested access to the resource is denied"
===
Failed copying image for rancher/mirrored-gmsa-webhook-k8s-gmsa-webhook
===
	Copying registry.k8s.io/gmsa-webhook/k8s-gmsa-webhook@sha256:9d59b3f577a6b0383978f1f957b0930c28580f84c74eaa823ebf27be715d96aa => docker.io/rancher/mirrored-gmsa-webhook-k8s-gmsa-webhook:v0.7.0-arm64
	        sha256:9d59b3f577a6b0383978f1f957b0930c28580f84c74eaa823ebf27be715d96aa => MISSING
Getting image source signatures
Copying blob sha256:3144a634950d742e6a2c0d51958e0d81d516fc99868dedc6dcc7391fccce318e
Copying blob sha256:d5696692a0e2e08ec9abe07946c01d5d05ebf39bc7f85a0f6343e2b95d78790f
time="2023-10-16T17:10:49Z" level=fatal msg="writing blob: initiating layer upload to /v2/rancher/mirrored-gmsa-webhook-k8s-gmsa-webhook/blobs/uploads/ in registry-1.docker.io: requested access to the resource is denied"
===
Failed copying image for rancher/mirrored-gmsa-webhook-k8s-gmsa-webhook
===
	arm NOT FOUND
	s390x NOT FOUND
	Writing manifest list to docker.io/rancher/mirrored-gmsa-webhook-k8s-gmsa-webhook:v0.7.0
sha256:38c99a8efc9c3c4c0dca50a95f26fdcc053e84b960d114b93f9e5a426f51478d
sha256:9d59b3f577a6b0383978f1f957b0930c28580f84c74eaa823ebf27be715d96aa
httpReaderSeeker: failed open: content at https://registry-1.docker.io/v2/rancher/mirrored-gmsa-webhook-k8s-gmsa-webhook/manifests/sha256:9d59b3f577a6b0383978f1f957b0930c28580f84c74eaa823ebf27be715d96aa not found: not found
===
Failed copying image for rancher/mirrored-gmsa-webhook-k8s-gmsa-webhook
===
Updating description for docker.io/rancher/mirrored-gmsa-webhook-k8s-gmsa-webhook

For context, here is how the manifest looks like:

$ docker manifest inspect registry.k8s.io/gmsa-webhook/k8s-gmsa-webhook:v0.7.0
{
   "schemaVersion": 2,
   "mediaType": "application/vnd.docker.distribution.manifest.list.v2+json",
   "manifests": [
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 698,
         "digest": "sha256:38c99a8efc9c3c4c0dca50a95f26fdcc053e84b960d114b93f9e5a426f51478d",
         "platform": {
            "architecture": "amd64",
            "os": "linux"
         }
      },
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 698,
         "digest": "sha256:9d59b3f577a6b0383978f1f957b0930c28580f84c74eaa823ebf27be715d96aa",
         "platform": {
            "architecture": "arm64",
            "os": "linux"
         }
      }
   ]
}

Need to create a repository for the neuvector prometheus exporter image

rancher/mirrored-prometheus-exporter repository need to be added. We want to create Rancher feature chart for thie new components.

Can the name be as rancher/mirrored-prometheus-exporter or need to have neuvector in it as rancher/mirrored-neuvector-prometheus-exporter

related issue
#397
#395

not found error while creating manifest list for docker.io/rancher/mirrored-pause:3.8

Seen in https://drone-publish.rancher.io/rancher/image-mirror/754/1/3:

Writing manifest to image destination
Storing signatures
	Unchanged: registry.k8s.io/pause@sha256:566af08540f378a70a03588f3963b035f33c49ebab3e4e13a4f5edbcd78c6689 == docker.io/rancher/mirrored-pause:3.8-arm64
	           sha256:566af08540f378a70a03588f3963b035f33c49ebab3e4e13a4f5edbcd78c6689
	Unchanged: registry.k8s.io/pause@sha256:27295ffe5a75328e8230ff9bcabe2b54ebb9079ff70344d73a7b7c7e163ee1a6 == docker.io/rancher/mirrored-pause:3.8-arm
	           sha256:27295ffe5a75328e8230ff9bcabe2b54ebb9079ff70344d73a7b7c7e163ee1a6
	Unchanged: registry.k8s.io/pause@sha256:7eaeb31509d7f370599ef78d55956e170eafb7f4a75b8dc14b5c06071d13aae0 == docker.io/rancher/mirrored-pause:3.8-s390x
	           sha256:7eaeb31509d7f370599ef78d55956e170eafb7f4a75b8dc14b5c06071d13aae0
	Writing manifest list to docker.io/rancher/mirrored-pause:3.8
sha256:27295ffe5a75328e8230ff9bcabe2b54ebb9079ff70344d73a7b7c7e163ee1a6
sha256:566af08540f378a70a03588f3963b035f33c49ebab3e4e13a4f5edbcd78c6689
sha256:78bfb9d8999c190fca79871c4b2f8d69d94a0605266f0bbb2dbaa1b6dfd03720
sha256:7eaeb31509d7f370599ef78d55956e170eafb7f4a75b8dc14b5c06071d13aae0
sha256:9d05676469a08d6dba9889297333b7d1768e44e38075ab5350a4f8edd97f5be1
sha256:e8fb66bcfe1a85ec1299652d28e6f7f9cfbb01d33c6260582a42971d30dcb77d
sha256:f5944f2d1daf66463768a1503d0c8c5e8dde7c1674d3f85abc70cef9c7e32e95
httpReaderSeeker: failed open: content at https://registry-1.docker.io/v2/rancher/mirrored-pause/manifests/sha256:78bfb9d8999c190fca79871c4b2f8d69d94a0605266f0bbb2dbaa1b6dfd03720 not found: not found
===
Failed copying image for rancher/mirrored-pause
===

I'm not sure if this is intentional, but with multiple amd64 images (linux and windows), the logic is kinda looping through all found digests and it looks up the current digest to diff with the found digest, and as it differs, it syncs all the digests needed for amd64. It doesn't look like it was meant to be that way for windows, but it works. Except for this case, as the resulting digest after running skopeo copy is different. This is what I saw:

bash-5.0# skopeo inspect docker://registry.k8s.io/pause@sha256:85cfebc79dccc7a0e56680778a614b29a0b1c2ae98d4b1efc746764c04d5656c --raw 2>/dev/null | sha256sum 
85cfebc79dccc7a0e56680778a614b29a0b1c2ae98d4b1efc746764c04d5656c  -
bash-5.0# skopeo inspect docker://docker.io/superseb/mirrored-pause:3.7-amd64  --raw 2>/dev/null | sha256sum
85cfebc79dccc7a0e56680778a614b29a0b1c2ae98d4b1efc746764c04d5656c  -
bash-5.0# skopeo inspect docker://registry.k8s.io/pause@sha256:e8fb66bcfe1a85ec1299652d28e6f7f9cfbb01d33c6260582a42971d30dcb77d --raw 2>/dev/null | sha256sum
e8fb66bcfe1a85ec1299652d28e6f7f9cfbb01d33c6260582a42971d30dcb77d  -
bash-5.0# skopeo inspect docker://docker.io/superseb/mirrored-pause:3.8-amd64  --raw 2>/dev/null | sha256sum
01c7ed0d93c697230ca7b0373f9ae34e25fc8cbbea084191c59fe1108f0e31a1  -

And the diff was showing two differences:

The response was newlined (origin) and not newlined when querying DockerHub (single line JSON)
Only a few layers are gzipped (origin) vs all being gzipped on DockerHub

skopeo inspect docker://registry.k8s.io/pause@sha256:e8fb66bcfe1a85ec1299652d28e6f7f9cfbb01d33c6260582a42971d30dcb77d --raw 
{
   "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
   "schemaVersion": 2,
   "config": {
      "mediaType": "application/vnd.docker.container.image.v1+json",
      "digest": "sha256:7effaf5879989caec6af88410f43cb0aac21f7c0b65a966a1977a1e14f6ecf2e",
      "size": 2610
   },
   "layers": [
      {
         "mediaType": "application/vnd.docker.image.rootfs.diff.tar",
         "digest": "sha256:26085ddda8400952278f21eda4de3abb08e6e2b02afaddd77d1dab2b2a3f7c70",
         "size": 303383040
      },
      {
         "mediaType": "application/vnd.docker.image.rootfs.diff.tar",
         "digest": "sha256:61b9d3e54fdf80b25126caf3898b7619f2cc4271b3707e147145c6aa7d7b6663",
         "size": 56320
      },
      {
         "mediaType": "application/vnd.docker.image.rootfs.diff.tar",
         "digest": "sha256:2a2a37705371efd292777b63ae5500db8c667ea6a2269db5a82bd7ddb37c611d",
         "size": 52736
      },
      {
         "mediaType": "application/vnd.docker.image.rootfs.diff.tar",
         "digest": "sha256:51433a8bc864e113031f56050243d96f07ee45ee87cfb769f94878f710fcfc00",
         "size": 1379328
      },
      {
         "mediaType": "application/vnd.docker.image.rootfs.diff.tar",
         "digest": "sha256:ac0271ec295db8efe49fd6d64caa24b7612828d8de3596df8118085a7f9c9f77",
         "size": 1286656
      },
      {
         "mediaType": "application/vnd.docker.image.rootfs.diff.tar",
         "digest": "sha256:0a1660a7fc0943f082ad82712f70f1f9864c10fcd7b6c2aa34b66e320d28df50",
         "size": 52736
      },
      {
         "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
         "digest": "sha256:c44b3b94f1619a178691090b2a02c123a5354f863497e64e3cc7b9f77c55ba79",
         "size": 20774
      },
      {
         "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
         "digest": "sha256:77fb4c7d81f02ad14b711a776993cd511e78545c5e6416dd91de07d8ebf9a787",
         "size": 1402287
      }
   ]
}

skopeo inspect docker://docker.io/superseb/mirrored-pause:3.8-amd64  --raw 2>/dev/null       
{"schemaVersion":2,"mediaType":"application/vnd.docker.distribution.manifest.v2+json","config":{"mediaType":"application/vnd.docker.container.image.v1+json","size":2610,"digest":"sha256:7effaf5879989caec6af88410f43cb0aac21f7c0b65a966a1977a1e14f6ecf2e"},"layers":[{"mediaType":"application/vnd.docker.image.rootfs.diff.tar.gzip","size":125575393,"digest":"sha256:74930762f9101c7fdb6be93708b93e0bd982831c7feecfc60863d126066dd0df"},{"mediaType":"application/vnd.docker.image.rootfs.diff.tar.gzip","size":1890,"digest":"sha256:553dc5cabf3637632c6d5ea0ed3a417b301c641191823d86ef3058d6279c5492"},{"mediaType":"application/vnd.docker.image.rootfs.diff.tar.gzip","size":1353,"digest":"sha256:966a9745c8adacd90195a617131d8ee7f1037ee2574c21ca70605c658195b59d"},{"mediaType":"application/vnd.docker.image.rootfs.diff.tar.gzip","size":90718,"digest":"sha256:6e157444b00193a8f77eaba8293b00b9c45faf2df32de43cec53e6cf3aaf97c8"},{"mediaType":"application/vnd.docker.image.rootfs.diff.tar.gzip","size":56273,"digest":"sha256:d5434e2517d7db7fede0fd93607d62d305f29d3ad0ca78ee85407f8390e2a612"},{"mediaType":"application/vnd.docker.image.rootfs.diff.tar.gzip","size":1357,"digest":"sha256:1d8c49f4e7a53129f5e03d26ae07373eb74ff5b49ef1255495fd3930f5e4cc4b"},{"mediaType":"application/vnd.docker.image.rootfs.diff.tar.gzip","size":20774,"digest":"sha256:c44b3b94f1619a178691090b2a02c123a5354f863497e64e3cc7b9f77c55ba79"},{"mediaType":"application/vnd.docker.image.rootfs.diff.tar.gzip","size":1402287,"digest":"sha256:77fb4c7d81f02ad14b711a776993cd511e78545c5e6416dd91de07d8ebf9a787"}]}

There are a few things we need to know before we can proceed to a solution:

Is/was Windows part of the initial implementation and does it account for multiple -amd64 images (one for Linux and multiple for Windows), this is probably best answered by @brandond

Possible solutions:

I tested upgrading skopeo and using --preserve-digests, this will solve the problem for this image but needs testing against the whole list
We can move to --all when the source image/manifests is a list and just sync the whole list at once. This was tested for pause:3.8 image but also needs testing for others. This needs input from @brandond as well if this was considered before (one "downside" I guess is that it syncs all archs and not just the ones we choose)

[Improvement] Add a new data source to automate retrieving images from a image-list file of a GitHub release

Current issue:

Longhorn would like to automate adding/mirroring images. However, the currently available list of data sources don't fit the use-case of Longhorn:

github-releases data source: This one finds new GitHub release tag and adds the images defined in the images field in the config.json file. For example:
```
{
  "vsphere-cpi": {
    "images": [
      "gcr.io/cloud-provider-vsphere/cpi/release/manager"
    ],
    "versionSource": "github-releases:kubernetes/cloud-provider-vsphere",
    "versionConstraint": ">1.21.0"
  }
}
```
This config.json file instructs the script to find GitHub release tags in the repo kubernetes/cloud-provider-vsphere. Then only add the gcr.io/cloud-provider-vsphere/cpi/release/manager image with the found tags to the images-list. This doesn't fit the use-case of Longhorn because the list of Longhorn images are not fixed. We added/removed images between the releases. Therefore, it would require manual works to modify the "images" fied of the config.json oftenly
github-latest-release data source: this data source has same limitation as the github-releases. Additionaly, Longhorn maintain multiple minor releases so a smaller version (e.g., v1.4.5) might be released after the current latest version (e.g., v1.5.3). This data source will not sync and add the smaller version (e.g., v1.4.5)
registry data source: Longhorn doesn't maintain a registry. Not applicable
helm-latest, helm-oci, and helm-directory data sources. With these data sources, the workflow attempts to run helm template and extract the images from the workload (deployment/daemonset/pod) output of helm template. This approach doesn't work for Longhorn because not all Longhorn images appears in the output of Helm template (the images of Longhorn system managed components)

Proposal

Add a new data source to automate retrieving images from a image-list file of a GitHub release, github-releases-images-file. This will look up GitHub releases, excluding pre-releases, and find the list of images inside a specified file of the release. This can be used if your project maintains a list of images in a file, e.g., https://github.com/longhorn/longhorn/blob/master/deploy/longhorn-images.txt

An example of configuration for github-releases-images-file could be:

{
  "longhorn": {
    "versionSource": "github-releases-images-file:longhorn/longhorn",
    "imagesFilePath": "deploy/longhorn-images.txt",
    "versionConstraint": ">=1.4.0"
  }
}

With the new github-releases-images-file data source, the above config.json instruct the workflow to:

Look up GitHub releases at the repo longhorn/longhorn, excluding pre-releases
Only consider the releases which are >=1.4.0
For each release, download the image list at deploy/longhorn-images.txt and add the newly found images to the images-list.

[BUG] The script retrieve-image-tags.py is crashed when chartConfig is empty

When running the retrieve-image-tags.py with this config.json(similar to the example in README.md), it crashed because chart_values and kube_version are undefined in this code section https://github.com/rancher/image-mirror/blob/186c3e91d0b739f806d7a49fcda7e005aeb94396/retrieve-image-tags/retrieve-image-tags.py#L292C1-L292C1

{
  "longhorn": {
    "versionSource": "helm-latest:https://charts.longhorn.io",
    "additionalVersionFilter": [
      "v1.4.*"
    ],
    "helmCharts": {
      "longhorn": {}
    }
  },
}

Add support for risc-v (and plan for platform addition in general)

We are working in the background to add risc-v support to k3s at k3s-io/k3s#7151. However, one of the challenges of this effort is lack of risc-v platform support in rancher-mirrored images.

While we could simply add this to the arch list, this would change the multi-arch digest for any tags that we currently mirror that have this platform available. We should figure out how to add this platform to only new tags, in a way that is minimally disruptive to the list maintenance process, and can be reused when adding new platforms in the future.

One possibility would be to

Add a new config file that tracks platform:start:end - where either or both start and end may be unspecified to indicate minimum and maximum values for the respective options.
Existing platforms would be listed with an empty start and end date.
Use the output of git blame --date=unix images-list to determine when an entry was added
Add the platform to the list to be mirrored if the entry is newer than that platform's start date, and older than the end date

Note that we are evaluating when the entry was added to the list, NOT when the entry was published to the source registry. This approach should work as long as we continue to not modify existing list entries, such that git blame properly indicates the correct date for the addition of the entry.

Drone Publish Jobs not Running

The drone publish jobs for this repo are no longer running, as such we are missing the images that should have been mirrored when this #664 was merged. Without those images this is a blocker for rancher/rancher#46082

It appears that drone-publish for this repo is inactive: https://drone-publish.rancher.io/rancher/image-mirror/settings

Potential fixes could be re-enabling drone-publish for the repo, migrating to GHA (potential PR), or maybe someone can run it manually?

retrieve-image-tags : Support `--devel`, directory

Ideas:

Support --devel flag when accessing helm repositories to see dev charts as well.
Support directory as a data source. The directory has to contain an unpacked helm chart.

The second idea would allows trialing the tool against charts in development and not released anywhere in no form.

Better feedback on image-mirror errors

Currently there is no feedback besides an error from skopeo in the Drone CI log. We can look at a way to have the errors reported back into GitHub issues or something to be aware if something is not right.

Consolidate minio images source to quay.io

While working on #437, the resulting minio images from the epinio chart are coming from quay.io. As the automation automatically checks if images are present based on the source (and adds tags if they dont exist yet), this will conflict with the current images-list file. We already added preferring the mirrored- images but in this case, they are both mirrored-.

As Minio seems to be using quay.io as their main in the docs (except quickstart?), we should look into consolidating all the minio images to quay.io which will then also resolve the automation for the epinio chart images.

Add check for target repository in CI

We currently check if the source image and tag exists but not if the target repository exists. We should add that check so we don't have to rely on the person submitting the PR or on the reviewer to check it.

Neuvector mirrored-prometheus-exporter:5.1.3 is not mirrored

The following image is not mirrored, can you please check.
#395

neuvector@NV-Ubuntu1604-Automation:~$ docker pull rancher/mirrored-prometheus-exporter:5.1.3
Error response from daemon: pull access denied for rancher/mirrored-prometheus-exporter, repository does not exist or may require 'docker login'

Automate adding images from chart as source

The current implementation of automation is based on the "add tag to existing image" approach, meaning that you can provide images and tags and the logic will add if they do not already exists. As we only allow automated PRs for existing images, the logic will prevent from adding new images to the file (only add tags for images that are already present).

Later on logic was added to have a source of images/tags that can be looked up (for example, github releases or registries) and with that collected data, the workflow "add tag to existing images" is launched.

The downside of the current approach is that we only allow for a list of images that require the same tag (usually from the same source). This means that if we have multiple images in a "release", we can't process that in a single request/PR. Currently we have multiple sources and images that all create separate PRs that make up one release.

For example, cilium uses quite a few different images with different release versions. The idea for this issue is to:

Implement Helm chart as a new source for images/tags for a component (example project cilium, example source Cilium Helm chart)
Retrieve Helm chart and use helm template with configurable values to get the correct rendered YAMLs
Extract images from rendered YAMLs
Feed the list of images + tags into a workflow to create the PR to add missing tags (we are still not adding new images)

Example mock up code:

# Lookup latest Helm chart from `index.yaml`, for this example hardcoded version
$ wget https://helm.cilium.io/cilium-1.14.0.tgz
# Helm template and extract images
$ helm template --set hubble.ui.enabled=true --set hubble.relay.enabled=true --set hubble.enabled=true --set azure.enabled=true --set envoy.enabled=true cilium-1.14.0.tgz  |  yq '..|.image? | select(.)' | sort -u | awk -F'@' '{ print $1 }'| sort -u
---
quay.io/cilium/cilium-envoy:v1.25.9-f039e2bd380b7eef2f2feea5750676bb36133699
quay.io/cilium/cilium:v1.14.0
quay.io/cilium/hubble-relay:v1.14.0
quay.io/cilium/hubble-ui-backend:v0.12.0
quay.io/cilium/hubble-ui:v0.12.0
quay.io/cilium/operator-azure:v1.14.0
# Launch workflow with comma separated list of images/tags to be processed (this is different from the current logic)
$ gh workflow run add-tag-to-existing-image.yml --ref $GITHUB_REF_NAME -f input=quay.io/cilium/cilium-envoy:v1.25.9-f039e2bd380b7eef2f2feea5750676bb36133699,quay.io/cilium/cilium:v1.14.0,quay.io/cilium/hubble-relay:v1.14.0,quay.io/cilium/hubble-ui-backend:v0.12.0,quay.io/cilium/hubble-ui:v0.12.0,quay.io/cilium/operator-azure:v1.14.0

Allow pinning images

We’ve seen that some images we mirror are being overwritten on the fly, which is breaking users in prod. Specifically nginx changed the manifest type on the 1.24.0-alpine image.
We should either:

Default to a pinned state, where every SHA gets recorded, and skipped if it's different
Allow the specifying of a SHA, and only pin if that SHA is specified

Support more version filter(s) in Helm chart image retrieval

This comes from #504 (comment).

In case of Longhorn, they have a stable (v1.4.x) and latest (v1.5.x) release. In current logic, we only get the latest from Helm. (it determines the latest available by not providing any version to the helm commands). We can easily add support for this but the question is regarding the design of the configuration.

Three options I can think of:

Add a new source configuration name called helm and specify a versionFilter, this aligns with the other sources. Possible downside is that you need to copy paste the blocks for helm-latest and helm. This will also result in multiple PRs as they are different blocks.
Add a new configuration option versionFilter to helm-latest. Possible downside is that you need to copy paste the blocks with and without this versionFilter. This will also result in multiple PRs as they are different blocks.
Add a new configuration option additionalVersionFilter to helm-latest. This will be used, after getting the images for latest, to get additional images using the filters from this array. This will result in a single PR with all images for all charts. Possible downside is that images from multiple chart versions get combined while this is not desirable.

We could also combine 1 & 3 or 2 & 3 to give the user the option for how they want the PRs to end up (combined or separate).