openshift / cluster-image-registry-operator Goto Github PK
View Code? Open in Web Editor NEWThe image registry operator installs+maintains the internal registry on a cluster
License: Apache License 2.0
The image registry operator installs+maintains the internal registry on a cluster
License: Apache License 2.0
Follow up to #52 to resolve the items described in:
#52 (comment)
You need to ensure that on re-sync, we are making sure the bucket exists. That means a few things:
The operator doesn't allow to change configuration from
storage:
filesystem: {}
to
storage:
filesystem:
volumeSource:
emptyDir: {}
because storage type change is not supported: expected storage type filesystem, but got emptydir
.
/kind bug
@RobertKrawitz reports this operator crashing on both libvirt and AWS with:
time=""2018-11-05T19:42:08Z"" level=info msg=""Cluster Image Registry Operator Version: 601f5c3-dirty""
time=""2018-11-05T19:42:08Z"" level=info msg=""Go Version: go1.10.3""
time=""2018-11-05T19:42:08Z"" level=info msg=""Go OS/Arch: linux/amd64""
time=""2018-11-05T19:42:08Z"" level=info msg=""operator-sdk Version: 0.0.6+git""
time=""2018-11-05T19:42:08Z"" level=info msg=""Metrics service cluster-image-registry-operator created""
time=""2018-11-05T19:42:08Z"" level=info msg=""generating registry custom resource""
time=""2018-11-05T19:42:08Z"" level=fatal msg=""unknown storage backend: """
From the first line of the logs, you can see that's with 601f5c3 from this repository. I'll file a pull request at least expanding the logs for the missing config-map case.
I'm removing most of our GC usage here:
#215
but I don't want to get into the PVC logic. Right now it appears to be using GC to ensure the PVC gets removed when the CR is removed. That won't work because the CR is cluster-scoped and GC can't span namespaces. The PVC needs to be cleaned up via finalization.
Since we only want one registry per cluster anyway (can't actually tolerate more than one) and because finalizers deadlock themselves when the finalizer and the resource being finalized run in the same namespace, we need to switch the registry resource CRD to be cluster scoped instead of namespace scoped.
/cc @deads2k
/assign @legionus
based on this:
cluster-image-registry-operator/pkg/storage/s3/s3.go
Lines 365 to 377 in 7f6ffaf
it looks like we apply our own encryption settings to the s3 bucket even if the bucket was supplied by the user, even if the bucket already existed. This means a user can't provide their own s3 bucket w/ their own encryption settings.
Move from the logrus and the old github.com/golang/glog
to k8s.io/klog
both the image-registry resource and the clusteroperator resource are reporting "progressing=true" when the deployment is complete.
The deployment itself reports:
- lastTransitionTime: 2018-10-22T15:18:03Z
lastUpdateTime: 2018-10-22T15:18:31Z
message: ReplicaSet "image-registry-7cbbc9dc8c" has successfully progressed.
reason: NewReplicaSetAvailable
status: "True"
type: Progressing
So either this is a misunderstanding on our part of what that means, or deployments have a bug in which they should be setting progressing to false/completed=true.
Certainly the "message: ReplicaSet "image-registry-7cbbc9dc8c" has successfully progressed." would imply this might be a deployments bug.
/assign @legionus
This code:
cluster-image-registry-operator/pkg/resource/generator.go
Lines 233 to 248 in 6e2fee7
means that we are ultimately going to update the CR even if we haven't necessarily changed the CR itself as part of an event loop (we may have only updated a secondary resource).
Every time we update the CR, we trigger an event that sends us back through the generator logic, so this means at a minimum we're doing one extra event loop process every time.
I thought we'd already done this..is there a reason the registry resource doesn't define and use a Status subresource?
Currently we're using a generated uuid for the bucket name:
@cuppett mentioned that this isn't generally considered good enough on S3 because of the universal nature of the bucket namespace. He had some suggestions for additional values we should include in the name to ensure uniquness.
@cuppett can you please supply the details of the recommendation you were making to me last night?
Thanks.
Is it supported to use PVC as registry storage on OCP 4 like OCP 3?
How do we set it up if yes?
By default, the registry uses s3 on aws.
# oc get configs.imageregistry.operator.openshift.io instance -o yaml | grep storageManaged -B3
s3:
bucket: image-registry-us-east-2-<hash>
region: us-east-2
storageManaged: true
Thanks.
In order to make the internal registry discoverable by other components (the API server, for example), we should create the ConfigMap openshift-image-registry/(image-registry-)?public-info with:
This ConfigMap should be readable by system:authenticated
.
Image registry operator works fine after cluster is setup, but broken after upgrade:
oc get clusteroperator cluster-image-registry-operator -o yaml
:...
- lastTransitionTime: 2019-01-28T15:48:03Z
message: "Unable to apply resources: unable to sync storage configuration: InvalidAccessKeyId:
The AWS Access Key Id you provided does not exist in our records.\n\tstatus
code: 403, request id: 1BB1FAF4B272F4A2, host id: 5/5II1d/1rTqEEh9HbyiW+P8NM+dqzd0NOCpmTogNIjgFoRf7lSxbdyX0BWhH1B+9ILNI22u9Lw="
status: "True"
type: Progressing
...
We settled on:
resource name: config(s)
groupname: imageregistry.operator.openshift.io
resource instance name: instance
Evidenced by the quickly scrolling list of events being handled in the operator log. Changing to emptydir storage stops the hotloop.
If the clusteroperator defines object refs, the must-gather tool can automatically collect relevant resources/logs/etc from the references objects/namespaces. See:
This operator should be updated to define appropriate object references.
/cc @dmage @adambkaplan
During upgrade testing, the node-ca
daemonset is not upgraded.
$ oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.0.0-0.alpha-2019-02-08-113402 True False 23m Cluster version is 4.0.0-0.alpha-2019-02-08-113402
$ oc get pod -oyaml cluster-image-registry-operator-67587bc6c7-qbsm2
spec:
containers:
- command:
- cluster-image-registry-operator
env:
- name: WATCH_NAMESPACE
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
- name: OPERATOR_NAME
value: cluster-image-registry-operator
- name: IMAGE
value: registry.svc.ci.openshift.org/openshift/origin-v4.0-2019-02-08-113402@sha256:c4cc6c2199cac465f128bf9a0a78a83cfae8302e86e9aeef22b740c1e45780c2
image: registry.svc.ci.openshift.org/openshift/origin-v4.0-2019-02-08-113402@sha256:9f31485bb394dba9264ee020d07b61f0554d4640b5c625e67904fadb02f5c3c6
imagePullPolicy: Always
name: cluster-image-registry-operator
$ oc get ds node-ca -oyaml
spec:
template:
spec:
containers:
image: registry.svc.ci.openshift.org/openshift/origin-v4.0-2019-02-08-055616@sha256:c4cc6c2199cac465f128bf9a0a78a83cfae8302e86e9aeef22b740c1e45780c2 <--- should match cluster version
message: "Unable to apply resources: unable to sync storage configuration: InvalidAccessKeyId:
The AWS Access Key Id you provided does not exist in our records.\n\tstatus
code: 403, request id: 9F7CF2E209519A53, host id: KmxOvvgpisxuwNCxf8xm3HQSKHryoECU6n10p2YUQBiW3sytZYtUmny006BW5ypF47nYQTzVSzE="
status: "True"
we should be setting up listwatchers for all the resources we care about so that when we need to re-get them, we can get them from the cache instead of making redundant api calls.
Oleg,
Since you added this test could you check this out?
Not sure how it got merged with this test failing either?
test/e2e/recreate_deployment_test.go:50:2: deploy declared but not used
vet: typecheck failures
FAIL github.com/openshift/cluster-image-registry-operator/test/e2e [build failed]
make: *** [Makefile:24: test-e2e] Error 2
Test timeouts (default of 10 minutes) are applied on a per package basis. If we organize the tests into proper packages we can go back to using the default timeout.
I have an installer e2e job where this operator is crash-looping on AWS. More details in openshift/installer#403, but specific to this operator are this:
$ kubectl logs -n openshift-image-registry cluster-image-registry-operator-869c995bc5-ccrn9
time="2018-10-03T23:09:12Z" level=info msg="Cluster Image Registry Operator Version: c2753e9-dirty"
time="2018-10-03T23:09:12Z" level=info msg="Go Version: go1.10.3"
time="2018-10-03T23:09:12Z" level=info msg="Go OS/Arch: linux/amd64"
time="2018-10-03T23:09:12Z" level=info msg="operator-sdk Version: 0.0.6+git"
E1003 23:09:12.518994 1 memcache.go:153] couldn't get resource list for apps.openshift.io/v1: the server is currently unable to handle the request
E1003 23:09:12.521719 1 memcache.go:153] couldn't get resource list for authorization.openshift.io/v1: the server is currently unable to handle the request
E1003 23:09:12.523561 1 memcache.go:153] couldn't get resource list for build.openshift.io/v1: the server is currently unable to handle the request
E1003 23:09:12.525691 1 memcache.go:153] couldn't get resource list for image.openshift.io/v1: the server is currently unable to handle the request
E1003 23:09:12.535615 1 memcache.go:153] couldn't get resource list for network.openshift.io/v1: the server is currently unable to handle the request
E1003 23:09:12.538190 1 memcache.go:153] couldn't get resource list for oauth.openshift.io/v1: the server is currently unable to handle the request
E1003 23:09:12.662266 1 memcache.go:153] couldn't get resource list for project.openshift.io/v1: the server is currently unable to handle the request
E1003 23:09:12.683553 1 memcache.go:153] couldn't get resource list for quota.openshift.io/v1: the server is currently unable to handle the request
E1003 23:09:12.703499 1 memcache.go:153] couldn't get resource list for route.openshift.io/v1: the server is currently unable to handle the request
E1003 23:09:12.717847 1 memcache.go:153] couldn't get resource list for security.openshift.io/v1: the server is currently unable to handle the request
E1003 23:09:12.719404 1 memcache.go:153] couldn't get resource list for template.openshift.io/v1: the server is currently unable to handle the request
E1003 23:09:12.720938 1 memcache.go:153] couldn't get resource list for user.openshift.io/v1: the server is currently unable to handle the request
time="2018-10-03T23:09:13Z" level=info msg="Metrics service cluster-image-registry-operator created"
time="2018-10-03T23:09:13Z" level=info msg="Watching rbac.authorization.k8s.io/v1, ClusterRole, , 0"
time="2018-10-03T23:09:13Z" level=info msg="Watching rbac.authorization.k8s.io/v1, ClusterRoleBinding, , 0"
time="2018-10-03T23:09:13Z" level=info msg="Watching v1, ConfigMap, openshift-image-registry, 0"
time="2018-10-03T23:09:13Z" level=info msg="Watching v1, Secret, openshift-image-registry, 0"
time="2018-10-03T23:09:13Z" level=info msg="Watching v1, ServiceAccount, openshift-image-registry, 0"
time="2018-10-03T23:09:13Z" level=info msg="Watching route.openshift.io/v1, Route, openshift-image-registry, 0"
time="2018-10-03T23:09:13Z" level=error msg="failed to get resource client for (apiVersion:route.openshift.io/v1, kind:Route, ns:openshift-image-registry): failed to get resource type: failed to get the resource REST mapping for GroupVersionKind(route.openshift.io/v1, Kind=Route): no matches for kind \"Route\" in version \"route.openshift.io/v1\""
panic: failed to get resource type: failed to get the resource REST mapping for GroupVersionKind(route.openshift.io/v1, Kind=Route): no matches for kind "Route" in version "route.openshift.io/v1"
goroutine 1 [running]:
github.com/openshift/cluster-image-registry-operator/vendor/github.com/operator-framework/operator-sdk/pkg/sdk.Watch(0xc4206786a0, 0x15, 0x133c5d9, 0x5, 0xc420040040, 0x18, 0x0, 0x0, 0x0, 0x0)
/go/src/github.com/openshift/cluster-image-registry-operator/vendor/github.com/operator-framework/operator-sdk/pkg/sdk/api.go:49 +0x4a8
main.watch(0xc4206786a0, 0x15, 0x133c5d9, 0x5, 0xc420040040, 0x18, 0x0)
/go/src/github.com/openshift/cluster-image-registry-operator/cmd/cluster-image-registry-operator/main.go:39 +0x228
main.main()
/go/src/github.com/openshift/cluster-image-registry-operator/cmd/cluster-image-registry-operator/main.go:82 +0x496
and this:
$ kubectl describe pods -n openshift-image-registry cluster-image-registry-operator-869c995bc5-ccrn9
Name: cluster-image-registry-operator-869c995bc5-ccrn9
Namespace: openshift-image-registry
Priority: 0
PriorityClassName: <none>
Node: ip-10-0-136-219.ec2.internal/10.0.136.219
Start Time: Wed, 03 Oct 2018 22:45:47 +0000
Labels: name=cluster-image-registry-operator
pod-template-hash=4257551671
Annotations: openshift.io/scc=restricted
Status: Running
IP: 10.2.4.6
Controlled By: ReplicaSet/cluster-image-registry-operator-869c995bc5
Containers:
cluster-image-registry-operator:
Container ID: cri-o://1bd15a65433c6f0cf3674fdf522bd7355c4e42741a7efccfa328fda1fea63ed2
Image: registry.svc.ci.openshift.org/ci-op-lpz1gxwg/stable@sha256:61b10a249a6efcf5ca2affd605365008115c1781fbd857b503f73d7091d23fd2
Image ID: registry.svc.ci.openshift.org/ci-op-lpz1gxwg/stable@sha256:61b10a249a6efcf5ca2affd605365008115c1781fbd857b503f73d7091d23fd2
Port: 60000/TCP
Host Port: 0/TCP
Command:
cluster-image-registry-operator
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 2
Started: Wed, 03 Oct 2018 23:40:11 +0000
Finished: Wed, 03 Oct 2018 23:40:11 +0000
Ready: False
Restart Count: 15
Environment:
WATCH_NAMESPACE: openshift-image-registry (v1:metadata.namespace)
OPERATOR_NAME: cluster-image-registry-operator
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-6p6p5 (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
default-token-6p6p5:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-6p6p5
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: <none>
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 58m (x310 over 1h) default-scheduler 0/4 nodes are available: 4 node(s) had taints that the pod didn't tolerate.
Warning FailedCreatePodSandBox 55m kubelet, ip-10-0-136-219.ec2.internal Failed create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_cluster-image-registry-operator-869c995bc5-ccrn9_openshift-image-registry_e112b284-c75c-11e8-ad65-1267b6294ade_0(bd2b553ae930d9afea620ac3bc9401828ae7a577b0637f77739324638d3d414e): open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 55m kubelet, ip-10-0-136-219.ec2.internal Failed create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_cluster-image-registry-operator-869c995bc5-ccrn9_openshift-image-registry_e112b284-c75c-11e8-ad65-1267b6294ade_0(d27b30b91af951d2b96fcb67b9316b6c820ae1b0d8cd2681bcb2153cba249315): open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 54m kubelet, ip-10-0-136-219.ec2.internal Failed create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_cluster-image-registry-operator-869c995bc5-ccrn9_openshift-image-registry_e112b284-c75c-11e8-ad65-1267b6294ade_0(dabf597aba822b351b50772943de49434d51ba26367877881a6a4d03c3b3c88a): open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 54m kubelet, ip-10-0-136-219.ec2.internal Failed create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_cluster-image-registry-operator-869c995bc5-ccrn9_openshift-image-registry_e112b284-c75c-11e8-ad65-1267b6294ade_0(5286b6bd6666d5ea8bd1d0e79dd35110598dd46b0f1c6488cbe7738312c71386): open /run/flannel/subnet.env: no such file or directory
Normal Pulling 52m (x4 over 54m) kubelet, ip-10-0-136-219.ec2.internal pulling image "registry.svc.ci.openshift.org/ci-op-lpz1gxwg/stable@sha256:61b10a249a6efcf5ca2affd605365008115c1781fbd857b503f73d7091d23fd2"
Normal Pulled 52m (x4 over 54m) kubelet, ip-10-0-136-219.ec2.internal Successfully pulled image "registry.svc.ci.openshift.org/ci-op-lpz1gxwg/stable@sha256:61b10a249a6efcf5ca2affd605365008115c1781fbd857b503f73d7091d23fd2"
Normal Created 52m (x4 over 54m) kubelet, ip-10-0-136-219.ec2.internal Created container
Normal Started 52m (x4 over 54m) kubelet, ip-10-0-136-219.ec2.internal Started container
Warning BackOff 21s (x242 over 53m) kubelet, ip-10-0-136-219.ec2.internal Back-off restarting failed container
I don't know if the missing /run/flannel/subnet.env
are an API server issue (the API servers are also having trouble on this cluster), or a kubelet issue, or an operator issue, but thought I'd post here in case the issue was more obvious to y'all :).
configuration_test.go:69: expected memory limit of 512Mi, found: 0
Given that the test creates the registry resource w/ the memory limit set, i can't see why we'd ever create a pod that didn't have the limit set, but it's the most consistent flake i'm seeing on the registry operator repo right now.
Configure the S3 bucket to abort multi-part uploads by default when we create the bucket
deleting a registry resource results in a hang, it doesn't look like the finalizer is succeeding/working.
In my case I did not have a registry created, I was simply deleting the default CR that was created (the one that has no storage configured, thus has not deployed a registry).
Today we copy s3 config from the cluster, into the secret in the image-registry namespace, and then drive that into the registry deployment.
Instead the registry operator should be:
We also need to be watching (2) for changes/resyncing it when there is no local secret.
We want to reuse the testframework for our the console-operator e2e tests. Is there any plan to refactor and extract it into https://github.com/openshift/library-go ?
Drivers should use a SyncStorage function to determine what action to take (create, update, remove) based on the requested storage (spec) and current storage (status).
We should be creating/maintaining the new clusteroperator.config.openshift.io
resource, not the old clusteroperators.operatorstatus.openshift.io
one.
Also clusteroperator.config.openshift.io is not namespaced.
this is logging the wrong namespace:
https://github.com/openshift/cluster-image-registry-operator/blob/master/pkg/clusterconfig/clusterconfig.go#L123
"Unable to apply resources: unable to sync storage configuration: unable to get cluster minted credentials \"kube-system/installer-cloud-credentials\": timed out waiting for the condition",
it's not looking in kube-system, it's looking in openshift-image-registry.
probably worth double checking others as well.
If I run "oc get ImageRegistry -o yaml", I see the Image Registry Operator status conditions such as
status:
conditions:
- lastTransitionTime: 2018-12-11T00:30:49Z
message: deployment has minimum availability
status: "True"
type: Available
- lastTransitionTime: 2018-12-11T00:30:49Z
message: everything is ready
status: "False"
type: Progressing
- lastTransitionTime: 2018-12-11T00:30:41Z
status: "False"
type: Failing
- lastTransitionTime: 2018-12-11T00:30:41Z
status: "False"
type: Removed
But shouldn't these be reported in the ClusterOperator resource?
When I run "oc get ClusterOperator" i see the status for the other operators (including the Samples Operator). And when I run "oc get ClusterOperator -o yaml" I see similar conditions to what we have in our ImageRegistry resource.
And then we should be storing things such as the StorageExists condition in the ImageRegistry resource status conditions I would think.
Thoughts?
The operator prevents changing from filesystem to another storagetype, but it doesn't prevent me from changing my filesystem volumeSource from PVC to EmptyDir (or presumably other changes). Such a change also invalidates my storage and therefore needs to be prevented.
docker build -t docker.io/openshift/origin-cluster-image-registry-operator:latest .
Sending build context to Docker daemon 180.2 MB
Step 1/9 : FROM openshift/origin-release:golang-1.10
Trying to pull repository registry.access.redhat.com/openshift/origin-release ...
Trying to pull repository docker.io/openshift/origin-release ...
sha256:76e8479eb8c137cff06a5d35b27e5b5b5d4ce70f7a30de4ef01c5b3159a4d1ce: Pulling from docker.io/openshift/origin-release
256b176beaff: Already exists
2bd622ac2b02: Pull complete
Digest: sha256:76e8479eb8c137cff06a5d35b27e5b5b5d4ce70f7a30de4ef01c5b3159a4d1ce
Status: Downloaded newer image for docker.io/openshift/origin-release:golang-1.10
---> b017c842702b
Step 2/9 : COPY . /go/src/github.com/openshift/cluster-image-registry-operator
---> d442a944130b
Removing intermediate container 830fd06ddc48
Step 3/9 : RUN cd /go/src/github.com/openshift/cluster-image-registry-operator && go build ./cmd/cluster-image-registry-operator
---> Running in 5f596199bb25
---> beb23f9b901d
Removing intermediate container 5f596199bb25
Step 4/9 : FROM centos:7
Trying to pull repository registry.access.redhat.com/centos ...
Trying to pull repository docker.io/library/centos ...
sha256:6f6d986d425aeabdc3a02cb61c02abb2e78e57357e92417d6d58332856024faf: Pulling from docker.io/library/centos
256b176beaff: Already exists
Digest: sha256:6f6d986d425aeabdc3a02cb61c02abb2e78e57357e92417d6d58332856024faf
Status: Downloaded newer image for docker.io/centos:7
---> 5182e96772bf
Step 5/9 : RUN useradd cluster-image-registry-operator
---> Running in 0af16ee1240f
---> af4ca97226ce
Removing intermediate container 0af16ee1240f
Step 6/9 : USER cluster-image-registry-operator
---> Running in 6c2ead775a45
---> 860d888ecc10
Removing intermediate container 6c2ead775a45
Step 7/9 : COPY --from=0 /go/src/github.com/openshift/cluster-image-registry-operator /usr/bin
Unknown flag: from
make: *** [Makefile:7: build-image] Error 1
$ docker version
Client:
Version: 1.13.1
API version: 1.26
Package version: docker-1.13.1-60.git9cb56fd.fc28.x86_64
Go version: go1.10.3
Git commit: bdb8293-unsupported
Built: Sun Jul 8 08:29:45 2018
OS/Arch: linux/amd64
Server:
Version: 1.13.1
API version: 1.26 (minimum version 1.12)
Package version: docker-1.13.1-60.git9cb56fd.fc28.x86_64
Go version: go1.10.3
Git commit: bdb8293-unsupported
Built: Sun Jul 8 08:29:45 2018
OS/Arch: linux/amd64
Experimental: true
Can we remove this modern option ?
I deleted the registry resource and saw the registry deployment was cleaned up, but I did not see a new registry resource created until I restarted the registry operator.
=== RUN TestCustomPVC
--- FAIL: TestCustomPVC (304.29s)
pvc_test.go:79: PersistentVolume pv-q4sfdkgprvnr27pz25pfsrqgvfcqt66x766lk8m9z66qj7hczdvswfw4p6675j5j created
pvc_test.go:79: PersistentVolume pv-qn842l7shqhwnrt6l57k7ckk7d44gkv92n6nzrthwjwmlpnl8j7wm6kl9bc9b9xb created
pvc_test.go:224: timed out waiting for the condition
framework.go:44: storageclasses:
items:
- metadata:
annotations:
storageclass.kubernetes.io/is-default-class: "true"
creationTimestamp: "2019-02-27T03:08:54Z"
labels:
cluster.storage.openshift.io/owner-name: cluster-config-v1
cluster.storage.openshift.io/owner-namespace: kube-system
name: gp2
resourceVersion: "12396"
selfLink: /apis/storage.k8s.io/v1/storageclasses/gp2
uid: 04235027-3a3d-11e9-a148-0a177483df06
parameters:
type: gp2
provisioner: kubernetes.io/aws-ebs
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
metadata:
resourceVersion: "40806"
selfLink: /apis/storage.k8s.io/v1/storageclasses
framework.go:44: persistentvolumes:
items:
- metadata:
annotations:
pv.kubernetes.io/bound-by-controller: "yes"
creationTimestamp: "2019-02-27T03:29:30Z"
finalizers:
- kubernetes.io/pv-protection
name: pv-jf95fgm2v5bbzp7z5jxm2mpn2fm8kxgj5pz5r8c66kzmz222v225nr78p5kcwtw4
resourceVersion: "37743"
selfLink: /api/v1/persistentvolumes/pv-jf95fgm2v5bbzp7z5jxm2mpn2fm8kxgj5pz5r8c66kzmz222v225nr78p5kcwtw4
uid: e4d85967-3a3f-11e9-a659-12bbbb61cb90
spec:
accessModes:
- ReadWriteOnce
- ReadOnlyMany
- ReadWriteMany
capacity:
storage: 100Gi
claimRef:
apiVersion: v1
kind: PersistentVolumeClaim
name: image-registry-storage
namespace: openshift-image-registry
resourceVersion: "36217"
uid: eb3bbf4c-3a3f-11e9-82b8-0a177483df06
hostPath:
path: /tmp/pv-jf95fgm2v5bbzp7z5jxm2mpn2fm8kxgj5pz5r8c66kzmz222v225nr78p5kcwtw4
type: ""
persistentVolumeReclaimPolicy: Retain
storageClassName: gp2
status:
phase: Released
- metadata:
creationTimestamp: "2019-02-27T03:29:29Z"
finalizers:
- kubernetes.io/pv-protection
name: pv-k76bz94w9fbsz7n6h65s87c9cw8796h6dphq2rnft7nkwr69tq7qgm62l4ktqcdb
resourceVersion: "36055"
selfLink: /api/v1/persistentvolumes/pv-k76bz94w9fbsz7n6h65s87c9cw8796h6dphq2rnft7nkwr69tq7qgm62l4ktqcdb
uid: e4363455-3a3f-11e9-a659-12bbbb61cb90
spec:
accessModes:
- ReadWriteOnce
- ReadOnlyMany
- ReadWriteMany
capacity:
storage: 100Gi
hostPath:
path: /tmp/pv-k76bz94w9fbsz7n6h65s87c9cw8796h6dphq2rnft7nkwr69tq7qgm62l4ktqcdb
type: ""
persistentVolumeReclaimPolicy: Retain
status:
phase: Available
- metadata:
creationTimestamp: "2019-02-27T03:31:07Z"
finalizers:
- kubernetes.io/pv-protection
name: pv-q4sfdkgprvnr27pz25pfsrqgvfcqt66x766lk8m9z66qj7hczdvswfw4p6675j5j
resourceVersion: "37818"
selfLink: /api/v1/persistentvolumes/pv-q4sfdkgprvnr27pz25pfsrqgvfcqt66x766lk8m9z66qj7hczdvswfw4p6675j5j
uid: 1e466d1c-3a40-11e9-a659-12bbbb61cb90
spec:
accessModes:
- ReadWriteOnce
- ReadOnlyMany
- ReadWriteMany
capacity:
storage: 100Gi
hostPath:
path: /tmp/pv-q4sfdkgprvnr27pz25pfsrqgvfcqt66x766lk8m9z66qj7hczdvswfw4p6675j5j
type: ""
persistentVolumeReclaimPolicy: Retain
status:
phase: Available
- metadata:
creationTimestamp: "2019-02-27T03:31:08Z"
finalizers:
- kubernetes.io/pv-protection
name: pv-qn842l7shqhwnrt6l57k7ckk7d44gkv92n6nzrthwjwmlpnl8j7wm6kl9bc9b9xb
resourceVersion: "37832"
selfLink: /api/v1/persistentvolumes/pv-qn842l7shqhwnrt6l57k7ckk7d44gkv92n6nzrthwjwmlpnl8j7wm6kl9bc9b9xb
uid: 1ee7f556-3a40-11e9-a659-12bbbb61cb90
spec:
accessModes:
- ReadWriteOnce
- ReadOnlyMany
- ReadWriteMany
capacity:
storage: 100Gi
hostPath:
path: /tmp/pv-qn842l7shqhwnrt6l57k7ckk7d44gkv92n6nzrthwjwmlpnl8j7wm6kl9bc9b9xb
type: ""
persistentVolumeReclaimPolicy: Retain
storageClassName: gp2
status:
phase: Available
metadata:
resourceVersion: "40806"
selfLink: /api/v1/persistentvolumes
framework.go:44: persistentvolumeclaims:
items:
- metadata:
creationTimestamp: "2019-02-27T03:31:09Z"
finalizers:
- kubernetes.io/pvc-protection
name: test-custom-pvc
namespace: openshift-image-registry
resourceVersion: "37842"
selfLink: /api/v1/namespaces/openshift-image-registry/persistentvolumeclaims/test-custom-pvc
uid: 1f876dd3-3a40-11e9-a659-12bbbb61cb90
spec:
accessModes:
- ReadWriteMany
dataSource: null
resources:
requests:
storage: 1Gi
storageClassName: gp2
status:
phase: Pending
metadata:
resourceVersion: "40806"
selfLink: /api/v1/namespaces/openshift-image-registry/persistentvolumeclaims
imageregistry.go:146: uninstalling the image registry...
imageregistry.go:150: stopping the operator...
imageregistry.go:154: deleting the image registry resource...
We need a mechanism that will allow us to remove the registry. We have the ManagementState removed, but there is no indication of the removal progress. I can't delete the custom resource because the operator will re-bootstrap it.
I need this because I want to uninstall the operator and removed everything that was generated by the operator.
Recently we had a regression (#205). We need to have a test for AdditionalTrustedCA to avoid such regressions in the future.
Upstream deployments constantly try to rollout the latest state until explicitly asked not to. DeploymentConfigs give up unless triggered again. Given an operator managed deploymentconfig that should be actively reconciled, you should be retrying failed deployments since they can later succeed.
The current state in openshift/installer is a good example where a blip in the first rollout will cause the image registry to be stuck indefinitely. Switching to a deployment will resolve this problem. Alternatively you can rebuild the oc rollout logic.
Since the operator has deleted the entire API [1], we are on the unsupported version in the vendor that cannot be updated. Of course, you can live with it for now, but eventually we need to rewrite everything using some another API.
Guess it's more a clarification request than an issue or bug, but how does the internal pods manage to pull images from the registry?
I have a situation where I installed the registry operator, I can push and pull images from outside openshift, but when I try oc new-app namespace/image
the pods simply cannot pull the image from the registry.
First there is a certificate error (Unknown issuer) that I solved by copying the image registry certificate to /etc/docker/certs.d/<host>
inside each node, but then it starts to fail with
Failed to pull image [...] rpc error: code = Unknown desc = unauthorized: authentication required
The image registry pod itself implies the pods is trying to pull images anonymously:
time="2019-01-15T19:12:26.189081838Z" level=error msg="OpenShift access denied: no RBAC policy
matched" go.version=go1.10.3 openshift.auth.user=anonymous vars.name=default/fedoraa
vars.reference="sha256:d79ddffdce8112f111878afcbe0205ef43e3eead131399194e2cf66fa8f3e5ed"
So my question is, what is the expect way to use the registry operator? Does it require extra configurations beyond what is https://github.com/openshift/cluster-image-registry-operator/tree/master/deploy ?
@legionus Can we rename openshiftdockerregistries.dockerregistry.operator.openshift.io to "openshiftimageregistries.imageregistry.operator.openshift"?
Priority classes docs:
https://docs.openshift.com/container-platform/3.11/admin_guide/scheduling/priority_preemption.html#admin-guide-priority-preemption-priority-class
Example: https://github.com/openshift/cluster-monitoring-operator/search?q=priority&unscoped_q=priority
Notes: The pre-configured system priority classes (system-node-critical
and system-cluster-critical
) can only be assigned to pods in kube-system
or openshift-*
namespaces. Most likely, core operators and their pods should be assigned system-cluster-critical
. Please do not assign system-node-critical
(the highest priority) unless you are really sure about it.
Seems #58 is back, seeing this on BYOR control plane cluster, no cloudprovider set:
I0111 01:01:08.021751 1 main.go:24] Cluster Image Registry Operator Version: 2e125da-dirty
I0111 01:01:08.021867 1 main.go:25] Go Version: go1.10.3
I0111 01:01:08.021872 1 main.go:26] Go OS/Arch: linux/amd64
I0111 01:01:08.049742 1 controller.go:378] waiting for informer caches to sync
I0111 01:01:09.251837 1 controller.go:387] started events processor
I0111 01:01:09.263690 1 bootstrap.go:102] generating registry custom resource
E0111 01:01:09.267640 1 storage.go:106] unknown storage backend:
E0111 01:01:09.267778 1 runtime.go:66] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
/go/src/github.com/openshift/cluster-image-registry-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:72
/go/src/github.com/openshift/cluster-image-registry-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:65
/go/src/github.com/openshift/cluster-image-registry-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:51
/usr/local/go/src/runtime/asm_amd64.s:573
/usr/local/go/src/runtime/panic.go:502
/usr/local/go/src/runtime/panic.go:63
/usr/local/go/src/runtime/signal_unix.go:388
/go/src/github.com/openshift/cluster-image-registry-operator/pkg/operator/bootstrap.go:128
/go/src/github.com/openshift/cluster-image-registry-operator/pkg/operator/controller.go:122
/go/src/github.com/openshift/cluster-image-registry-operator/pkg/operator/controller.go:213
/go/src/github.com/openshift/cluster-image-registry-operator/pkg/operator/controller.go:220
/go/src/github.com/openshift/cluster-image-registry-operator/pkg/operator/controller.go:385
/go/src/github.com/openshift/cluster-image-registry-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133
/go/src/github.com/openshift/cluster-image-registry-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:134
/go/src/github.com/openshift/cluster-image-registry-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88
/usr/local/go/src/runtime/asm_amd64.s:2361
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x12918ad]
goroutine 96 [running]:
github.com/openshift/cluster-image-registry-operator/vendor/k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
/go/src/github.com/openshift/cluster-image-registry-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:58 +0x107
panic(0x144be20, 0x223d8d0)
/usr/local/go/src/runtime/panic.go:502 +0x229
github.com/openshift/cluster-image-registry-operator/pkg/operator.(*Controller).Bootstrap(0xc42002c160, 0xc420810480, 0x1)
/go/src/github.com/openshift/cluster-image-registry-operator/pkg/operator/bootstrap.go:128 +0x6ad
github.com/openshift/cluster-image-registry-operator/pkg/operator.(*Controller).sync(0xc42002c160, 0x17f2570, 0xc4200c7460)
/go/src/github.com/openshift/cluster-image-registry-operator/pkg/operator/controller.go:122 +0x1e6
github.com/openshift/cluster-image-registry-operator/pkg/operator.(*Controller).eventProcessor.func1(0xc42002c160, 0x13b60a0, 0x17bb930)
/go/src/github.com/openshift/cluster-image-registry-operator/pkg/operator/controller.go:213 +0x8f
github.com/openshift/cluster-image-registry-operator/pkg/operator.(*Controller).eventProcessor(0xc42002c160)
/go/src/github.com/openshift/cluster-image-registry-operator/pkg/operator/controller.go:220 +0x8e
github.com/openshift/cluster-image-registry-operator/pkg/operator.(*Controller).(github.com/openshift/cluster-image-registry-operator/pkg/operator.eventProcessor)-fm()
/go/src/github.com/openshift/cluster-image-registry-operator/pkg/operator/controller.go:385 +0x2a
github.com/openshift/cluster-image-registry-operator/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1(0xc420aa80a0)
/go/src/github.com/openshift/cluster-image-registry-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133 +0x54
github.com/openshift/cluster-image-registry-operator/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc420aa80a0, 0x3b9aca00, 0x0, 0x9c295639fd35a501, 0xc4200b0de0)
/go/src/github.com/openshift/cluster-image-registry-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:134 +0xbd
github.com/openshift/cluster-image-registry-operator/vendor/k8s.io/apimachinery/pkg/util/wait.Until(0xc420aa80a0, 0x3b9aca00, 0xc4200b0de0)
/go/src/github.com/openshift/cluster-image-registry-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88 +0x4d
created by github.com/openshift/cluster-image-registry-operator/pkg/operator.(*Controller).Run
/go/src/github.com/openshift/cluster-image-registry-operator/pkg/operator/controller.go:385 +0xc32
As a cluster operator in an enterprise env, I'm required to encrypt all the things.
It would be nice if I could supply my own kms key.
We don't have handlers for the fields Proxy.HTTP and Proxy.HTTPS.
The operator doesn't restore the finalizer if it was removed from CR manually.
/kind bug
After the image registry operator pod comes up I don't see any logs displayed (when running oc get logs pod/), not even the output from printversion() that should happen.
Is there a way to set the loglevel with klog also? (debug vs info)
Am I missing something or are the logs not working after the refactor to use klog?
similar to the following:
https://github.com/openshift/cluster-dns-operator/pull/67/files
cc @mrunalp
slack exchange:
i’d like to get some style cleanups in both the devex team operators on the progressing message
this is your primary communication channel and the intended receiver is the admin
right now it’s “The samples operator configuration is valid” and for registry it’s something slightly more machine oriented
i’d like the message to be something more affirmative like “The latest sample images and templates are installed” or something
or even shorter “All sample resources are up to date”
the style is sentence without punctuation
upper case leading, no punctution, human intended receipient
try to avoid kube-isms and overly technical explanationsGabe Montero [1:41 PM]
the available condition is the one that says "Samples exist in the openshift project"Clayton Coleman [1:42 PM]
Progressing is the generic conditionGabe Montero [1:42 PM]
the failing condition is the one that says ""The samples operator configuration is valid""Clayton Coleman [1:43 PM]
that’s where you put your human focused summary message
Progressing must represent a summary of the other conditions
so it’s fine to have the same message on progressing as available
(it’s fine to have the same messages on available)
but available and failed should explain the details as necessary, while progressing should explain to a human what the current state is
For context see https://github.com/openshift/cluster-version-operator/blob/master/pkg/cvo/status.go which has comments and examples of how the CVO works which is what this pattern is modeled on.
openshift/cluster-version-operator#72 is docs around what to do
This came from openshift/cluster-samples-operator#69
so see that issue for additional discussion.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.