embercsi / ember-csi-operator Goto Github PK

View Code? Open in Web Editor NEW

3.0 3.0 6.0 47.27 MB

Operator to create/configure/manage Ember CSI Driver atop Kubernetes/OpenShift

License: Apache License 2.0

Go 65.44% Dockerfile 0.32% Makefile 1.16% Shell 3.94% Python 29.14%

ember-csi-operator's People

Contributors

Stargazers

Watchers

Forkers

irosenzw danielerez cschwede liangxia akrog yunkunrao

ember-csi-operator's Issues

Image build fails using Docker 1.13.1

Building the operator on RHEL 7.6 with Docker 1.3.11 fails. The reason is due to Dockerfile's COPY command not recognizing the '--from' flag. Perhaps this was introduced in the later versions of Docker?

--
Step 13/13 : COPY --from=0 /go/src/github.com/embercsi/ember-csi-operator/build/ember-csi-operator /usr/local/bin/ember-csi-operator
Unknown flag: from
make: *** [build] Error 1

[root@kt-c7kb7 ember-csi-operator]# docker version
Client:
Version: 1.13.1
API version: 1.26
Package version: docker-1.13.1-88.git07f3374.el7.x86_64
Go version: go1.10.2
Git commit: 07f3374/1.13.1
Built: Thu Dec 6 07:01:49 2018
OS/Arch: linux/amd64

Server:
Version: 1.13.1
API version: 1.26 (minimum version 1.12)
Package version: docker-1.13.1-88.git07f3374.el7.x86_64
Go version: go1.10.2
Git commit: 07f3374/1.13.1
Built: Thu Dec 6 07:01:49 2018
OS/Arch: linux/amd64
Experimental: false
[root@kt-c7kb7 ember-csi-operator]# uname -a
Linux kt-c7kb7.cloud.lab.eng.bos.redhat.com 3.10.0-957.1.3.el7.x86_64 #1 SMP Thu Nov 15 17:36:42 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
[root@kt-c7kb7 ember-csi-operator]# cat /etc/os-release
NAME="Red Hat Enterprise Linux Server"
VERSION="7.6 (Maipo)"
ID="rhel"
ID_LIKE="fedora"
VARIANT="Server"
VARIANT_ID="server"
VERSION_ID="7.6"
PRETTY_NAME="Employee SKU"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:redhat:enterprise_linux:7.6:GA:server"
HOME_URL="https://www.redhat.com/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"

REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 7"
REDHAT_BUGZILLA_PRODUCT_VERSION=7.6
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="7.6"

Drop deprecated cluster-driver-registrar

The usage of the cluster-driver-registrar sidecar has been deprecated, and the CSIDriver object should be created by Ember itself.

https://github.com/kubernetes-csi/cluster-driver-registrar

Operator form incorrectly handling Float values

The current operator form parses float values in driver configuration as text, which means they reach as str to Ember-CSI code, which breaks drivers that use float's in their configuration and assume (reasonably) that they receive them as float and not str.

As an example: Issue #166 in the Ember-CSI repository that fails because of the vmware_task_poll_interval configuration option.

The reason why we were converting to text is the OLM not having a float type. It only has the number type and using it for floats will truncate the number and removes the decimal portion of the number.

A possible solution is to add a new transform function on the operator to the existing ones that transforms from text to a float and then
add this new transform to the generator and then use it on the floats configuration options section.

That way Ember-CSI will be able to receive a float instance instead of a string.

Images are based on backend names

Current code maps container images to be used using the backend name as the key:

    Image:   Conf.getDriverImage(ecsi.Spec.Backend),

It should be based on the driver to be used, as some drivers may need a custom container with specific packages.

Allow pinning the controller

For the LVM backend we need to be able to pin the controller pod to a specific node.
The operator should support this.

Missing mountpoints

There are a couple of mountpoints that are missing on the nodes:

/var/lib/iscsi
/run/udev

We need these among other things to share the iscsi nodes with the host, which will ensure we'll still see them after the container restarts.

Use CRD as the default persistence

Instead of forcing every deployment to define CRD as the default persistence, we should be setting this as a default in the operator.

Since the operator is storing data in the "ember-csi" namespace, we know that it exists, so we can tell Ember-CSI to use it instead of the default one: X_CSI_PERSISTENCE_CONFIG: '{"storage":"crd","namespace":"ember-csi"}'

Stop using Kirant's images

There are multiple places in the repository where we use kirant registry.
We should change this to Docker Hub's embercsi:

In the Makefile we have REPO?=quay.io/kirankt/ember-csi-operator
Same thing in install: quay.io/kirankt/ember-csi-operator:v0.0.3
And in the README file: quay.io/kirankt/ember-csi-operator:0.0.3

The same image is available as embercsi/ember-csi-operator:v0.0.3

Operator fails if X_CSI_PERSISTENCE_CONFIG is not set

The operator should use a default if X_CSI_PERSISTENCE_CONFIG is not set. However, the default includes some invalid JSON and therefore it fails with the following error:

E0203 08:41:07.939574 1 reflector.go:134] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:126: Failed to list *v1alpha1.EmberCSI: v1alpha1.EmberCSIList.Items: []v1alpha1.EmberCSI: v1alpha1.EmberCSI.Spec: v1alpha1.EmberCSISpec.Config: v1alpha1.EmberCSIConfig.EnvVars: v1alpha1.EnvVars.X_CSI_BACKEND_CONFIG: ReadString: expects " or n, but found {, error found in #10 byte of ...|_CONFIG":{"driver":"|..., bigger context ...|ec":{"config":{"envVars":{"X_CSI_BACKEND_CONFIG":{"driver":"RBD","name":"rbd","rbd_ceph_conf":"/etc/|...

Operator must handle RBAC, ServiceAccounts

Currently, we are piggy-backing on the 'ember-csi-operator' and its really wide-open RBAC to get a working deployment. Ideally the 'ember-csi-operator' must only have permissions to handle the Operator itself and the Operator must create the necessary RBAC and service accounts dynamically for each deployment.

serviceaccount "ember-csi-operator" not found

Hi @kirankt ,

Following document: https://github.com/kirankt/ember-csi-operator , external-ceph-node pod and external-ceph-controller can't be created.

[cloud-user@cnv-executor-qwang-1016-master1 ember-csi-operator]$ oc get all
NAME                                      READY     STATUS    RESTARTS   AGE
pod/ember-csi-operator-59dbb585db-ckwd4   1/1       Running   0          47m

NAME                         TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)     AGE
service/ember-csi-operator   ClusterIP   172.30.246.142   <none>        60000/TCP   47m

NAME                                DESIRED   CURRENT   READY     UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
daemonset.apps/external-ceph-node   0         0         0         0            0           <none>          4m

NAME                                 DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/ember-csi-operator   1         1         1            1           47m

NAME                                            DESIRED   CURRENT   READY     AGE
replicaset.apps/ember-csi-operator-59dbb585db   1         1         1         47m

NAME                                        DESIRED   CURRENT   AGE
statefulset.apps/external-ceph-controller   1         0         4m


[cloud-user@cnv-executor-qwang-1016-master1 ember-csi-operator]$ oc describe daemonset.apps/external-ceph-node | grep -A10 Events
Events:
  Type     Reason        Age                From                  Message
  ----     ------        ----               ----                  -------
  Warning  FailedCreate  1m (x20 over 12m)  daemonset-controller  Error creating: pods "external-ceph-node-" is forbidden: error looking up service account ember-csi/ember-csi-operator: serviceaccount "ember-csi-operator" not found

[cloud-user@cnv-executor-qwang-1016-master1 ember-csi-operator]$ oc describe statefulset.apps/external-ceph-controller | grep -A10 Events
Events:
  Type     Reason        Age                 From                    Message
  ----     ------        ----                ----                    -------
  Warning  FailedCreate  54s (x37 over 11m)  statefulset-controller  create Pod external-ceph-controller-0 in StatefulSet external-ceph-controller failed error: pods "external-ceph-controller-0" is forbidden: error looking up service account ember-csi/ember-csi-operator: serviceaccount "ember-csi-operator" not found

[cloud-user@cnv-executor-qwang-1016-master1 ember-csi-operator]$ oc get sa
NAME                SECRETS   AGE
builder             2         50m
csi-controller-sa   2         50m
csi-node-sa         2         50m
default             2         50m
deployer            2         50m

Ember-CSI pod should share IPC with the host

When using multipath, if we don't share the IPC of the Ember-CSI container running on the nodes, we will have problems detaching the volumes, and it will take 5 minutes for the operation to "complete".

RegisterPlugin error -- failed to get plugin info using RPC GetInfo at socket /var/lib/kubelet/plugins/ember-csi.io/csi.sock, err: rpc error: code = Unimplemented desc = Method not found!

As Reported on Ember-CSI's repository (embercsi/ember-csi#153), on Kubernetes nodes since v1.15 we've been starting to see these continual errors:

Jan 19 13:19:21 kube23.foo.com kubelet: I0119 13:19:21.844405 2107 operation_generator.go:193] parsed scheme: ""
Jan 19 13:19:21 kube23.foo.com kubelet: I0119 13:19:21.844431 2107 operation_generator.go:193] scheme "" not registered, fallback to default scheme
Jan 19 13:19:21 kube23.foo.com kubelet: I0119 13:19:21.844449 2107 passthrough.go:48] ccResolverWrapper: sending update to cc: {[{/var/lib/kubelet/plugins/ember-csi.io/csi.sock 0 }] }
Jan 19 13:19:21 kube23.foo.com kubelet: I0119 13:19:21.848196 2107 clientconn.go:577] ClientConn switching balancer to "pick_first"
Jan 19 13:19:21 kube23.foo.com kubelet: E0119 13:19:21.850493 2107 goroutinemap.go:150] Operation for "/var/lib/kubelet/plugins/ember-csi.io/csi.sock" failed. No retries permitted until 2020-01-19 13:21:23.850457201 -0500 EST m=+1109229.205664694 (durationBeforeRetry 2m2s). Error: "RegisterPlugin error -- failed to get plugin info using RPC GetInfo at socket /var/lib/kubelet/plugins/ember-csi.io/csi.sock, err: rpc error: code = Unimplemented desc = Method not found!"

This is on a system that currently doesn't have any pvc mounts. Rebooting the system will clear it up for a time but then it will start to happen again.

Kubernetes 1.15 introduced a change in the plugin-manager that keeps retrying till all the plugins (socket files under /var/lib/kubelet/{plugins,plugin_registry}) registrations succeed. That didn't use to be the case in previous releases, where we would only see the error once.

We need to change the location of the CSI socket used for communication between the CSI plugin and its sidecars.

The operator is creating the volume for the nodes as in Ember-CSI's examples https://github.com/embercsi/ember-csi-operator/blame/8c6a833acb97d433b6c1841318ace989b2fe0250/pkg/controller/embercsi/node.go#L156:

fmt.Sprintf("%s/%s/%s", "--kubelet-registration-path=/var/lib/kubelet/plugins", GetPluginDomainName(ecsi.Name), "csi.sock"),

And like the doc's manifest at https://kubernetes-csi.github.io/docs/deploying.html#driver-volume-mounts:

      # This volume is where the socket for kubelet->driver communication is done
      - name: socket-dir
        hostPath:
          path: /var/lib/kubelet/plugins/<driver-name>
          type: DirectoryOrCreate

Though the picture (https://kubernetes-csi.github.io/docs/images/kubelet.png) refers to a different directory: /var/lib/kubelet/<driver-name>

I believe we should be changing the nodes manifest to avoid these errors.

We can use an empty dir, or we can create the sockets under /var/lib/kubelet/.

Operator's clusterrole must be able to manipulate 'csinode' resource in storage.k8s.io API group

Kubelet complains that the operator's serviceaccount cannot query/update the 'csinode' resource in the 'storage.k8s.io' API group. Kubelet error:

Aug 28 22:11:16 node1.example.com dockerd-current[1274]: E0829 02:11:16.820017 1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1beta1.CSINode: csinodes.storage.k8s.io is forbidden: User "system:serviceaccount:ember-csi:ember-csi-operator" cannot list resource "csinodes" in API group "storage.k8s.io" at the cluster scope

Fix should be easy, with the addition of csinode in the 'ember-csi-operator' clusterrole.

Don't mount LVM directories for other drivers

We don't need to mount the LVM directories if we are not using the LVM driver:

		},{
			MountPath: "/etc/lvm",
			Name: "lvm-dir",
			MountPropagation: &bidirectional,
		},{
			MountPath: "/var/lock/lvm",
			Name: "lvm-lock",
			MountPropagation: &bidirectional,
		},{

Operator not reconciling StorageClass and VolumeSnapshotClass

If the StorageClass and VolumeSnapshotClass objects are removed after they are initially created, they are not being re-created by the Operator. However, if either the StatefulSet or DaemonSet is removed, these get recreated immediately in addition to any missing StorageClass and VolumeSnapshotClass objects.

The correct Operator behaviour should be to immediately recreate StorageClass and VolumeSnapshotClass after they are removed.

Missing directory /var/lib/ember-csi/vols

Commit cd81ebc added support for the shared lock directory, however this requires an additional subdirectory or pods will fail to start:

Warning FailedMount 51s (x8 over 2m) kubelet, node01 MountVolume.MountDevice failed for volume "pvc-67f73de2208311e9" : rpc error: code = Unknown desc = Exception calling application: [Errno 2] No such file or directory: u'/var/lib/ember-csi/vols/466e0060-a518-4f57-9f2a-cbe95c9629de'

After creating /var/lib/ember-csi/vols on the node01 it is working.

Ensure all the objects are created before exiting

If the operator receives a KILL|TERM signal, instead of immediately quitting, ensure that the resources the Operator owns are properly created before exiting.

Remove deprecated parameters

There are some deprecated parameters used, for example when using the external-provisioner sidecar:

Warning: option provisioner="my-ceph.ember-csi.io" is deprecated and has no effect

Also: connection-timeout when using the external-attacher sidecar >= 2.0

Volume mount fails on OCP 4.2 due to missing directory

This is due to not using var/lib/kubelet/plugins/ on OCP >= 4.2.

MountVolume.MountDevice failed for volume "pvc-8ba199ce-41f1-11ea-9720-52fdfc072182" : rpc error: code = InvalidArgument desc = Parent staging directory for /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-8ba199ce-41f1-11ea-9720-52fdfc072182/globalmount/stage doesn't exist: [Errno 2] No such file or directory: '/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-8ba199ce-41f1-11ea-9720-52fdfc072182/globalmount'

Deployment fails when host is missing /etc/localtime

According to the Ember-CSI reported issue #167 the Operator's deployment will fail if the host is missing file /etc/localtime.

We assumed that all hosts would have that file, but it looks like in some cases OKD 4.5 beta (Fedora CoreOS) is actually missing it:

$ ls /etc/localtime
ls: cannot access '/etc/localtime': No such file or directory

We'll see the following misleading error:

Error: container create failed: time="2020-06-24T06:40:26Z" level=warning msg="exit status 1" time="2020-06-24T06:40:26Z" level=error msg="container_linux.go:349: starting container process caused \"process_linux.go:449: container init caused \\\"rootfs_linux.go:58: mounting \\\\\\\"/etc/localtime\\\\\\\" to rootfs \\\\\\\"/var/lib/containers/storage/overlay/c87616da1d0f51f436eacf9e97bc4622c0285aad28edbcc08a1ec7283d7f930c/merged\\\\\\\" at \\\\\\\"/var/lib/containers/storage/overlay/c87616da1d0f51f436eacf9e97bc4622c0285aad28edbcc08a1ec7283d7f930c/merged/usr/share/zoneinfo/UTC\\\\\\\" caused \\\\\\\"not a directory\\\\\\\"\\\"\"" container_linux.go:349: starting container process caused "process_linux.go:449: container init caused \"rootfs_linux.go:58: mounting \\\"/etc/localtime\\\" to rootfs \\\"/var/lib/containers/storage/overlay/c87616da1d0f51f436eacf9e97bc4622c0285aad28edbcc08a1ec7283d7f930c/merged\\\" at \\\"/var/lib/containers/storage/overlay/c87616da1d0f51f436eacf9e97bc4622c0285aad28edbcc08a1ec7283d7f930c/merged/usr/share/zoneinfo/UTC\\\" caused \\\"not a directory\\\"\""

Deleting volume /etc/localtime from StatefulSet and DaemonSet will solve the problem, but that's not a convenient way of doing it.

The operator should have a way to disable the mounting of the host file for systems that don't have it, even if then the logs will not have the right timestamps.

Operatorhub integration

Add Ember to the operatorhub and make sure it is working within custom namespaces.

Automatically build images

Images should be automatically build on push to master and when tagging versions.

This will be easier once we resolve issue #9, since we can setup Docker Hub to do this automatically.

Change service names

For rolling upgrades/updates we'll need to have multiple instances of Ember-CSI running against the same backend, so we cannot have a fixed name for the StatefulSet and the DaemonSet of each backend, as that would create a conflict in Kubernetes.

The name could either use the hash of the image that's going to be used, some metadata from the image, or the vendor_version that will be reported by Ember-CSI once it has a finer grained vendor_version (embercsi/ember-csi#108).

Update socket-dir

Originally socket-dir for the pods had to be on /var/lib/kubelet/plugins/<driver-name>/csi.sock, as described in the documentation, but this is no longer the case. Now sidecars accept an argument with the address to use to connect to the CSI plugin, so we can use an emptyDir instead of a fixed path.

This will be required to be able to do rolling upgrades, as we'll have multiple instances of the plugin running simultaneous while we update/upgrade.

We should update the operator to use emptyDir when the sidecar containers allow it, to support rolling upgrades.

Support other deployment options for the controller

The CSI Volume Plugins in Kubernetes Design Doc suggests using leader election when we want faster recovery times.

The operator should support deploying Ember-CSI using leader election. This may require changes to Ember-CSI as well.

Relevant links:

Use go.mod instead of dep to resolve package dependencies

Use go.mod instead of dep to resolve package dependencies.

Graceful shutdown

We need to ensure operator actions have been finished successfully before shutting down the operator, eg. if receiving a SIGTERM.

Todo:

Send a notification to a channel once all operations are done
Add a goroutine & channels to block exiting until notification sent
Test timeouts with OLM and adopt them as needed

Support disabling issues in the operator

This is related/a followup to issue embercsi/ember-csi#122

As the title implies, the operator need a config option to disable specific features.

Better definition of spec, status and config for the Operator to control the application

Currently we have 'spec' and 'config' sections in the Operator's CR. We need better separation of these. Also, we could utilize the 'status' struct.

Reference: https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md

Node pod is not created

The node pod is not created, likely due to a wrong HostIPC setting.

For example, when creating a pod using a pvc it will fail with the following error:

kubectl -n sample-project describe pod/busybox-sleep
[...]
AttachVolume.Attach failed for volume "pvc-a2a48c97208111e9" : node "node01" has no NodeID annotation

Looking at the pods running in ember-csi, the "external-ceph-node-*" is missing:

kubectl -n ember-csi get pods
NAME READY STATUS RESTARTS AGE
ember-csi-operator-68844f4988-qfjd6 1/1 Running 0 4m
external-ceph-controller-0 3/3 Running 0 3m

And looking at the DaemonSet it fails due to HostIPC not being allowed:

kubectl -n ember-csi describe daemonset

Warning FailedCreate 59s (x16 over 3m) daemonset-controller Error creating: pods "external-ceph-node-" is forbidden: unable to validate against any security context constraint: [...] spec.containers[0].securityContext.hostIPC: Invalid value: true: Host IPC is not allowed to be used

error: unable to recognize "deploy/examples/lvmdriver.yaml": no matches for kind "EmberCSI" in version "ember-csi.io/v1alpha1"

I am following the README in the devel branch and i am unable to procceed with deployment when attempting

oc create -f deploy/examples/lvmdriver.yaml
error: unable to recognize "deploy/examples/lvmdriver.yaml": no matches for kind "EmberCSI" in version "ember-csi.io/v1alpha1"
oc version 
Client Version: 4.4.0-0.nightly-2020-02-17-022408
Server Version: 4.5.0-0.nightly-2020-05-17-163339
Kubernetes Version: v1.18.2+ee20b51

deploy/uninstall.yml needs update

Hi @kirankt ,

Ember CSI operator related resources can't be removed by uninstall.yml. Did I use it correctly?

[cloud-user@cnv-executor-qwang-1016-master1 ember-csi-operator]$ oc create -f deploy/uninstall.yml -n ember-csi
Error from server (Invalid): error when creating "deploy/uninstall.yml": Deployment.apps "ember-csi-operator" is invalid: [spec.selector: Required value, spec.template.metadata.labels: Invalid value: map[string]string(nil): `selector` does not match template `labels`, spec.template.spec.containers: Required value]
Error from server (Invalid): error when creating "deploy/uninstall.yml": Service "ember-csi-operator" is invalid: spec.ports: Required value
Error from server (Invalid): error when creating "deploy/uninstall.yml": CustomResourceDefinition.apiextensions.k8s.io "embercsis.ember-csi.io" is invalid: [metadata.name: Invalid value: "embercsis.ember-csi.io": must be spec.names.plural+"."+spec.group, spec.group: Required value, spec.versions: Invalid value: []apiextensions.CustomResourceDefinitionVersion(nil): must have exactly one version marked as storage version, spec.names.plural: Required value, spec.names.singular: Required value, spec.names.kind: Required value, spec.names.listKind: Required value, status.storedVersions: Invalid value: []string(nil): must have at least one stored version]
Error from server (AlreadyExists): error when creating "deploy/uninstall.yml": roles.rbac.authorization.k8s.io "ember-csi-operator" already exists
Error from server (Invalid): error when creating "deploy/uninstall.yml": RoleBinding.rbac.authorization.k8s.io "ember-csi-operator-rb" is invalid: [roleRef.kind: Unsupported value: "": supported values: "Role", "ClusterRole", roleRef.name: Required value]
Error from server (AlreadyExists): error when creating "deploy/uninstall.yml": clusterroles.rbac.authorization.k8s.io "ember-csi-controller-cr" already exists
Error from server (AlreadyExists): error when creating "deploy/uninstall.yml": clusterrolebindings.rbac.authorization.k8s.io "ember-csi-controller-rb" already exists
Error from server (AlreadyExists): error when creating "deploy/uninstall.yml": clusterroles.rbac.authorization.k8s.io "ember-csi-node-cr" already exists
Error from server (AlreadyExists): error when creating "deploy/uninstall.yml": clusterrolebindings.rbac.authorization.k8s.io "ember-csi-node-rb" already exists
Error from server (AlreadyExists): error when creating "deploy/uninstall.yml": serviceaccounts "csi-controller-sa" already exists
Error from server (AlreadyExists): error when creating "deploy/uninstall.yml": serviceaccounts "csi-node-sa" already exists
Error from server (Invalid): error when creating "deploy/uninstall.yml": SecurityContextConstraints.security.openshift.io "ember-csi-scc" is invalid: [runAsUser.type: Invalid value: "": invalid strategy type.  Valid values are MustRunAs, MustRunAsNonRoot, MustRunAsRange, RunAsAny, seLinuxContext.type: Invalid value: "": invalid strategy type.  Valid values are MustRunAs, RunAsAny]

StorageBackend vSphere Failed as Operator in OKD 4.5

OKD 4.5 beta (Fedora CoreOS)
Ember-csi operator

When I add new storage backend with vsphere driver all pod become failing

Error: container create failed: time="2020-06-24T06:40:26Z" level=warning msg="exit status 1" time="2020-06-24T06:40:26Z" level=error msg="container_linux.go:349: starting container process caused \"process_linux.go:449: container init caused \\\"rootfs_linux.go:58: mounting \\\\\\\"/etc/localtime\\\\\\\" to rootfs \\\\\\\"/var/lib/containers/storage/overlay/c87616da1d0f51f436eacf9e97bc4622c0285aad28edbcc08a1ec7283d7f930c/merged\\\\\\\" at \\\\\\\"/var/lib/containers/storage/overlay/c87616da1d0f51f436eacf9e97bc4622c0285aad28edbcc08a1ec7283d7f930c/merged/usr/share/zoneinfo/UTC\\\\\\\" caused \\\\\\\"not a directory\\\\\\\"\\\"\"" container_linux.go:349: starting container process caused "process_linux.go:449: container init caused \"rootfs_linux.go:58: mounting \\\"/etc/localtime\\\" to rootfs \\\"/var/lib/containers/storage/overlay/c87616da1d0f51f436eacf9e97bc4622c0285aad28edbcc08a1ec7283d7f930c/merged\\\" at \\\"/var/lib/containers/storage/overlay/c87616da1d0f51f436eacf9e97bc4622c0285aad28edbcc08a1ec7283d7f930c/merged/usr/share/zoneinfo/UTC\\\" caused \\\"not a directory\\\"\""

I delete volume '/etc/localtime' from statefulset and daemonset, after pods starts successfully

Add support for Topology

CSI spec v1.0 supports topology. This needs to be incorporated into the operator and ensure it works with X_CSI_TOPOLOGIES and X_CSI_NODE_TOPOLOGY in Ember-CSI driver.

No pv is created after deploy the ember-csi-operator

Deploy ember-csi-operator to OCP according to https://github.com/embercsi/ember-csi-operator
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.3.0-0.nightly-2019-12-29-173422 True False 28h Cluster version is 4.3.0-0.nightly-2019-12-29-173422

1.After deployment, ember-csi-operator is running
pod/ember-csi-operator-7cfb6b9c67-cqlkl 1/1 Running 0 22h
pod/my-ceph-controller-0 5/5 Running 0 9h
pod/my-ceph-node-0-hf9tr 2/2 Running 0 9h
pod/my-ceph-node-0-pjhpr 2/2 Running 0 9h
oc get pod/my-ceph-controller-0 -o yaml | grep image
image: embercsi/ember-csi:master
imagePullPolicy: Always
image: quay.io/k8scsi/csi-attacher:v1.1.1
imagePullPolicy: IfNotPresent
image: quay.io/k8scsi/csi-provisioner:v1.1.0
imagePullPolicy: IfNotPresent
image: quay.io/k8scsi/csi-cluster-driver-registrar:v1.0.1
imagePullPolicy: IfNotPresent
image: quay.io/k8scsi/csi-snapshotter:v1.1.0
imagePullPolicy: IfNotPresent
imagePullSecrets:
image: quay.io/k8scsi/csi-cluster-driver-registrar:v1.0.1
imageID: quay.io/k8scsi/csi-cluster-driver-registrar@sha256:fafd75ae5442f192cfa8c2e792903aee30d5884b62e802e4464b0a895d21e3ef
image: docker.io/embercsi/ember-csi:master
imageID: docker.io/embercsi/ember-csi@sha256:95ca3849471d65bb9500ad9169fdffccc3c31e468d40de33a746e38418150069
image: quay.io/k8scsi/csi-attacher:v1.1.1
imageID: quay.io/k8scsi/csi-attacher@sha256:e4db94969e1d463807162a1115192ed70d632a61fbeb3bdc97b40fe9ce78c831
image: quay.io/k8scsi/csi-provisioner:v1.1.0
imageID: quay.io/k8scsi/csi-provisioner@sha256:9828f32a1b350bef5f813857c2a3223e8aec79a9762bd78545eaea8fa79735d1
image: quay.io/k8scsi/csi-snapshotter:v1.1.0
imageID: quay.io/k8scsi/csi-snapshotter@sha256:a49e0da1af6f2bf717e41ba1eee8b5e6a1cbd66a709dd92cc43fe475fe2589eb

Create pvc, app to test but no pv is created
oc describe pvc ember-csi-pvc
Name: ember-csi-pvc
Namespace: demoapp
StorageClass: my-ceph.ember-csi.io-sc
Status: Pending
Volume:
Labels:
Annotations: volume.beta.kubernetes.io/storage-provisioner: my-ceph.ember-csi.io
Finalizers: [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
VolumeMode: Filesystem
Events:
Type Reason Age From Message

Normal ExternalProvisioning 6s (x6 over 69s) persistentvolume-controller waiting for a volume to be created, either by external provisioner "my-ceph.ember-csi.io" or manually created by system administrator
Mounted By: my-csi-app

W0102 20:18:28.503719 I0102 20:18:28.503800 I0102 20:18:28.503823 I0102 20:18:28.503839 I0102 20:18:28.586631 I0102 20:18:32.529469 I0102 20:18:32.529508 I0102 20:18:32.529518 I0102 20:18:32.534075 I0102 20:18:32.534578 I0102 20:18:32.534599 I0102 20:18:32.534608 I0102 20:18:32.538798 I0102 20:18:32.539646 I0102 20:18:32.539662 I0102 20:18:32.539677 I0102 20:18:32.539686 I0102 20:18:32.544832 I0102 20:18:32.545839 I0102 20:18:32.545865 I0102 20:18:32.545874 I0102 20:18:32.550529 I0102 20:18:32.554534 I0102 20:18:32.555593 I0102 20:18:32.555868 I0102 20:18:32.556539 I0102 20:18:32.556578 I0102 20:18:32.557169 I0102 20:18:32.557414 I0102 20:18:32.557435 I0102 20:18:32.557869 I0102 20:18:32.557895 I0102 20:18:32.656432 I0102 20:18:32.656650 I0102 20:18:32.656668 I0102 20:18:32.656673 I0102 20:22:39.647669 I0102 20:22:39.650857 I0102 20:22:39.723217 I0102 20:22:39.723276 I0102 20:22:39.723704 I0102 20:22:39.723729 W0102 20:22:39.900193 W0102 20:22:39.900359 1 deprecatedflags.go:53] Warning: option provisioner="my-ceph.ember-csi.io" is deprecated and has no effect
1 feature_gate.go:226] feature gates: &{map[Topology:true]}
1 csi-provisioner.go:95] Version: v1.1.0-0-gcecb5a96
1 csi-provisioner.go:109] Building kube configs for running in cluster...
1 connection.go:151] Connecting to unix:///csi-data/csi.sock
1 connection.go:261] Probing CSI driver for readiness
1 connection.go:180] GRPC call: /csi.v1.Identity/Probe
1 connection.go:181] GRPC request: {}
1 connection.go:183] GRPC response: {"ready":{"value":true}}
1 connection.go:184] GRPC error:
1 connection.go:180] GRPC call: /csi.v1.Identity/GetPluginInfo
1 connection.go:181] GRPC request: {}
1 connection.go:183] GRPC response: {"manifest":{"cinder-driver":"RBDDriver","cinder-driver-supported":"True","cinder-driver-version":"1.2.0","cinder-version":"15.1.0.dev125","cinderlib-version":"1.0.1.dev3","mode":"controller","persistence":"CRDPersistence"},"name":"ember-csi.io","vendor_version":"0.9.0-44-gf161c3c+19122019161820487031859"}
1 connection.go:184] GRPC error:
1 csi-provisioner.go:149] Detected CSI driver ember-csi.io
1 connection.go:180] GRPC call: /csi.v1.Identity/GetPluginCapabilities
1 connection.go:181] GRPC request: {}
1 connection.go:183] GRPC response: {"capabilities":[{"Type":{"Service":{"type":1}}}]}
1 connection.go:184] GRPC error:
1 connection.go:180] GRPC call: /csi.v1.Controller/ControllerGetCapabilities
1 connection.go:181] GRPC request: {}
1 connection.go:183] GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}},{"Type":{"Rpc":{"type":2}}},{"Type":{"Rpc":{"type":3}}},{"Type":{"Rpc":{"type":4}}},{"Type":{"Rpc":{"type":5}}},{"Type":{"Rpc":{"type":6}}},{"Type":{"Rpc":{"type":7}}},{"Type":{"Rpc":{"type":8}}}]}
1 connection.go:184] GRPC error:
1 controller.go:621] Using saving PVs to API server in background
1 controller.go:769] Starting provisioner controller ember-csi.io_my-ceph-controller-0_0c2c0085-2d9d-11ea-a1d9-0a580a830013!
1 reflector.go:123] Starting reflector *v1.PersistentVolumeClaim (15m0s) from sigs.k8s.io/sig-storage-lib-external-provisioner/controller/controller.go:800
1 reflector.go:161] Listing and watching *v1.PersistentVolumeClaim from sigs.k8s.io/sig-storage-lib-external-provisioner/controller/controller.go:800
1 volume_store.go:90] Starting save volume queue
1 reflector.go:123] Starting reflector *v1.PersistentVolume (15m0s) from sigs.k8s.io/sig-storage-lib-external-provisioner/controller/controller.go:803
1 reflector.go:161] Listing and watching *v1.PersistentVolume from sigs.k8s.io/sig-storage-lib-external-provisioner/controller/controller.go:803
1 reflector.go:123] Starting reflector *v1.StorageClass (15m0s) from sigs.k8s.io/sig-storage-lib-external-provisioner/controller/controller.go:806
1 reflector.go:161] Listing and watching *v1.StorageClass from sigs.k8s.io/sig-storage-lib-external-provisioner/controller/controller.go:806
1 shared_informer.go:123] caches populated
1 controller.go:979] Final error received, removing PVC 41ae847e-a383-4a56-a7dd-e2a2cb0e7655 from claims in progress
1 controller.go:818] Started provisioner controller ember-csi.io_my-ceph-controller-0_0c2c0085-2d9d-11ea-a1d9-0a580a830013!
1 controller.go:902] Provisioning succeeded, removing PVC 41ae847e-a383-4a56-a7dd-e2a2cb0e7655 from claims in progress
1 streamwatcher.go:107] Unable to decode an event from the watch stream: http2: server sent GOAWAY and closed the connection; LastStreamID=13, ErrCode=NO_ERROR, debug=""
1 reflector.go:370] sigs.k8s.io/sig-storage-lib-external-provisioner/controller/controller.go:803: Watch close - *v1.PersistentVolume total 0 items received
1 streamwatcher.go:107] Unable to decode an event from the watch stream: http2: server sent GOAWAY and closed the connection; LastStreamID=13, ErrCode=NO_ERROR, debug=""
1 reflector.go:370] sigs.k8s.io/sig-storage-lib-external-provisioner/controller/controller.go:806: Watch close - *v1.StorageClass total 0 items received
1 streamwatcher.go:107] Unable to decode an event from the watch stream: http2: server sent GOAWAY and closed the connection; LastStreamID=13, ErrCode=NO_ERROR, debug=""
1 reflector.go:370] sigs.k8s.io/sig-storage-lib-external-provisioner/controller/controller.go:800: Watch close - *v1.PersistentVolumeClaim total 0 items received
1 reflector.go:289] sigs.k8s.io/sig-storage-lib-external-provisioner/controller/controller.go:806: watch of *v1.StorageClass ended with: too old resource version: 316368 (317507)
1 reflector.go:289] sigs.k8s.io/sig-storage-lib-external-provisioner/controller/controller.go:800: watch of *v1.PersistentVolumeClaim ended with: too old resource version: 314203 (317500)

Build everything in the Dockerfile

We should have a multistep Dockerfile that is capable of compiling the code and then generating the final image with it instead of requiring us to manually build it in our system and then build the image.

Support Ember-CSI container restarts and share locks

As we currently deploy the Ember-CSI node containers, if we restart them we'll mess up things because they'll lose the private volume bind mounts.

If we have multiple Ember-CSI node containers running on the same host (we have multiple backends) we may run into problems since they are not sharing the locks.

To resolve these two issues we just need to mount the container's /var/lib/ember-csi directory into the host, and share this directory between all the Ember-CSI containers.

Add support for Volume Snapshots

Add support to add Volume Snapshot based on cluster and/or CSI versions.

Consider listing operator in Artifact Hub

Hi! 👋🏻

Have you considered listing the ember-csi operator directly in Artifact Hub?

At the moment it is already listed there, because the Artifact Hub team has added the community-operators repository. However, listing it yourself directly has some benefits:

You add your repository once, and new versions (or even new operators) committed to your git repository will be indexed automatically and listed in Artifact Hub, with no extra PRs needed.
You can display the Verified Publisher label in your operators, increasing their visibility and potentially the users' trust in your content.
Increased visibility of your organization in urls and search results. Users will be able to see your organization's description, a link to the home page and search for other content published by you.
If something goes wrong indexing your repository, you will be notified and you can even inspect the logs to check what went wrong.

If you decide to go ahead, you just need to sign in and add your repository from the control panel. You can add it using a single user or create an organization for it, whatever suits your needs best.

You can find some notes about the expected repository url format and repository structure in the repositories guide. There is also available an example of an operator repository already listed in Artifact Hub in the documentation. Operators are expected to be packaged using the format defined in the Operator Framework documentation to facilitate the process.

Please let me know if you have any questions or if you encounter any issue during the process 🙂

Update documentation to describe sysfiles-secrets in greater detail

Currently the README.md has sparse instructions regarding how sysfiles-secrets is created. Update the doc to reveal the way its create, the contents and file structure of the tar file, etc.

embercsi / ember-csi-operator Goto Github PK

ember-csi-operator's People

Contributors

Stargazers

Watchers

Forkers

ember-csi-operator's Issues

-- Step 13/13 : COPY --from=0 /go/src/github.com/embercsi/ember-csi-operator/build/ember-csi-operator /usr/local/bin/ember-csi-operator Unknown flag: from make: *** [build] Error 1

Recommend Projects

Recommend Topics

Recommend Org

--
Step 13/13 : COPY --from=0 /go/src/github.com/embercsi/ember-csi-operator/build/ember-csi-operator /usr/local/bin/ember-csi-operator
Unknown flag: from
make: *** [build] Error 1