Giter Site home page Giter Site logo

cloud-ark / kubeplus Goto Github PK

View Code? Open in Web Editor NEW
645.0 19.0 78.0 6.54 GB

Kubernetes Operator for multi-instance multi-tenancy

Home Page: https://cloudark.io/

License: Apache License 2.0

Go 61.09% Shell 8.88% Python 26.83% Dockerfile 0.49% JavaScript 1.49% HTML 1.23%
kubernetes-operator multi-tenancy saas multi-customer helm managed-application platform-engineering application-hosting

kubeplus's People

Contributors

barry8schneider avatar chiukapoor avatar dependabot[bot] avatar devdattakulkarni avatar djarotech avatar dsuleimenov avatar eminalparslan avatar enyachoke avatar joaocc avatar mbowen13 avatar omgoswami avatar tomku1 avatar yil1223 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

kubeplus's Issues

Operator delete fails on connection timeout

If the connection between operator-deployer and helm times out, the Operator delete action fails.

Error from operator-deployer logs:

Error: rpc error: code = Unavailable desc = transport is closingOperator chart %s %s already deployed
Operators to install:[https://github.com/cloud-ark/operatorcharts/blob/master/moodle-operator-chart-0.0.1.tgz?raw=true]
E1109 21:50:15.647812 1 portforward.go:178] lost connection to pod
Effective Operators to install:[https://github.com/cloud-ark/operatorcharts/blob/master/moodle-operator-chart-0.0.1.tgz?raw=true]
Error: context deadline exceededOperator chart %s %s already deployed
Operators to install:[https://github.com/cloud-ark/operatorcharts/blob/master/moodle-operator-chart-0.0.1.tgz?raw=true]
Operators to delete:[https://github.com/cloud-ark/operatorcharts/blob/master/moodle-operator-chart-0.0.1.tgz?raw=true]
Release Name to delete:newbie-cardinal
Deleting chart:newbie-cardinal

Error from operator-manager logs:

E1109 21:50:20.025104 1 reflector.go:322] github.com/cloud-ark/kubeplus/operator-manager/vendor/k8s.io/client-go/informers/factory.go:130: Failed to watch *v1.Deployment: Get https://10.96.0.1:443/apis/apps/v1/deployments?resourceVersion=3964&timeoutSeconds=525&watch=true: unexpected EOF
E1109 21:50:20.056931 1 reflector.go:322] github.com/cloud-ark/kubeplus/operator-manager/pkg/client/informers/externalversions/factory.go:117: Failed to watch *v1.Operator: Get https://10.96.0.1:443/apis/operatorcontroller.kubeplus/v1/operators?resourceVersion=3375&timeoutSeconds=361&watch=true: dial tcp 10.96.0.1:443: connect: connection refused
E1109 21:50:34.591025 1 reflector.go:205] github.com/cloud-ark/kubeplus/operator-manager/vendor/k8s.io/client-go/informers/factory.go:130: Failed to list *v1.Deployment: Get https://10.96.0.1:443/apis/apps/v1/deployments?limit=500&resourceVersion=0: net/http: TLS handshake timeout
E1109 21:50:35.585139 1 reflector.go:205] github.com/cloud-ark/kubeplus/operator-manager/pkg/client/informers/externalversions/factory.go:117: Failed to list *v1.Operator: Get https://10.96.0.1:443/apis/operatorcontroller.kubeplus/v1/operators?limit=500&resourceVersion=0: net/http: TLS handshake timeout
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Operator to delete:moodle-operator

Mysql operator steps.txt - missing step? and unable to verify mysql pods exist

I followed steps 1-11 here, but was unable to verify MySQL cluster.

kubectl create -f mysql-operator-chart-0.2.1.yaml
kubectl create -f mysql-cluster.yaml (I think this step is missing, but still)
No mysql pods are displayed

$ kubectl get pods
kubeplus-wnmpc   4/4       Running   0          12m

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.0", GitCommit:"fc32d2f3698e36b93322a3465f63a14e9f0eaead", GitTreeState:"clean", BuildDate:"2018-03-26T16:55:54Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.0", GitCommit:"fc32d2f3698e36b93322a3465f63a14e9f0eaead", GitTreeState:"clean", BuildDate:"2018-03-26T16:44:10Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}

$ helm version
Client: &version.Version{SemVer:"v2.12.3", GitCommit:"eecf22f77df5f65c823aacd2dbd30ae6c65f186e", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.12.3", GitCommit:"eecf22f77df5f65c823aacd2dbd30ae6c65f186e", GitTreeState:"clean"}

$ minikube version
minikube version: v0.28.2

OS:
$ sw_vers
ProductName: Mac OS X
ProductVersion: 10.14.2
BuildVersion: 18C54

$ kubectl logs -c operator-manager
error: expected 'logs (POD | TYPE/NAME) [CONTAINER_NAME]'.
POD or TYPE/NAME is a required argument for the logs command
See 'kubectl logs -h' for help and examples.
$ kubectl logs -c operator-deployer
error: expected 'logs (POD | TYPE/NAME) [CONTAINER_NAME]'.
POD or TYPE/NAME is a required argument for the logs command
See 'kubectl logs -h' for help and examples.
$ kubectl logs -c kube-discovery-apiserver
error: expected 'logs (POD | TYPE/NAME) [CONTAINER_NAME]'.
POD or TYPE/NAME is a required argument for the logs command
See 'kubectl logs -h' for help and examples.


$ kubectl logs -n mysql-operator mysql-operator-56cb675b7f-mjnzf
I0208 15:57:04.990969       1 reflector.go:240] Listing and watching *v1.Service from github.com/oracle/mysql-operator/vendor/k8s.io/client-go/informers/factory.go:87
E0208 15:57:04.991764       1 reflector.go:205] github.com/oracle/mysql-operator/vendor/k8s.io/client-go/informers/factory.go:87: Failed to list *v1.Service: services is forbidden: User "system:serviceaccount:mysql-operator:mysql-operator" cannot list services at the cluster scope
I0208 15:57:04.995208       1 reflector.go:240] Listing and watching *v1.Pod from github.com/oracle/mysql-operator/vendor/k8s.io/client-go/informers/factory.go:87
E0208 15:57:04.996013       1 reflector.go:205] github.com/oracle/mysql-operator/vendor/k8s.io/client-go/informers/factory.go:87: Failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:mysql-operator:mysql-operator" cannot list pods at the cluster scope
I0208 15:57:04.996320       1 reflector.go:240] Listing and watching *v1alpha1.Cluster from github.com/oracle/mysql-operator/pkg/generated/informers/externalversions/factory.go:70
E0208 15:57:04.997434       1 reflector.go:205] github.com/oracle/mysql-operator/pkg/generated/informers/externalversions/factory.go:70: Failed to list *v1alpha1.Cluster: mysqlclusters.mysql.oracle.com is forbidden: User "system:serviceaccount:mysql-operator:mysql-operator" cannot list mysqlclusters.mysql.oracle.com at the cluster scope

YAMLs:
mysql-cluster.yaml
mysql-operator-chart-0.2.1.yaml

Image Tags:
Operator-manager

Incorrect delete behavior after KubePlus has been running for several hours

We are noticing that after several hours of Operation, KubePlus fails to correctly process delete requests for deleting a deployed Operator. The Helm chart gets deleted but the Operator Object itself does not get deleted.

Here are the log outputs:

operator-manager:

E1202 16:30:01.049852 1 reflector.go:322] github.com/cloud-ark/kubeplus/operator-manager/pkg/client/informers/externalversions/factory.go:117: Failed to watch *v1.Operator: Get https://10.96.0.1:443/apis/operatorcontroller.kubeplus/v1/operators?resourceVersion=41684&timeoutSeconds=513&watch=true: dial tcp 10.96.0.1:443: connect: connection refused
E1202 16:30:01.051118 1 reflector.go:322] github.com/cloud-ark/kubeplus/operator-manager/vendor/k8s.io/client-go/informers/factory.go:130: Failed to watch *v1.Deployment: Get https://10.96.0.1:443/apis/apps/v1/deployments?resourceVersion=41679&timeoutSeconds=469&watch=true: dial tcp 10.96.0.1:443: connect: connection refused
E1202 16:30:02.107242 1 reflector.go:205] github.com/cloud-ark/kubeplus/operator-manager/pkg/client/informers/externalversions/factory.go:117: Failed to list *v1.Operator: Get https://10.96.0.1:443/apis/operatorcontroller.kubeplus/v1/operators?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: connection refused
E1202 16:30:02.154620 1 reflector.go:205] github.com/cloud-ark/kubeplus/operator-manager/vendor/k8s.io/client-go/informers/factory.go:130: Failed to list *v1.Deployment: Get https://10.96.0.1:443/apis/apps/v1/deployments?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: connection refused
E1202 16:30:03.143158 1 reflector.go:205] github.com/cloud-ark/kubeplus/operator-manager/pkg/client/informers/externalversions/factory.go:117: Failed to list *v1.Operator: Get https://10.96.0.1:443/apis/operatorcontroller.kubeplus/v1/operators?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: connection refused
E1202 16:30:03.168676 1 reflector.go:205] github.com/cloud-ark/kubeplus/operator-manager/vendor/k8s.io/client-go/informers/factory.go:130: Failed to list *v1.Deployment: Get https://10.96.0.1:443/apis/apps/v1/deployments?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: connection refused
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Operator to delete:moodle-operator
Last item:moodle-operator-chart-0.0.1.tgz?raw=true
Candidate:moodle-operator-chart-0.0.1
Version:0.0.1
Name:moodle-operator-chart

operator-deployer:

Setting release-https://github.com/cloud-ark/operatorcharts/blob/master/moodle-operator-chart-0.0.1.tgz?raw=true->killjoy-leopard
E1202 14:41:25.861185 1 portforward.go:178] lost connection to pod
E1202 14:41:25.868215 1 portforward.go:178] lost connection to pod
E1202 14:41:25.878150 1 portforward.go:178] lost connection to pod
E1202 14:41:25.878178 1 portforward.go:178] lost connection to pod
E1202 14:41:25.878183 1 portforward.go:178] lost connection to pod
E1202 14:41:25.784588 1 portforward.go:178] lost connection to pod
E1202 14:41:25.785064 1 portforward.go:178] lost connection to pod
E1202 14:41:25.828966 1 portforward.go:178] lost connection to pod
E1202 14:41:25.950864 1 portforward.go:178] lost connection to pod
Created tunnel using local port: 45119
Release Name to delete:killjoy-leopard
Deleting chart:killjoy-leopard
release "killjoy-leopard" deleted

Actual state:

NAME READY STATUS RESTARTS AGE
kubeplus-2mbxj 4/4 Running 2 18h
moodle-operator-deployment-64cdcb958c-2fw5s 1/1 Running 0 18h
moodle1-748f7b84c9-l8f6w 1/1 Running 0 18h
moodle1-mysql-6c8748868c-tq6bz 1/1 Running 0 18h

Notice that the Moodle Operator Pod is still running even though the Helm chart has been deleted.

Helm failure on real cluster

When trying with a real cluster we were seeing helm failure where helm was not able to install a chart.

It got fixed after running following command:
kubectl.sh create clusterrolebinding permissive-binding --clusterrole=cluster-admin --user=admin --user=kubelet --group=system:serviceaccounts

as identified on [1]

[1] https://stackoverflow.com/questions/43499971/helm-error-no-available-release-name-found

We also created a service account for tiller as outlined on [2].

[2] helm/helm#3130

What is KubeARK

KubeARK is reference in the readme but I am not able to find any code or github project.

Operator deployer keeps on trying deploying an Operator even after Helm deploy failure

Currently operator-deployer keeps attempting to deploy a Helm chart even after there has been some Helm failure. We should put some limit on the number of deployment attempts that are done before stopping. The stopping can be for some period and then the deployment can be attempted again.

We saw this issue when deploying/deleting/and redeploying MySQL Operator. This Operator creates a namespace 'mysql-operator' and it takes long time to delete this namespace. If the redeployment is tried again before the 'mysql-operator' namespace is completely deleted, the deployment fails due to Helm failures (since the namespace is being terminated, new deployment cannot be done).

When attempting to list deployed Helm charts, we see following error after a while.

helm list --tiller-namespace default
Error: grpc: received message larger than max (6811399 vs. 4194304)

Default Service Account permissions

When deploying Postgres controller on a real cluster got following error:

E1101 11:06:43.360335 1 reflector.go:205] github.com/cloud-ark/kubeplus/postgres-crd-v2/pkg/client/informers/externalversions/factory.go:117: Failed to list *v1.Postgres: postgreses.postgrescontroller.kubeplus is forbidden: User "system:serviceaccount:default:default" cannot list postgreses.postgrescontroller.kubeplus at the cluster scope

The default service account in the default namespace needs to be granted cluster-admin role. Applying following YAML fixes the issue:


apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: default
subjects:

  • kind: ServiceAccount
    name: default
    namespace: default
    roleRef:
    kind: ClusterRole
    name: cluster-admin
    apiGroup: rbac.authorization.k8s.io

Include above deployment artifacts of Postgres-crd-v2 and update README to include the step to apply this first.

kubectl get composition for Operator CRD

It should be possible to use 'composition' endpoint with the 'Operator' kind.

For example, following should work

$ kubectl get operators
NAME AGE
moodle-operator 10m

$ kubectl get --raw "/apis/kubeplus.cloudark.io/v1/composition?kind=operator&instance=moodle-operator"
[]

Related: #144

Custom Values.yaml for an Operator

This issue is to track the work required to support custom Values.yaml for an Operator.

Current option for users is to create a custom chart, upload it to chart repository, and then provide URL of that as input in KubePlus Operator CRD.

Another option that we should explore is following:

  • User creates a ConfigMap with the new Values.yaml
  • User provides name of the ConfigMap in the Operator CRD. For this we add a new attribute to the Operator CRD spec. The Operator controller retrieves this Values.yaml from the ConfigMap and passes it to tiller when deploying the chart.

Examples of using other methods of interaction

The primary method of using KubePlus is 'kubectl'. But there might be situations where kubectl is not the best option. For example, in programmatic access. For such situations we would need something that allows making direct REST calls against the Kube APIServer.
By default, anything that is possible through kubectl can be accessed using curl (and consequently through any other rest client).

We should verify that is the case with 'explain' and 'composition' endpoints as well and include examples that show how to use them using curl and rest library of some programming language.

Operator deployment error

Getting following error when trying to deploy an Operator on a real cluster

Name:https://github.com/cloud-ark/operatorcharts/blob/master/postgres-crd-v2-chart-0.0.2.tgz?raw=true
panic: Get https://github.com/cloud-ark/operatorcharts/blob/master/postgres-crd-v2-chart-0.0.2.tgz?raw=true: dial tcp: lookup github.com on 10.0.0.10:53: server misbehaving

goroutine 1 [running]:
main.locateChartPath(0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xc4200b0240, 0x60, 0x0, 0x0, ...)
/home/devdatta/Code/go/src/github.com/cloud-ark/kubeplus/operator-deployer/install.go:517 +0xea1
main.newInstallCmd(0x18da520, 0xc42083cb40, 0x18ab6a0, 0xc42000e018, 0xc4200b0240, 0x60, 0x0, 0x0, 0x0, 0x0, ...)
/home/devdatta/Code/go/src/github.com/cloud-ark/kubeplus/operator-deployer/install.go:178 +0x1cc
main.main()
/home/devdatta/Code/go/src/github.com/cloud-ark/kubeplus/operator-deployer/helm.go:203 +0x9f5

--
Ruling out DNS issue as the error persists even after
updating /etc/resolve.conf to include
nameserver 8.8.8.8
nameserver 8.8.4.4

Issues inspecting the custom resource

Hello, here some issues inspecting the custom resources:

$ kubectl get --raw "/apis/kubeplus.cloudark.io/v1/explain?kind=Postgres"
Error from server (NotFound): the server could not find the requested resource
$ kubectl get --raw "/apis/kubeplus.cloudark.io/v1/"
Error from server (NotFound): the server could not find the requested resource
$ kubectl get --raw "/apis/kubeplus.cloudark.io"
{"kind":"APIGroup","apiVersion":"v1","name":"kubeplus.cloudark.io","versions":[{"groupVersion":"kubeplus.cloudark.io/v1","version":"v1"}],"preferredVersion":{"groupVersion":"kubeplus.cloudark.io/v1","version":"v1"}}

Postgres Operator - No such host on real cluster

When deploying Postgres Operator on a real cluster we got following error:

Now setting up the database
Setting up database
Commands:
E1031 16:29:31.750684 1 runtime.go:66] Observed a panic: &net.OpError{Op:"dial", Net:"tcp", Source:net.Addr(nil), Addr:net.Addr(nil), Err:(*net.DNSError)(0xc4204e8ac0)} (dial tcp: lookup port=0: no such host)
/Users/devdatta/go/src/github.com/cloud-ark/kubeplus/postgres-crd-v2/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:72
/Users/devdatta/go/src/github.com/cloud-ark/kubeplus/postgres-crd-v2/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:65
/Users/devdatta/go/src/github.com/cloud-ark/kubeplus/postgres-crd-v2/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:51
/usr/local/Cellar/go/1.10.2/libexec/src/runtime/asm_amd64.s:573
/usr/local/Cellar/go/1.10.2/libexec/src/runtime/panic.go:502
/Users/devdatta/go/src/github.com/cloud-ark/kubeplus/postgres-crd-v2/controller.go:700
/Users/devdatta/go/src/github.com/cloud-ark/kubeplus/postgres-crd-v2/controller.go:462
/Users/devdatta/go/src/github.com/cloud-ark/kubeplus/postgres-crd-v2/controller.go:407
/Users/devdatta/go/src/github.com/cloud-ark/kubeplus/postgres-crd-v2/controller.go:227
/Users/devdatta/go/src/github.com/cloud-ark/kubeplus/postgres-crd-v2/controller.go:235
/Users/devdatta/go/src/github.com/cloud-ark/kubeplus/postgres-crd-v2/controller.go:188
/Users/devdatta/go/src/github.com/cloud-ark/kubeplus/postgres-crd-v2/controller.go:174
/Users/devdatta/go/src/github.com/cloud-ark/kubeplus/postgres-crd-v2/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133
/Users/devdatta/go/src/github.com/cloud-ark/kubeplus/postgres-crd-v2/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:134
/Users/devdatta/go/src/github.com/cloud-ark/kubeplus/postgres-crd-v2/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88
/usr/local/Cellar/go/1.10.2/libexec/src/runtime/asm_amd64.s:2361
panic: dial tcp: lookup port=0: no such host [recovered]
panic: dial tcp: lookup port=0: no such host

Not able to access the moodle1 url from host machine.

Moodle Operator: Steps Tested with minikube.

OS and Version :

System Memory : 8 GB
Ubuntu 18.04.1 LTS
Release: 18.04
Codename: bionic
Kernel release : Linux ubuntu 4.15.0-42-generic
Kernel : #45-Ubuntu SMP
x86_64

Minikube v0.30.0
Helm v2.11.0

Result :

Able to bring all the pods up. ( operator, MySQL, and moodle1 )
moodle1 pod took a long time more than 2 hours to be ready.

/etc/host is updated with the IP address and got the admin and password

but not able to connect to the URL from the host machine.

http://moodle1:32000
IP: 192.168.99.100

Is there are any step missing in the doc?

Consolidation of Etcd functionality

Currently operator-manager and operator-deployer both implement certain ETCD related functions. We should extract these functions into a separate library and use it within operator-manager and operator-deployer.

Steps to generate CRD/Operator code

Here are the steps that one needs to follow towards generating your own CRD/Operator code.

Generating Client/Listers/Informers:

For generating the client/listers/informers, etc. the first step is to create specific directory structure and define specific files in it. Here is that structure and the required files:

  • pkg/apis/some-controller-name/register.go
  • pkg/apis/some-controller-name/v1/doc.go
  • pkg/apis/some-controller-name/v1/types.go
  • pkg/client
  • pkg/signals

Following are detailed steps:

  1. Choose a name for your CRD's group name -- say 'postgrescontroller'

  2. Create pkg/apis/'CRD group name' directory

  3. cd pkg/apis/'CRD group name'

  4. Inside this directory create register.go to register the group name. When registering the name, choose some qualifier (example, in
    https://github.com/cloud-ark/kubeplus/blob/master/postgres-crd-v2/pkg/apis/postgrescontroller/register.go we have used "kubeplus" as the qualifier name.).

  5. Create v1 directory inside pkg/apis/'CRD group name'

  6. cd pkg/apis/'CRD group name'/v1

  7. Create doc.go. Make sure you set the "+groupName=." in line 20 of doc.go. Leave line 21 as it is. See https://github.com/cloud-ark/kubeplus/blob/master/postgres-crd-v2/pkg/apis/postgrescontroller/v1/doc.go

  8. Copy https://github.com/cloud-ark/kubeplus/tree/master/operator manager/pkg/apis/operatorcontroller/v1/register.go into pkg/apis//v1/.
    Make appropriate changes to this file.

  9. Define your CRD spec in types.go

  10. Nothing else needs to be created in the pkg/apis directory

  11. Create pkg/client directory. Nothing needs to be created inside the client directory -- leave it empty.

  12. Create pkg/signals directory and just copy all the files from https://github.com/cloud-ark/kubeplus/tree/master/postgres-crd-v2/pkg/signals into pkg/signals.
    You don't need to modify anything in any of these files.

  13. Go inside the hack/ directory.

  14. Edit update-codegen.sh to include the path of your folder

  15. Copy Gopkg.lock and Gopkg.toml from
    https://github.com/cloud-ark/kubeplus/tree/master/operator-manager

  16. Install Go dependencies
    dep ensure (this will create the vendor folder)

  17. mkdir vendor/k8s.io/code-generator/hack

  18. cp hack/boilerplate.go.txt vendor/k8s.io/code-generator/hack/.

  19. From the main directory of your CRD code execute following:
    ./hack/update-codegen.sh

Running the controller:

  1. In the controller.go (and other go files), include the CRD client/listers/informers etc. with appropriate path. See https://github.com/cloud-ark/kubeplus/blob/master/postgres-crd-v2/controller.go#L47 for an example.

  2. go run *.go -kubeconfig=$HOME/.kube/config

Reference:

kubernetes/sample-controller#13

Handling Operator CRD and children resource ownership across KubePlus deployments

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.0", GitCommit:"fc32d2f3698e36b93322a3465f63a14e9f0eaead", GitTreeState:"clean", BuildDate:"2018-03-26T16:55:54Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.0", GitCommit:"fc32d2f3698e36b93322a3465f63a14e9f0eaead", GitTreeState:"clean", BuildDate:"2018-03-26T16:44:10Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}

$ helm version
Client: &version.Version{SemVer:"v2.12.3", GitCommit:"eecf22f77df5f65c823aacd2dbd30ae6c65f186e", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.12.3", GitCommit:"eecf22f77df5f65c823aacd2dbd30ae6c65f186e", GitTreeState:"clean"}

$ minikube version
minikube version: v0.28.2

OS:
$ sw_vers
ProductName: Mac OS X
ProductVersion: 10.14.2
BuildVersion: 18C54

$ kubectl delete -f moodle-operator.yaml
Error from server (NotFound): error when deleting "moodle-operator.yaml": operators.operatorcontroller.kubeplus "moodle-operator" not found

$ kubectl get operators
No resources found.

$ kubectl get pods
NAME READY STATUS RESTARTS AGE
kubeplus-gkhhk 4/4 Running 0 39s
moodle-operator-deployment-64cdcb958c-j5wdr 1/1 Running 0 16m

$ kubectl logs moodle-operator-deployment-64cdcb958c-j5wdr


Moodle Name:moodle1
Plugins:[profilecohort]
Inside deployMoodle
Inside createPersistentVolume
Creating persistentVolume...
Created persistentVolume "moodle1".
Inside createPersistentVolumeClaim
Creating persistentVolumeClaim...
Created persistentVolumeClaim "moodle1".
Inside createService
Created service "moodle1".
Moodle Service IP:%s 10.111.160.190
Service URI10.111.160.190:32000
Creating Ingress...
Created Ingress "moodle1".
Inside createDeployment
Generated Password:0PIgyQbQ
Inside createSecret
Secret Name:moodle1
Admin Password:0PIgyQbQ
Creating secrets..
Created Secret "moodle1".
MySQL Service Port int:%d
3306
MySQL Service Port:%d
3306
MySQL Host IP:%s
10.108.57.105
HOST_NAME:%s
moodle1:32000
Creating deployment...
Created deployment "moodle1".
Waiting for Moodle Pod to get ready.
Moodle Pod Name:moodle1-76dbff7b5c-ccg55
Waiting for Moodle Pod to get ready.
Moodle Pod Name:moodle1-76dbff7b5c-ccg55
...
Waiting for Moodle Pod to get ready.
Moodle Pod Name:moodle1-76dbff7b5c-ccg55
Waiting for Moodle Pod to get ready.
Moodle Pod Name:moodle1-76dbff7b5c-ccg55
Waiting for Moodle Pod to get ready.
Moodle Pod Name:moodle1-76dbff7b5c-ccg55
Waiting for Moodle Pod to get ready.
Moodle Pod Name:moodle1-76dbff7b5c-ccg55
Moodle Pod is running.
Pod is ready.
Supported Plugins:%v [profilecohort]
Unsupported Plugins:%v []
Inside installPlugins
Installing plugin profilecohort
Download Link:https://moodle.org/plugins/download.php/17929/local_profilecohort_moodle35_2018092800.zip
Install Folder:/var/www/html/local/
Inside exec
Plugin ZipFile Name:local_profilecohort_moodle35_2018092800.zip
Download Plugin Cmd:wget https://moodle.org/plugins/download.php/17929/local_profilecohort_moodle35_2018092800.zip -O /tmp/local_profilecohort_moodle35_2018092800.zip
Inside executeExecCall
Output:
Unzip Plugin Cmd:unzip /tmp/local_profilecohort_moodle35_2018092800.zip -d /tmp/.
Inside executeExecCall
Output:Archive: /tmp/local_profilecohort_moodle35_2018092800.zip
creating: profilecohort/
inflating: profilecohort/styles.css
creating: profilecohort/classes/
inflating: profilecohort/classes/profilecohort.php
creating: profilecohort/classes/task/
inflating: profilecohort/classes/task/update_cohorts.php
inflating: profilecohort/classes/profilefields.php
creating: profilecohort/classes/privacy/
inflating: profilecohort/classes/privacy/provider.php
inflating: profilecohort/classes/field_checkbox.php
inflating: profilecohort/classes/field_text.php
inflating: profilecohort/classes/field_textarea.php
inflating: profilecohort/classes/field_menu.php
inflating: profilecohort/classes/cohort_form.php
inflating: profilecohort/classes/fields_form.php
inflating: profilecohort/classes/field_base.php
inflating: profilecohort/CHANGES.md
inflating: profilecohort/.travis.yml
creating: profilecohort/lang/
creating: profilecohort/lang/en/
inflating: profilecohort/lang/en/local_profilecohort.php
creating: profilecohort/amd/
creating: profilecohort/amd/src/
inflating: profilecohort/amd/src/reorder.js
creating: profilecohort/amd/build/
inflating: profilecohort/amd/build/reorder.min.js
inflating: profilecohort/COPYING.txt
creating: profilecohort/tests/
inflating: profilecohort/tests/rules_test.php
creating: profilecohort/tests/behat/
inflating: profilecohort/tests/behat/edit_rules.feature
inflating: profilecohort/tests/behat/behat_local_profilecohort.php
inflating: profilecohort/index.php
inflating: profilecohort/cohorts.php
creating: profilecohort/db/
inflating: profilecohort/db/events.php
inflating: profilecohort/db/upgrade.php
inflating: profilecohort/db/tasks.php
inflating: profilecohort/db/install.xml
inflating: profilecohort/version.php
inflating: profilecohort/settings.php
inflating: profilecohort/README.md

Move Plugin Cmd:mv /tmp/profilecohort /var/www/html/local//.
Inside executeExecCall
Output:
Done installing plugin profilecohort
Erred Plugins:[]
Done installing Plugins
Returning from deployMoodle
Moodle URL:http://moodle1:32000


Moodle Name:moodle1
Plugins:[profilecohort]
Spec Plugins:[profilecohort]
Installed Plugins:[profilecohort]
Plugins to install:%v
[]
Plugins to remove:%v
[]
Supported Plugins:%v []
Unsupported Plugins:%v []
Moodle custom resource moodle1 did not change. No plugin installed.

Logs:

$ kubectl logs -c kube-discovery-apiserver
error: expected 'logs (POD | TYPE/NAME) [CONTAINER_NAME]'.
POD or TYPE/NAME is a required argument for the logs command
See 'kubectl logs -h' for help and examples.
same output for others...
$ kubectl logs -c operator-deployer

$ kubectl logs -c operator-manager

So this issue may be because I did delete the deploy/ directory before I tried to delete the moodle-operator-deployment pod. But now, when I re-applied the kubeplus 4/4 pods, I can no longer see the operator or delete this pod anymore . See command inputs:


$ kubectl get pods
NAME                                          READY     STATUS    RESTARTS   AGE
kubeplus-gkhhk                                4/4       Running   0          8m
moodle-operator-deployment-64cdcb958c-j5wdr   1/1       Running   0          24m

$ kubectl delete -f moodle-operator.yaml
Error from server (NotFound): error when deleting "moodle-operator.yaml": operators.operatorcontroller.kubeplus "moodle-operator" not found


YAMLS: moodle-operator.yaml
Image Tags for: Operator-Manager

Certificate validation issue when running behind a firewall/proxy

The machine on which this test was run included a firewall software running on it in the background. KubePlus seems to get into certificate validation issue in such a setup. Turning off the firewall caused this issue to go away and then KubePlus deployments worked.

kubectl logs kubeplus-gzzs7 -c operator-deployer
Created tunnel using local port: 36251
Effective Operators to install:[https://github.com/cloud-ark/operatorcharts/blob/master/postgres-crd-v2-chart-0.0.2.tgz?raw=true]
Error: rpc error: code = Unknown desc = configmaps is forbidden: User "system:serviceaccount:kube-system:default" cannot list configmaps in the namespace "kube-system"Installing chart.
Chart URLhttps://github.com/cloud-ark/operatorcharts/blob/master/postgres-crd-v2-chart-0.0.2.tgz?raw=true
ValuesString:null
ChartValues:[]
Name:https://github.com/cloud-ark/operatorcharts/blob/master/postgres-crd-v2-chart-0.0.2.tgz?raw=true, Version:
Filename:
Name:https://github.com/cloud-ark/operatorcharts/blob/master/postgres-crd-v2-chart-0.0.2.tgz?raw=true
panic: Get https://github.com/cloud-ark/operatorcharts/blob/master/postgres-crd-v2-chart-0.0.2.tgz?raw=true: x509: certificate signed by unknown authority

goroutine 1 [running]:
main.locateChartPath(0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xc4208b8060, 0x60, 0x0, 0x0, ...)
/Users/devdatta/go/src/github.com/cloud-ark/kubeplus/operator-deployer/install.go:494 +0xe55
main.installChart(0x18cfd20, 0xc420692240, 0x18a0ee0, 0xc42009c008, 0xc4208b8060, 0x60, 0x0, 0x0, 0x0, 0x0, ...)
/Users/devdatta/go/src/github.com/cloud-ark/kubeplus/operator-deployer/install.go:176 +0x1cc
main.main()
/Users/devdatta/go/src/github.com/cloud-ark/kubeplus/operator-deployer/helm.go:220 +0x11b5

Consolidate deployment artifacts

Currently there is some overlap of functionality in deployment artifacts. For instance, we grant cluster-admin role to both the default service account and apiserver service account that we create. The default service account is currently needed for Helm.

We should run everything using the apiserver service account. In this we can also consider deploying tiller as part of KubePlus deploy artifacts. Then we will also remove the step of user having to install Helm first.

Communication between Operator Pod and resource instance Pod

In the Postgres Operator, we are using service type of NodePort for the created Postgres Pod. Then for setting up the database Postgres Operator Pod makes calls against Public_IP_of_Node:NodePort of the Postgres service. There are following problems with this approach.

  1. We need to know the Public_IP of the Node and pass it as environment variable to the Postgres Operator deployment YAML. We can use status.hostIP downward API call. But this does not seem to work everywhere. It works on Minikube, but it does not work on AWS EC2 instances. So some sort of workaround is required to update the deployment YAML to include Public_IP of the node on EC2 before applying it.

  2. On Public cloud like AWS, we need to modify the security groups of the EC2 instance to allow traffic to ports 30000-32767 from any place. This is an extra step that cluster administrator needs to perform.

  3. Currently we store the connection string in the Postgres instance Status for later use. In multi-node cluster if the Postgres Pod gets scheduled to a different Node after restart, the saved connection string will not work.

Ideally the Postgres Operator Pod should be able to communicate with the Postgres Pod using the name of the Service object created for that particular Postgres instance.

No resources found for "mysql-operator-0.2.1" namespace.

~/kubeplus/examples/mysql$ kubectl get pods -n mysql-operator-0.2.1
No resources found.

But when tried to get pods on namespace "mysql-operator" it worked.

~/kubeplus/examples/mysql$ kubectl get pods -n mysql-operator
NAME READY STATUS RESTARTS AGE
mysql-operator-56cb675b7f-szmj8 1/1 Running 0 6m

Deleting Operator should delete Operator deployment Pod and any custom resource pods

After deleting kubectl delete operator postgres operator, when we do kubectl get pods we see the following output:

kubeplus-q7khv 4/4 Running 21 51d
postgres-operator-deployment-7649cbf58b-7f85c 1/1 Running 1 51d
postgres1-86dbf8b95b-qs6sg 1/1 Running 1 51d

This is incorrect. Once postgres operator is deleted the corresponding deployment should get deleted, along with postgres1 pod

Handling Operator delete

kubeplus should support delete of an Operator.

Following needs to be done as part of this:

  • Delete the deployed Operator Helm Chart
  • Delete the ConfigMap that is created to store the OpenAPISpec of the custom resources managed by the Operator
  • Delete any custom resource instances that are managed by the Operator

Output of kubectl explain

Currently the output of:

kubectl get --raw "/apis/kubeplus.cloudark.io/v1/explain?kind=..."

shows the entire OpenAPISpec definition of all the custom resources being managed by an Operator.

The output should be restricted to only the Kind that we have provided as input.

In the long term we plan to integrate our work with upstream. This is being tracked in following issue:
kubernetes/kube-openapi#97

Handling Operator version upgrade

We should support upgrading Operator to a different version.

As part of this following needs to be done:

  • Find all the resources that correspond to the current version of the Operator and transfer their Owner references to the new Operator.

Failed to list *v1.Postgres: the server could not find the requested resource (get postgreses.postgrescontroller.kubeplus)

Hi,

I tried your Custom Resource Postgres. But it failed

E0424 11:08:56.640812   44329 reflector.go:205] github.com/cloud-ark/kubeplus/postgres-crd/pkg/client/informers/externalversions/factory.go:74: Failed to list *v1.Postgres: the server could not find the requested resource (get postgreses.postgrescontroller.kubeplus)
E0424 11:08:57.645547   44329 reflector.go:205] github.com/cloud-ark/kubeplus/postgres-crd/pkg/client/informers/externalversions/factory.go:74: Failed to list *v1.Postgres: the server could not find the requested resource (get postgreses.postgrescontroller.kubeplus)
E0424 11:08:58.648548   44329 reflector.go:205] github.com/cloud-ark/kubeplus/postgres-crd/pkg/client/informers/externalversions/factory.go:74: Failed to list *v1.Postgres: the server could not find the requested resource (get postgreses.postgrescontroller.kubeplus)

I'm using Minikube, Kube version 1.10.

Mysql operator step 12 : can not find three pods

After executing steps below, we cannot find three mysql pods running.
While the steps mention about three pods mysql0 mysql1 mysql2

737 minikube start --memory 4096
738 helm init
739 kubectl get pods -n kube-system
740 kubectl apply -f deploy/
741 kubectl get pods
742 cd examples/mysql
743 kubectl create -f mysql-operator-chart-0.2.1.yaml
744 kubectl get ns
745 kubectl get pods
747 kubectl get pods -n mysql-operator

748 kubectl get operators
749 kubectl describe operators mysql-operator-0.2.1
750 kubectl describe customresourcedefinition mysqlclusters.mysql.oracle.com
751 kubectl get --raw "/apis/kubeplus.cloudark.io/v1/explain?kind=Cluster" | python -m json.tool
752 kubectl get --raw "/apis/kubeplus.cloudark.io/v1/explain?kind=Cluster.ClusterSpec" | python -m json.tool

753 kubectl describe operators mysql-operator-0.2.1
754 kubectl describe customresourcedefinition mysqlbackups.mysql.oracle.com
755 kubectl get --raw "/apis/kubeplus.cloudark.io/v1/explain?kind=Backup" | python -m json.tool
756 kubectl get --raw "/apis/kubeplus.cloudark.io/v1/explain?kind=Backup.BackupSpec" | python -m json.tool
757 kubectl get --raw "/apis/kubeplus.cloudark.io/v1/explain?kind=Backup.BackupSpec.StorageProvider" | python -m json.tool
758 kubectl get --raw "/apis/kubeplus.cloudark.io/v1/explain?kind=Backup.BackupSpec.StorageProvider.S3StorageProvider" | python -m json.tool

759 kubectl describe operators mysql-operator-0.2.1
760 kubectl describe customresourcedefinition mysqlrestores.mysql.oracle.com
761 kubectl get --raw "/apis/kubeplus.cloudark.io/v1/explain?kind=Restore" | python -m json.tool
762 kubectl get --raw "/apis/kubeplus.cloudark.io/v1/explain?kind=Restore.RestoreSpec" | python -m json.tool

763 kubectl describe operators mysql-operator-0.2.1
764 kubectl describe customresourcedefinition mysqlbackupschedules.mysql.oracle.com
765 kubectl get --raw "/apis/kubeplus.cloudark.io/v1/explain?kind=BackupSchedule" | python -m json.tool
766 kubectl get --raw "/apis/kubeplus.cloudark.io/v1/explain?kind=BackupSchedule.BackupScheduleSpec" | python -m json.tool

767 kubectl get pods

NAME READY STATUS RESTARTS AGE
kubeplus-sncw4 4/4 Running 0 1h

Set owner references on underlying Resources/Objects created by Operator

We should set Owner reference on the Objects that are created as part of the Operator deployment. The owner should be the Operator Object.

How to achieve this?
-> Capture output of Helm deploy to find all the Objects that got created as part of the Operator deployment. On each of those objects set the Owner reference using the ID of the Operator object.

kubectl explain for Operator CRD

We should support kubectl explain for Operator CRD itself.

kubectl explain --raw "/apis/kubeplus.cloudark.io/v1/explain?kind=Operator" should show the OpenAPISpec for the Operator kind.

Support for Helm chart that is available on local filesystem

Currently KubePlus supports chartURL in Operator CRD. This requires that the chart be present on some helm repository. However there are many situations when we may want to deploy a chart package that is available on the file system locally. We should consider adding support for it.

Note that any solution needs to work within the constraint of 'kubectl' -- i.e. it should be possible to use kubectl only.

Keeping in mind this constraint, a solution for supporting locally available helm chart package would work as follows.

  • We first create a configmap with the locally available chart
  • We teach Operator CRD's chartURL field to understand configmap names. These can be represented as just names without any prefix such as http.

Running the postgres operator without minikube

My set up is that I have a GCE node up and running kubernetes. I followed this tutorial and cloned kubernetes source code into my $GOPATH. At that point I ran hack/local-up-cluster.sh, which runs the kubernetes cluster.
https://dzone.com/articles/easy-step-by-step-local-kubernetes-source-code-cha

I had an issue with running the operator:
go run *.go -kubeconfig=$HOME/.kube/config

It just complains by saying File not found: config.
While running on minikube, this works fine because minikube puts the file in the .kube directory, but when running an actual kubernetes cluster, I did not find it in the .kube directory.

I found the location of the config file in the $KUBECONFIG environment variable.
$KUBECONFIG points to: /var/run/kubernetes/admin.kubeconfig (in case the variable is not set, this is where to find it)

go run *.go -kubeconfig=$KUBECONFIG
worked just fine for me

image

I also set the MINIKUBE_IP variable to the IP of my server, (in controller.go) to 35.196.46.32 which i got from my gcp console

Add Contributing guidelines

Add guidelines for contributing, reporting issues, etc.

For reporting issues we will at least need following information:

  1. Kubernetes server version
  2. kubectl version
  3. helm version
  4. Host details
    • If using minikube, minikube version
    • If using cloud VM, flavor
  5. Error output
  6. kubeplus sha, kubediscovery sha
  7. operator-manager, operator-deployer, kube-discovery-apiserver image tags

How do status type get updated

Where is the code for the controller to update the status type?

AvailableReplicas int32 `json:"availableReplicas"`
ActionHistory []string `json:"actionHistory"`
Users []UserSpec `json:"users"`
Databases []string `json:"databases"`
VerifyCmd string `json:"verifyCommand"`
ServiceIP string `json:"serviceIP"`
ServicePort string `json:"servicePort"`
Status string `json:"status"`

Helm configmaps not deleted after Operator is deleted

Now that we are running tiller in the default namespace, it looks like the configmaps that Tiller creates for deploying each Operator chart is not getting deleted after the Operator is deleted. We should delete the configmap in the DeleteFunc of operator-manager/controller.go

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.