kudobuilder / kudo Goto Github PK

Kubernetes Universal Declarative Operator (KUDO)

License: Apache License 2.0

Dockerfile 0.21% Makefile 0.76% Go 96.34% Shell 2.69%

cluster cncf crd hacktoberfest kafka kubernetes kubernetes-community kubernetes-controller kubernetes-operator kudo maestro mysql operator sdk zookeeper

kudo's People

Contributors

Stargazers

Watchers

Forkers

raghu999 humblec guenter eddieohlson kensipe ryadav88 fabianbaier shaneutt chakra-coder rimusz kaiwalyajoshi jbarrick-mesosphere abmybgx porridge yankcrime samvantran zmalik tattdcodemonkey amitkumardas tbaums zencircle sharego raravena80 oliviabarrick gerred harryge00 awesomegolang jpg-datarobot genums kmova divoxx mpereira edgarlanting a-hilaly danielmschmidt makkes muratkars rawdatalabs anthonydahanne sarjeet2013 jlaswell hypnoglow clistoq alenkacz jianhuabi franklinharry faiq cuijian666888 armandgrillet jmccormick2001 migueldoss mans2singh sivaramsk laashub-soa anthonyheckmann kubernetes-arm lauraeci joejulian nitisht adamgoose mchirico amit2016-17 pradeepongithub davidleeux wallashop syldej devopstoday11 kubernetescheatsheet isgasho ybooks240 asekretenko davar-playgrounds rishabh96b clix-dev-llc marvel-works doytsujin filintod akiros001 ygelfand syllogy nathanawmk jangocheng mitzen best-of-k8s moteesh-reddy mkrupczak3 steph0de alainlompo jdoucerain ericferrier bignuoli desmax74 oneidentity weblfe anoland kalil-pelissier gg-big-org diptochakrabarty iq-scm abhijeetgauravm

kudo's Issues

Discuss API (replace/restart ..)

What is the best way to expose?

CreateOrUpdate function fix

In the plan controller, the line:

result, err := controllerutil.CreateOrUpdate(context.TODO(), r.Client, obj, func(runtime.Object) error { return nil })

needs to be fixed. The last argument of this function is supposed to capture the modifications to the object pulled from the server. Need to replace it with something like this from instance_controller.go

did, err := controllerutil.CreateOrUpdate(context.TODO(), mgr.GetClient(), current, func(o runtime.Object) error {
   t := true
   o.(*maestrov1alpha1.PlanExecution).Spec.Suspend = &t
   return nil
})

Custom Overrides for PlanExecutions

For some plans, it may make sense to run on an instance with a different value than specified by the Instance or FrameworkVersion, e.g. backup/restore. An example spec might look like:

apiVersion: maestro.k8s.io/v1alpha1
kind: PlanExecution
metadata:
  name: small-backup
  namespace: default
  ownerReferences:
  - apiVersion: maestro.k8s.io/v1alpha1
    blockOwnerDeletion: true
    controller: true
    kind: Instance
    name: small
    uid: a1fc8f64-fa54-11e8-8673-08002795d782
spec:
  instance:
    kind: Instance
    name: small
    namespace: default
  planName: backup
  arguments:
    BACKUP_LOCATION: s3://backup-bucket/data.sql

Some parameters should not be over-rideable and we'd want to include that in the parameter definition in the FrameworkVersion.

Support Canary or Blue/Green updates

Show examples of how to implement Canary or Blue/green deployments.

Call plan explicitly

Some plans are determined by specific changes in states (e.g. Version update means to call upgrade).

For specific plans (e..g createTopic), a mechanism needs to be in place to explicitly call plans

Handling different minor and major versions of frameworks.

TODO

Package verify should validate contents of operator.yaml

If a task definiton doesn't contain appropriate data:

restore:
      resource:
      - restore.yaml

instead of

restore:
      resources:
      - restore.yaml

Maestro doesn't alert the user of incorrect data.

Finalize plan code

Upgrade Kubernetes version to support 1.13

Currently requires Kubebuilder to support Kubernetes 1.12. kubernetes-sigs/cluster-api#522

When trying to update to Kubernetes 1.12, we get the following error:

Solving failure: No versions of sigs.k8s.io/controller-runtime met constraints:
    v0.1.7: Could not introduce sigs.k8s.io/[email protected], as it has a dependency on k8s.io/client-go with constraint kubernetes-1.11.2, which has no overlap with existing constraint ^9.0.0 from (root)
    v0.1.6: Could not introduce sigs.k8s.io/[email protected], as it is not allowed by constraint ^0.1.7 from project github.com/kubernetes-sigs/kubebuilder-maestro.
    v0.1.5: Could not introduce sigs.k8s.io/[email protected], as it is not allowed by constraint ^0.1.7 from project github.com/kubernetes-sigs/kubebuilder-maestro.
    v0.1.4: Could not introduce sigs.k8s.io/[email protected], as it is not allowed by constraint ^0.1.7 from project github.com/kubernetes-sigs/kubebuilder-maestro.
    v0.1.3: Could not introduce sigs.k8s.io/[email protected], as it is not allowed by constraint ^0.1.7 from project github.com/kubernetes-sigs/kubebuilder-maestro.
    v0.1.2: Could not introduce sigs.k8s.io/[email protected], as it is not allowed by constraint ^0.1.7 from project github.com/kubernetes-sigs/kubebuilder-maestro.
    v0.1.1: Could not introduce sigs.k8s.io/[email protected], as it is not allowed by constraint ^0.1.7 from project github.com/kubernetes-sigs/kubebuilder-maestro.
    master: Could not introduce sigs.k8s.io/controller-runtime@master, as it is not allowed by constraint ^0.1.7 from project github.com/kubernetes-sigs/kubebuilder-maestro.
    admissionwebhook: Could not introduce sigs.k8s.io/controller-runtime@admissionwebhook, as it is not allowed by constraint ^0.1.7 from project github.com/kubernetes-sigs/kubebuilder-maestro.
    admissionwebhook-1.11: Could not introduce sigs.k8s.io/[email protected], as it is not allowed by constraint ^0.1.7 from project github.com/kubernetes-sigs/kubebuilder-maestro.
    bulk_deposit: Could not introduce sigs.k8s.io/controller-runtime@bulk_deposit, as it is not allowed by constraint ^0.1.7 from project github.com/kubernetes-sigs/kubebuilder-maestro.
    release-0.1: Could not introduce sigs.k8s.io/[email protected], as it is not allowed by constraint ^0.1.7 from project github.com/kubernetes-sigs/kubebuilder-maestro.
    review: Could not introduce sigs.k8s.io/controller-runtime@review, as it is not allowed by constraint ^0.1.7 from project github.com/kubernetes-sigs/kubebuilder-maestro.

Once complete, re-running dep ensure should update the client-go library.

Create finalizers on FrameworkVersion when Instances are created.

finalizer name is frameworks.maestro.io

Create maestroctl CLI command

Create the initial framework of a Maestro CLI command as a skeleton for future sub-commands that can be used to manipulate Maestro-specific CRDs or an API aggregation layer if one exists.

Usage documentation with examples.

Joining kubectl kudo plugin with kudoctl

We should join the kubectl maestro plugin and the maestroctl go binary.

Create Dockerfile and build instructions

Create a Dockerfile, Docker images for released version. This can be a manual process for this image for purposes of a Kubecon release. Another release process automation issue will be created.

How does Updatating a FrameworkVersion Impact deployed Instances

If we have an instance referencing a Framework Version, and that FV gets updated, how should we handle changes to the instance?

Do nothing
re-run the most recent plan
Custom FrameworkUpdate Plan option if there's something special to do

Support Dependent Instances

Allow for FrameworkInstances to reference other FrameworkInstances to satisfy a dependency. For example Kafka needs an instance of Zookeeper if the zookeeper.url is not provided

Annotate configurable values for FrameworkVersion

To help with kubectl describe output and facilitate other tooling, as well as help users discover configuration for a particular service, let's add descriptions to the FrameworkVersion default parameters configuration values.

Resources created before operator is running dont reconcile

To reproduce start with a vanilla cluster, apply all crds, and then start the controller

kubectl plugin for plan execution

We'd like to be able type things like:

kubectl maestro plan show deploy --instance=kafka

And get a plan output.

Changing parameters in Instance won't trigger actions

Expected Behavior

When changing parameters in an Instance, e.g. BROKERS_COUNT: "4" for Kafka or FLINK_TASKMANAGER_REPLICAS: "3" for Flink I would expect Maestro to scale up/down accordingly.

Observed Behavior

Nothing really happens, though I see some (?unrelated) error messages as well as information about plans that could lead us on the right path.

maestro-demo $ cat flink-instance.yaml 
apiVersion: maestro.k8s.io/v1alpha1
kind: Instance
metadata:
  labels:
    controller-tools.k8s.io: "1.0"
    framework: flink
  name: flink # this is the instance label which will lead the pod name
spec:
  frameworkVersion:
    name: flink-1.7
    namespace: default
    type: FrameworkVersion
maestro-demo $ kubectl apply -f flink-instance.yaml 
instance.maestro.k8s.io/flink created
maestro-demo $ kubectl get pods
NAME                                READY   STATUS    RESTARTS   AGE
flink-jobmanager-69cd6768b9-bjr2v   1/1     Running   0          58s
flink-taskmanager-d57b9c8bc-d49z2   1/1     Running   0          56s
flink-taskmanager-d57b9c8bc-ng8rv   1/1     Running   0          56s
maestro-demo $ cat flink-instance.yaml 
apiVersion: maestro.k8s.io/v1alpha1
kind: Instance
metadata:
  labels:
    controller-tools.k8s.io: "1.0"
    framework: flink
  name: flink # this is the instance label which will lead the pod name
spec:
  frameworkVersion:
    name: flink-1.7
    namespace: default
    type: FrameworkVersion
  parameters:
    FLINK_TASKMANAGER_REPLICAS: "3"
maestro-demo $ kubectl apply -f flink-instance.yaml 
instance.maestro.k8s.io/flink configured

I would expect now to have another taskmanager running, but instead I get:

2019/01/28 21:02:28 Error getting FrameworkVersion flink-1.7 for instance flink: FrameworkVersion.maestro.k8s.io "flink-1.7" not found

although the framework exists:

maestro-demo $ kubectl get frameworkversions
NAME        CREATED AT
flink-1.7   8m

I tested this as well with Kafka where I see the same behavior (error message 2019/01/28 21:28:14 Error getting FrameworkVersion zookeeper-1.0 for instance zk: FrameworkVersion.maestro.k8s.io "zookeeper-1.0" not found)

Full log

2019/01/28 21:02:28 Recieved create event for &{{Instance maestro.k8s.io/v1alpha1} {flink  default /apis/maestro.k8s.io/v1alpha1/namespaces/default/instances/flink 56e2f7a8-2382-11e9-822f-42010a800154 2949 1 2019-01-28 20:57:12 -0800 PST <nil> <nil> map[framework:flink controller-tools.k8s.io:1.0] map[kubectl.kubernetes.io/last-applied-configuration:{"apiVersion":"maestro.k8s.io/v1alpha1","kind":"Instance","metadata":{"annotations":{},"labels":{"controller-tools.k8s.io":"1.0","framework":"flink"},"name":"flink","namespace":"default"},"spec":{"frameworkVersion":{"name":"flink-1.7","namespace":"default","type":"FrameworkVersion"},"parameters":{"FLINK_TASKMANAGER_REPLICAS":"3"}}}
] [] nil [] } {{ default flink-1.7    } [] map[FLINK_TASKMANAGER_REPLICAS:3]} {{ default flink-deploy-997290000 f4bf9ccc-2382-11e9-822f-42010a800154   } COMPLETE}}
2019/01/28 21:02:28 Error getting FrameworkVersion flink-1.7 for instance flink: FrameworkVersion.maestro.k8s.io "flink-1.7" not found
2019/01/28 21:02:28 Adding flink-deploy-997290000 to reconcile
2019/01/28 21:02:28 Adding flink-deploy-997290000 to reconcile
{"level":"info","ts":1548738148.9631062,"logger":"kubebuilder.controller","caller":"controller/controller.go:134","msg":"Starting Controller","Controller":"instance-controller"}
{"level":"info","ts":1548738148.963074,"logger":"kubebuilder.controller","caller":"controller/controller.go:134","msg":"Starting Controller","Controller":"planexecution-controller"}
{"level":"info","ts":1548738148.9630651,"logger":"kubebuilder.controller","caller":"controller/controller.go:134","msg":"Starting Controller","Controller":"framework-controller"}
{"level":"info","ts":1548738148.9630919,"logger":"kubebuilder.controller","caller":"controller/controller.go:134","msg":"Starting Controller","Controller":"frameworkversion-controller"}
{"level":"info","ts":1548738149.0641012,"logger":"kubebuilder.controller","caller":"controller/controller.go:153","msg":"Starting workers","Controller":"planexecution-controller","WorkerCount":1}
2019/01/28 21:02:29 PlanExecution flink-deploy-243510000 has already run to completion, not processing.
2019/01/28 21:02:29 PlanExecution flink-deploy-997290000 has already run to completion, not processing.
{"level":"info","ts":1548738149.0677109,"logger":"kubebuilder.controller","caller":"controller/controller.go:153","msg":"Starting workers","Controller":"instance-controller","WorkerCount":1}
{"level":"info","ts":1548738149.067785,"logger":"kubebuilder.controller","caller":"controller/controller.go:153","msg":"Starting workers","Controller":"frameworkversion-controller","WorkerCount":1}
{"level":"info","ts":1548738149.0677068,"logger":"kubebuilder.controller","caller":"controller/controller.go:153","msg":"Starting workers","Controller":"framework-controller","WorkerCount":1}
2019/01/28 21:02:29 FrameworkController: Recieved Reconcile request for flink
2019/01/28 21:02:29 FrameworkVersionController: Recieved Reconcile request for flink-1.7
2019/01/28 21:02:40 InstanceController: UpdateInstance: Going to call plan deploy
2019/01/28 21:02:40 Current Plan for Instance is already done, wont change the Suspend flag
2019/01/28 21:02:40 InstanceController: Recieved Reconcile request for flink
2019/01/28 21:02:40 Old and new spec matched...
2019/01/28 21:02:40 InstanceController: UpdateInstance: Going to call plan 
2019/01/28 21:02:40 InstanceController: Recieved Reconcile request for flink
2019/01/28 21:02:40 Phase 0 Step 0 has 3 objects
2019/01/28 21:02:40 CreateOrUpdate resulted in: unchanged
2019/01/28 21:02:40 Unkonwn type is marked healthy by default
2019/01/28 21:02:40 CreateOrUpdate resulted in: unchanged
2019/01/28 21:02:40 Unkonwn type is marked healthy by default
2019/01/28 21:02:40 CreateOrUpdate resulted in: unchanged
2019/01/28 21:02:40 Unkonwn type is marked healthy by default
2019/01/28 21:02:40 Phase flink has strategy serial
2019/01/28 21:02:40 Phase flink marked as serial
2019/01/28 21:02:40 Step jobmanager is healthy, so I can continue on
2019/01/28 21:02:40 Step jobmanager looked at
2019/01/28 21:02:40 Phase flink is healthy
2019/01/28 21:02:40 Phase flink marked as healthy
2019/01/28 21:02:40 Phase flink is healthy
2019/01/28 21:02:40 PlanExecution flink-deploy-28047000 has already run to completion, not processing.

Ability to delete objects in steps

When performing a canary or blue/green deployment (#70 ) there needs to be the ability to clean up objects that are created as part of the rollout

Kustomize Label Additions prevent services from finding Deployments

Kustomize provides a feature to automatically add the labels on the service object into the selector spec. This typically provides consistency for objects created, but can cause issues for pods created from a different STEP than the service:

Consider the following:

 plans:
    deploy:
      strategy: serial
      phases:
        - name: all
          strategy: parallel
          steps:
            - name: services
              tasks:
              - services
            - name: deployment
              tasks:
              - deployment

The service object

    service.yaml: |
      apiVersion: v1
      kind: Service
      metadata:
        name: appinfo
        labels:
          app: appinfo
      spec:
        ports:
        - port: 8080
          protocol: TCP
        selector:
          app: appinfo
        type: LoadBalancer

Has the following label selectors:

$ kubectl get svc bird-appinfo -o jsonpath="{.spec.selector['step']}"
services

whereas the deployment gets created with the following label

$kubectl get deployments bird-appinfo -o jsonpath="{ .metadata.labels['step']}"
deployment

This added label for the specific step was added to be able to differentiate Jobs that need to be re-created even when they have the same name.

Naming Conventions

We currently have Frameworks, FrameworkVersions and Instances. Should Instances be renamed to FrameworkInstances to better mark that its part of these sets of CRDS?

Kubernetes Objects should be Autogenerated from ServiceSpec

We currently hardcode a set of kubernetes objects and default values. These should be pulled from the following files in the universe:

svc.yml - This should define most of the Kubernetes objects. The Plan section in this file should dictate the operations as the Instance moves around its lifecycle
config.json - Shows the available parameters for a FrameworkVersion
marathon.json.mustache - Tracks the conversions of parameters to environment variables
resource.json - Tracks binaries required for creation of Docker Image used for FrameworkVersion
package.json - Tracks metadata about the FrameworkVersion
mustache files in src/main/dist/ for the framework. These should map to ConfigMap objects

Are there any other files that are used as input into a FrameworkVersion?

Update upgrade.yaml example

Convert to new template/patches example

Add Jenkinsfile for Testing

Reminder to implement CICD Pipeline

Error Message when framework/frameworkversion does not exist needs to be more descriptive

Right now when we install an instance of a framework1 as part of another framework0 without having the framework/frameworkversion both properly installed, the presented error message helps little to point us in the right direction:

Error getting PlaneExecution /: PlanExecution.maestro.k8s.io "" not found

This can be easily reproduced by trying to install the flink-demo that has zookeeper as a dependency without having the framework/frameworkversion of zookeeper installed.

What would be more intuitive to the user is that there is an error message saying the framework or frameworkversion for that instance wasn't found.

The longer error message:

2019/01/22 12:04:20 Obj is NOT healthy: &{{ } {demo-zk  default /apis/maestro.k8s.io/v1alpha1/namespaces/default/instances/demo-zk e73c3016-1e80-11e9-b99e-08002788f190 2444 1 2019-01-22 12:04:19 -0800 PST <nil> <nil> map[heritage:maestro phase:dependencies instance:demo plan:deploy step:zookeeper version: framework:zookeeper app:flink-financial-demo controller-tools.k8s.io:1.0 planexecution:demo-deploy-109789000] map[] [{maestro.k8s.io/v1alpha1 Instance demo e6baa583-1e80-11e9-b99e-08002788f190 0xc000daa7ac 0xc000daa7ad}] nil [] } {{ default zookeeper-1.0    } [] map[ZOOKEEPER_CPUS:0.3]} {{      } }}
2019/01/22 12:04:20 Phase dependencies has strategy serial
2019/01/22 12:04:20 Phase dependencies marked as serial
2019/01/22 12:04:20 Step zookeeper isn't complete, skipping rest of steps in phase until it is
2019/01/22 12:04:20 Phase dependencies is not healthy b/c step zookeeper is not healthy
2019/01/22 12:04:20 Phase dependencies not healthy, and plan marked as serial, so breaking.
2019/01/22 12:04:20 Phase dependencies is not healthy b/c step zookeeper is not healthy

We actually have that error message at another part already. When we try to install just zookeeper via kubectl apply -f zookeeper-instance.yaml without having the framework/frameworkversion installed as a prerequisite, we end up with a more descriptive message:

2019/01/22 12:07:01 Error getting FrameworkVersion zookeeper-1.0 for instance zk: FrameworkVersion.maestro.k8s.io "zookeeper-1.0" not found

Long error message:

2019/01/22 12:06:50 Could not find planExecution demo-deploy-109789000: PlanExecution.maestro.k8s.io "demo-deploy-109789000" not found
2019/01/22 12:06:50 Error getting instance object: Instance.maestro.k8s.io "demo" not found
2019/01/22 12:07:01 Recieved create event for &{{ } {zk  default /apis/maestro.k8s.io/v1alpha1/namespaces/default/instances/zk 47b56197-1e81-11e9-b99e-08002788f190 2629 1 2019-01-22 12:07:01 -0800 PST <nil> <nil> map[framework:zookeeper controller-tools.k8s.io:1.0] map[kubectl.kubernetes.io/last-applied-configuration:{"apiVersion":"maestro.k8s.io/v1alpha1","kind":"Instance","metadata":{"annotations":{},"labels":{"controller-tools.k8s.io":"1.0","framework":"zookeeper"},"name":"zk","namespace":"default"},"spec":{"frameworkVersion":{"name":"zookeeper-1.0","namespace":"default","type":"FrameworkVersions"},"name":"zk","parameters":{"ZOOKEEPER_CPUS":"0.3"}}}
] [] nil [] } {{ default zookeeper-1.0    } [] map[ZOOKEEPER_CPUS:0.3]} {{      } }}
2019/01/22 12:07:01 Error getting FrameworkVersion zookeeper-1.0 for instance zk: FrameworkVersion.maestro.k8s.io "zookeeper-1.0" not found

Add sub-command to maestroctl for viewing plan status

Provide a sub-command for maestroctl to enable viewing the status of PlanExecution CRDs.

Naming Conventions

We currently have Frameworks, FrameworkVersions and Instances. Should Instances be renamed to FrameworkInstances to better mark that its part of these sets of CRDS?

Standardize Logging package

Currently I've been using a smattering of log and fmt

Modify Defaults.config spec to include more information

e.g.

parameters:
  - name: BROKER_COUNT
    description: The number of kafka brokers that should be started
    required: false
    default: 3
  - name: BROKER_CPUS
    ....

Kubecon Release

Release Tag
Docker Image
Binary

Handling custom code, ancillary plans, and plan overriders

Implement binary release process

This binary release process should cut binaries to the Github releases. This will facilitate users to get the operator and maestroctl binaries.

Check Deployment Health

Currently stubbed out:

func IsHealthy(c client.Client, obj runtime.Object) error {

	switch obj.(type) {
	case *appsv1.Deployment:
		d := obj.(*appsv1.Deployment)
		log.Printf("Deployment %v is marked healthy\n", d.Name)
		return nil

Will look at:
Check the number of ReadyPods in the Status match the number of Replicas in the spec

make run fails

$ make run
go generate ./pkg/... ./cmd/...
go fmt ./pkg/... ./cmd/...
go vet ./pkg/... ./cmd/...
go run ./cmd/manager/main.go
# github.com/maestrosdk/maestro/vendor/k8s.io/client-go/transport
vendor/k8s.io/client-go/transport/round_trippers.go:437:9: undefined: strings.Builder
# github.com/maestrosdk/maestro/vendor/sigs.k8s.io/kustomize/pkg/target
vendor/sigs.k8s.io/kustomize/pkg/target/kusttarget.go:89:5: dec.DisallowUnknownFields undefined (type *json.Decoder has no field or method DisallowUnknownFields)
make: *** [run] Error 2

Support update plans when changing version

Non- Preemptable Plans

Some plans should not be interupted. For example creating backups of databases should be completed before continuing to another plan.

Add plan parameter to set plan as non-preemptable. By default they will be preemptable

[Proposal] Use Application CRD

make install doesn't work

$ make install
go run vendor/sigs.k8s.io/controller-tools/cmd/controller-gen/main.go all
Breaking recursion for type github.com/maestrosdk/maestro/pkg/apis/maestro/v1alpha1.FrameworkVersionCRD manifests generated under '/Users/djannot/Documents/go/src/github.com/maestrosdk/maestro/config/crds' 
RBAC manifests generated under '/Users/djannot/Documents/go/src/github.com/maestrosdk/maestro/config/rbac' 
kubectl apply -f config/crds
customresourcedefinition "frameworks.maestro.k8s.io" configured
error validating "config/crds/maestro_v1alpha1_frameworkversion.yaml": error validating data: [ValidationError(CustomResourceDefinition.spec.validation.openAPIV3Schema.properties.spec.properties.dependencies.items): invalid type for io.k8s.apiextensions-apiserver.pkg.apis.apiextensions.v1beta1.JSONSchemaPropsOrArray: got "map", expected "", ValidationError(CustomResourceDefinition.spec.validation.openAPIV3Schema.properties.spec.properties.parameters.items): invalid type for io.k8s.apiextensions-apiserver.pkg.apis.apiextensions.v1beta1.JSONSchemaPropsOrArray: got "map", expected "", ValidationError(CustomResourceDefinition.spec.validation.openAPIV3Schema.properties.spec.properties.upgradableFrom.items): invalid type for io.k8s.apiextensions-apiserver.pkg.apis.apiextensions.v1beta1.JSONSchemaPropsOrArray: got "map", expected ""]; if you choose to ignore these errors, turn validation off with --validate=false
error validating "config/crds/maestro_v1alpha1_instance.yaml": error validating data: ValidationError(CustomResourceDefinition.spec.validation.openAPIV3Schema.properties.spec.properties.dependencies.items): invalid type for io.k8s.apiextensions-apiserver.pkg.apis.apiextensions.v1beta1.JSONSchemaPropsOrArray: got "map", expected ""; if you choose to ignore these errors, turn validation off with --validate=false
error validating "config/crds/maestro_v1alpha1_planexecution.yaml": error validating data: ValidationError(CustomResourceDefinition.spec.validation.openAPIV3Schema.properties.status.properties.phases.items): invalid type for io.k8s.apiextensions-apiserver.pkg.apis.apiextensions.v1beta1.JSONSchemaPropsOrArray: got "map", expected ""; if you choose to ignore these errors, turn validation off with --validate=false
make: *** [install] Error 1

Plan/PlanJob CRD Garbage Collection

Zookeeper Example on Minikube missing Memory Requirements

Running zookeper example on the minikube started with the default memory options results in:

kubectl get pod
NAME      READY   STATUS    RESTARTS   AGE
zk-zk-0   1/1     Running   0          7m7s
zk-zk-1   0/1     Pending   0          7m7s
zk-zk-2   0/1     Pending   0          7m7s

Events:
  Type     Reason            Age                  From               Message
  ----     ------            ----                 ----               -------
  Warning  FailedScheduling  105s (x5 over 106s)  default-scheduler  pod has unbound immediate PersistentVolumeClaims
  Warning  FailedScheduling  97s (x20 over 105s)  default-scheduler  0/1 nodes are available: 1 Insufficient memory.

Suggestion: document memory requirements for minikube environment. For example
minikube start --memory 4096, etc.

Leverage/Implement Service Definitions

For both dependencies and as an export of a running Instance

https://github.com/kubernetes/enhancements/blob/master/keps/sig-apps/0032-portable-service-definitions.md

PlanExecutions don't delete when Instances get deleted.

Update README.md

Currently its the autogenerated readme from Kubebuilder

Kafka Framework fails when running on a GKE cluster

The current version of Kafka we use in our framework won't work with GKE:

small-kafka-0   0/1       Pending   0         0s
small-kafka-0   0/1       Pending   0         0s
small-kafka-0   0/1       Pending   0         3s
small-kafka-0   0/1       ContainerCreating   0         3s
small-kafka-0   0/1       Running   0         15s
small-kafka-0   0/1       Error     0         17s
small-kafka-0   0/1       Running   1         18s
small-kafka-0   0/1       Error     1         20s
small-kafka-0   0/1       CrashLoopBackOff   1         33s
small-kafka-0   0/1       Running   2         34s
small-kafka-0   0/1       Error     2         36s

The exact error message is:

[2019-01-23 01:16:07,655] INFO Loading logs. (kafka.log.LogManager)
[2019-01-23 01:16:07,663] ERROR There was an error in one of the threads during logs loading: kafka.common.KafkaException: Found directory /var/lib/kafka/lost+found, 'lost+found' is not in the form of topic-partition
If a directory does not contain Kafka topic data it should not exist in Kafka's log directory (kafka.log.LogManager)
[2019-01-23 01:16:07,664] FATAL [Kafka Server 0], Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
kafka.common.KafkaException: Found directory /var/lib/kafka/lost+found, 'lost+found' is not in the form of topic-partition
If a directory does not contain Kafka topic data it should not exist in Kafka's log directory
	at kafka.log.Log$.exception$1(Log.scala:1131)
	at kafka.log.Log$.parseTopicPartitionName(Log.scala:1139)
	at kafka.log.LogManager$$anonfun$loadLogs$2$$anonfun$3$$anonfun$apply$10$$anonfun$apply$1.apply$mcV$sp(LogManager.scala:153)
	at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:57)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
[2019-01-23 01:16:07,667] INFO [Kafka Server 0], shutting down (kafka.server.KafkaServer)
[2019-01-23 01:16:07,670] INFO Terminate ZkClient event thread. (org.I0Itec.zkclient.ZkEventThread)
[2019-01-23 01:16:07,677] INFO EventThread shut down for session: 0x268784681130001 (org.apache.zookeeper.ClientCnxn)
[2019-01-23 01:16:07,677] INFO Session: 0x268784681130001 closed (org.apache.zookeeper.ZooKeeper)
[2019-01-23 01:16:07,681] INFO [Kafka Server 0], shut down completed (kafka.server.KafkaServer)
[2019-01-23 01:16:07,682] FATAL Fatal error during KafkaServerStartable startup. Prepare to shutdown (kafka.server.KafkaServerStartable)
kafka.common.KafkaException: Found directory /var/lib/kafka/lost+found, 'lost+found' is not in the form of topic-partition
If a directory does not contain Kafka topic data it should not exist in Kafka's log directory
	at kafka.log.Log$.exception$1(Log.scala:1131)
	at kafka.log.Log$.parseTopicPartitionName(Log.scala:1139)
	at kafka.log.LogManager$$anonfun$loadLogs$2$$anonfun$3$$anonfun$apply$10$$anonfun$apply$1.apply$mcV$sp(LogManager.scala:153)
	at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:57)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
[2019-01-23 01:16:07,685] INFO [Kafka Server 0], shutting down (kafka.server.KafkaServer)

Which is a known problem, e.g. see here: vmware-archive/kubeless#460

The only workaround I see is

A clean-up step as also stated in the referenced kubeless issue above
Using confluentinc/cp-kafka which is used by https://github.com/helm/charts/tree/master/incubator/kafka

See also Slack conversation: https://kubernetes.slack.com/archives/C09NXKJKA/p1548207090459900

Propagate events from children objects to parent Instances

Create Example that uses Kustomize Patches

Also provide documentation.

Kudo is not cleaning up PersistentVolumeClaims after an instance was deleted

Expected Behavior

When uninstalling an instance, e.g. Kafka I would assume all created artifacts are being cleaned/garbage collected so that when I install an instance with the same name the next time I start from scratch again.

Observed Behavior

It looks like PVCs are not being cleaned up after a delete.

maestro-demo $ kubectl get pvc
NAME                    STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
datadir-small-kafka-0   Bound    pvc-a028c4a1-2384-11e9-822f-42010a800154   1Gi        RWO            standard       24m
datadir-small-kafka-1   Bound    pvc-38f26e09-2386-11e9-822f-42010a800154   1Gi        RWO            standard       12m
datadir-small-kafka-2   Bound    pvc-4b97632c-2386-11e9-822f-42010a800154   1Gi        RWO            standard       12m
datadir-zk-zk-0         Bound    pvc-9fcc977f-2385-11e9-822f-42010a800154   2Gi        RWO            standard       17m
datadir-zk-zk-1         Bound    pvc-9fd0feed-2385-11e9-822f-42010a800154   2Gi        RWO            standard       17m
datadir-zk-zk-2         Bound    pvc-9fd7d9f9-2385-11e9-822f-42010a800154   2Gi        RWO            standard       17m

This leads to behavior, where when you install a frameworkversion (e.g. flink-financial-demo) and then deleting that the data still persists. In this case it would be in Kafka. When then re-installing the demo again you will see that Kafka magically shows you in its logs the data from the previous deleted instance, which lets also the actor in the demo immediately display detected fraud entries where those are relicts of the past.

kudobuilder / kudo Goto Github PK

kudo's People

Contributors

Stargazers

Watchers

Forkers

kudo's Issues

Expected Behavior

Observed Behavior

Full log

Expected Behavior

Observed Behavior

Recommend Projects

Recommend Topics

Recommend Org