kudobuilder / kudo Goto Github PK
View Code? Open in Web Editor NEWKubernetes Universal Declarative Operator (KUDO)
Home Page: https://kudo.dev
License: Apache License 2.0
Kubernetes Universal Declarative Operator (KUDO)
Home Page: https://kudo.dev
License: Apache License 2.0
What is the best way to expose?
In the plan controller, the line:
result, err := controllerutil.CreateOrUpdate(context.TODO(), r.Client, obj, func(runtime.Object) error { return nil })
needs to be fixed. The last argument of this function is supposed to capture the modifications to the object pulled from the server. Need to replace it with something like this from instance_controller.go
did, err := controllerutil.CreateOrUpdate(context.TODO(), mgr.GetClient(), current, func(o runtime.Object) error {
t := true
o.(*maestrov1alpha1.PlanExecution).Spec.Suspend = &t
return nil
})
For some plans, it may make sense to run on an instance with a different value than specified by the Instance or FrameworkVersion, e.g. backup/restore. An example spec might look like:
apiVersion: maestro.k8s.io/v1alpha1
kind: PlanExecution
metadata:
name: small-backup
namespace: default
ownerReferences:
- apiVersion: maestro.k8s.io/v1alpha1
blockOwnerDeletion: true
controller: true
kind: Instance
name: small
uid: a1fc8f64-fa54-11e8-8673-08002795d782
spec:
instance:
kind: Instance
name: small
namespace: default
planName: backup
arguments:
BACKUP_LOCATION: s3://backup-bucket/data.sql
Some parameters should not be over-rideable and we'd want to include that in the parameter definition in the FrameworkVersion.
Show examples of how to implement Canary or Blue/green deployments.
Some plans are determined by specific changes in states (e.g. Version update means to call upgrade
).
For specific plans (e..g createTopic
), a mechanism needs to be in place to explicitly call plans
TODO
If a task definiton doesn't contain appropriate data:
restore:
resource:
- restore.yaml
instead of
restore:
resources:
- restore.yaml
Maestro doesn't alert the user of incorrect data.
Currently requires Kubebuilder to support Kubernetes 1.12. kubernetes-sigs/cluster-api#522
When trying to update to Kubernetes 1.12, we get the following error:
Solving failure: No versions of sigs.k8s.io/controller-runtime met constraints:
v0.1.7: Could not introduce sigs.k8s.io/[email protected], as it has a dependency on k8s.io/client-go with constraint kubernetes-1.11.2, which has no overlap with existing constraint ^9.0.0 from (root)
v0.1.6: Could not introduce sigs.k8s.io/[email protected], as it is not allowed by constraint ^0.1.7 from project github.com/kubernetes-sigs/kubebuilder-maestro.
v0.1.5: Could not introduce sigs.k8s.io/[email protected], as it is not allowed by constraint ^0.1.7 from project github.com/kubernetes-sigs/kubebuilder-maestro.
v0.1.4: Could not introduce sigs.k8s.io/[email protected], as it is not allowed by constraint ^0.1.7 from project github.com/kubernetes-sigs/kubebuilder-maestro.
v0.1.3: Could not introduce sigs.k8s.io/[email protected], as it is not allowed by constraint ^0.1.7 from project github.com/kubernetes-sigs/kubebuilder-maestro.
v0.1.2: Could not introduce sigs.k8s.io/[email protected], as it is not allowed by constraint ^0.1.7 from project github.com/kubernetes-sigs/kubebuilder-maestro.
v0.1.1: Could not introduce sigs.k8s.io/[email protected], as it is not allowed by constraint ^0.1.7 from project github.com/kubernetes-sigs/kubebuilder-maestro.
master: Could not introduce sigs.k8s.io/controller-runtime@master, as it is not allowed by constraint ^0.1.7 from project github.com/kubernetes-sigs/kubebuilder-maestro.
admissionwebhook: Could not introduce sigs.k8s.io/controller-runtime@admissionwebhook, as it is not allowed by constraint ^0.1.7 from project github.com/kubernetes-sigs/kubebuilder-maestro.
admissionwebhook-1.11: Could not introduce sigs.k8s.io/[email protected], as it is not allowed by constraint ^0.1.7 from project github.com/kubernetes-sigs/kubebuilder-maestro.
bulk_deposit: Could not introduce sigs.k8s.io/controller-runtime@bulk_deposit, as it is not allowed by constraint ^0.1.7 from project github.com/kubernetes-sigs/kubebuilder-maestro.
release-0.1: Could not introduce sigs.k8s.io/[email protected], as it is not allowed by constraint ^0.1.7 from project github.com/kubernetes-sigs/kubebuilder-maestro.
review: Could not introduce sigs.k8s.io/controller-runtime@review, as it is not allowed by constraint ^0.1.7 from project github.com/kubernetes-sigs/kubebuilder-maestro.
Once complete, re-running dep ensure
should update the client-go library.
finalizer name is frameworks.maestro.io
Create the initial framework of a Maestro CLI command as a skeleton for future sub-commands that can be used to manipulate Maestro-specific CRDs or an API aggregation layer if one exists.
We should join the kubectl maestro plugin
and the maestroctl
go binary.
Create a Dockerfile, Docker images for released version. This can be a manual process for this image for purposes of a Kubecon release. Another release process automation issue will be created.
If we have an instance referencing a Framework Version, and that FV gets updated, how should we handle changes to the instance?
Allow for FrameworkInstances to reference other FrameworkInstances to satisfy a dependency. For example Kafka needs an instance of Zookeeper if the zookeeper.url
is not provided
To help with kubectl describe
output and facilitate other tooling, as well as help users discover configuration for a particular service, let's add descriptions to the FrameworkVersion default parameters configuration values.
To reproduce start with a vanilla cluster, apply all crds, and then start the controller
We'd like to be able type things like:
kubectl maestro plan show deploy --instance=kafka
And get a plan output.
When changing parameters in an Instance, e.g. BROKERS_COUNT: "4"
for Kafka or FLINK_TASKMANAGER_REPLICAS: "3"
for Flink I would expect Maestro to scale up/down accordingly.
Nothing really happens, though I see some (?unrelated) error messages as well as information about plans that could lead us on the right path.
maestro-demo $ cat flink-instance.yaml
apiVersion: maestro.k8s.io/v1alpha1
kind: Instance
metadata:
labels:
controller-tools.k8s.io: "1.0"
framework: flink
name: flink # this is the instance label which will lead the pod name
spec:
frameworkVersion:
name: flink-1.7
namespace: default
type: FrameworkVersion
maestro-demo $ kubectl apply -f flink-instance.yaml
instance.maestro.k8s.io/flink created
maestro-demo $ kubectl get pods
NAME READY STATUS RESTARTS AGE
flink-jobmanager-69cd6768b9-bjr2v 1/1 Running 0 58s
flink-taskmanager-d57b9c8bc-d49z2 1/1 Running 0 56s
flink-taskmanager-d57b9c8bc-ng8rv 1/1 Running 0 56s
maestro-demo $ cat flink-instance.yaml
apiVersion: maestro.k8s.io/v1alpha1
kind: Instance
metadata:
labels:
controller-tools.k8s.io: "1.0"
framework: flink
name: flink # this is the instance label which will lead the pod name
spec:
frameworkVersion:
name: flink-1.7
namespace: default
type: FrameworkVersion
parameters:
FLINK_TASKMANAGER_REPLICAS: "3"
maestro-demo $ kubectl apply -f flink-instance.yaml
instance.maestro.k8s.io/flink configured
I would expect now to have another taskmanager running, but instead I get:
2019/01/28 21:02:28 Error getting FrameworkVersion flink-1.7 for instance flink: FrameworkVersion.maestro.k8s.io "flink-1.7" not found
although the framework exists:
maestro-demo $ kubectl get frameworkversions
NAME CREATED AT
flink-1.7 8m
I tested this as well with Kafka where I see the same behavior (error message 2019/01/28 21:28:14 Error getting FrameworkVersion zookeeper-1.0 for instance zk: FrameworkVersion.maestro.k8s.io "zookeeper-1.0" not found
)
2019/01/28 21:02:28 Recieved create event for &{{Instance maestro.k8s.io/v1alpha1} {flink default /apis/maestro.k8s.io/v1alpha1/namespaces/default/instances/flink 56e2f7a8-2382-11e9-822f-42010a800154 2949 1 2019-01-28 20:57:12 -0800 PST <nil> <nil> map[framework:flink controller-tools.k8s.io:1.0] map[kubectl.kubernetes.io/last-applied-configuration:{"apiVersion":"maestro.k8s.io/v1alpha1","kind":"Instance","metadata":{"annotations":{},"labels":{"controller-tools.k8s.io":"1.0","framework":"flink"},"name":"flink","namespace":"default"},"spec":{"frameworkVersion":{"name":"flink-1.7","namespace":"default","type":"FrameworkVersion"},"parameters":{"FLINK_TASKMANAGER_REPLICAS":"3"}}}
] [] nil [] } {{ default flink-1.7 } [] map[FLINK_TASKMANAGER_REPLICAS:3]} {{ default flink-deploy-997290000 f4bf9ccc-2382-11e9-822f-42010a800154 } COMPLETE}}
2019/01/28 21:02:28 Error getting FrameworkVersion flink-1.7 for instance flink: FrameworkVersion.maestro.k8s.io "flink-1.7" not found
2019/01/28 21:02:28 Adding flink-deploy-997290000 to reconcile
2019/01/28 21:02:28 Adding flink-deploy-997290000 to reconcile
{"level":"info","ts":1548738148.9631062,"logger":"kubebuilder.controller","caller":"controller/controller.go:134","msg":"Starting Controller","Controller":"instance-controller"}
{"level":"info","ts":1548738148.963074,"logger":"kubebuilder.controller","caller":"controller/controller.go:134","msg":"Starting Controller","Controller":"planexecution-controller"}
{"level":"info","ts":1548738148.9630651,"logger":"kubebuilder.controller","caller":"controller/controller.go:134","msg":"Starting Controller","Controller":"framework-controller"}
{"level":"info","ts":1548738148.9630919,"logger":"kubebuilder.controller","caller":"controller/controller.go:134","msg":"Starting Controller","Controller":"frameworkversion-controller"}
{"level":"info","ts":1548738149.0641012,"logger":"kubebuilder.controller","caller":"controller/controller.go:153","msg":"Starting workers","Controller":"planexecution-controller","WorkerCount":1}
2019/01/28 21:02:29 PlanExecution flink-deploy-243510000 has already run to completion, not processing.
2019/01/28 21:02:29 PlanExecution flink-deploy-997290000 has already run to completion, not processing.
{"level":"info","ts":1548738149.0677109,"logger":"kubebuilder.controller","caller":"controller/controller.go:153","msg":"Starting workers","Controller":"instance-controller","WorkerCount":1}
{"level":"info","ts":1548738149.067785,"logger":"kubebuilder.controller","caller":"controller/controller.go:153","msg":"Starting workers","Controller":"frameworkversion-controller","WorkerCount":1}
{"level":"info","ts":1548738149.0677068,"logger":"kubebuilder.controller","caller":"controller/controller.go:153","msg":"Starting workers","Controller":"framework-controller","WorkerCount":1}
2019/01/28 21:02:29 FrameworkController: Recieved Reconcile request for flink
2019/01/28 21:02:29 FrameworkVersionController: Recieved Reconcile request for flink-1.7
2019/01/28 21:02:40 InstanceController: UpdateInstance: Going to call plan deploy
2019/01/28 21:02:40 Current Plan for Instance is already done, wont change the Suspend flag
2019/01/28 21:02:40 InstanceController: Recieved Reconcile request for flink
2019/01/28 21:02:40 Old and new spec matched...
2019/01/28 21:02:40 InstanceController: UpdateInstance: Going to call plan
2019/01/28 21:02:40 InstanceController: Recieved Reconcile request for flink
2019/01/28 21:02:40 Phase 0 Step 0 has 3 objects
2019/01/28 21:02:40 CreateOrUpdate resulted in: unchanged
2019/01/28 21:02:40 Unkonwn type is marked healthy by default
2019/01/28 21:02:40 CreateOrUpdate resulted in: unchanged
2019/01/28 21:02:40 Unkonwn type is marked healthy by default
2019/01/28 21:02:40 CreateOrUpdate resulted in: unchanged
2019/01/28 21:02:40 Unkonwn type is marked healthy by default
2019/01/28 21:02:40 Phase flink has strategy serial
2019/01/28 21:02:40 Phase flink marked as serial
2019/01/28 21:02:40 Step jobmanager is healthy, so I can continue on
2019/01/28 21:02:40 Step jobmanager looked at
2019/01/28 21:02:40 Phase flink is healthy
2019/01/28 21:02:40 Phase flink marked as healthy
2019/01/28 21:02:40 Phase flink is healthy
2019/01/28 21:02:40 PlanExecution flink-deploy-28047000 has already run to completion, not processing.
When performing a canary or blue/green deployment (#70 ) there needs to be the ability to clean up objects that are created as part of the rollout
Kustomize provides a feature to automatically add the labels on the service object into the selector
spec. This typically provides consistency for objects created, but can cause issues for pods created from a different STEP than the service:
Consider the following:
plans:
deploy:
strategy: serial
phases:
- name: all
strategy: parallel
steps:
- name: services
tasks:
- services
- name: deployment
tasks:
- deployment
The service object
service.yaml: |
apiVersion: v1
kind: Service
metadata:
name: appinfo
labels:
app: appinfo
spec:
ports:
- port: 8080
protocol: TCP
selector:
app: appinfo
type: LoadBalancer
Has the following label selectors:
$ kubectl get svc bird-appinfo -o jsonpath="{.spec.selector['step']}"
services
whereas the deployment gets created with the following label
$kubectl get deployments bird-appinfo -o jsonpath="{ .metadata.labels['step']}"
deployment
This added label for the specific step was added to be able to differentiate Jobs that need to be re-created even when they have the same name.
We currently have Frameworks
, FrameworkVersions
and Instances
. Should Instances
be renamed to FrameworkInstances
to better mark that its part of these sets of CRDS?
We currently hardcode a set of kubernetes objects and default values. These should be pulled from the following files in the universe:
src/main/dist/
for the framework. These should map to ConfigMap objectsAre there any other files that are used as input into a FrameworkVersion?
Convert to new template/patches example
Reminder to implement CICD Pipeline
Right now when we install an instance of a framework1 as part of another framework0 without having the framework/frameworkversion both properly installed, the presented error message helps little to point us in the right direction:
Error getting PlaneExecution /: PlanExecution.maestro.k8s.io "" not found
This can be easily reproduced by trying to install the flink-demo
that has zookeeper
as a dependency without having the framework/frameworkversion
of zookeeper
installed.
What would be more intuitive to the user is that there is an error message saying the framework
or frameworkversion
for that instance wasn't found.
The longer error message:
2019/01/22 12:04:20 Obj is NOT healthy: &{{ } {demo-zk default /apis/maestro.k8s.io/v1alpha1/namespaces/default/instances/demo-zk e73c3016-1e80-11e9-b99e-08002788f190 2444 1 2019-01-22 12:04:19 -0800 PST <nil> <nil> map[heritage:maestro phase:dependencies instance:demo plan:deploy step:zookeeper version: framework:zookeeper app:flink-financial-demo controller-tools.k8s.io:1.0 planexecution:demo-deploy-109789000] map[] [{maestro.k8s.io/v1alpha1 Instance demo e6baa583-1e80-11e9-b99e-08002788f190 0xc000daa7ac 0xc000daa7ad}] nil [] } {{ default zookeeper-1.0 } [] map[ZOOKEEPER_CPUS:0.3]} {{ } }}
2019/01/22 12:04:20 Phase dependencies has strategy serial
2019/01/22 12:04:20 Phase dependencies marked as serial
2019/01/22 12:04:20 Step zookeeper isn't complete, skipping rest of steps in phase until it is
2019/01/22 12:04:20 Phase dependencies is not healthy b/c step zookeeper is not healthy
2019/01/22 12:04:20 Phase dependencies not healthy, and plan marked as serial, so breaking.
2019/01/22 12:04:20 Phase dependencies is not healthy b/c step zookeeper is not healthy
We actually have that error message at another part already. When we try to install just zookeeper
via kubectl apply -f zookeeper-instance.yaml
without having the framework/frameworkversion
installed as a prerequisite, we end up with a more descriptive message:
2019/01/22 12:07:01 Error getting FrameworkVersion zookeeper-1.0 for instance zk: FrameworkVersion.maestro.k8s.io "zookeeper-1.0" not found
Long error message:
2019/01/22 12:06:50 Could not find planExecution demo-deploy-109789000: PlanExecution.maestro.k8s.io "demo-deploy-109789000" not found
2019/01/22 12:06:50 Error getting instance object: Instance.maestro.k8s.io "demo" not found
2019/01/22 12:07:01 Recieved create event for &{{ } {zk default /apis/maestro.k8s.io/v1alpha1/namespaces/default/instances/zk 47b56197-1e81-11e9-b99e-08002788f190 2629 1 2019-01-22 12:07:01 -0800 PST <nil> <nil> map[framework:zookeeper controller-tools.k8s.io:1.0] map[kubectl.kubernetes.io/last-applied-configuration:{"apiVersion":"maestro.k8s.io/v1alpha1","kind":"Instance","metadata":{"annotations":{},"labels":{"controller-tools.k8s.io":"1.0","framework":"zookeeper"},"name":"zk","namespace":"default"},"spec":{"frameworkVersion":{"name":"zookeeper-1.0","namespace":"default","type":"FrameworkVersions"},"name":"zk","parameters":{"ZOOKEEPER_CPUS":"0.3"}}}
] [] nil [] } {{ default zookeeper-1.0 } [] map[ZOOKEEPER_CPUS:0.3]} {{ } }}
2019/01/22 12:07:01 Error getting FrameworkVersion zookeeper-1.0 for instance zk: FrameworkVersion.maestro.k8s.io "zookeeper-1.0" not found
Provide a sub-command for maestroctl
to enable viewing the status of PlanExecution CRDs.
We currently have Frameworks
, FrameworkVersions
and Instances
. Should Instances
be renamed to FrameworkInstances
to better mark that its part of these sets of CRDS?
Currently I've been using a smattering of log
and fmt
e.g.
parameters:
- name: BROKER_COUNT
description: The number of kafka brokers that should be started
required: false
default: 3
- name: BROKER_CPUS
....
This binary release process should cut binaries to the Github releases. This will facilitate users to get the operator and maestroctl binaries.
Currently stubbed out:
func IsHealthy(c client.Client, obj runtime.Object) error {
switch obj.(type) {
case *appsv1.Deployment:
d := obj.(*appsv1.Deployment)
log.Printf("Deployment %v is marked healthy\n", d.Name)
return nil
Will look at:
Check the number of ReadyPods in the Status match the number of Replicas in the spec
$ make run
go generate ./pkg/... ./cmd/...
go fmt ./pkg/... ./cmd/...
go vet ./pkg/... ./cmd/...
go run ./cmd/manager/main.go
# github.com/maestrosdk/maestro/vendor/k8s.io/client-go/transport
vendor/k8s.io/client-go/transport/round_trippers.go:437:9: undefined: strings.Builder
# github.com/maestrosdk/maestro/vendor/sigs.k8s.io/kustomize/pkg/target
vendor/sigs.k8s.io/kustomize/pkg/target/kusttarget.go:89:5: dec.DisallowUnknownFields undefined (type *json.Decoder has no field or method DisallowUnknownFields)
make: *** [run] Error 2
Some plans should not be interupted. For example creating backups of databases should be completed before continuing to another plan.
Add plan parameter to set plan as non-preemptable. By default they will be preemptable
$ make install
go run vendor/sigs.k8s.io/controller-tools/cmd/controller-gen/main.go all
Breaking recursion for type github.com/maestrosdk/maestro/pkg/apis/maestro/v1alpha1.FrameworkVersionCRD manifests generated under '/Users/djannot/Documents/go/src/github.com/maestrosdk/maestro/config/crds'
RBAC manifests generated under '/Users/djannot/Documents/go/src/github.com/maestrosdk/maestro/config/rbac'
kubectl apply -f config/crds
customresourcedefinition "frameworks.maestro.k8s.io" configured
error validating "config/crds/maestro_v1alpha1_frameworkversion.yaml": error validating data: [ValidationError(CustomResourceDefinition.spec.validation.openAPIV3Schema.properties.spec.properties.dependencies.items): invalid type for io.k8s.apiextensions-apiserver.pkg.apis.apiextensions.v1beta1.JSONSchemaPropsOrArray: got "map", expected "", ValidationError(CustomResourceDefinition.spec.validation.openAPIV3Schema.properties.spec.properties.parameters.items): invalid type for io.k8s.apiextensions-apiserver.pkg.apis.apiextensions.v1beta1.JSONSchemaPropsOrArray: got "map", expected "", ValidationError(CustomResourceDefinition.spec.validation.openAPIV3Schema.properties.spec.properties.upgradableFrom.items): invalid type for io.k8s.apiextensions-apiserver.pkg.apis.apiextensions.v1beta1.JSONSchemaPropsOrArray: got "map", expected ""]; if you choose to ignore these errors, turn validation off with --validate=false
error validating "config/crds/maestro_v1alpha1_instance.yaml": error validating data: ValidationError(CustomResourceDefinition.spec.validation.openAPIV3Schema.properties.spec.properties.dependencies.items): invalid type for io.k8s.apiextensions-apiserver.pkg.apis.apiextensions.v1beta1.JSONSchemaPropsOrArray: got "map", expected ""; if you choose to ignore these errors, turn validation off with --validate=false
error validating "config/crds/maestro_v1alpha1_planexecution.yaml": error validating data: ValidationError(CustomResourceDefinition.spec.validation.openAPIV3Schema.properties.status.properties.phases.items): invalid type for io.k8s.apiextensions-apiserver.pkg.apis.apiextensions.v1beta1.JSONSchemaPropsOrArray: got "map", expected ""; if you choose to ignore these errors, turn validation off with --validate=false
make: *** [install] Error 1
Running zookeper
example on the minikube
started with the default memory options results in:
kubectl get pod
NAME READY STATUS RESTARTS AGE
zk-zk-0 1/1 Running 0 7m7s
zk-zk-1 0/1 Pending 0 7m7s
zk-zk-2 0/1 Pending 0 7m7s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 105s (x5 over 106s) default-scheduler pod has unbound immediate PersistentVolumeClaims
Warning FailedScheduling 97s (x20 over 105s) default-scheduler 0/1 nodes are available: 1 Insufficient memory.
Suggestion: document memory requirements for minikube
environment. For example
minikube start --memory 4096
, etc.
For both dependencies and as an export of a running Instance
Currently its the autogenerated readme from Kubebuilder
The current version of Kafka we use in our framework won't work with GKE:
small-kafka-0 0/1 Pending 0 0s
small-kafka-0 0/1 Pending 0 0s
small-kafka-0 0/1 Pending 0 3s
small-kafka-0 0/1 ContainerCreating 0 3s
small-kafka-0 0/1 Running 0 15s
small-kafka-0 0/1 Error 0 17s
small-kafka-0 0/1 Running 1 18s
small-kafka-0 0/1 Error 1 20s
small-kafka-0 0/1 CrashLoopBackOff 1 33s
small-kafka-0 0/1 Running 2 34s
small-kafka-0 0/1 Error 2 36s
The exact error message is:
[2019-01-23 01:16:07,655] INFO Loading logs. (kafka.log.LogManager)
[2019-01-23 01:16:07,663] ERROR There was an error in one of the threads during logs loading: kafka.common.KafkaException: Found directory /var/lib/kafka/lost+found, 'lost+found' is not in the form of topic-partition
If a directory does not contain Kafka topic data it should not exist in Kafka's log directory (kafka.log.LogManager)
[2019-01-23 01:16:07,664] FATAL [Kafka Server 0], Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
kafka.common.KafkaException: Found directory /var/lib/kafka/lost+found, 'lost+found' is not in the form of topic-partition
If a directory does not contain Kafka topic data it should not exist in Kafka's log directory
at kafka.log.Log$.exception$1(Log.scala:1131)
at kafka.log.Log$.parseTopicPartitionName(Log.scala:1139)
at kafka.log.LogManager$$anonfun$loadLogs$2$$anonfun$3$$anonfun$apply$10$$anonfun$apply$1.apply$mcV$sp(LogManager.scala:153)
at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:57)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
[2019-01-23 01:16:07,667] INFO [Kafka Server 0], shutting down (kafka.server.KafkaServer)
[2019-01-23 01:16:07,670] INFO Terminate ZkClient event thread. (org.I0Itec.zkclient.ZkEventThread)
[2019-01-23 01:16:07,677] INFO EventThread shut down for session: 0x268784681130001 (org.apache.zookeeper.ClientCnxn)
[2019-01-23 01:16:07,677] INFO Session: 0x268784681130001 closed (org.apache.zookeeper.ZooKeeper)
[2019-01-23 01:16:07,681] INFO [Kafka Server 0], shut down completed (kafka.server.KafkaServer)
[2019-01-23 01:16:07,682] FATAL Fatal error during KafkaServerStartable startup. Prepare to shutdown (kafka.server.KafkaServerStartable)
kafka.common.KafkaException: Found directory /var/lib/kafka/lost+found, 'lost+found' is not in the form of topic-partition
If a directory does not contain Kafka topic data it should not exist in Kafka's log directory
at kafka.log.Log$.exception$1(Log.scala:1131)
at kafka.log.Log$.parseTopicPartitionName(Log.scala:1139)
at kafka.log.LogManager$$anonfun$loadLogs$2$$anonfun$3$$anonfun$apply$10$$anonfun$apply$1.apply$mcV$sp(LogManager.scala:153)
at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:57)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
[2019-01-23 01:16:07,685] INFO [Kafka Server 0], shutting down (kafka.server.KafkaServer)
Which is a known problem, e.g. see here: vmware-archive/kubeless#460
The only workaround I see is
confluentinc/cp-kafka
which is used by https://github.com/helm/charts/tree/master/incubator/kafkaSee also Slack conversation: https://kubernetes.slack.com/archives/C09NXKJKA/p1548207090459900
Also provide documentation.
When uninstalling an instance, e.g. Kafka I would assume all created artifacts are being cleaned/garbage collected so that when I install an instance with the same name the next time I start from scratch again.
It looks like PVCs are not being cleaned up after a delete.
maestro-demo $ kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
datadir-small-kafka-0 Bound pvc-a028c4a1-2384-11e9-822f-42010a800154 1Gi RWO standard 24m
datadir-small-kafka-1 Bound pvc-38f26e09-2386-11e9-822f-42010a800154 1Gi RWO standard 12m
datadir-small-kafka-2 Bound pvc-4b97632c-2386-11e9-822f-42010a800154 1Gi RWO standard 12m
datadir-zk-zk-0 Bound pvc-9fcc977f-2385-11e9-822f-42010a800154 2Gi RWO standard 17m
datadir-zk-zk-1 Bound pvc-9fd0feed-2385-11e9-822f-42010a800154 2Gi RWO standard 17m
datadir-zk-zk-2 Bound pvc-9fd7d9f9-2385-11e9-822f-42010a800154 2Gi RWO standard 17m
This leads to behavior, where when you install a frameworkversion (e.g. flink-financial-demo
) and then deleting that the data still persists. In this case it would be in Kafka. When then re-installing the demo again you will see that Kafka magically shows you in its logs the data from the previous deleted instance, which lets also the actor in the demo immediately display detected fraud entries where those are relicts of the past.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.