gardener / etcd-druid Goto Github PK
View Code? Open in Web Editor NEWETCD operator managing the lifecycle of ETCD clusters for hosted control planes.
License: Apache License 2.0
ETCD operator managing the lifecycle of ETCD clusters for hosted control planes.
License: Apache License 2.0
Feature (What you would like to be added):
Run multi-node ETCD during maintenance operations, so that it can quickly fail-over.
Motivation (Why is this needed?):
Shorter ETCD (=API server=cluster) downtimes during maintenance operations that effect ETCD like rolling the seed node it runs on or updating the ETCD spec.
Approach/Hint to the implement solution (optional):
Operator that scales out (with node anti-affinity) and later in again. The main question will be how to orchestrate that with Gardener as there are hooks and means missing for that at present.
Feature (What you would like to be added):
Make Etcd CRD's spec.backup.store
section immutable.
Motivation (Why is this needed?):
Make Etcd CRD's spec.backup.store
section immutable so that the storage container location isn't allowed to change mid-usage of an etcd, due to potential mismatch of snapshotting and restoration locations, allowing restorations to happen from a different etcd's backup and rendering the shoot cluster unusable. Refer gardener/gardener#4454 for a fix already made on Gardener, although we still want druid to be resilient to potential undesirable changes to the Etcd resource.
Approach/Hint to the implement solution (optional):
Since CRD immutability is yet to be support (refer kubernetes/kubernetes#65973), it might make more sense to use something like a validating webhook on the Etcd resource updates.
/cc @amshuman-kr
Feature (What you would like to be added):
Enhance reconciliation to handle multi-node scenario in etcd-druid
. This should include the following topics.
Ready
and AllMembersReady
conditions based on the contents of the members
section of the Etcd
resource status.Lease
objects for every member pod.Service
s for client and etcd peer (ref) -> TBD (#147)Motivation (Why is this needed?):
Pick individually executable pieces of the multi-node proposal.
Approach/Hint to the implement solution (optional):
Feature (What you would like to be added):
Unit tests for etcd-druid reconciliation cycle.
Motivation (Why is this needed?):
We should have both positive and negative scenarios covered in the unit tests to improve our own productivity and to avoid regression.
Approach/Hint to the implement solution (optional):
Replace the kubebuilder way of tests (running kube-apiserver
and etcd
) with mock APIs.
Feature (What you would like to be added):
The etcd-druid should control the both the versions for etcd + backup-restore sidecar.
Motivation (Why is this needed?):
It controls the manifests + configuration for the statefulset and the used versions must fit to it. Hence, it makes sense to control them.
Approach/Hint to the implement solution (optional):
Please use the image vector approach (https://github.com/gardener/gardener/blob/master/charts/images.yaml) with the use of https://github.com/gardener/gardener/tree/master/pkg/utils/imagevector.
It must be possible to overwrite the image vector during deployment time.
Feature (What you would like to be added):
Move to etcd v3.3.23
or the latest v.3.3.x
patch release .
Motivation (Why is this needed?):
Approach/Hint to the implement solution (optional):
Describe the bug:
A clear and concise description of what the bug is.
The statefulset prevents creation of statefulsets where the labels in the template does not match the selector. Etcd however does not have a similar validation.
Expected behavior:
A clear and concise description of what you expected to happen.
Etcd resource creation should through an error when the labels in the template does not match the selector.
How To Reproduce (as minimally and precisely as possible):
Have the selector field set so that it does not match the template in the statefulset.
Logs:
Screenshots (if applicable):
Environment (please complete the following information):
Anything else we need to know?:
Feature (What you would like to be added):
Integrate backup compression feature from etcd-backup-restore with etcd-druid (by enabling configuration via Etcd
resource spec) and then integrate with gardener.
Motivation (Why is this needed?):
The backup compression feature will be used primarily in the etcd-druid and gardener context.
Approach/Hint to the implement solution (optional):
Keep the default configuration in etcd-druid to be uncompressed backups (for backward compatibility) and the default configuration in gardener integration to be compressed backups.
Feature (What you would like to be added):
The main reconciliation loop in etcd-druid
takes care of everything from updating the owned resources and updating the status in the Etcd
resource. We should create a separate controller (still part of the etcd-druid controller manager) which reconciles only the status section of the Etcd
resource.
Credit: @rfranzke ❤️
Motivation (Why is this needed?):
The main reconcilation loop is triggered only if the watch events pass some predicates. If the status update during the main reconcilation fails for any reason, the status in the Etcd
resource might not be updated until the next gardener reconcilation event that matches the predicates.
Approach/Hint to the implement solution (optional):
Describe the bug:
VPA recommender misses the permission to get scale subresource because of which VPA on etcd is not happening
Expected behavior:
As load increases based on VPA recommendations , etcd should be scaled.
How To Reproduce (as minimally and precisely as possible):
Logs:
Screenshots (if applicable):
Environment (please complete the following information):
Anything else we need to know?:
Feature (What you would like to be added):
The current multi-node ETCD proposal to handle backup health more explicitly. Especially, consider the impact of not cutting off requests when backup upload fails.
Motivation (Why is this needed?):
Approach/Hint to the implement solution (optional):
Feature (What you would like to be added):
Expose the associated monitoring and logging configuration as per the https://github.com/gardener/gardener/blob/master/docs/extensions/logging-and-monitoring.md
Motivation (Why is this needed?):
Thought Druid is standalone component, it is designed adhere to gardener extension contract as well. As a result it might have to take responsibility of exposing it monitoring configuration to gardener like projects.
Approach/Hint to the implement solution (optional):
Question: How are you guys dealing with the incremental backup files when restoring a cluster? I am asking, because in Kubify we expect one full snapshot file to trigger a restore operation. What is the best way to compact all the incremental backup files into one? Do you have already something handy?
Feature (What you would like to be added):
If there is any issue with watch connections used by the informers, etcd-druid
should detect this and try to automatically recover from it.
Motivation (Why is this needed?):
Manual intervention is required without such detection and automatic recovery.
Approach/Hint to the implement solution (optional):
We need to revendor client-go and possibly controller-runtime to include the fix (kubernetes/kubernetes#87329) that propagates informer errors to the caller and then possibly react to it.
See also:
Feature (What you would like to be added):
In some infrastructures (Azure), abnormal termination of etcd container/pod leads to the database directory lock not being released and prevents the backup-restore to hang while opening the database for verification on etcd container restart.
We should try to detect this scenario and try to recover from it automatically.
Motivation (Why is this needed?):
This happens rarely (so far only a couple of times in Azure) but requires manual intervention. Typically, a pod restart resolves the issue. But we should try and automate this.
Approach/Hint to the implement solution (optional):
Typically, a pod restart resolves the issue.
Please provide stories that we plan to tackle:
We should support provisioning and management of multi-node etcd clusters via etcd-druid to serve the following goals:
First increment (please break-down into different increments rather than by component, if possible)
Documentation/Proposal
Etcd-Druid
Etcd-Backup-Restore
Additional Improvements
Feature (What you would like to be added):
Druid should use and support Server-Side Apply where applicable once Gardener dropped the support for seed clusters with K8s <= 1.17 (gardener/gardener#4083).
Motivation (Why is this needed?):
Server-Side Apply makes working with the etcd
resource more efficiently when there will be more then one actor (motivated here).
Tasks to be done:
etcd
resourceetcd
status updatesFeature (What you would like to be added):
Make it possible to have smaller auto-compaction-retention period for etcd (both main etcd and the embedded ETCD during restoration).
Motivation (Why is this needed?):
High update rate can overflow memory and storage if auto-compaction-retention
period is long. The current value is 24h
.
etcd-druid/charts/etcd/templates/etcd-bootstrap-configmap.yaml
Lines 129 to 130 in 8307a62
Approach/Hint to the implement solution (optional):
We can either change the value to be smaller by default (5m
?) and/or we can make it configurable via the Etcd
resource spec.
Feature (What you would like to be added):
Etcd
resource status structure according to the changes proposed hereEtcd
resource status. This need not include the task of cutting off traffic in case of backup failure yet as the evaluation/decision on that is pending for the scenario of multi-node ETCD with ephemeral persistence. The etcd-backup-restore
needs to consider the following scenarios in the implementation.
etcd-druid
can maintain the AllMembersReady
and Ready
conditions as well as the following transitions for the member status where the member's etcd-backup-restore
is unable to update its own status. Probably this has to be done in etcd-druid
by enhancing the etcd status controller (custodian).
Motivation (Why is this needed?):
Pick individually executable pieces of the multi-node proposal.
Approach/Hint to the implement solution (optional):
Also, it would be preferable to use StatusWriter.Patch()
to avoid race-conditions.
Feature (What you would like to be added):
Move out etcd bootstrap script to etcd custom image.
Motivation (Why is this needed?):
To avoid the issue of out-of-sync configmap and statefulset spec (and hence etcd image version) during etcd version updates on Gardener landscapes.
Approach/Hint to the implement solution (optional):
Feature (What you would like to be added):
Enhance the Etcd resource status structure according to the changes proposed here while maintaining backward compatibility for the consumers of Etcd
resource status (such as the gardenlet
).
Motivation (Why is this needed?):
Pick individually executable pieces of the multi-node proposal.
Approach/Hint to the implement solution (optional):
For backward compatibility, the existing status fields and the values in them need to maintained as they are. In both the main etcd-druid controller (especially, here and here) as well as the newly separated custodian controller.
Feature (What you would like to be added):
Leader election settings should be increased and made configurable in chart manifests.
Motivation (Why is this needed?):
The default leader election settings in controller-runtime seem to create too much load on the apiserver. It should be possible to configure them to reduce load on the apiserver without having to make any changes to etcd-druid
.
Approach/Hint to the implement solution (optional):
We can introduce command-line flags and chart manifest flags along the lines of gardener/gardener#2667.
Also, it would be desirable to switch to Lease
for leader election rather than ConfigMap
. But controller-runtime
still uses ConfigMap
. So, for this, we either have to wait till controller-runtime
move to Lease
or we override with a custom newResourceLock
factory function in the options.
Describe the bug:
Validations for the below scenarions:
Expected behavior:
A clear and concise description of what you expected to happen.
How To Reproduce (as minimally and precisely as possible):
Logs:
Screenshots (if applicable):
Environment (please complete the following information):
Anything else we need to know?:
Feature (What you would like to be added):
This repository should also benefit from automatic update PRs of dependent components. etcd-druid deploys etcd-backup-restore, hence, when a new version of it is released then automatic update PRs should be opened by CI, similar to gardener/gardener#2260.
Motivation (Why is this needed?):
Less manual actions.
Approach/Hint to the implement solution (optional):
You need such a script: https://github.com/gardener/gardener/blob/master/hack/.ci/set_dependency_version
You don't need to copy it but can also call it like the extensions as you already vendor gardener/gardener
: https://github.com/gardener/gardener-extension-provider-aws/blob/master/.ci/set_dependency_version, https://github.com/gardener/gardener-extension-provider-aws/blob/master/hack/tools.go#L23
Currently, the druid deploy etcd statefulset with owenerReference pointing to etcd resources. But the blockOwnerDeletion
field is set to false
as you can see here. But deletion of etcd resource should be blocked until all the resources deployed by it like statefulset, service, configmap are deleted.
Ideally, deletion of etcd resource should guarantee that etcd server is completely down.
Feature (What you would like to be added):
With the newly introduced compaction
command in etcd-backup-restore
to asynchronously compact backups (latest full snapshot and its following incremental snapshots into a single full snapshot) in gardener/etcd-backup-restore#301, we should enhance etcd-druid
to schedule the backup compaction at regular intervals to limit the number of incremental snapshots at any point in time and hence enhance backup restoration performance.
Motivation (Why is this needed?):
Complete the functionality for the issue #88.
Approach/Hint to the implement solution (optional):
etcd-druid
's main controller may create a CronJob
as part of it's reconciliation cycle. There is no need include the logic for selecting existing cronjobs
based on spec.selector
(of the Etcd
resources) because of #186.
Feature (What you would like to be added):
Use an optimized informer for Lease
resources, concretely only for objects which contain a gardener.cloud/owned-by
label.
Motivation (Why is this needed?):
#214 fetches Lease
objects for performing health checks on etcd members. It uses the standard Controller-Runtime client which is backed by a cache, so that all Lease
objects will be considered in the informer's ListWatch
function. Since Controller-Runtime v0.9.0
it is possible to setup this cache more fine granular (see here)
Approach/Hint to the implement solution (optional):
Controller Runtime is updated to v0.10.2. So the optimization of lease informer based on label is supported with the current version
Describe the bug:
The following test case is failing after the commit 72ec7a0.
Expected behavior:
No test cases should fail.
How To Reproduce (as minimally and precisely as possible):
Run make test
on the commit 72ec7a0.
Logs:
• Failure [6.027 seconds]
Druid when etcd resource is created [It] if fields are set in etcd.Spec and TLS enabled, the resources should reflect the spec changes
/tmp/build/a94a8fe5/pull-request-gardener.etcd-druid-pr.master/tmp/src/github.com/gardener/etcd-druid/controllers/etcd_controller_test.go:482
Expected
<string>: ConfigMap
to match fields: {
.Data."bootstrap.sh":
Expected
<string>: "...tus = '143'..."
to equal |
<string>: "...tus == '143..."
}
/tmp/build/a94a8fe5/pull-request-gardener.etcd-druid-pr.master/tmp/src/github.com/gardener/etcd-druid/controllers/etcd_controller_test.go:882
Screenshots (if applicable):
Environment (please complete the following information):
Anything else we need to know?:
Describe the bug:
The etcd controller removes the operation annotation from the Etcd
resource after reconciling it, which goes against the gardener extension contract.
Expected behavior:
The etcd controller should remove the operation annotation from the Etcd
resource before reconciling it here.
How To Reproduce (as minimally and precisely as possible):
Logs:
Screenshots (if applicable):
Environment (please complete the following information):
Anything else we need to know?:
Feature (What you would like to be added):
In Alicloud China regions, sometimes it is not possible to run apk get
. We need to avoid such statement.
Motivation (Why is this needed?):
Shoot cluster can't be created in China regions sometimes.
Approach/Hint to the implement solution (optional):
Feature (What you would like to be added):
We should have performance/load test for etcd instances integrated with etcd-druid
CI/CD pipelines which should test at least the following aspects.
8Gi
.>500/s
) into etcd and high rate of delta snapshots (>4/m
of >100Mi
snapshots).Motivation (Why is this needed?):
This will help us understand the limits and help us configure the alert thresholds.
Approach/Hint to the implement solution (optional):
Describe the bug:
etcd-druid panics with nil pointer.
Expected behavior:
How To Reproduce (as minimally and precisely as possible):
Logs:
E0416 09:08:10.455443 1 runtime.go:78] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
goroutine 317 [running]:
k8s.io/apimachinery/pkg/util/runtime.logPanic(0x14e9740, 0x23fb600)
/go/src/github.com/gardener/etcd-druid/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:74 +0xa3
k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
/go/src/github.com/gardener/etcd-druid/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:48 +0x82
panic(0x14e9740, 0x23fb600)
/usr/local/go/src/runtime/panic.go:679 +0x1b2
github.com/gardener/etcd-druid/controllers.(*EtcdReconciler).getMapFromEtcd(0xc0000d4050, 0xc00086c500, 0x3fb999999999999a, 0x4, 0x0)
/go/src/github.com/gardener/etcd-druid/controllers/etcd_controller.go:881 +0x1276
github.com/gardener/etcd-druid/controllers.(*EtcdReconciler).reconcileEtcd(0xc0000d4050, 0xc00086c500, 0xc00086c500, 0x0, 0x0, 0x0)
/go/src/github.com/gardener/etcd-druid/controllers/etcd_controller.go:724 +0x4d
github.com/gardener/etcd-druid/controllers.(*EtcdReconciler).reconcile(0xc0000d4050, 0x18da080, 0xc000048248, 0xc00086c500, 0xc000845c40, 0x2, 0x2, 0x18a9ec0)
/go/src/github.com/gardener/etcd-druid/controllers/etcd_controller.go:227 +0x27c
github.com/gardener/etcd-druid/controllers.(*EtcdReconciler).Reconcile(0xc0000d4050, 0xc000187300, 0x16, 0xc0005c4d00, 0xb, 0xc000758c00, 0x1, 0xc000758cc8, 0x478588)
/go/src/github.com/gardener/etcd-druid/controllers/etcd_controller.go:189 +0x30b
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc000184840, 0x15400e0, 0xc00061a400, 0x43eb00)
/go/src/github.com/gardener/etcd-druid/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:256 +0x162
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc000184840, 0x0)
/go/src/github.com/gardener/etcd-druid/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:232 +0xcb
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker(0xc000184840)
/go/src/github.com/gardener/etcd-druid/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:211 +0x2b
k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1(0xc00069c530)
/go/src/github.com/gardener/etcd-druid/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152 +0x5e
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc00069c530, 0x3b9aca00, 0x0, 0x1, 0xc00015a0c0)
/go/src/github.com/gardener/etcd-druid/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153 +0xf8
k8s.io/apimachinery/pkg/util/wait.Until(0xc00069c530, 0x3b9aca00, 0xc00015a0c0)
/go/src/github.com/gardener/etcd-druid/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88 +0x4d
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1
/go/src/github.com/gardener/etcd-druid/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:193 +0x328
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x1347ff6]
goroutine 317 [running]:
k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
/go/src/github.com/gardener/etcd-druid/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:55 +0x105
panic(0x14e9740, 0x23fb600)
/usr/local/go/src/runtime/panic.go:679 +0x1b2
github.com/gardener/etcd-druid/controllers.(*EtcdReconciler).getMapFromEtcd(0xc0000d4050, 0xc00086c500, 0x3fb999999999999a, 0x4, 0x0)
/go/src/github.com/gardener/etcd-druid/controllers/etcd_controller.go:881 +0x1276
github.com/gardener/etcd-druid/controllers.(*EtcdReconciler).reconcileEtcd(0xc0000d4050, 0xc00086c500, 0xc00086c500, 0x0, 0x0, 0x0)
/go/src/github.com/gardener/etcd-druid/controllers/etcd_controller.go:724 +0x4d
github.com/gardener/etcd-druid/controllers.(*EtcdReconciler).reconcile(0xc0000d4050, 0x18da080, 0xc000048248, 0xc00086c500, 0xc000845c40, 0x2, 0x2, 0x18a9ec0)
/go/src/github.com/gardener/etcd-druid/controllers/etcd_controller.go:227 +0x27c
github.com/gardener/etcd-druid/controllers.(*EtcdReconciler).Reconcile(0xc0000d4050, 0xc000187300, 0x16, 0xc0005c4d00, 0xb, 0xc000758c00, 0x1, 0xc000758cc8, 0x478588)
/go/src/github.com/gardener/etcd-druid/controllers/etcd_controller.go:189 +0x30b
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc000184840, 0x15400e0, 0xc00061a400, 0x43eb00)
/go/src/github.com/gardener/etcd-druid/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:256 +0x162
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc000184840, 0x0)
/go/src/github.com/gardener/etcd-druid/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:232 +0xcb
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker(0xc000184840)
/go/src/github.com/gardener/etcd-druid/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:211 +0x2b
k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1(0xc00069c530)
/go/src/github.com/gardener/etcd-druid/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152 +0x5e
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc00069c530, 0x3b9aca00, 0x0, 0x1, 0xc00015a0c0)
/go/src/github.com/gardener/etcd-druid/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153 +0xf8
k8s.io/apimachinery/pkg/util/wait.Until(0xc00069c530, 0x3b9aca00, 0xc00015a0c0)
/go/src/github.com/gardener/etcd-druid/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88 +0x4d
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1
/go/src/github.com/gardener/etcd-druid/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:193 +0x328
Screenshots (if applicable):
Environment (please complete the following information):
Anything else we need to know?:
etcd-druid: v0.1.14
Feature (What you would like to be added):
Move ETCD monitoring configuration into etcd-druid
according to gardener's extensions monitoring integration.
Motivation (Why is this needed?):
Keep the monitoring configuration close to the component.
Approach/Hint to the implement solution (optional):
Feature (What you would like to be added):
Would be great if we would have a standalone helm chart for druid which can be used outside Gardener.
Motivation (Why is this needed?):
Using druid instead of the etcd-operator and utilizing the backup and restore capabilities.
Approach/Hint to the implement solution (optional):
Cool thing would be to have a chart in the charts
repository and released inside this github project via https://github.com/helm/chart-releaser-action in the repositories gh-pages
branch.
Feature (What you would like to be added):
The status in etcd
resource does not reflect the current snapshot [full/delta]. Update the etcd resource status to reflect the latest snapshot information.
Motivation (Why is this needed?):
It would help in control plane migration to fetch the latest snapshot for update.
Approach/Hint to the implement solution (optional):
Feature (What you would like to be added):
Deploy/maintain the correct PodDisruptionBudget
configuration according to the Etcd
resource status conditions.
Motivation (Why is this needed?):
Pick individually executable pieces of the multi-node proposal.
Approach/Hint to the implement solution (optional):
The deployment of PodDisruptionBudget
resource is probably best done in the main controller and the dynamic modification of the resource based on the Etcd
resource status
is best done in the custodian controller.
It is probably better to deploy the PodDisruptionBudget
only for the multi-node case (spec.replicas > 1
) because deploying it for the single-node case might block node drain.
Feature (What you would like to be added):
etcd-druid should not hardcode the annotation "cluster-autoscaler.kubernetes.io/safe-to-evict=false" on etcd pods and let user configure it using annotation field in CRD.
Motivation (Why is this needed?):
Not every etcd is critical to system. The above mentioned annotation is specific to cluster-autoscaler and not etcd. Depending on the use of etcd CRD creator should have choice to add this annotation.
From gardener's POV, this etcd-main is critical but etcd-events is not that critical. So, the annotation should be set for etcd-main but not etcd-events. In future, we thought of deploying etcd for cilium networking extension, there also probably this annotation might not be required.
Approach/Hint to the implement solution (optional):
Remove the annotation from https://github.com/gardener/etcd-druid/blob/master/charts/etcd/templates/etcd-statefulset.yaml#L30.
Feature (What you would like to be added):
A new resource EtcdMemeber
should be added to the druid.gardener.cloud/v1alpha1
API group.
Example:
apiVersion: druid.gardener.cloud/v1alpha1
kind: EtcdMember
metadata:
labels:
gardener.cloud/owned-by: etcd-test
name: etcd-test-0 # pod name
namespace: default
ownerReferences:
- apiVersion: druid.gardener.cloud/v1alpha1
blockOwnerDeletion: true
controller: true
kind: etcd
name: etcd-test
uid: <UID>
status:
id: "1"
lastTransitionTime: "2021-07-20T10:34:04Z"
lastUpdateTime: "2021-07-20T10:34:04Z"
name: member1
reason: up and running
role: Member
status: Ready
Every etcd member in a cluster should have a corresponding EtcdMember
resource which contains the shown status information. The EtcdMember
resource ought to be created and maintained by the backup-restore side car. Etcd-Druid may set status: Unkown
after heartbeatGracePeriod
(ref).
Motivation (Why is this needed?):
The original proposal intended the status information of each etcd member to be part of a []members
list in the etcd.status
resource. However, this will lead to update conflicts as multiple clients try to update the same resource at nearly the same time and we cannot use any adequate patch technique (SSA failed for K8s versions <= 1.21, strategic-merge not supported for CRDs) to prevent that.
Subtasks
Feature (What you would like to be added):
#163 introduced locking the main and custodian controller for every update to the Etcd
resource and its status.
This should be avoided and the race conditions in the tests should be solved in a different way.
Motivation (Why is this needed?):
Such synchronisation will lead to performance bottlenecks.
Approach/Hint to the implement solution (optional):
Use StatusWriter.Patch()
?
Credit: @timuthy
Feature (What you would like to be added):
Functionality to start a job which will use etcd-backup-restore to copy ETCD backups between backup buckets during the restore phase of Control Plane Migration (you can check the revised GEP here
The ETCD druid can find out whether it should start such a job via an additional field in the etcd
resource, providing information about the source backup bucket. All necessary secrets will be handled by the BackupEntry
controller and an additional "source" Backupentry resource.
Motivation (Why is this needed?):
This is needed to start an etcd-backup-restore copy operation which will be used to copy etcd backups between backup buckets. You can check issue 356 on the etcd-backup-restore repo
Approach/Hint to the implement solution (optional):
A POC was already developed for this, however we did not start the etcd-backup-restore copy operation as a job. Still the main functionality and idea is present in the POC. It is outlined here: gardener/gardener#3875
Describe the bug:
After a single-node etcd instance provisioned via etcd-druid
terminated abnormally (non-zero exit code) the etcd
container restarted and the the backup-restore
sidecar container (on data directory verification) had the following logs.
current etcd revision (2314180238) is less than latest snapshot revision (2314180239): possible data loss
On circumventing the backup restoration triggered because of this, it was found that the WAL directory (not checked by the `backup-restore sidecar) contained more recent revisions which were applied after the restart (without the backup restoration).
Expected behavior:
etcd-druid
should try and configure etcd instances to shut down safely (and flush the WAL changes to the database) or often that.
How To Reproduce (as minimally and precisely as possible):
Not known yet.
Logs:
current etcd revision (2314180238) is less than latest snapshot revision (2314180239): possible data loss
Screenshots (if applicable):
Environment (please complete the following information):
Anything else we need to know?:
Document information about how to deploy etcd-druid and what all resources are needed by etcd-druid while reconciling an etcd
resource.
Feature (What you would like to be added):
Please add validation code for etcd resources, similarly to the validation code that already exists for other Gardener extension resources, even though this is technically still dead code.
We are currently working on a new validating webhook in seed-admission-controller
for such extension resources, see gardener/gardener#4293, I think we could include the validation of etcd
resources there as well. Alternatively, etcd-druid
could introduce its own validating webhook if for whatever reason the above option is not good enough.
Motivation (Why is this needed?):
We recently had a rather severe issue that could have been prevented if we had such validation in place, see gardener/gardener-extension-provider-azure#328 (comment). In this particular case, gardenlet
was generating an etcd
resource with a spec.backup.store.prefix
set to --
due to a data race. With validation in place, we could have detected --
as an invalid spec.backup.store.prefix
and prevented the reconciliation from continuing. This particular issue is already fixed in gardenlet
(see gardener/gardener#4459 and gardener/gardener#4454), but similar issues may occur in the future.
Approach/Hint to the implement solution (optional):
Feature (What you would like to be added):
Currently, the health check of the etcd pods is linked to the backup health (last backup upload succeeded) in addition to just etcd health. But as long as etcd data is backed by persistent volumes (it is now), we can afford for etcd to continue serve income requests even when backup upload fails as long as high priority alerts are triggered when backup upload fails and follow up is done to resolve the issue.
Motivation (Why is this needed?):
Avoid bringing down the whole shoot cluster control-plane when backup upload fails as that basically brings the cluster to a grinding halt. This might be affordable if etcd data is backed by persistent volumes because for data loss to occur a further data corruption in the persistent volumes is required (while backup upload is failing) to cause a data loss.
See also https://github.tools.sap/kubernetes-canary/issues-canary/issues/599
Approach/Hint to the implement solution (optional):
The following tasks might have to checked/evaluated.
Feature (What you would like to be added):
Summarise the roadmap for etcd-druid
with links to the corresponding issues.
Motivation (Why is this needed?):
A central place to collect the roadmap as well as the progress.
Approach/Hint to the implement solution (optional):
StatefulSet
(with replicas: 1
) with the containers for etcd
and etcd-backup-restore
the same way it is being done now.etcd
defragmentation schedule from the CRD to etcd-backup-restore
sidecar container.etcd
cluster
etcd
nodes within the same Kubernetes cluster.
etcd
nodes in the same Kubernetes cluster/namespace as the CRD instance.Scale
sub-resource implementation for the current CRDetcd
learners/members during scale up, including quorum adjustment.etcd
members during scale down, including quorum adjustment.etcd
clusteretcd
nodes distributed across availability zones in the hosting Kubernetes clusteretcd
node in a different Kubernetes cluster.
etcd
node will be provisioned via a separate CRD instance in a different Kubernetes cluster but these nodes will be configured to find each other to form an etcd
cluster.etcd
cluster.etcd
learners/members during scale up, including quorum adjustment.etcd
members during scale down, including quorum adjustment.etcd
clusterVerticalPodAutoscaler
supports multiple update policies including recreate
, initial
and off
.recreate
policy is clearly not suitable for a single-node etcd
instances because of the implications on frequent, unpredictable and unmanaged down-time.initial
policy does not make sense for etcd
considering the longer database verification time for non-graceful shutdown.etcd
instance, vertical scaling via the VerticalPodAutoscaler
would always be disruptive because of the way scaling is done by VPA. It gives no opportunity to take action before the etcd
pod(s)
are disrupted for scaling.etcd
-specific steps to mitigate the disruption during (vertical) scaling if an alternative way is used to vertically scale a CRD instead of the individual pods
directly.etcd
instance, updates would be disruptive.etcd
-specific steps to mitigate the disruption during updates.etcd
instance which might mean that the memory requirement for database restoration is almost certain to be proportionate to the database size. However, the memory requirement for backup (full and delta) need not be proportionate to the database size at all. In fact, it is very realistic to expect that the memory requirement for backup be more or less independent of the database size.Describe the bug:
The main controller reconciles changes to Etcd
resource spec
even if gardener.cloud/reconcile
annotation is not added. This is against the gardener extension contract.
Expected behavior:
The main controller should use the predicates in such a way that changes to the Etcd
resource spec
are reconciled only when the resource is also annotated appropriately. For example, see here.
How To Reproduce (as minimally and precisely as possible):
Logs:
Screenshots (if applicable):
Environment (please complete the following information):
Anything else we need to know?:
Feature (What you would like to be added):
Move to etcd v3.4.10
or the latest v3.4.x
patch release.
Motivation (Why is this needed?):
Approach/Hint to the implement solution (optional):
We might have to build our own custom image for etcd to package the dependencies of the bootstrap script (wget
).
Feature (What you would like to be added):
We should created a suite of automated tests that test gardener integration.
Motivation (Why is this needed?):
We should detect as many regression and backward compatibility issues before merging a PR and to keep the master release-ready at any point in time.
Approach/Hint to the implement solution (optional):
Feature (What you would like to be added):
The reconciliation flow of etcd-druid
includes claiming from potentially multiple pre-existing StatefulSet
, Service
and ConfigMap
objects if they exist. This is done by selecting the objects based on spec.selector
in the Etcd
resource, claiming one of the matching objects (if any) and deleting the rest of the objects (if any). If no matching objects are found then a new object is created.
The logic of claiming from multiple pre-existing objects objects based onspec.selector
was done because of the following reasons.
etcd-druid
was introduced. I.e. adopting objects created from the time before etcd-druid
was introduced minimised and simplified clean up.StatefulSet
for all the members of an ETCD cluster. Another alternative of using one StatefulSet
for each member of an ETCD cluster was still open at that time.Now that the migration scenario as well as the multi-node design don't need the functionality of claiming from multiple pre-existing objects, we can simplify the claim logic to just pick the object to be claimed by the same name as the Etcd
resource . We will still need the claim functionality to mark it as claimed, of course.
Motivation (Why is this needed?):
Approach/Hint to the implement solution (optional):
Feature (What you would like to be added): ETCD druid should create multiple ETCD instances(along with ETCDBR instances) as specified in ETCD CRD.
Motivation (Why is this needed?): To allow bootstrapping of multinode ETCD cluster for shoot cluster
Approach/Hint to the implement solution (optional):
Refer: #107
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.