Giter Site home page Giter Site logo

foriequal0 / pod-graceful-drain Goto Github PK

View Code? Open in Web Editor NEW
238.0 5.0 16.0 868 KB

You don't need `lifecycle: { preStop: { exec: { command: ["sleep", "30"] } } }`

License: Apache License 2.0

Dockerfile 1.19% Makefile 2.55% Smarty 2.98% Go 93.28%

pod-graceful-drain's Introduction

GitHub tag (latest SemVer) Helm chart version

Pod Graceful Drain

You don't need lifecycle: { preStop: { exec: { command: ["sleep", "30"] } } }

Installation

helm install \
  --repo https://foriequal0.github.io/pod-graceful-drain\
  --namespace kube-system \
  pod-graceful-drain \
  pod-graceful-drain

What is this?

Have you ever suffered from getting 5xx errors on your load balancer when you roll out new deployment? Have you ever applied this ugly mitigation even if your app is able to shut down gracefully?

lifecycle:
  preStop:
    exec:
      command: ["sleep", "30"]

As far as I know, in Kubernetes, there is no facility to notify pod dependent subsystems that the pod is about to be terminated and to reach a consensus that the pod is okay to be terminated. So during the deployment rollout, a pod is terminated first, and the subsystems reconcile after that. There is an inevitable delay between the pod termination and the reconciliation. Eventually, endpoints remove the pod from their lists, then load balancer controllers deregister and start to drain the traffic. But, what's the point of draining when the pod is already terminated? It is too late when the deregistration is fully propagated. This is the cause of load balancer 5xx errors. You might reduce the delay, but can't eliminate it.

So that's why everyone suggests sleep 30 while saying it is an ugly hack regardless of being able to terminate gracefully. It delays SIGTERM to the app while setting the pod to the terminating state, so the dependent subsystems could do reconciliation during the delay. However, sometimes, "sleep" command might not be available on some containers. It might be needed to apply mistake-prone patches to some chart distributions. And it is ugly. It doesn't seem to be solved in a foreseeable future, and related issues are getting closed due to the inactivity by the bots.

pod-graceful-drain solves this by abusing admission webhooks. It intercepts the deletion/eviction of a pod deletion/eviction process to prevent the pod from getting terminated for a period. It'll take appropriate methods to delay the pod deletion: deny the admission, response the admission very slowly, mutate the eviction request to dry-run, etc. Then the pod will be eventually terminated by the controller after designated timeouts. With this delay, traffics are drained safely since the pod is still alive and can serve misdirected new traffics.

Another goal of it is making sure it won't affect common tasks such as deployment rollout, or kubectl drain. By removing labels, which isolates the pod from the replicasets, rollout process will continue as the pod was terminated, without actually terminating it. It modifies the requested pods/eviction, which usually made during the kubectl drain, to be dry-run, then it isolates and eventually terminates the pod.

I find that this is more 'graceful' than the brutal sleep. It can still feel like ad-hoc, and hacky, but the duct tapes are okay if they are hidden in the wall (until they leak).

pod-graceful-drain's People

Contributors

ajaykumarmandapati avatar foriequal0 avatar hatemosphere avatar infusible avatar nvanheuverzwijn avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

pod-graceful-drain's Issues

Finalizer support

I'm trying to understand why you made use of admission webhooks instead of finalizers.

With finalizers we wouldn't have to artifically delay anything and could just re-queue the terminating pod to be checked again. Pods would get into "Terminating" state, the load balancer (e.g. aws-alb-load-balancer-controller) would handle removing the pod from the target group and as soon as the pods' ip isn't registered in the LB anymore we could release the finalizer.

Just asking myself whether that's something we could contribute.

deprecation of rbac.authorization.k8s.io/v1beta1

We would need to upgrade our EKS version to 1.17 from 1.16 which deprecates rbac.authorization.k8s.io/v1beta1 , we found relevances to it in your code but could not completely understand if they are actively being used in the project.
Could you confirm if these would be upgraded in the near future since we might move to EKS 1.21 in the coming months.

Eviction can be handled differently

We might use MutatingAdmissionWebhook to set dryRun to eviction and intercept the pod deletion.
MutatingAdmissionWebhook doesn't work on the DELETE verb (pod deletion), but pod eviction is CREATE verb.

Proper HA setup

During node draining, pod-graceful-drain itself may also be evicted.
I deliberately chosen to ignore webhook admission failures since otherwise deployments would fail to progress.
Because of this, pods that are evicted/deleted at that time can suffer downtime even with multiple replicas.

To fix this, pod-graceful-drain needs a HA setup.
However, simply giving it replicas: 2 on the deployment won't work.
It would not behave correctly when there are multiple replicas of it.

Unable to upgrade cluster because of a drain error related to admission webhook denied request: no kind "Eviction" is registered for version

Hi,

I've been using your package successfully for a few months now. Today I tried to upgrade my EKS cluster from 1.21 to 1.22 and I think this issue is related to pod graceful drain.

The worker nodes couldn't be drained because I got this type of error for a bunch of pods. I've included the exact error for all of the pods that were unable to be evicted which results in the node not being able to be drained / upgraded.

It's a test cluster so this is basically everything running on the cluster:

error when evicting pods/"argocd-redis-d486999b7-sgptn" -n "argocd": admission webhook "[mpodseviction.pod-graceful-drain.io](http://mpodseviction.pod-graceful-drain.io/)" denied the request: no kind "Eviction" is registered for version "policy/v1" in scheme "pkg/runtime/scheme.go:100"
error when evicting pods/"argocd-server-cb57f685d-22bng" -n "argocd": admission webhook "[mpodseviction.pod-graceful-drain.io](http://mpodseviction.pod-graceful-drain.io/)" denied the request: no kind "Eviction" is registered for version "policy/v1" in scheme "pkg/runtime/scheme.go:100"
error when evicting pods/"argocd-notifications-controller-5f8c5d6fc5-ldqlp" -n "argocd": admission webhook "[mpodseviction.pod-graceful-drain.io](http://mpodseviction.pod-graceful-drain.io/)" denied the request: no kind "Eviction" is registered for version "policy/v1" in scheme "pkg/runtime/scheme.go:100"
error when evicting pods/"coredns-85d5b4454c-dskk9" -n "kube-system": admission webhook "[mpodseviction.pod-graceful-drain.io](http://mpodseviction.pod-graceful-drain.io/)" denied the request: no kind "Eviction" is registered for version "policy/v1" in scheme "pkg/runtime/scheme.go:100"
error when evicting pods/"argocd-dex-server-64cb85bf46-pfbvx" -n "argocd": admission webhook "[mpodseviction.pod-graceful-drain.io](http://mpodseviction.pod-graceful-drain.io/)" denied the request: no kind "Eviction" is registered for version "policy/v1" in scheme "pkg/runtime/scheme.go:100"
error when evicting pods/"argocd-applicationset-controller-66689cbf4b-5k85t" -n "argocd": admission webhook "[mpodseviction.pod-graceful-drain.io](http://mpodseviction.pod-graceful-drain.io/)" denied the request: no kind "Eviction" is registered for version "policy/v1" in scheme "pkg/runtime/scheme.go:100"
error when evicting pods/"aws-load-balancer-controller-597f47c4df-mskv2" -n "kube-system": admission webhook "[mpodseviction.pod-graceful-drain.io](http://mpodseviction.pod-graceful-drain.io/)" denied the request: no kind "Eviction" is registered for version "policy/v1" in scheme "pkg/runtime/scheme.go:100"
error when evicting pods/"coredns-85d5b4454c-w9m7n" -n "kube-system": admission webhook "[mpodseviction.pod-graceful-drain.io](http://mpodseviction.pod-graceful-drain.io/)" denied the request: no kind "Eviction" is registered for version "policy/v1" in scheme "pkg/runtime/scheme.go:100"
error when evicting pods/"argocd-application-controller-0" -n "argocd": admission webhook "[mpodseviction.pod-graceful-drain.io](http://mpodseviction.pod-graceful-drain.io/)" denied the request: no kind "Eviction" is registered for version "policy/v1" in scheme "pkg/runtime/scheme.go:100"
error when evicting pods/"sealed-secrets-controller-5fb95c87fd-b25g8" -n "kube-system": admission webhook "[mpodseviction.pod-graceful-drain.io](http://mpodseviction.pod-graceful-drain.io/)" denied the request: no kind "Eviction" is registered for version "policy/v1" in scheme "pkg/runtime/scheme.go:100"
error when evicting pods/"argocd-repo-server-8576d68689-rsgww" -n "argocd": admission webhook "[mpodseviction.pod-graceful-drain.io](http://mpodseviction.pod-graceful-drain.io/)" denied the request: no kind "Eviction" is registered for version "policy/v1" in scheme "pkg/runtime/scheme.go:100"
error when evicting pods/"pod-graceful-drain-949674d56-stp7g" -n "kube-system": admission webhook "[mpodseviction.pod-graceful-drain.io](http://mpodseviction.pod-graceful-drain.io/)" denied the request: no kind "Eviction" is registered for version "policy/v1" in scheme "pkg/runtime/scheme.go:100"
error when evicting pods/"aws-load-balancer-controller-597f47c4df-ph64g" -n "kube-system": admission webhook "[mpodseviction.pod-graceful-drain.io](http://mpodseviction.pod-graceful-drain.io/)" denied the request: no kind "Eviction" is registered for version "policy/v1" in scheme "pkg/runtime/scheme.go:100"

Any tips on where to go from here?

Prevent 5xx error during evicting deployments that has small replica count

If your deployments are well replicated, then this is not a problem.
However, when they are not, then there's still be ALB 5xx errors while they are evicted.
These eviction can be triggered by kubectl drain, node termination, etc.

Evicted pods should be handled in this order:

  1. The pod is requested to be evicted.
  2. Increase the replica count of the deployment of the requested pod. However, you can't isolate the pod yet since it'll drain the pod.
  3. When the new pods are ready, you can isolate the pod, restoring the replica count, and starts delayed deletion.

Related comment: #31 (comment)

Support other load balancers

Currently it works only with aws-load-balancer-controller. Would it work with other load balancers if there is any other load balancers with the same problem?

What specifically should namespaceSelector values be?

I'm testing our your chart because the 502 issue on draining is getting out of hand.

I am trying to restrict with the namespaceSelector however none of the values I've tried work. Can you provide an example of what is expected?

Also, thank you for this chart! I'm glad there is something to help. Hoping that better support makes it into AWS Load Balancer itself.

Is the helm chart namespace sepecific?

If my applications are deployed in test namespace, then this helm chart needs to be deployed in test namespace?

I have deployed in kube-system, will it take care of all the namespaces?

Fix cert-manager

Kustomize action fails with these error:

mutatingwebhookconfiguration.admissionregistration.k8s.io/pod-graceful-drain-mutating-webhook-configuration created
Error from server (InternalError): error when creating "STDIN": Internal error occurred: failed calling webhook "webhook.cert-manager.io": Post "https://cert-manager-webhook.cert-manager.svc:443/mutate?timeout=10s": dial tcp 10.110.109.37:443: i/o timeout
validatingwebhookconfiguration.admissionregistration.k8s.io/pod-graceful-drain-validating-webhook-configuration created
Error from server (InternalError): error when creating "STDIN": Internal error occurred: failed calling webhook "webhook.cert-manager.io": Post "https://cert-manager-webhook.cert-manager.svc:443/mutate?timeout=10s": context deadline exceeded

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.