Comments (17)
I've been busy launching a new service on my team, and I recently got some time to work on this project.
I'm redesigning this project from scratch. It has more simple and sound structure.
I'll let you know when I'm ready.
from pod-graceful-drain.
Hi,
Thank you for your consideration! But I don't think I'm able to receive your donation.
I have a very tight schedule at my current job for now 😢. So I can't give you a specific time frame.
However, I want you to understand that I didn't give up on this issue.
I'll try to squeeze in as possible since I also rely on this and have a bunch of pending security patches.
from pod-graceful-drain.
Hi,
I'm really sorry that I haven't replied to you too long.
I had some hard time to take care of things around me due to personal/work issues. They are getting better and I expect that most of them would be resolved by the end of december this year.
I think you might be able to mitigate the downtime risk with carefully orchestrating the process until that.
It might be like this:
- prepare new worker nodes.
- migrate pod-graceful-drain to one of those node.
The rest should be still during this step, and the migration should be quick enough.
Any of the rest might get downtime with 5xx error if they're evicted during this step since pod-graceful-drain might not be available (replica < 1) or behave unexpectedly (replica > 1) depending on how you migrate it. - Check pod-graceful-drain is working correctly when it is stable in its new node.
- The rest can migrate while pod-graceful-drain is stable.
- remove old worker nodes if all migration is finished.
from pod-graceful-drain.
Hi,
I know you're not under any obligation to develop this and appreciate all you've done already but are there any works in progress towards this or #33? At the moment this prevents being able to upgrade nodes without hard downtime.
I really wish I could contribute but low level custom Kubernetes add-ons is still above my pay grade. It may not be forever but it is in the foreseeable future.
from pod-graceful-drain.
Hi,
I'm looking for a way to implement this without external components, but at the moment I'm busy with work so it will take a few months to get started on this issue.
from pod-graceful-drain.
Hi,
Definitely not rushing you, but:
I'd love to be able to continue using this tool but not being able to upgrade nodes on our cluster without downtime is becoming an issue around not being able to apply security patches to the underlying EC2 boxes by creating new node groups without introducing downtime on our apps.
With that said, are you open to receiving USD donations to help expedite this ticket and any related tickets to reach an outcome of being able to do zero down time node upgrades while using pod-graceful-drain on your pods?
If so feel free to email me at [email protected]
from the email address listed in your GitHub profile and we can go from there. If not, that's totally ok too. I fully understand you're not here to maintain your open source quality on anyone's terms except your own. :D
Also if you need any assistance in testing anything (regardless of your decision) let me know. Happy to do anything I can do.
from pod-graceful-drain.
Here's some news:
Good: I told my boss that I'm going to work on this issue during the worktime.
Bad: The schedule situation isn't changed.
Ugly: We've started crunching.
from pod-graceful-drain.
It's all good. It is nice to see you're using this at work too.
from pod-graceful-drain.
Hi,
Where do you stand on this since it's been a bit? I'm just probing for information, not trying to push you into anything!
On the bright side, your project is so wildly useful that we're happily using it and would love to continue using it.
from pod-graceful-drain.
@nickjj potential workaround:
Using selectors/taints, you can run this on separate nodes from where the workloads requiring the graceful shutdown run.
For example, on AWS we have all our controllers running in one EKS ASG, and then our services running in another. For upgrades or scaling events, we can do these two groups independently.
from pod-graceful-drain.
@nkkowa We have a small cluster with (2) nodes with no auto-scaling and we run (2) replicas of most of our services. This tool itself ends up always running on 1 of the nodes. We haven't yet configured anything at the K8s level to guarantee 1 replica runs on each node because K8s has handled this pretty well out of the box.
Were you suggesting something that could still work with that in mind? I also update the cluster usually with Terraform (EKS module) which handles the whole taint+drain+etc. process.
from pod-graceful-drain.
@nickjj Ah, without an architecture that has dedicated nodes for controllers/operators I don't think my setup would work for you 😞
from pod-graceful-drain.
I've been busy launching a new service on my team, and I recently got some time to work on this project. I'm redesigning this project from scratch. It has more simple and sound structure. I'll let you know when I'm ready.
It's taking more time to write tests and polish up non-critical features.
The core functionality were almost finished already in just a first few days.
I'll pre-release it without these non-critical features soon.
from pod-graceful-drain.
Sounds great. This is a critical tool in general, don't feel rushed!
I'd personally be happy to wait a bit longer to have extensive test coverage and features that you deem are worthy. What are the non-critical features btw?
from pod-graceful-drain.
Thank you. I was trying some leader election based minor optimizations.
from pod-graceful-drain.
Hi,
Any updates on this? Not trying to be pushy in the slightest. EKS 1.23 will be end of life in a few months and I was wondering if I should upgrade the cluster with a minute of app downtime or potentially wait until this patch is available to do it without downtime.
from pod-graceful-drain.
It's all good, hope things are going well. I ended up taking the downtime approach a few months ago, it was less than a minute or so of downtime but it's very uncontrollable. Thanks for the checklist, hopefully in a few months it can go away!
from pod-graceful-drain.
Related Issues (18)
- Support other load balancers
- Eviction can be handled differently HOT 1
- Better logging
- Respect DeleteOptions HOT 3
- What specifically should namespaceSelector values be? HOT 2
- deprecation of rbac.authorization.k8s.io/v1beta1 HOT 2
- Finalizer support HOT 2
- Which license does your project fall under? HOT 3
- Fix cert-manager HOT 1
- Unable to upgrade cluster because of a drain error related to admission webhook denied request: no kind "Eviction" is registered for version HOT 10
- Prevent 5xx error during evicting deployments that has small replica count HOT 2
- [Query] Is there a configuration to apply the deleteAfter configuration only to the specific pods and not to all the pods HOT 1
- Is the helm chart namespace sepecific? HOT 1
- Does we need to tweak any other variable to make this work HOT 3
- Make admission delay agnostic to webhook timeout
- Restrict namespace like `elbv2.k8s.aws/pod-readiness-gate-inject: enabled`
- Configure which labels are removed HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pod-graceful-drain.