Cloud-ops automation runbooks that are ready to use. Build your own automations using the hundreds of drag and drop actions included in the repository. Built on Jupyter Notebooks, our automation platform jumpstarts your SRE RunBook creation. ๐ published by the unSkript community.
Is your feature request related to a problem? Please describe.
A higher-than-expected percentage of the operations kube-apiserver is performing are erroring. Since random errors are inevitable, kube-apiserver has a โbudgetโ of errors that it is allowed to make before triggering this alert. It will be good to have a runbook to execute the standard set of steps whenever this happens.
Is your feature request related to a problem? Please describe.
Kubelets have a configuration that limits how many Pods they can run. The default value of this is 110 Pods per Kubelet, but it is configurable. It will be great to have a runbook that detects when Kubelet has more than desired capacity of pods and mitigates the issue.
AMI is region specific, but often times we need the AMI to be available in multiple regions.
Create a lego that replicates an AMI to given regions (all if none is specified).
Is your feature request related to a problem? Please describe.
A runbook to check if Kube Node is ready or not and if it is not ready, print the reason for it not being ready.
Is your feature request related to a problem? Please describe.
There can be various reasons why a volume is filling up. A runbook that will cover generic reasons for volumes that are legitimately filling and generic mitigations for specific issues.
Is your feature request related to a problem? Please describe.
The Github Workflow that currently is checked in runs pylint on all python file. This can be improved to run the pylint on changed files only, this would reduce the Github workflow run minutes.
What is to be done?
Enhance the existing workflow so it runs pylint on changed py files
Make this as the pre-check which should pass to open a Pull Request
Is your feature request related to a problem? Please describe.
A runbook to analyze the state of Kube API every few minutes and run diagnosis as well as remediation or send an alert whenever Kube API is down.