WIP
This issue is under construction and will be updated as we go
This issue serves to track the projects design.
Problem
The problem we are trying to solve is as such:
Projects with similar solutions
Both of these try to solve it using a Mutating Webhook Controller:
While the second appears abandoned, the first seems more active. It, however, has a few issues that'd we like to tackle better. See the requirements section.
Kyverno also allows doing this through its policies. Kyverno, however, is a massive solution with a lot of complexity, and we aim for something simpler.
Requirements
- This is a simple application, with a simple purpose. We expand features around the idea of controlling where pods based on their namespace, but nothing else.
- We should minimize configuration whenever possible. For example, if we use a label, it must contain the most amount of information possible.
- Ideally, the configuration for the namespace should be in the namespace. This follow's Kubernetes design, where most resources allow configuration exactly where they are. For example: you don't need to change the Ingress Controller to configure an Ingress object. This might be hard for this case, but we should still strive to keep it as concentrated on the namespace as possible.
- Since this is a Kubernetes project, we should package it in a way it's easy to install in Kubernetes, and we should not worry about other cases as much.
- We should strive to affect as little of the cluster as possible.
- We should strive to test as much as possible. What we are doing is "dangerous" in the sense that it can literally break pods. We can't afford to push updates willy-nilly.
Proposal
Kubernetes
Similarly to other projects, we can work using Kubernetes' Mutating Webhook Controllers to intercept pod creation requests and validate them.
We should be able to use the namespaceSelector
to limit operation to only namespaces we want to match. If we use it with the Exists
operator, we should even be able to use it to configure "what nodes" the namespace belongs to (more on the Configuration section).
If we use that approach, we should be careful to follow all the best practices:
- Idempotence
- Handling all pods
- Low Latency
- Validating the result (maybe a toggle?)
- Excluding our own namespace (maybe a toggle?)
- No side effects
Configuration
Inspired by some of the similar projects, I propose a configuration based on "groups". One creates groups in Pod Director's namespace and then uses a label to select which group.
For example, a namespace manifest might look like this:
apiVersion: v1
kind: Namespace
metadata:
name: foo
labels:
# label name and format up to debate
# here we select the "bar" configuration
pod-director/group: bar
This allows us to use a namespaceSelector
like so on the Admission Controllers:
namespaceSelector:
matchExpressions:
- key: pod-director/group
operator: Exists
To configure group "bar", I propose a simple YAML file. As much as I dislike YAML, it is familiar to the Kubernetes community and meshes well with the existing tooling, such as Helm. The configuration for the "bar" and "bazz" groups might look like this:
groups:
bar:
nodeSelector: # node selector labels
affinity: # pod affinity
tolerations: # to handle taints
bazz:
nodeSelector: # node selector labels
affinity: # pod affinity
tolerations: # to handle taints
A alternative is to use a list with names:
groups:
- name: bar
nodeSelector: ...
affinity: ...
tolerations: ...
But, in our case, it may be simpler to use this as a Map of configurations.
As a side note, and I'm not yet sure how to do this, we can consider a toggle to allow "exclusions", after all, for some reason some pods in a namespace might not need (or must not) run in the same nodes as the global configuration.
Application
A simple HTTP Rest server using:
Should be more than sufficient. We'll handle things like configuring rustfmt, toolchain, editorconfig as we go.
The kube-rs repository has a great example of how to implement an Admission Controller, which is exactly what we need. While they use warp, my general internet searches have pointed words Axum being a more modern/better HTTP server. Either should fulfill our 1 ~ 2 endpoint needs.
Container Image
Since we are running on Kubernetes, a container is required. I propose two images:
- A release image, with debug symbols stripped and aiming for minimal size. If we can do a distroless image, wonderful, but it's not necessary.
- A debug image, with full debug information included. This can have the same tag as above, only with a
-debug
suffix.
Helm
A Helm Chart should be the main way to deploy the application. If a user wants to use their own chart, manifests or wants to run this off a lambda somewhere, that is their imperative.
The Chart could following the application's versioning, even if there were no changes. That way, every time we release the application, we release the chart. I see no issue with this since our utility is closely tied to k8s.
The Chart should also contain helpers for generating the certificates necessary to deploy an admission controller. I propose two:
- Simple, helm or init container generated certificates, for quick and dirty testing
- A better configuration, to be used with cert-manger
Testing
Testing the application is the most important part. The axum library has a few testing libraries we can use, and should be pretty simple to run.
The Helm Chart can include some unittests to help.
Ideally, when we reach a 1.0.0, we should have some testing done against an actual cluster. I believe GHA has a "kind" action we can use to actually spin up a cluster, maybe even with cert-manager.
Ideally, we should also host the chart somewhere. Github Pages is probably sufficient.
Nice to Haves
The list of nice to haves is massive. I'd like to list some, in an order in which I think they are most important:
- A "enforce" mode and a "log mode" . The idea is that a cluster administrator can first enable only the logging mode to see what the application would do.
- Structured logging
- Configuration hot-reloading
- Being able to exclude some pods in a namespace, but only if the global configuration allows
- A "look at all namespaces" mode, which should include a "default" setting
- Automatically testing against newer k8s versions as they come out
- Prometheus metrics
- Distroless image
- Benchmark tests
- Renovate integration, so we keep our dependencies up to date