some-natalie / kubernoodles Goto Github PK

View Code? Open in Web Editor NEW

53.0 4.0 11.0 334 KB

k8s runners for GitHub Actions in the enterprise, made for humans

Home Page: https://some-natalie.dev/kubernoodles/

License: MIT License

Shell 36.59% PowerShell 1.51% Dockerfile 61.55% Python 0.34%

github-enterprise kubernetes github-actions

kubernoodles's Introduction

Kubernoodles

GHES users prior to 3.9, please navigate back to tag v0.9.6 (release) for the APIs that'll work for you. ❤️

Kubernoodles is a framework for managing custom self-hosted runners for GitHub Actions in Kubernetes at the enterprise-wide scale. The design goal is to easily bootstrap a system where customized self-hosted runners update, build, test, deploy, and scale themselves with minimal interaction from enterprise admins and maximum input from the developers using it.

This is an opinionated reference implementation, designed to be taken and modified to your liking. I use this to test GitHub Actions on my personal account, GitHub Enterprise Cloud (SaaS) or GitHub Enterprise Server (self-hosted) from Docker Desktop, a Raspberry Pi cluster for arm64, a managed Kubernetes provider, and other random platforms as needed. Your implementation may look wildly different, etc.

❓ Are you a GitHub Enterprise admin that's new to GitHub Actions? Don't know how to set up self-hosted runners at scale? Start here!

Pull requests welcome! ❤️

Design goals and compromises

There are a few assumptions that go into this that aren't necessarily true or best practices outside of an enterprise "walled garden". Being approachable and readable are the most important goals of all code and documentation. As a reference implementation, this isn't a turn-key solution, but the amount of fiddling needed should be up to you as much as possible. Links to the appropriate documentation, resources to learn more where needed, and explanations of design choices will be included!

Co-tenanted business systems tend to have small admin teams running services (like GitHub Enterprise) available to a large group of diverse internal users. That system places a premium on people-overhead more than computer-overhead. The implication of that is an anti-pattern where there are larger containers capable of lots of different things instead of discrete, "microservices" type containers.

Moving data around locally is exponentially cheaper and easier than pulling data in from external sources, especially in a larger company. Big containers are not scary if the registry, the compute, and the entire network path is all within the same datacenter or availability zone. Caching on-site is important to prevent rate-limiting by upstream providers, as that can take down other services and users that rely on them. This also provides a mechanism for using a "trusted" package registry, common in enterprise environments, using an .env file as outlined here.

Setup

The admin introduction walks you through some key considerations on how to think about implementing GitHub Actions at the enterprise scale, the implications of those decisions, and why this project is generally built out the way it is.

The admin setup is a mostly copy-and-paste exercise to get a basic deployment up and going.

The customization guide has a quick writeup and links to learn more about the ways you can customize things to your needs.

Tips and tricks has a few more considerations if things aren't quite going according to plan.

Choosing the image(s)

There are currently 4 images that are "prebuilt" by this project, although you can certainly use others or build your own! All images assume that they are ephemeral. If you're copy/pasting out of the deployments, you should be set ... provided you give it the right repository/organization/enterprise to use!

image name	base image	CVE count (crit/high/med)	virtualization?	sudo?	notes
ubi8	ubi8-init:8.9	4/6/74	❌	❌	n/a
ubi9	ubi9-init:9.3	0/6/87	❌	❌	n/a
rootless-ubuntu-jammy	ubuntu:jammy	0/3/45	rootless Docker-in-Docker	nope	common rootless problems
wolfi	wolfi-base:latest	0/0/6	❌	❌	n/a

Note

CVE count was done on 26 April 2024 with the latest versions of grype and runner image tags.

Sources

These are all excellent reads and can provide more insight into the customization options and updates than are available in this repository. This entire repository is mostly gluing a bunch of these other bits together and explaining how/why to make this your own.

GitHub's official documentation on hosting your own runners.
Kubernetes controller for self-hosted runners, on GitHub, is the glue that makes this entire solution possible.
Docker image for runners that can automatically join, which solved a good bit of getting the runner agent started automatically on each pod, write up and GitHub.
GitHub's repository used to generate the hosted runners' images (GitHub), where I got the idea of using shell scripts to layer discrete dependency management on top of a base image. The software scripts are (mostly) copy/pasted directly out of that repo.

Learn more

Don't know what the whole Kubernetes thing is about? Here's some help:
- The Kubernetes Aquarium
- The Cloud Native Computing Foundation's book, The Illustrated Children's Guide to Kubernetes
- The official tutorial covering the basics of what Kubernetes is and how it works
- What helped me to understand this whole concept shift is to think that Kubernetes is to containers as KVM/vSphere/Hyper-V is to virtual machines. It's probably not a perfect metaphor, but it helped. 😄
Want to see a whole bunch of other ways to solve this problem? You should check out Awesome Runners for a curated list and amazing matrix comparison of all sorts of other self-hosted runner solutions.
Even if this is 100% on-premises, many of these antipatterns for cloud applications are very relevant to the architecture of CI at scale and these are all well worth the time to read.
Rootful versus rootless containerization in Podman is a bit different than in Docker. Learn more at RedHat's Enable Sysadmin blog post.

Dependencies of note

actions-runner-controller
Helm
Yelp dumb-init
Docker engine and Docker Compose for Debian-based images
Podman, Buildah, and Skopeo for the RedHat-based images
actions/runner is the runner agent for GitHub Actions

kubernoodles's People

Contributors

Stargazers

Watchers

Forkers

selvigp williamsunctp williamsun-hha xxjs17xx noelxp shakerg krishna007-cloud infotitanz amiynarh vivacitylabs

kubernoodles's Issues

Bump dumb-init

Bump Yelp's dumb-init to the latest version

[SECURITY] - Cert Manager Mandatory for Openshift 4.X

Describe the problem

A clear and concise description of the problem.

We are trying to setup ARC on Openshift 4.X. Can we install the ARC without using the cert-manager for openshift 4.X?

Cleanup shell scripts

Make sure shellcheck is clean on all the things!

Add Ubuntu Jammy Jellyfish (22.04 LTS)

Set up a basic "does it build" test to merge to `main`

Set up a basic "does it build" test to merge to main, to be expanded later

Bump dependencies and scripts

ARC has some updates to logger, entrypoint, etc. scripts from where I'd left them. Bump these. :)

Lint the dockerfiles

Lint the dockerfiles with hadolint ... clean everything up and ✨

Move to `test` and `latest` tags for repo use

This is a demo repo, so latest is alright, but test should be used for the build/test jobs

Auto-bump dependencies _within_ Dockerfile

Example - figure out the latest version of Docker Engine or the runner agent, then bump that on a PR

Set up labels for this project

Some of the default issue labels don't really match what I want here. Maybe label things for "tech debt" cleanup, etc.

Add docs on supply chain management

Add documentation on approaches to supply chain management WRT security, reproducibility, and bandwidth usage

Remove `patched` files

Remove "patched" files if no longer needed

Get off `latest`

tbh, this is mostly about growing up a little bit here

[NEW RUNNER] - Invalid value: true: Privileged containers are not allowed

Is your feature request related to a problem? Please describe

i tried runner deployment with https://github.com/some-natalie/kubernoodles/blob/main/deployments/ghes/rootless-ubuntu-focal.yml and getting below error

2022-09-21T15:44:16Z ERROR actions-runner-controller.runner Failed to create pod resource {"runner": "actions-runner-system/rootless-ubuntu-focal-mbc22-vgbds", "error": "pods "rootless-ubuntu-focal-mbc22-vgbds" is forbidden: PodSecurityPolicy: unable to admit pod: [spec.containers[0].securityContext.privileged: Invalid value: true: Privileged containers are not allowed]"}
github.com/actions-runner-controller/actions-runner-controller/controllers.(*RunnerReconciler).Reconcile

is there any way to run DiD without privileged mode . or any other image / solution

Kata + Firecracker container test

Test buildout and document (if successful) on Kata containers to provide true workload isolation on a STIG baseline

starting point?

Add docs about D-in-D architecture

Docker-in-Docker isn't the most fun way to do things, but having docs on why would be valuable

Move to multi-stage builds

Investigate movement to multi-stage builds to address transient dependency problems.

Disable automatic runner updates

The runner agent automatically updates itself, which means there's a ton of bandwidth usage on ephemeral pods. Disable this in favor of updating it via the pod images.

[NEW RUNNER] - running DID with privileged: false

Hi,
i was referring to https://github.com/some-natalie/kubernoodles/blob/main/deployments/ghes/rootless-ubuntu-focal.yml and tryign to start Docker inside runner , but our kubernetes cluster policy wont allow running container in prevailed mode .

could you pls suggest any alternative solution / option to run docker or podman with privileged: false option.

2022-09-21T16:07:20Z ERROR Reconciler error {"controller": "runner-controller", "controllerGroup": "actions.summerwind.dev", "controllerKind": "Runner", "runner": {"name":"rootless-ubuntu-focal-s79gm-nznk2","namespace":"actions-runner-system"}, "namespace": "actions-runner-system", "name": "rootless-ubuntu-focal-s79gm-nznk2", "reconcileID": "b8cc10ad-7d9a-4a2a-ab13-4bef0dda41d7", "error": "pods "rootless-ubuntu-focal-s79gm-nznk2" is forbidden: PodSecurityPolicy: unable to admit pod: [spec.containers[0].securityContext.privileged: Invalid value: true: Privileged containers are not allowed]"}

Put together an intro admin guide

Quick overview of implementation, decisions for admins

Rework weekly repo cleanup

Move to actions/delete-package-versions

Remove runner reaper since that's automatic for GHEC now?

Reduce Ubuntu image size

2.something GB is a bit big for a default in GHCR ... time to dive

Community health files

Create contributing.md and such to finish community health files. According to https://github.com/some-natalie/kubernoodles/community, there's 3 things missing.

Code of conduct
Contribution guidelines
PR template

Move into AKS for demos

Move off my laptop Docker Desktop and into AKS for demos

Needs

service account creation in AKS
environment config in GitHub

Set up container scanning

Set up container scanning at build

Set up the super-linter

Set up the super-linter to avoid local linting differences

Shell
Docker
Kubernetes yaml
Powershell?
Markdown

Improve test coverage

Test that the MTU size is set correctly, software expected is installed?

Revisit rootless podman container problems

Problem - podman works, but podman run does not when used by the runner

Idea - use --userns=keep-id at run to keep uid maps

Links that could be helpful

Logs

##[debug]Evaluating condition for step: 'Container Action test'
##[debug]Evaluating: success()
##[debug]Evaluating success:
##[debug]=> true
##[debug]Result: true
##[debug]Starting: Container Action test
##[debug]Loading inputs
##[debug]Loading env
Run ./tests/container
Building docker image
  Dockerfile for action: '/runner/_work/kubernoodles/kubernoodles/./tests/container/Dockerfile'.
  /usr/bin/docker build -t 60e[2](https://github.com/some-natalie/kubernoodles/runs/7346411518?check_suite_focus=true#step:6:2)26:5f9714a018e65ed72405d6d56d722dc4 -f "/runner/_work/kubernoodles/kubernoodles/./tests/container/Dockerfile" "/runner/_work/kubernoodles/kubernoodles/tests/container"
  Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg.
  Resolved "python" as an alias (/etc/containers/registries.conf.d/000-shortnames.conf)
  Trying to pull docker.io/library/python:[3](https://github.com/some-natalie/kubernoodles/runs/7346411518?check_suite_focus=true#step:6:3)-slim...
  STEP 1/5: FROM python:3-slim
  Getting image source signatures
  Copying blob sha256:[4](https://github.com/some-natalie/kubernoodles/runs/7346411518?check_suite_focus=true#step:6:4)0c89643d0cd670[5](https://github.com/some-natalie/kubernoodles/runs/7346411518?check_suite_focus=true#step:6:5)484[6](https://github.com/some-natalie/kubernoodles/runs/7346411518?check_suite_focus=true#step:6:6)d3f5[7](https://github.com/some-natalie/kubernoodles/runs/7346411518?check_suite_focus=true#step:6:7)e5f73755ded7c0f951a4c5b9f392c4067[8](https://github.com/some-natalie/kubernoodles/runs/7346411518?check_suite_focus=true#step:6:8)4bf62
  Copying blob sha256:e37ebf440f7f53eb0584605f7c63a[9](https://github.com/some-natalie/kubernoodles/runs/7346411518?check_suite_focus=true#step:6:9)5b42583a1372913da9061b6fdb7b535663
  Copying blob sha256:40c89643d0cd67054846d3f57e5f73755ded7c0f951a4c5b9f392c406784bf62
  Copying blob sha256:e37ebf440f7f53eb0584605f7c63a95b42583a1372913da9061b6fdb7b535663
  Copying blob sha256:461246efe0a75316d99afdbf348f7063b57b0caeee8daab775f1f08152ea36f4
  Copying blob sha256:912bc51860fbe91d759008e764b6db[10](https://github.com/some-natalie/kubernoodles/runs/7346411518?check_suite_focus=true#step:6:11)299b829ec9789fa873ef9f3dced3390b
  Copying blob sha256:461246efe0a75316d99afdbf348f7063b57b0caeee8daab775f1f08152ea36f4
  Copying blob sha256:912bc51860fbe91d759008e764b6db10299b829ec9789fa873ef9f3dced3390b
  Copying blob sha256:07053eece5a202737fa1c0ee49737a22007fad69699cc0953a9ec4276b33ec7c
  Copying blob sha256:07053eece5a202737fa1c0ee49737a22007fad69699cc0953a9ec4276b33ec7c
  Copying config sha256:ba94a8d[11](https://github.com/some-natalie/kubernoodles/runs/7346411518?check_suite_focus=true#step:6:12)761b3d47a0035819c32c0d42a43bc104734c8ce2a303da8d7f6e700
  Writing manifest to image destination
  Storing signatures
  STEP 2/5: COPY test.py /app/test.py
  --> 142a30aafff
  STEP 3/5: WORKDIR /app
  --> d5[12](https://github.com/some-natalie/kubernoodles/runs/7346411518?check_suite_focus=true#step:6:13)516d34e
  STEP 4/5: ENV PYTHONPATH /app
  --> 93e1b2bb5de
  STEP 5/5: CMD ["/app/test.py"]
  COMMIT 60e226:5f9714a018e65ed72405d6d56d722dc4
  --> dca19cf115a
  Successfully tagged localhost/60e226:5f9714a018e65ed72405d6d56d722dc4
  dca19cf115af3b6b38102da6c8fd51922c87ff2e970fa78cd32306c534d05736
/usr/bin/docker run --name e2265f9714a018e65ed72405d6d56d722dc4_d564e3 --label 60e226 --workdir /github/workspace --rm -e HOME -e GITHUB_JOB -e GITHUB_REF -e GITHUB_SHA -e GITHUB_REPOSITORY -e GITHUB_REPOSITORY_OWNER -e GITHUB_RUN_ID -e GITHUB_RUN_NUMBER -e GITHUB_RETENTION_DAYS -e GITHUB_RUN_ATTEMPT -e GITHUB_ACTOR -e GITHUB_WORKFLOW -e GITHUB_HEAD_REF -e GITHUB_BASE_REF -e GITHUB_EVENT_NAME -e GITHUB_SERVER_URL -e GITHUB_API_URL -e GITHUB_GRAPHQL_URL -e GITHUB_REF_NAME -e GITHUB_REF_PROTECTED -e GITHUB_REF_TYPE -e GITHUB_WORKSPACE -e GITHUB_ACTION -e GITHUB_EVENT_PATH -e GITHUB_ACTION_REPOSITORY -e GITHUB_ACTION_REF -e GITHUB_PATH -e GITHUB_ENV -e GITHUB_STEP_SUMMARY -e RUNNER_DEBUG -e RUNNER_OS -e RUNNER_ARCH -e RUNNER_NAME -e RUNNER_TOOL_CACHE -e RUNNER_TEMP -e RUNNER_WORKSPACE -e ACTIONS_RUNTIME_URL -e ACTIONS_RUNTIME_TOKEN -e ACTIONS_CACHE_URL -e GITHUB_ACTIONS=true -e CI=true -v "/var/run/docker.sock":"/var/run/docker.sock" -v "/runner/_work/_temp/_github_home":"/github/home" -v "/runner/_work/_temp/_github_workflow":"/github/workflow" -v "/runner/_work/_temp/_runner_file_commands":"/github/file_commands" -v "/runner/_work/kubernoodles/kubernoodles":"/github/workspace" 60e226:5f9714a018e65ed72405d6d56d722dc4
Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg.
cannot resolve /github/home: lstat /github: no such file or directory
##[debug]Docker Action run completed with exit code 1
##[debug]Finishing: Container Action test

Delete untagged containers automatically

Delete untagged containers automatically from GHCR

Add a RHEL-ish flavor deployment

Add a RHEL-ish flavor deployment - maybe Fedora or UBI based?

Add job summaries to each build/test job

Feature

ℹ️ Not yet shipped in GHES

Fix MTU in Rootless Ubuntu Runner

Upstream - actions/actions-runner-controller#1856

Look into RUNNER_TOOL_CACHE as a read-only fast mount in ARC

Problem - setup-LANGUAGE actions want LANGUAGE-versions, so if not available to the runner, it's going to try (and maybe fail) to get that at each run time from github.com. Copying those files in is delicate, tedious, and makes for gigantic pods.

Potential solution - Try to use a persistent volume claim with readonlymany to address this problem.

Problems with ☝ may include needing a 2-step export (from github.com) and import (into ARC without internet access), still being tedious even with internet access, and does actions/runner support a read-only cache?

Create docs on ingress control

Create some documentation around using an ingress controller.

General outline ideas

start with HTTP only
add caching with persistent data volume
add TLS using lets-encrypt

Add rootless Ubuntu runner

Setup pages to look nice

It's lowest of low effort right now, but a custom domain + let's encrypt + actually putting a bit of effort in would make this much easier to read the docs. :)