xetys / hetzner-kube Goto Github PK
View Code? Open in Web Editor NEWA CLI tool for provisioning kubernetes clusters on Hetzner Cloud
License: Apache License 2.0
A CLI tool for provisioning kubernetes clusters on Hetzner Cloud
License: Apache License 2.0
Ay,
I did the command hetzner-kube cluster kubeconfig --name demo
and my ~/.kube/config was totally replaced with the hetzner kube cluster. All my other clusters config were erased.
I'm using version "0.2.1".
Should we implement an addon command, or just work on helm charts, to make them available as helm commands?
At the end, it would be one extra line. And the helm command would be more broadly used.
Hi,
Like for request #78, if we could extend the support to all hetzner-kube addons this could be useful.
After some searches to do port forwarding ( #77 ), I've found that Ingress controller recently support TCP/UDP ports forwarding ( https://github.com/kubernetes/ingress-nginx/blob/master/docs/user-guide/exposing-tcp-udp-services.md )
But this require to use the option parameter « --tcp-services-configmap ConfigMap.yaml » on ingress.
Do you known how to do that to support it inside hetzner-kube please?
Ideally the cluster creation should fail before creating the cluster if you typed the ssh passphrase wrong or handle the error differently.
2018/03/12 15:20:40 Creating new cluster sour-razororange with 1 master(s), 1 worker(s), HA: false
Enter passphrase for ssh key /Users/orhan/.ssh/id_rsa:
2018/03/12 15:20:43 creating server 'sour-razororange-master-01'...
12s [====================================================================] 100%
2018/03/12 15:21:03 Created node 'sour-razororange-master-01' with IP 78.47.118.201
2018/03/12 15:21:03 creating server 'sour-razororange-worker-01'...
16s [====================================================================] 100%
2018/03/12 15:21:28 Created node 'sour-razororange-worker-01' with IP 88.198.146.146
2018/03/12 15:21:28 sleep for 30s...
parse key failed:x509: decryption password incorrectparse key failed:x509: decryption password incorrectparse key failed:x509: decryption password incorrect2018/03/12 15:21:58 parse key failed:x509: decryption password incorrect
Right now when you add a worker the commands stop like this:
jean-philippe@jean-philippe:~$ hetzner-kube cluster add-worker -n 1 --name production
Enter passphrase for ssh key /home/jean-philippe/cubos_kubernetes:
2018/03/01 16:07:12 creating server 'production-worker-06'...
16s [====================================================================] 100%
2018/03/01 16:07:34 Created node 'production-worker-06' with IP 78.47.226.92
2018/03/01 16:07:34 sleep for 30s...
production-worker-06: rewrite kubeconfigs 50.0% [=================================>----------------------------------]
jean-philippe@jean-philippe:~$
The worker seems to work fine after that but for most user it will be kind of scary to see the command stop at 50% and just returning to the console without any message.
We should get the bar to 100% and return a message like "Successfully created worker $workername"
Hi!
Helm have recently added a better support for Secrets, but this require to install Helm like this:
helm init --override 'spec.template.spec.containers[0].command'='{/tiller,--storage=secret}'
Is it possible to permit to add extra parameters or to include this by default please?
Right now, one has to either guess or inspect the sourcecode to get the list/names of available addons.
It would be nice to simply run:
hetzner-kube cluster addon list
…to get a list of available addons, e.g. like this:
NAME REQUIRES DESCRIPTION URL
helm - Kubernetes Package Manager https://helm.sh
rook - File, Block, and Object Storage Services https://rook.io
cert-manager helm Auto-TLS provisioning/management for K8S https://github.com/jetstack/cert-manager
What it says, really...
About the use of hardware, I plan on writing some documentations for that.
The idea would be to pick 3 nodes in Hetzner offer, and do a Reference Implementation using:
This would be documentation at first, but on the way to do that, I'll be able to identify potential candidates for automation.
I'll ask hetzner if they are kind enough to borrow some servers.
I don't care much about cpu/ram, but I'd like something like:
I know little about go, but this project is the opportunity to learn :)
And if you looked at http://libre.sh you know I'm serious about this!
I just run into an "API" Limit as Hetzner only allow you to run ten CX11 server. If you want more you need to open up a support ticket. We should catch this exception and point out with a message what to do.
Eventually we should proceed with the given servers or delete those.
Hi, it would be nice to add some extra security and enable the firewall (ufw) in ubuntu.
We will need some extra configuration to allow for access to Kubernetes api and other service you need to expose but easily we could add those in the provisioning script no?
Do you want this cluster to have a default ingress with TLS and Let's encrypt?
I think it would be a good idea to offer a default option.
I didn't manage yesterday to set it up properly (was tired, and kube-lego is no longer the recommended way, only dns method works in traefik, and I didn't manage to make it work in a reasonable amount of time.)
Here looks like a good article about how to set it properly with:
that means running it multiple times after another doesn't end up in the same state and might fail.
so it outputs stuff and does not render bars
How to achieve true HA without the help of a cloud LoadBalancer?
There are not many solutions, the best we have in Hetzner is probably the Failover IP.
All we need for this to work would be the following:
role=edge-router
As we now have a codeclimate config, we should setup the repository to http://codeclimate.com so the checks will run on every PR.
Hi,
nice tool you have built!
To integrate the kubernetes cluster with the Hetzner Cloud API you might also want to deploy the hcloud-cloud-controller-manger.
As you are already using kubeadm
the deployment instructions should be easy to adapt: https://github.com/hetznercloud/hcloud-cloud-controller-manager#deployment
Hi,
Thanks for this great tool and your articles!
We've tried a default HA installation ( 4 CX11 servers, 3 masters/etcd and 1 worker ) and we would like to known how to switch worker to CX51 plan please.
Can we use the Hetzner dashboard to upgrade the worker node CX11 to CX51?
And how can we add other workers nodes directly in CX51 please?
EDIT: Solved => cluster add-worker --worker-server-type cx51
Thanks!
To achieve a better security for node communication wireguard should be installed on all nodes
Hello! Looks cool :) thanks for sharing!
I'm working on libresh/libre.sh#161
And I have various questions :
And you probably are also looking for the sweets pot right? I think it would be cool to provision masters on VMs and workers on there cheap hardware.
I'll look more on the code once I'm on desktop, but thanks a lot for sharing!
Hi!
I have just created a cluster and when I'm trying to add the addon rook, I get this error:
% hetzner-kube cluster addon install -n project rook
2018/03/10 15:20:16 installing addon rook
2018/03/10 15:20:35 error: unable to recognize "https://github.com/rook/rook/raw/master/cluster/examples/kubernetes/rook-cluster.yaml": no matches for rook.io/, Kind=Cluster
2018/03/10 15:20:35 > kubectl apply -f https://github.com/rook/rook/raw/master/cluster/examples/kubernetes/rook-cluster.yaml
2018/03/10 15:20:35
2018/03/10 15:20:35 namespace "rook" created
2018/03/10 15:20:35 Run failed:Process exited with status 1
Here is the list of command used before:
hetzner-kube cluster create --name project --ha-enabled --ssh-key guillaume --cloud-init ./cloud-init.yml --datacenters fsn1-dc8 --worker-server-type cx51 --self-hosted
hetzner-kube cluster add-worker --name project --datacenters fsn1-dc8 --worker-server-type cx51
hetzner-kube cluster addon install -n project helm
hetzner-kube cluster addon install -n project rook
Thanks for your advices!
Still in the 'playing around' mode. One thing I'd like to do is build a cluster backed mostly by my own (already paid-for, fairly hefty) hardware, but with the possibility of adding worker nodes running on Hetzner on-demand.
The best way to do this, I think, would be a Kubernetes daemon that watches for overall load, creating new worker nodes when it reaches some threshold. It would make sense to use hetzner-kube for the heavy lifting.
That much should already be doable; what I'd like to ask for is the ability to link hetzner-kube to a pre-existing (non-Hetzner) cluster, and add workers to that.
There already is support for adding workers to existing clusters, so while this might be somewhat out of scope for the project, hopefully it wouldn't be too difficult to add. If you'd rather not, that's fine; I can probably hack something up. From the looks of config.json it might not even require any new code, but I'd like to confirm that.
(One likely stumbling block is Wireguard, which I've yet to try setting up myself. Should be fun.)
hetzner-kube cluster addon install --name test cert-manager
fails with:
2018/03/04 12:41:45 Error: release cert-manager failed: namespaces "kube-system" is forbidden: User "system:serviceaccount:kube-system:default" cannot get namespaces in the namespace "kube-system"
2018/03/04 12:41:45 > helm install --name cert-manager --namespace kube-system --set ingressShim.extraArgs='{--default-issuer-name=letsencrypt-prod,--default-issuer-kind=ClusterIssuer}' stable/cert-manager
2018/03/04 12:41:45
2018/03/04 12:41:45
2018/03/04 12:41:45 Run failed:Process exited with status 1
On SSHing to a machine, I was greeted with this message:
Welcome to Ubuntu 16.04.3 LTS (GNU/Linux 4.4.0-112-generic x86_64)
* Documentation: https://help.ubuntu.com
* Management: https://landscape.canonical.com
* Support: https://ubuntu.com/advantage
29 packages can be updated.
7 updates are security updates.
At a minimum, these should be installed during installation.
Additionally, it would be good if unattended-upgrades (which will only install security updates) was enabled.
I installed ninx-ingress via helm
helm install --name ingress --set rbac.create=true,controller.kind=DaemonSet,controller.service.type=ClusterIP stable/nginx-ingress
but it does not route traffic to the service. Any need to configure external Ips on the hetzner-kube setup?
In relation to #33, it would be useful if worker nodes could be removed from a cluster.
After first draining them, of course.
Installation with current master:
2018/03/02 16:25:21 Creating new cluster staging1 with 1 master(s), 1 worker(s), HA: false
2018/03/02 16:25:21 creating server 'staging1-master-01'...
10s [====================================================================] 100%
2018/03/02 16:25:38 Created node 'staging1-master-01' with IP <redacted>
2018/03/02 16:25:38 creating server 'staging1-worker-01'...
15s [====================================================================] 100%
2018/03/02 16:25:59 Created node 'staging1-worker-01' with IP <redacted>
2018/03/02 16:25:59 sleep for 30s...
staging1-master-01 : configure wireguard 5.0% [--------------]
staging1-worker-01 : configure wireguard 27.3% [==>-----------]
2018/03/02 16:28:54 Failed to execute operation: No such file or directory
2018/03/02 16:28:54 > systemctl enable wg-quick@wg0 && systemctl restart wg-quick@wg0
2018/03/02 16:28:54
2018/03/02 16:28:54
2018/03/02 16:28:54 Run failed:Process exited with status 1
systemd log from the worker:
Mar 02 16:43:23 staging1-worker-01 wg-quick[9275]: Warning: `/etc/wireguard/wg0.conf' is world accessible
Mar 02 16:43:23 staging1-worker-01 wg-quick[9275]: [#] ip link add wg0 type wireguard
Mar 02 16:43:23 staging1-worker-01 kernel: wireguard: module verification failed: signature and/or required key missing - tainting kernel
Mar 02 16:43:23 staging1-worker-01 kernel: wireguard: WireGuard 0.0.20180218 loaded. See www.wireguard.com for information.
Mar 02 16:43:23 staging1-worker-01 kernel: wireguard: Copyright (C) 2015-2018 Jason A. Donenfeld <[email protected]>. All Rights Reserved.
Mar 02 16:43:23 staging1-worker-01 wg-quick[9275]: [#] wg setconf wg0 /dev/fd/63
Mar 02 16:43:23 staging1-worker-01 wg-quick[9275]: Line unrecognized: `PrivateKey='
Mar 02 16:43:23 staging1-worker-01 wg-quick[9275]: Configuration parsing error
Mar 02 16:43:23 staging1-worker-01 wg-quick[9275]: [#] ip link delete dev wg0
Mar 02 16:43:23 staging1-worker-01 systemd[1]: [email protected]: Main process exited, code=exited, status=1/FAILURE
Mar 02 16:43:23 staging1-worker-01 systemd[1]: Failed to start WireGuard via wg-quick(8) for wg0.
-- Subject: Unit [email protected] has failed
On the master, systemctl enable wg-quick@wg0 && systemctl restart wg-quick@wg0
cannot be executed.
Looking at the config files on worker (/etc/wireguard/wg0
) and master (/etc/wireguard
), it looks like neither PrivateKey
nor PublicKey
are set.
hetzner-kube cluster addon install --name libresh-staging rook
2018/03/01 18:56:09 installing addon rook
2018/03/01 18:56:35 error: unable to recognize "https://github.com/rook/rook/raw/master/cluster/examples/kubernetes/rook-cluster.yaml": no matches for rook.io/, Kind=Cluster
2018/03/01 18:56:35 > kubectl apply -f https://github.com/rook/rook/raw/master/cluster/examples/kubernetes/rook-cluster.yaml
2018/03/01 18:56:35
2018/03/01 18:56:35 namespace "rook" created
2018/03/01 18:56:35 Run failed:Process exited with status 1
Looks like we need to wait a bit, when I install it the second time, it works.
Here's a puzzler: What happens if, for whatever reason, kube-dns fails to schedule due to overall lack of CPU?
If your answer is "everything breaks", you'd be right. :)
There's a solution hinted at in https://kubernetes.io/docs/concepts/configuration/pod-priority-preemption/, namely enabling priorities and marking all the kube-system pods as critical, especially the ones required for the cluster to keep working. I don't think there's a way to mark an entire namespace as high-priority, but certainly these particular pods should be at maximum priority.
Currently, this project has a code coverage of ~11% and the working parts are coupled tight to each other. This also makes testing hard. As this is my first golang project, it's hopefully ok that I started in this state. But I think now is the time to make it better. For the next minor version, I see a huge improvement on the code quality by:
nodeUtil.go
, config.go
) into a cluster-deploy
package under pkg
runCmd
, writeFile
and a source for Node
, like the hetzner server create stuff. Doing a direct implementation with an SSH client for node communication and the HetznerConfig
for providing nodes. With this change, the cluster-deploy logic can be provided with mock implementations to test this logic widercmd
just for calling these tools. Here we need tests, which prove correct working validationsThis enables to write better tests, and ensure the quality of hetzner-kube in a long run.
I will wait, until #45, #44, and #40 are resolved and do a 0.2.2 release. After this I will start my work on this redesign and for this time stop merging incoming PRs (or these will have to merge the redesign in advance).
apt list --upgradable
on a newly created node shows:
docker-ce/xenial 17.12.0~ce-0~ubuntu amd64 [upgradable from: 17.03.2~ce-0~ubuntu-xenial]
This would upgrade docker-ce
to 17.12.*
which is not compatible with current K8S releases.
This can be prevented by creating /etc/apt/preferences.d/docker-ce
:
Package: docker-ce
Pin: version 17.03.*
Pin-Priority: 1000
References regarding supported versions:
It looks, like hetzner-kube cluster create […]
is unable to deal with passphrase-protected SSH keys.
After successfully initializing all servers, the last 2 lines of output were:
2018/01/26 09:11:28 installing docker.io and kubeadm on node 'grfy-k8s-dc5d-master-01'...
2018/01/26 09:11:28 parse key failed:ssh: cannot decode encrypted private keys
Using hetzner-kube
eb0790b
I got a 'uniqueness constraint' violation, and the key wasn't added to hetzner-kube's configuration.
In this case, it should be fine to simply continue setup.
The tool actually does everything correctly but doesn't state it's done and looks mostly like a bug.
from the comment in #17
I forgot that for the very most storage cloud solution you need to replicate to 3, to ensure real failover behavior...
this is already did for OpenEBS, but not rook.
Hello!
Ingress seems to support only HTTP/HTTPS protocols ( kubernetes/kubernetes#23291 ), and we need to expose the TCP port 3306 ( MySQL ), and restricted to an white listed IP addresses.
Do you have any solution compatible with your hetzner-kube please?
This manifests as such:
homukube-master-01: install packages 4.5% [=>------------------------------------------------------------------]
homukube-master-01: install packages 4.5% [=>------------------------------------------------------------------]
homukube-master-01: install packages 4.5% [=>------------------------------------------------------------------]
homukube-master-01: install packages 4.5% [=>------------------------------------------------------------------]
(times a thousand)
Really, it would be great if much thinner terminals worked, but 80 columns at least should work.
I just tried to monitor my cluster using prometheus and failed as kubernetes doesn't export any metrics. We should create an addon that adds an endpoint to the cluster.
When trying to create a cluster with a ED25519 SSH key, the command fails with an error:
parse key failed:ssh: cannot decode encrypted private keysparse key failed:ssh: cannot decode encrypted private keysparse key failed:ssh: cannot decode encrypted private keys2018/03/06 00:54:28 parse key failed:ssh: cannot decode encrypted private keys
Is it possible to support non-rsa keys?
Same as Ingress, I think it would be good if we offer a default option.
Yesterday, I setup an (almost) successful cluster (without Let's encrypt), and I used rook.
It was almost straight forward, and it looks like a very viable option for the storage.
The plus are:
Here are some notes I used:
git clone https://github.com/rook/rook
cd rook/cluster/examples/kubernetes
kubectl create -f rook-operator.yaml
kubectl create -f rook-cluster.yaml
kubectl create -f rook-storageclass.yaml
kubectl create -f mysql.yaml
kubectl create -f wordpress.yaml
The helm chart didn't work, but this way, it wokred.
(didn't have time to debug further the helm, but I think it should be the way to go)
Then I deployed a RocketChat instance, with the official helm chart, and it worked with the storage \o/ (and the ingress, but without tls :/ )
In the last 10 hours, I successfully built a true HA cluster manually.
To do this, these steps I noted down are needed to provision a full HA kubernetes:
- -apiserver-count 3
in particular this setup was tested with heavy network disruptions and kubernetes was still able to operate and is stated as highly available.
A different approach would be using a Floating IP and assign it by keepalived. That would cost extra $
Hi !
Is it possible to permit to support extra parameters when installing the cluster please?
For example, we would like to use CoreDNS ( will be the default in k8s v1.11 ) and we can use it manually by default since k8s 1.9 like this:
kubeadm init --feature-gates=CoreDNS=true
I suppose it's not the only case where this could be useful to give extra parameters to kubeadm.
Thanks!
See rook/rook#1044
This should have been fixed, so perhaps we're installing a too-old version of Ceph/Rook, but in any case I was unable to mount a filesystem using rook-toolbox without first upgrading the kernel.
Upgrading to linux-image-virtual-hwe-16.04 / linux-headers-virtual-hwe-16.04 fixes it, but putting that (and the necessary reboot) in cloud-init makes cluster create fail. It would be good if that could be handled better.
As the addon
command mostly executes commands on the nodes, if a passphrase is used for the ssh-key the command will fail with this error, as the internal map holding the passphrases is not populated.
jean-philippe@jean-philippe:~$ hetzner-kube cluster addon install helm -n production
2018/03/03 18:42:06 installing addon helm
2018/03/03 18:42:06 parse key failed:passphrase not found
I installed kubernetes-dashboard and metrics-server to get node stats. Heapster fails at resolving the worker nodes:
E0317 21:43:05.000376 1 summary.go:374] Node test-worker-04 has no valid hostname and/or IP address: test-worker-04
My understanding of kubernetes is limited and some hours of google searching didn't reveal the root cause.
Maybe this issue is related to the setup of the cluster.
To ensure the quality of this tool over time, the existing functionality should be covered by tests.
In addition, we, therefore, need a CI integration, to achieve a better flow for upcoming PRs.
Some things we could add to make management easier...
hetzner-kube cluster list
only lists the IP of the master node; I'm looking for hetzner-kube cluster get
.I say "we", but knowing myself, I probably won't get around to implementing anything anytime soon. So I'll leave this here for now.
Hi,
Thanks again for this amazing project, just created a new cluster in few minutes, it's magic!
I installed addons helm, rook, ingress and cert-manager, but I don't really known how to deal with SSL certs.
Thanks!
Add support to bind "floating ips" to specific nodes (worker/master) to have a "static ip" in case the instance reboots and gets assigned a new ip.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.