Giter Site home page Giter Site logo

Comments (24)

phaer avatar phaer commented on June 15, 2024 1

@mysticaltech Could be, I am mostly guessing at this point!

x509: certificate is valid for 127.0.0.1, 88.198.105.71, not 10.2.0.1" node="agent-big-0"

I interpret this error as saying that the metric server (on agent-big-0) is trying to contact the API server on 10.2.0.1, but its certificate is only signed for localhost and the external ip, not the internal control plane ip (10.2.0.1). So SAN sounds suspicious to me.

EDIT: No, thats wrong. It's not about the API server (6443), see port 10250

from terraform-hcloud-kube-hetzner.

phaer avatar phaer commented on June 15, 2024 1

Can confirm that the metrics-server seems to be working in my just-deployed cluster (name-suffixes branch, but that shouldn't matter here)

from terraform-hcloud-kube-hetzner.

mysticaltech avatar mysticaltech commented on June 15, 2024 1

My bad all IPs that are not private must be.

All should be 10.X.0.X...

As soon as you open the file, you will know!

from terraform-hcloud-kube-hetzner.

mysticaltech avatar mysticaltech commented on June 15, 2024 1

The node-ip is wrong in the config.yaml, it should be the private ip of you server so 10.2.0.X form.

from terraform-hcloud-kube-hetzner.

MartiniMoe avatar MartiniMoe commented on June 15, 2024 1

Thank you so much! Now it looks okay :)

from terraform-hcloud-kube-hetzner.

phaer avatar phaer commented on June 15, 2024

Hi @MartiniMoe,

Looks like your control planes certificate does not include its internal IP. This looks like a bug, but I don't have time to investigate correctly atm. If you do, please try if adding the following line to your control planes k3s config in https://github.com/kube-hetzner/kube-hetzner/blob/master/control_planes.tf#L45

tls-san = module.control_planes[0].private_ipv4_address

and re-provision. This should include the control planes private ip in the cert, but is curently untested.

(We had this in an earlier version of kube-hetzner, but there's been a lot going on in this repo lately, hope that it's going to stabilize soon ;))

from terraform-hcloud-kube-hetzner.

MartiniMoe avatar MartiniMoe commented on June 15, 2024

Thanks, I will try that!
How can I re-provision easily? Or do I have to takedown everything and start over?

from terraform-hcloud-kube-hetzner.

phaer avatar phaer commented on June 15, 2024

Should be sufficient to taint your first control node in this case.

from terraform-hcloud-kube-hetzner.

MartiniMoe avatar MartiniMoe commented on June 15, 2024

Thanks. I added the line and let terraform recreate the control-node-0, but the problems persists :/

from terraform-hcloud-kube-hetzner.

phaer avatar phaer commented on June 15, 2024

Does the resulting k3s config look correct? Did you check whether the certificate is generated correctly? Did you try re-creating the cluster?
I sadly can't provide step-by-step instructions, you need to do some of the digging yourself (or wait until someone else does) ;)

from terraform-hcloud-kube-hetzner.

mysticaltech avatar mysticaltech commented on June 15, 2024

@phaer The tls-san defaults to the node-ip, that's why I removed it while fixing another certificate issue that was just the node agents using their public IP as node-ip. So I believe, that tls-san, is not the problem, or is it?

from terraform-hcloud-kube-hetzner.

mysticaltech avatar mysticaltech commented on June 15, 2024

Ahhh.... Yes you are right!

from terraform-hcloud-kube-hetzner.

mysticaltech avatar mysticaltech commented on June 15, 2024

@MartiniMoe you are probably using master from 48h ago... Just pull the latest changes, this is fixed already!

from terraform-hcloud-kube-hetzner.

MartiniMoe avatar MartiniMoe commented on June 15, 2024

@mysticaltech Thanks, I already pulled and did a terraform apply. Can I fix this without recreating the cluster?

from terraform-hcloud-kube-hetzner.

mysticaltech avatar mysticaltech commented on June 15, 2024

@MartiniMoe Yes probably, login via ssh to each agent (see in the readme).

then:

systemctl stop k3s-agent

Then edit the /etc/rancher/k3s/config.yml

Change the server IP, basically all IPs to the private IP.

systemctl start k3s-agent

from terraform-hcloud-kube-hetzner.

MartiniMoe avatar MartiniMoe commented on June 15, 2024

@mysticaltech Do I change "server": or "node-ip": or both?

from terraform-hcloud-kube-hetzner.

mysticaltech avatar mysticaltech commented on June 15, 2024

Maybe you need to drain and uncordon the node before.. I think it's best

from terraform-hcloud-kube-hetzner.

MartiniMoe avatar MartiniMoe commented on June 15, 2024

I'm not sure what happened here. My agent has the private IP "10.1.0.1" in the config file, but actually it has "10.2.0.1" 😕

from terraform-hcloud-kube-hetzner.

phaer avatar phaer commented on June 15, 2024

What's your network_ipv4_subnets and agent_nodepools?

from terraform-hcloud-kube-hetzner.

MartiniMoe avatar MartiniMoe commented on June 15, 2024
network_ipv4_subnets = {
  control_plane = "10.1.0.0/16"
  agent_big     = "10.2.0.0/16"
#  agent_small   = "10.3.0.0/16"
}

agent_nodepools = {
  agent-big = {
    server_type = "cpx21",
    count       = 1,
    subnet      = "agent_big",
  }
#  agent-small = {
#    server_type = "cpx11",
#    count       = 2,
#    subnet      = "agent_small",
#  }
}

from terraform-hcloud-kube-hetzner.

MartiniMoe avatar MartiniMoe commented on June 15, 2024

If I change the agents IP in its config to its actual IP the node is shown as not ready afterwards in kubectl get nodes.

from terraform-hcloud-kube-hetzner.

mysticaltech avatar mysticaltech commented on June 15, 2024

Post you config.yaml file here please. You r server IP should be 10.1.0.1. And check the node events, with kubectl describe. What does it say? Also did you drain it and and cordon before?

from terraform-hcloud-kube-hetzner.

mysticaltech avatar mysticaltech commented on June 15, 2024

Any updates on this @MartiniMoe?

from terraform-hcloud-kube-hetzner.

MartiniMoe avatar MartiniMoe commented on June 15, 2024

Ah yes, sorry for the late reply, I was a little busy.

So, this is my network
image

This is the config.yaml on agent-big-0:

static:~ # cat /etc/rancher/k3s/config.yaml
"flannel-iface": "eth1"
"kubelet-arg": "cloud-provider=external"
"node-ip": "10.1.0.1"
"node-label":
- "k3s_upgrade=true"
"node-name": "agent-big-0"
"server": "https://10.1.0.1:6443"
"token": "<token>"

There are no events:

~ ❯ kubectl describe node agent-big-0
[...]
Events:              <none>

Regarding draining and cordon, I tried to do it, but it did not really work, because I had to much workload in the cluster. I then deleted some deployments, tried again and made the changes to config.yaml. To be honest at this point I'm not exactly sure if the node was drained then 😕

from terraform-hcloud-kube-hetzner.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.