Giter Site home page Giter Site logo

k8s HA cluster setup about kubeadm-ha HOT 25 CLOSED

cookeem avatar cookeem commented on May 21, 2024
k8s HA cluster setup

from kubeadm-ha.

Comments (25)

cookeem avatar cookeem commented on May 21, 2024

@kcao3
For your questions:

  1. Sorry, I didn't find the how to make it works with NodeRestriction admission control setting. Maybe in kubelet's settings and RBAC settings, I don't know.

  2. I think there lot's of new features in 1.9, so this instructions would not works in 1.9, but some friends told me that it works in 1.8 also.

  3. I will try it later, thanks for your great advise:)

from kubeadm-ha.

KeithTt avatar KeithTt commented on May 21, 2024

借个楼........

@cookeem 楼主大神,请教一下:

1、这里的etcd集群没用证书?etcd原本就是可以不用证书的吧?我不用证书是可以拉起master并正常创建应用的。

2、架构图很赞💯,不过有点疑问,这里同时用了keepalived和nginx,keepalived检测apiserver,这个高可用没问题,但是用nginx代理作负载有什么作用?从图上看是nginx在keepalived下面,keepalived和Nginx同时监测apiserver?这里用haproxy作四层代理和负载均衡检测apiserver,然后keepalived检测haproxy是不是更合适?如果是用Nginx做负载,keepalived检测的应该是nginx。。

一点想法,望大神回复!

from kubeadm-ha.

KeithTt avatar KeithTt commented on May 21, 2024

动手试了一下,楼主的文档真的是好~👍

环境如下:

kube version: 1.8.3
etcd version: 3.2.9
os version: debian stretch

我没有关闭 node验证,两个Node验证参数默认打开:

--admission-control=Initializers,NamespaceLifecycle,LimitRanger,ServiceAccount,PersistentVolumeLabel,DefaultStorageClass,DefaultTolerationSeconds,NodeRestriction,ResourceQuota
--authorization-mode=Node,RBAC

这里,另外两个mater是可以加入集群的,如下:

# kubectl get csr
NAME        AGE       REQUESTOR             CONDITION
csr-5xtwv   1h        system:node:uy05-13   Approved,Issued
csr-k8b9v   1h        system:node:uy05-13   Approved,Issued

# kubectl certificate approve $CSR
# kubectl get no
NAME      STATUS    ROLES     AGE       VERSION
uy05-13   Ready     master    2h        v1.8.3
uy08-07   Ready     <none>    1h        v1.8.3
uy08-08   Ready     <none>    1h        v1.8.3
# kubectl get po --all-namespaces -o wide
NAMESPACE     NAME                                       READY     STATUS    RESTARTS   AGE       IP                NODE
kube-system   calico-etcd-fmj7x                          1/1       Running   0          1h        192.168.5.42      uy05-13
kube-system   calico-kube-controllers-55449f8d88-sxk6c   1/1       Running   0          1h        192.168.5.42      uy05-13
kube-system   calico-node-4dqbj                          2/2       Running   1          26m       192.168.5.104     uy08-07
kube-system   calico-node-p4bl2                          2/2       Running   0          26m       192.168.5.105     uy08-08
kube-system   calico-node-v496h                          2/2       Running   0          1h        192.168.5.42      uy05-13
kube-system   heapster-59ff54b574-5nd8c                  1/1       Running   0          19m       192.168.122.200   uy08-07
kube-system   heapster-59ff54b574-8lfwh                  1/1       Running   0          1h        192.168.122.22    uy05-13
kube-system   heapster-59ff54b574-qpcv2                  1/1       Running   0          19m       192.168.122.132   uy08-08
kube-system   kube-apiserver-uy05-13                     1/1       Running   0          1h        192.168.5.42      uy05-13
kube-system   kube-apiserver-uy08-07                     1/1       Running   0          21m       192.168.5.104     uy08-07
kube-system   kube-apiserver-uy08-08                     1/1       Running   0          26m       192.168.5.105     uy08-08
kube-system   kube-controller-manager-uy05-13            1/1       Running   1          1h        192.168.5.42      uy05-13
kube-system   kube-controller-manager-uy08-07            1/1       Running   0          25m       192.168.5.104     uy08-07
kube-system   kube-controller-manager-uy08-08            1/1       Running   0          26m       192.168.5.105     uy08-08
kube-system   kube-dns-545bc4bfd4-7n4jv                  3/3       Running   0          18m       192.168.122.201   uy08-07
kube-system   kube-dns-545bc4bfd4-sngzc                  3/3       Running   0          18m       192.168.122.133   uy08-08
kube-system   kube-dns-545bc4bfd4-z7b4p                  3/3       Running   0          1h        192.168.122.19    uy05-13
kube-system   kube-proxy-8v97k                           1/1       Running   0          26m       192.168.5.104     uy08-07
kube-system   kube-proxy-9rxmb                           1/1       Running   0          1h        192.168.5.42      uy05-13
kube-system   kube-proxy-zgs89                           1/1       Running   0          26m       192.168.5.105     uy08-08
kube-system   kube-scheduler-uy05-13                     1/1       Running   0          1h        192.168.5.42      uy05-13
kube-system   kube-scheduler-uy08-07                     1/1       Running   0          25m       192.168.5.104     uy08-07
kube-system   kube-scheduler-uy08-08                     1/1       Running   0          26m       192.168.5.105     uy08-08
kube-system   kubernetes-dashboard-69c5c78645-67bhb      1/1       Running   0          12m       192.168.122.134   uy08-08
kube-system   kubernetes-dashboard-69c5c78645-74v4j      1/1       Running   0          12m       192.168.122.202   uy08-07
kube-system   kubernetes-dashboard-69c5c78645-fn2wd      1/1       Running   0          1h        192.168.122.21    uy05-13

image

集群是运行起来了,看着好像没问题,dashboard也能正常打开,不过,不幸的是还是出现了致命问题,那就是 controller-manager 和 scheduler 选举时请求apiserver被拒绝,日志如下:

# kubectl logs -f kube-controller-manager-uy08-07 -n kube-system
I1119 22:08:55.373366       1 controllermanager.go:109] Version: v1.8.3
I1119 22:08:55.378179       1 leaderelection.go:174] attempting to acquire leader lease...
E1119 22:08:55.378541       1 leaderelection.go:224] error retrieving resource lock kube-system/kube-controller-manager: Get https://192.168.5.104:6443/api/v1/namespaces/kube-system/endpoints/kube-controller-manager: dial tcp 192.168.5.104:6443: getsockopt: connection refused
E1119 22:08:58.830547       1 leaderelection.go:224] error retrieving resource lock kube-system/kube-controller-manager: Get https://192.168.5.104:6443/api/v1/namespaces/kube-system/endpoints/kube-controller-manager: dial tcp 192.168.5.104:6443: getsockopt: connection refused
# kubectl logs -f kube-scheduler-uy08-07 -n kube-system
E1119 22:36:19.612211       1 reflector.go:205] k8s.io/kubernetes/vendor/k8s.io/client-go/informers/factory.go:73: Failed to list *v1.ReplicationController: Get https://192.168.5.104:6443/api/v1/replicationcontrollers?resourceVersion=0: dial tcp 192.168.5.104:6443: getsockopt: connection refused
E1119 22:36:19.613257       1 reflector.go:205] k8s.io/kubernetes/vendor/k8s.io/client-go/informers/factory.go:73: Failed to list *v1beta1.ReplicaSet: Get https://192.168.5.104:6443/apis/extensions/v1beta1/replicasets?resourceVersion=0: dial tcp 192.168.5.104:6443: getsockopt: connection refused
E1119 22:36:19.614281       1 reflector.go:205] k8s.io/kubernetes/vendor/k8s.io/client-go/informers/factory.go:73: Failed to list *v1beta1.StatefulSet: Get https://192.168.5.104:6443/apis/apps/v1beta1/statefulsets?resourceVersion=0: dial tcp 192.168.5.104:6443: getsockopt: connection refused

不知道是不是跟Node策略有关...... sad

from kubeadm-ha.

KeithTt avatar KeithTt commented on May 21, 2024

update:

经过一番探索,终于好了!!!手动为其他两个节点单独生成了一套证书,我直接用的openssl,网上有很多用的cfssl,官方好像用的easyrsa。

# kubectl logs -f kube-controller-manager-uy08-07 -n kube-system
I1120 10:10:13.172558       1 controllermanager.go:109] Version: v1.8.3
I1120 10:10:13.177602       1 leaderelection.go:174] attempting to acquire leader lease...

root@uy05-13:~# kubectl logs -f kube-controller-manager-uy08-08 -n kube-system
I1121 02:38:05.051239       1 controllermanager.go:109] Version: v1.8.3
I1121 02:38:05.055504       1 leaderelection.go:174] attempting to acquire leader lease...

# kubectl logs -f kube-scheduler-uy08-07 -n kube-system
I1120 10:10:13.916338       1 controller_utils.go:1041] Waiting for caches to sync for scheduler controller
I1120 10:10:14.016530       1 controller_utils.go:1048] Caches are synced for scheduler controller
I1120 10:10:14.016584       1 leaderelection.go:174] attempting to acquire leader lease...

# kubectl logs -f kube-scheduler-uy08-08 -n kube-system
I1121 02:38:05.729041       1 controller_utils.go:1041] Waiting for caches to sync for scheduler controller
I1121 02:38:05.829314       1 controller_utils.go:1048] Caches are synced for scheduler controller
I1121 02:38:05.829398       1 leaderelection.go:174] attempting to acquire leader lease...

还差一点就是负载和高可用还没弄,打算用lvs+keepalived,多谢楼主大神的文档!!!

from kubeadm-ha.

cookeem avatar cookeem commented on May 21, 2024

@KeithTt 是否保留了所有默认admission-control策略,通过自己做证书解决节点互通问题?

from kubeadm-ha.

KeithTt avatar KeithTt commented on May 21, 2024

是的,保留了Node策略,为新加的两个节点每个单独生成了一套证书。

楼主可以试试为每个节点单独生成一套证书,ca不变,sa公钥私钥不变,要注意的是apiserver证书的san。

如果对Openssl不熟的话,可能得仔细研究一下,期待楼主更新1.8文档。

from kubeadm-ha.

cookeem avatar cookeem commented on May 21, 2024

@KeithTt
v1.8的策略是不是还是:

--admission-control=Initializers,NamespaceLifecycle,LimitRanger,ServiceAccount,PersistentVolumeLabel,DefaultStorageClass,DefaultTolerationSeconds,NodeRestriction,ResourceQuota

当时做v1.7的高可用就是遗留了这个问题,详细说说你的操作步骤是怎么样的?

from kubeadm-ha.

KeithTt avatar KeithTt commented on May 21, 2024

是的,默认策略没变。

除了证书通信问题,具体步骤跟楼主的文档一样。

也就是,在现有的文档的基础上,为每个节点生成一套证书就可以了。

from kubeadm-ha.

ghulevishal avatar ghulevishal commented on May 21, 2024

While joining worker node using Virtual IP address gives following error.

$ kubeadm join --token e6fa03.81bc4d202f817f32 104.236.222.113:8443

[kubeadm] WARNING: kubeadm is in beta, please do not use it for production clusters.
[preflight] Running pre-flight checks
[discovery] Trying to connect to API Server "104.236.222.113:8443"
[discovery] Created cluster-info discovery client, requesting info from "https://104.236.222.113:8443"
[discovery] Failed to request cluster info, will try again: [Get https://104.236.222.113:8443/api/v1/namespaces/kube-public/configmaps/cluster-info: dial tcp 104.236.222.113:8443: getsockopt: connection refused]
[discovery] Failed to request cluster info, will try again: [Get https://104.236.222.113:8443/api/v1/namespaces/kube-public/configmaps/cluster-info: dial tcp 104.236.222.113:8443: getsockopt: connection refused]

from kubeadm-ha.

cookeem avatar cookeem commented on May 21, 2024

Is the 104.236.222.113:8443 port active already?

from kubeadm-ha.

ghulevishal avatar ghulevishal commented on May 21, 2024

Yes..Port is active. Is there any extra configuration for the virtual IP node?

from kubeadm-ha.

cookeem avatar cookeem commented on May 21, 2024

@vishalcloudyuga Check your apiserver logs, and make sure your certificates create correctly.

from kubeadm-ha.

cookeem avatar cookeem commented on May 21, 2024

@KeithTt 请教一下你是怎么操作的?我知道怎么用openssl手工创建apiserver的证书,不过controller-manager和scheduler的怎么创建还真不清楚,有没有具体的操作指引?

https://kubernetes.io/docs/concepts/cluster-administration/certificates/#distributing-self-signed-ca-certificate
官网这个是描述创建证书,但是这个是apiserver的证书。

from kubeadm-ha.

KeithTt avatar KeithTt commented on May 21, 2024

@cookeem 官方居然有这么详细的文档.....才看到。。厉害了。

要是知道apiserver的证书怎么弄,那controller-manager和scheduler就更简单了,不同的仅仅是后面两个是客户端验证,以及不需要配置san,在配置文件里稍微修改一下就好.

我是这样做的:

#controller-manager
openssl genrsa -out controller-manager.key 2048
openssl req -new -key controller-manager.key -out controller-manager.csr -subj "/CN=system:kube-controller-manager"
openssl x509 -req -set_serial $(date +%s%N) -in controller-manager.csr -CA ca.crt -CAkey ca.key -out controller-manager.crt -days 365 -extensions v3_req -extfile controller-manager-openssl.cnf

#scheduler
openssl genrsa -out scheduler.key 2048
openssl req -new -key scheduler.key -out scheduler.csr -subj "/CN=system:kube-scheduler"
openssl x509 -req -set_serial $(date +%s%N) -in scheduler.csr -CA ca.crt -CAkey ca.key -out scheduler.crt -days 365 -extensions v3_req -extfile scheduler-openssl.cnf

from kubeadm-ha.

cookeem avatar cookeem commented on May 21, 2024

@KeithTt
controller-manager-openssl.cnf和scheduler-openssl.cnf怎么定义?
另外,我发现manifest中kube-scheduler.conf没有定义crt的地方,manifest这个地方是不是也要做改动的?

from kubeadm-ha.

KeithTt avatar KeithTt commented on May 21, 2024

@cookeem 配置文件用默认的改一下就可以了。

你说的manifests中是kube-scheduler.yaml吧,证书不是引用的文件路径,是直接用在了controller-manager.conf scheduler.conf里面。

from kubeadm-ha.

ghulevishal avatar ghulevishal commented on May 21, 2024

@cookeem Sorry i missunderstood the setup I was trying thise set up on cloud environment and i have created special node for virtual ip. I will do revision and i will let you know.

from kubeadm-ha.

cookeem avatar cookeem commented on May 21, 2024

@KeithTt
两个问题:
1、你用于创建controller-manager.crt和scheduler.crt的ext文件controller-manager-openssl.cnf和scheduler-openssl.cnf是怎么样的?
2、是不是把证书直接贴在controller-manager.conf和scheduler.conf里边?是不是对应client-certificate-data和client-key-data字段?

from kubeadm-ha.

KeithTt avatar KeithTt commented on May 21, 2024

整理了一下,这是我的配置脚本,要设置和替换的地方从脚本中能看到:

#!/bin/bash

VIP=192.168.5.42
APISERVER_PORT=6443
ETCDSERVER=http://192.168.5.42:2379,http://192.168.5.104:2379,http://192.168.5.105:2379
HOSTNAME=$(hostname)
CA_CRT=$(cat ca.crt |base64 -w0)
CA_KEY=$(cat ca.key |base64 -w0)
ADMIN_CRT=$(cat admin.crt |base64 -w0)
ADMIN_KEY=$(cat admin.key |base64 -w0)
CONTROLLER_CRT=$(cat controller-manager.crt |base64 -w0)
CONTROLLER_KEY=$(cat controller-manager.key |base64 -w0)
KUBELET_CRT=$(cat $(hostname).crt |base64 -w0)
KUBELET_KEY=$(cat $(hostname).key |base64 -w0)
SCHEDULER_CRT=$(cat scheduler.crt |base64 -w0)
SCHEDULER_KEY=$(cat scheduler.key |base64 -w0)

mkdir -p /etc/kubernetes/pki/
mkdir -p /etc/kubernetes/manifests/

#admin
sed -e "s/VIP/$VIP/g" -e "s/APISERVER_PORT/$APISERVER_PORT/g" -e "s/CA_CRT/$CA_CRT/g" -e "s/ADMIN_CRT/$ADMIN_CRT/g" -e "s/ADMIN_KEY/$ADMIN_KEY/g" admin.temp > admin.conf
cp -a admin.conf /etc/kubernetes/admin.conf

#kubelet
sed -e "s/VIP/$VIP/g" -e "s/APISERVER_PORT/$APISERVER_PORT/g" -e "s/HOSTNAME/$HOSTNAME/g" -e "s/CA_CRT/$CA_CRT/g" -e "s/CA_KEY/$CA_KEY/g" -e "s/KUBELET_CRT/$KUBELET_CRT/g" -e "s/KUBELET_KEY/$KUBELET_KEY/g" kubelet.temp > kubelet.conf
cp -a kubelet.conf /etc/kubernetes/kubelet.conf

#controller-manager
sed -e "s/VIP/$VIP/g" -e "s/APISERVER_PORT/$APISERVER_PORT/g" -e "s/CA_CRT/$CA_CRT/g" -e "s/CONTROLLER_CRT/$CONTROLLER_CRT/g" -e "s/CONTROLLER_KEY/$CONTROLLER_KEY/g" controller-manager.temp > controller-manager.conf
cp -a controller-manager.conf /etc/kubernetes/controller-manager.conf

#scheduler
sed -e "s/VIP/$VIP/g" -e "s/APISERVER_PORT/$APISERVER_PORT/g" -e "s/CA_CRT/$CA_CRT/g" -e "s/SCHEDULER_CRT/$SCHEDULER_CRT/g" -e "s/SCHEDULER_KEY/$SCHEDULER_KEY/g" scheduler.temp > scheduler.conf
cp -a scheduler.conf /etc/kubernetes/scheduler.conf

#manifest pub
cp -a ca.crt /etc/kubernetes/pki/
cp -a ca.key /etc/kubernetes/pki/
cp -a sa.pub /etc/kubernetes/pki/
cp -a sa.key /etc/kubernetes/pki/
cp -a apiserver.crt /etc/kubernetes/pki/
cp -a apiserver.key /etc/kubernetes/pki/
cp -a front-proxy-client.key /etc/kubernetes/pki/
cp -a front-proxy-client.crt /etc/kubernetes/pki/
cp -a front-proxy-ca.key /etc/kubernetes/pki/
cp -a front-proxy-ca.crt /etc/kubernetes/pki/

#manifest kube-apiserver-client
cp -a apiserver-kubelet-client.key /etc/kubernetes/pki/
cp -a apiserver-kubelet-client.crt /etc/kubernetes/pki/

#backup
cp -a admin.crt /etc/kubernetes/pki/
cp -a admin.key /etc/kubernetes/pki/
cp -a controller-manager.crt /etc/kubernetes/pki/
cp -a controller-manager.key /etc/kubernetes/pki/
cp -a scheduler.crt /etc/kubernetes/pki/
cp -a scheduler.crt /etc/kubernetes/pki/
cp -a $(hostname).crt /etc/kubernetes/pki/
cp -a $(hostname).key /etc/kubernetes/pki/

from kubeadm-ha.

cookeem avatar cookeem commented on May 21, 2024

@KeithTt 厉害,你解决了我一直没有解决的admission策略问题,高手。明天我测试测试。

from kubeadm-ha.

KeithTt avatar KeithTt commented on May 21, 2024

至于openssl的配置文件,这个很难说清楚,如果事先没有直观认识,可能需要买本书看看。这个双向认证分为serverAuthclientAuth,你可以先看看openssl的默认配置文件,debian下是在/etc/ssl/openssl.cnf,这里用的只是修改了一下默认配置。弄明白这个配置文件,再看看我的用法你就清楚啦。

from kubeadm-ha.

KeithTt avatar KeithTt commented on May 21, 2024

嗯嗯,期待更新文档~ 👍

from kubeadm-ha.

cookeem avatar cookeem commented on May 21, 2024

@KeithTt 😁

from kubeadm-ha.

cookeem avatar cookeem commented on May 21, 2024

@kcao3 我只是重新生成并了controller-manager和scheduler,没有改CA证书,并更新了controller-manager.conf以及scheduler.conf,不过还是connection refused错误。
CentOS下边也有/etc/pki/tls/openssl.cnf这个文件,不过实在配置项太多了,不知道要设置哪些。我只是根据kubernetes文档中关于生成apiserver部分的参考设置了openssl.cnf为:

[ v3_ext ]
authorityKeyIdentifier=keyid,issuer:always
basicConstraints=CA:FALSE
keyUsage=keyEncipherment,dataEncipherment
extendedKeyUsage=serverAuth,clientAuth
#subjectAltName=@alt_names

可惜证书生成了,但是其他master节点连接失败。你的cnf配置是怎么设置的?

from kubeadm-ha.

cookeem avatar cookeem commented on May 21, 2024

kubeadm v.1.9 now support HA by official, this issue can close. New document about kubeadm v1.9 HA will deploy soon.

from kubeadm-ha.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.