kubecube-io / kubecube Goto Github PK
View Code? Open in Web Editor NEWKubeCube is an open source enterprise-level container platform
Home Page: https://kubecube.io
License: Apache License 2.0
KubeCube is an open source enterprise-level container platform
Home Page: https://kubecube.io
License: Apache License 2.0
centos7.4, all in one安装脚本时报错,拉不到镜像
2021-07-13 14:27:52 DEBUG enable and start docker
Created symlink from /etc/systemd/system/multi-user.target.wants/docker.service to /etc/systemd/system/docker.service.
2021-07-13 14:27:57 INFO downloading images
I0713 14:27:59.066899 15693 version.go:252] remote version is much newer: v1.21.2; falling back to: stable-1.19
W0713 14:27:59.839835 15693 configset.go:348] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
2021-07-13 14:27:59 DEBUG spin pid: 15728 -Error response from daemon: Get https://registry.cn-hangzhou.aliyuncs.com/v2/google_containers/kube-apiserver/manifests/v1.19.12: Get https://dockerauth.cn-hangzhou.aliyuncs.com/auth?scope=repository%3Agoogle_containers%2Fkube-apiserver%3Apull&service=registry.aliyuncs.com%3Acn-hangzhou%3A26842: dial tcp: lookup dockerauth.cn-hangzhou.aliyuncs.com on 10.198.141.241:53: no answer from DNS server
2021-07-13 14:32:41 ERROR install kubernetes failed
pivotKubeConfig 和localKubeConfig需要怎么配置求教
Error: INSTALLATION FAILED: failed to create resource: Service "frontend-nodeport" is invalid: spec.ports[0].nodePort: Invalid value: 30080: provided port is already allocated
helm安装时必须填写NodePort的服务器所在的IP, 但是安装过程会和集群已经存在的ingress的NodePort冲突. 且无法规避这个端口, 导致无法安装.
上面的30080端口就是Ingress的NodePort.
当部署环境为需要代理连接互联网时,无法完成helm charts下载。
可以考虑将helm-charts包直接打包在容器镜像中,或者支持为init container配置可选的http_proxy | https_proxy环境变量。
继续观光一下
【Hotplug】hotplugs.hotplug.kubecube.io v1
common 与 pivot-cluster 都打开
spec:
component:
-
name: audit
status: enabled
-
env: "address: elasticsearch-master-headless.elasticsearch.svc\n"
name: logseer
namespace: logseer
pkgName: logseer-v1.0.0.tgz
status: enabled
-
env: "clustername: \"{{.cluster}}\"\n"
name: logagent
namespace: logagent
pkgName: logagent-v1.0.0.tgz
status: enabled
-
name: elasticsearch
namespace: elasticsearch
pkgName: elasticsearch-7.8.1.tgz
status: enabled
-
env: "grafana:\n enabled: false\nprometheus:\n prometheusSpec:\n externalLabels:\n cluster: \"{{.cluster}}\"\n remoteWrite:\n - url: http://172.31.0.171:31291/api/v1/receive\n"
name: kubecube-monitoring
namespace: kubecube-monitoring
pkgName: kubecube-monitoring-15.4.12.tgz
status: enabled
-
name: kubecube-thanos
namespace: kubecube-monitoring
pkgName: thanos-3.18.0.tgz
status: enabled
spec:
component:
-
env: "address: elasticsearch-master.elasticsearch.svc \n"
name: logseer
status: enabled
-
env: "grafana:\n enabled: true\nprometheus:\n prometheusSpec:\n externalLabels:\n cluster: \"{{.cluster}}\"\n remoteWrite:\n - url: http://kubecube-thanos-receive:19291/api/v1/receive\n"
name: kubecube-monitoring
-
env: "receive:\n tsdbRetention: 7d\n replicaCount: 1\n replicationFactor: 1\n"
name: kubecube-thanos
status: enabled
默认配置,配置 elasticsearch-master-headless.elasticsearch.svc和elasticsearch-master.elasticsearch.svc 都配置过,理论上不会有什么影响,还是不行,然后进行调试
问题一:查询日志报错 “request elasticsearch fail”
问题二:操作审计无数据(经过调试已解决)
过程如下:
查看logseer运行pod的容器日志发现如下
2022-09-24 20:40:47.299 [http-nio-8080-exec-10] c.n.logseer.engine.impl.ElasticSearchEngineImpl:52 INFO - [getLogs] request to es, url: /*/_search?ignore_unavailable=true, requestBody: {
"size": 50,
"from": 0,
"query": {
"bool" : {
"filter" : [
{"term": {"cluster_name" : "pivot-cluster"}},
{"term": {"namespace" : "wordpress"}}
],
"must" : [
{
"query_string" : {
"default_field" : "message",
"query" : "elasticsearch-master.elasticsearch.svc:9200"
}
},
{
"range" : {
"@timestamp" : {
"gte" : 1664019350313,
"lte" : 1664022950313,
"format": "epoch_millis"
}
}
}
]
}
},
"aggs": {
"2": {
"date_histogram": {
"field": "@timestamp",
"interval": "1m",
"time_zone": "Asia/Shanghai",
"min_doc_count": 1
}
}
},
"highlight" : {
"fields" : {
"message" : {}
},
"fragment_size": 2147483647
},
"sort" : [
{ "@timestamp" : "asc"}
],
"_source" : {
"excludes": "tags"
},
"timeout": "30000ms"
}
2022-09-24 20:40:48.302 [http-nio-8080-exec-10] c.n.logseer.engine.impl.ElasticSearchEngineImpl:65 ERROR - request elasticsearch exception: {}
java.net.ConnectException: null
at org.elasticsearch.client.RestClient$SyncResponseListener.get(RestClient.java:959)
at org.elasticsearch.client.RestClient.performRequest(RestClient.java:233)
at com.netease.logseer.engine.impl.ElasticSearchEngineImpl.getLogs(ElasticSearchEngineImpl.java:53)
at com.netease.logseer.service.impl.LogSearchServiceImpl.commonSearch(LogSearchServiceImpl.java:154)
at com.netease.logseer.service.impl.LogSearchServiceImpl.searchLog(LogSearchServiceImpl.java:79)
at com.netease.logseer.api.controller.LogSearchController.searchLog(LogSearchController.java:50)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:205)
at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:133)
at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:116)
at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:827)
at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:738)
at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:85)
at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:963)
at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:897)
at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:970)
at org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:872)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:660)
at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:846)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:741)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:231)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
at com.netease.logseer.api.filter.FillWebContextHolderFilter.doFilter(FillWebContextHolderFilter.java:35)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
at com.netease.logseer.api.filter.AuthFilter.doFilter(AuthFilter.java:92)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
at org.springframework.web.filter.RequestContextFilter.doFilterInternal(RequestContextFilter.java:99)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
at org.springframework.web.filter.HttpPutFormContentFilter.doFilterInternal(HttpPutFormContentFilter.java:105)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
at org.springframework.web.filter.HiddenHttpMethodFilter.doFilterInternal(HiddenHttpMethodFilter.java:81)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:197)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
at org.springframework.boot.web.support.ErrorPageFilter.doFilter(ErrorPageFilter.java:115)
at org.springframework.boot.web.support.ErrorPageFilter.access$000(ErrorPageFilter.java:59)
at org.springframework.boot.web.support.ErrorPageFilter$1.doFilterInternal(ErrorPageFilter.java:90)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
at org.springframework.boot.web.support.ErrorPageFilter.doFilter(ErrorPageFilter.java:108)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:199)
at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:96)
at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:528)
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:139)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:81)
at org.apache.catalina.valves.AbstractAccessLogValve.invoke(AbstractAccessLogValve.java:678)
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:87)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:343)
at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:798)
at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:66)
at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:810)
at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1498)
at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.ConnectException: null
at org.apache.http.nio.pool.RouteSpecificPool.timeout(RouteSpecificPool.java:168)
at org.apache.http.nio.pool.AbstractNIOConnPool.requestTimeout(AbstractNIOConnPool.java:561)
at org.apache.http.nio.pool.AbstractNIOConnPool$InternalSessionRequestCallback.timeout(AbstractNIOConnPool.java:822)
at org.apache.http.impl.nio.reactor.SessionRequestImpl.timeout(SessionRequestImpl.java:183)
at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processTimeouts(DefaultConnectingIOReactor.java:210)
at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvents(DefaultConnectingIOReactor.java:155)
at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor.execute(AbstractMultiworkerIOReactor.java:348)
at org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager.execute(PoolingNHttpClientConnectionManager.java:192)
at org.apache.http.impl.nio.client.CloseableHttpAsyncClientBase$1.run(CloseableHttpAsyncClientBase.java:64)
直接curl http://elasticsearch-master.elasticsearch.svc:9200/*/_search?ignore_unavailable=true 发现返回了一大堆数据,证明ES连通性是好的,不过毕竟没添加参数,curl不是很好加参数,加了参数可能就返回空报错了。
猜测是不是环境变量没有设置起,不停的调整环境变量格式,甚至configMap,内部配置文件 期望出现日志
equest to es, url: http://elasticsearch-master.elasticsearch.svc:9200/*/_search?ignore_unavailable=true。
是不是没有读取到address: elasticsearch-master.elasticsearch.svc 这个变量,最终放弃也许日志本来就是这么写的。
转入logagent 的filebeat 的configMap 发现
output.elasticsearch:
hosts: [elasticsearch-master.elasticsearch.svc:30435]
这个根本访问不到修改成
output.elasticsearch:
hosts: [elasticsearch-master.elasticsearch.svc:9200]
再试试,嗯一样的不通(好在的是filebeate不爆连接错误了 )
接着看了一下文档也没发现哪里不对,再修复下审计
我本来也安装的内部ES,还是当外部配置下吧
kubectl edit deploy audit -n kubecube-system
env:
- name: AUDIT_WEBHOOK_HOST
value: http://elasticsearch-master.elasticsearch:9200
- name: AUDIT_WEBHOOK_INDEX
value: audit
- name: AUDIT_WEBHOOK_TYPE
value: logs
但是日志依然不通,看来只有想办法开放ES 9200端口出来用工具连连是没上传还是没查询到,
不过大体定位到如下可能的几个问题
目前发现的问题猜测ripple和filebeat的配置感觉这里嫌疑最大,创建了新的日志抓取任务,也没看到/etc/filebeat/inputs.d 有什么文件改动
不过也建议修复下空报错的问题,让指示得更明显,只能去看看哪里有源码了
Warden should only have the basic authority for updating and watch resource's change in pivot. Otherwise owner of warden cluster could operate the pivot cluster. It has a security risky of kubecube
Why do we need non-k8s resource authz expansion?
Assumed that we have a bookinfo server and we want to make a decision who can access bookinfo and by how?
How to expand authz of non-k8s resource?
Introduce new crd ExternalResource
for mapping non-k8s resource so that we can use it as general k8s resource to rbac
apiVersion: extension.kubecube.io/v1
kind: ExternalResource
metadata:
name: bookinfo
spec:
namespaced: false
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: platform-admin
rules:
...
- apiGroups:
- "extension.kubecube.io/v1"
resources:
- externalresources
resourceNames:
- bookinfo
verbs:
- get
- list
- watch
- create
- delete
- deletecollection
- patch
- update
执行按All-in-one安装文档执行有如下错误:
Is your feature request related to a problem? Please describe.
Describe the solution you'd like
Describe alternatives you've considered
Additional context
kubecube 部署的时候,有不少过程是上一步运行失败,但是不中断,继续往下执行,导致后面失败的问题。。
类似helm 执行失败后,应该终止shell,而不是进入下一步
https://github.com/kubecube-io/kubecube-installer/blob/main/install_kubecube.sh#L121
Describe the bug
A clear and concise description of what the bug is.
可以使用中文。
construct a kind cluster without sign outside ip.
when visit this kind cluster will get error Get "https://192.168.4.124:57300/api?timeout=32s": x509: certificate is valid for 10.96.0.1, 172.18.0.3, not xxxx
Of course, this is a problem with my configuration, but the kubecube program should not panic
To Reproduce
Steps to reproduce the behavior:
Expected behavior
A clear and concise description of what you expected to happen.
Screenshots
If applicable, add screenshots to help explain your problem.
Desktop (please complete the following information):
Smartphone (please complete the following information):
Additional context
Add any other context about the problem here.
使用官网提供的helm安装方法部署,能部署的最新版本是1.8.2,大概是23年4月的版本,但github的版本已经到1.9.0
如果已安装旧版本,有升级到1.9.0的流程吗
官网的说明资料有些是21年不知是否是最新
期待更新
另外,产品有提供helm管理的模块吗,包括helm repo,通过helm安装的应用管理。有些中间件部署如果是helm方式安装,如何在平台管理控制?
在租户功能界面,只有创建项目,没有删除项目的功能,项目改名但不能更改项目标识
Describe the bug
A clear and concise description of what the bug is.
可以使用中文。
场景一:
先决条件:
三台服务器搭建3节点,k8s 已部署v1.27.6,k8s已通过helm方式安装 metrics-server
同时三节点通过cephadm方式部署ceph存储集群,正常运行
操作
通过helm 方式安装kubecube
1、发现kubecube-monitoring-prometheus-node-exporter-XXX
9100端口与ceph自带节点监控服务冲突无法启动
通过修改ceph默认端口,让kubecube可以使用9100,解决
2、发现prometheus-kubecube-monitoring-prometheus-0
服务没有启动,也没拉取镜像等动作,监控服务相关如下
helm list -A
查询发现部分服务安装失败
kubecube-monitoring kubecube-monitoring 1 2024-02-29 08:48:09.902374875 +0000 UTC failed kubecube-monitoring-15.4.12 0.47.0
kubecube-thanos kubecube-monitoring 1 2024-02-29 08:48:58.754995455 +0000 UTC failed thanos-3.18.0 0.21.1
场景二
基于场景一无法正常使用kubecube情况,调整如下
三台服务器搭建3节点,k8s 已部署v1.27.6,k8s已通过helm方式安装 metrics-server
cephadm集群卸载不启动
操作
1、通过helm 方式安装kubecube ,kubecube正常安装所有组件,并正常运行
2、使用cephadm安装ceph存储集群,初始化集群,并修改ceph自带节点监控服务端口为9111避免冲突。观察kubecube正常运行
3、使用ceph对每一台机进行存储硬盘初始化,启动OSD服务。此时对应主机上kubecube相关的pod出现crashoff/error,崩溃并且无法自动重启,而在这台机的其他pod,例如kube-proxy,kube-controller-manager等是正常的。
4、针对第3步,将问题服务器节点重启,则kubecube相关的pod,恢复正常,而ceph服务也是正常,OSD服务正常。
5、三个节点均是这种情况,ceph执行初始化OSD服务,对应的主机上的kubecube的pod就崩溃无法自动重启。然后整台服务器重启后,kubecube恢复正常,ceph正常,运行1天,暂时没有异常。
server(please complete the following information):
Internal error occurred: failed calling webhook "vresourcequota.kb.io": failed to call webhook: Post "https://warden.kubecube-system.svc:8443/validate-core-kubernetes-v1-resource-quota?timeout=10s": service "warden" not found
Problem
The problem that Integration with olm app market is authR, users in KubeCube need have related rbac of new operator crds to do access.
Solution may be
olm OperatorGroup
records all of gvks during specified namespace, we can use a controller to watch for OperatorGroup
and create Role
and RoleBinding
for user according to OperatorGroup
info
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"operators.coreos.com/v1","kind":"OperatorGroup","metadata":{"annotations":{},"name":"og-single","namespace":"default"},"spec":{"targetNamespaces":["default"]}}
olm.providedAPIs: EtcdBackup.v1beta2.etcd.database.coreos.com,EtcdCluster.v1beta2.etcd.database.coreos.com,EtcdRestore.v1beta2.etcd.database.coreos.com
creationTimestamp: "2021-09-28T09:27:20Z"
generation: 1
name: og-single
namespace: default
resourceVersion: "17745074"
uid: 23d1f838-f3df-4025-ac82-51fa69212606
spec:
targetNamespaces:
- default
status:
lastUpdated: "2021-09-28T09:27:20Z"
namespaces:
- default
Meanwhile the controller should aggregate the ClusterRole
of OperatorGroup
to platform-admin so that he/she can access new crds of operator
aggregationRule:
clusterRoleSelectors:
- matchLabels:
olm.opgroup.permissions/aggregate-to-c571d720f17289d3-admin: "true"
- matchLabels:
olm.opgroup.permissions/aggregate-to-2c1e6f7e17c07035-admin: "true"
- matchLabels:
olm.opgroup.permissions/aggregate-to-2fdc3540750c4d2b-admin: "true"
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
creationTimestamp: "2021-09-28T09:27:20Z"
labels:
olm.owner: og-single
olm.owner.kind: OperatorGroup
olm.owner.namespace: default
rbac.authorization.k8s.io/aggregate-to-platform-admin: "true"
name: og-single-admin
resourceVersion: "20351534"
uid: 3e1689c1-272e-4d36-b3c4-310ac7dbf884
rules:
- apiGroups:
- etcd.database.coreos.com
resources:
- etcdclusters
verbs:
- '*'
- apiGroups:
- etcd.database.coreos.com
resources:
- etcdbackups
verbs:
- '*'
- apiGroups:
- etcd.database.coreos.com
resources:
- etcdrestores
verbs:
- '*'
目前使用中发现有如下问题:
1、文档中只有安装介绍,没有如何移除kubecube的介绍
2、 在已有kubernetes集群中安装的文档中,应该补充说明如何预先使用可插拔忽略某些组件的安装,比如ingress-nginx以及monintor等,避免在不知情情况下与集群中已经安装的产生冲突
在Kubcube的资源管理中,租户分配租用了资源,下一步分配租户资源。按逻辑来,理应继承组织管理的层级,将租户资源分配给租户内的多个项目
但是,此时,就多了一个“空间”和“创建空间分配资源”的概念,让人觉得很奇怪。因为在一个企业组织中,很少有说我在一个空间中,基本上是我在哪个项目中。企业内带头人成立项目,下一步是项目审批通过了,就下发公司资源。好像很少说项目审批通过了,再创造一个空间,创建多个空间来下发公司资源。
请问在资源管理时,为什么不把租户资源直接按项目来分配?这样的企业组织逻辑不是更清晰吗?空间和创建空间的最佳实践案例是什么?
Is your feature request related to a problem? Please describe.
最新的hnc 1.1版本,已经支持hncresourcequota了,是否可以后续支持hncresourcequota来控制配额了。
Describe alternatives you've considered
最新的hnc 1.1版本,已经支持hncresourcequota了,是否可以后续支持hncresourcequota来控制配额了。
all in one 的安装方式 是不是会自动创建一个默认的k8s集群
使用日志采集
apiVersion: netease.com/v1
kind: Logconfig
metadata:
creationTimestamp: '2022-06-13T05:59:27Z'
generation: 1
labels:
app: dep-staff
managedFields:
-
apiVersion: netease.com/v1
fieldsType: FieldsV1
fieldsV1:
'f:metadata':
'f:labels':
.: {}
'f:app': {}
'f:spec':
.: {}
'f:inputs': {}
manager: Mozilla
operation: Update
time: '2022-06-13T05:59:27Z'
name: stufflog
namespace: cqdx
resourceVersion: '50486'
uid: 4c2ba645-762e-4354-bae4-d9cddd1e18b4
spec:
inputs:
-
enable: true
type:
name: dockerStdout
检查热插拔组件
[root@zpfrltgup4tujpi1-0001 network-scripts]# kubectl get hotplug
NAME PHASE AGE
common fail 2d2h
pivot-cluster fail 2d2h
查看组件
spec:
component:
-
name: audit
status: disabled
-
name: logseer
namespace: logseer
pkgName: logseer-v1.0.0.tgz
status: disabled
-
env: "clustername: \"{{.cluster}}\"\n"
name: logagent
namespace: logagent
pkgName: logagent-v1.0.0.tgz
status: disabled
-
name: elasticsearch
namespace: elasticsearch
pkgName: elasticsearch-7.8.1.tgz
status: enabled
-
env: "grafana:\n enabled: false\nprometheus:\n prometheusSpec:\n externalLabels:\n cluster: \"{{.cluster}}\"\n remoteWrite:\n - url: http://10.10.10.44:31291/api/v1/receive\n"
name: kubecube-monitoring
namespace: kubecube-monitoring
pkgName: kubecube-monitoring-15.4.12.tgz
status: enabled
-
name: kubecube-thanos
namespace: kubecube-monitoring
pkgName: thanos-3.18.0.tgz
status: disabled
status:
phase: fail
results:
-
message: 'audit is disabled'
name: audit
result: success
status: disabled
-
message: uninstalled
name: logseer
result: success
status: disabled
-
message: 'release is running'
name: logagent
result: success
status: enabled
-
message: 'release is running'
name: elasticsearch
result: success
status: enabled
-
message: 'helm install fail, cannot re-use a name that is still in use'
name: kubecube-monitoring
result: fail
status: enabled
-
message: 'release is running'
name: kubecube-thanos
result: success
status: enabled
看了下官方文档也没说如何才能开启日志查询
创建好的租户支持删除么,在租户管理里面没有找到删除租户的按钮
Is your feature request related to a problem? Please describe.
如果用户已有自己的认证系统,不希望使用KubeCube的认证方法,而希望KubeCube对接自己的认证系统。则需要定义一个通用的第三方认证接口,以便用户可以对接自己的认证系统。
Describe the solution you'd like
定义通用的第三方认证方法:
Describe alternatives you've considered
Additional context
Why we should remove the dependency on modifying k8s apiserver?
At present, we have to modify the args of k8s apiserver so that auth token can be verified by our auth webhook, but against deployment.
Assumed that there are lots of k8s master nodes, we have to modify each master node one by one. This is unbearable.
A proposal may make sense
Integrate auth proxy in warden.
Warden would parse token for user and do Impersonation with request forwards to k8s apiserver.
Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
支持带有边缘节点的k8s集群纳入嘛
Describe the solution you'd like
A clear and concise description of what you want to happen.
Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
Additional context
Add any other context or screenshots about the feature request here.
Why CubeOptions need validation?
Before running KubeCube, we are supposed to doing options validate, quick exit if validate failed.
Start = func(c *cli.Context) error {
if errs := flags.CubeOpts.Validate(); len(errs) > 0 {
return utilerrors.NewAggregate(errs)
}
run(flags.CubeOpts, signals.SetupSignalHandler())
return nil
}
How to do it?
We had validate method but in fact doing nothing in each validate func, we need complete it.
// Validate verify options for every component
// todo(weilaaa): complete it
func (s *CubeOptions) Validate() []error {
var errs []error
errs = append(errs, s.APIServerOpts.Validate()...)
errs = append(errs, s.ClientMgrOpts.Validate()...)
errs = append(errs, s.CtrlMgrOpts.Validate()...)
return errs
}
func (c *Config) Validate() []error {
return nil
}
###生产环境大部分时候
Describe the bug
A clear and concise description of what the bug is.
可以使用中文。
使用 All-in-one 文档部署,停在最后一步 deploy kubecube
查看情况发现如下错误
按 v1.1.x 部署
https://www.kubecube.io/docs/installation-guide/all-in-one/
Describe the bug
member cluster connect failed cause to kubecube startup failure
2021-08-09T10:34:48.179+0800 error kubernetes/kubernetes.go:81 problem new k8s client: Get "https://10.173.32.130:6443/api?timeout=32s": dial tcp 10.173.32.130:6443: connect: no route to host
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x40 pc=0x160106e]
goroutine 1034 [running]:
github.com/kubecube-io/kubecube/pkg/clients/kubernetes.NewClientFor.func1(0xc000df1280, 0x1ea3930, 0xc001c9e1c0)
/workspace/pkg/clients/kubernetes/kubernetes.go:102 +0x2e
created by github.com/kubecube-io/kubecube/pkg/clients/kubernetes.NewClientFor
/workspace/pkg/clients/kubernetes/kubernetes.go:101 +0x252
To Reproduce
Steps to reproduce the behavior:
Expected behavior
It's supposed to setting the status of cluster to abnormal
until member cluster reconnect kubecube meanwhile refresh InternalCluster
Screenshots
If applicable, add screenshots to help explain your problem.
Desktop (please complete the following information):
Smartphone (please complete the following information):
Additional context
Add any other context about the problem here.
apiVersion: v1
kind: Pod
metadata:
annotations:
kubeadm.kubernetes.io/kube-apiserver.advertise-address.endpoint: 10.206.0.10:6443
creationTimestamp: null
labels:
component: kube-apiserver
tier: control-plane
name: kube-apiserver
namespace: kube-system
spec:
containers:
- command:
- kube-apiserver
- --audit-log-format=json
- --audit-log-maxage=10
- --audit-log-maxbackup=10
- --audit-log-maxsize=100
- --audit-log-path=/var/log/audit
- --audit-policy-file=/etc/cube/audit/audit-policy.yaml
- --audit-webhook-config-file=/etc/cube/audit/audit-webhook.config
- --authentication-token-webhook-config-file=/etc/cube/warden/webhook.config
- --advertise-address=10.206.0.10
- --allow-privileged=true
- --authorization-mode=Node,RBAC
- --client-ca-file=/etc/kubernetes/pki/ca.crt
- --enable-admission-plugins=NodeRestriction
- --enable-bootstrap-token-auth=true
- --etcd-cafile=/etc/kubernetes/pki/etcd/ca.crt
- --etcd-certfile=/etc/kubernetes/pki/apiserver-etcd-client.crt
- --etcd-keyfile=/etc/kubernetes/pki/apiserver-etcd-client.key
- --etcd-servers=https://127.0.0.1:2379
- --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt
- --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key
- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
- --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt
- --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key
- --requestheader-allowed-names=front-proxy-client
- --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt
- --requestheader-extra-headers-prefix=X-Remote-Extra-
- --requestheader-group-headers=X-Remote-Group
- --requestheader-username-headers=X-Remote-User
- --secure-port=6443
- --service-account-issuer=https://kubernetes.default.svc.cluster.local
- --service-account-key-file=/etc/kubernetes/pki/sa.pub
- --service-account-signing-key-file=/etc/kubernetes/pki/sa.key
- --service-cluster-ip-range=10.16.0.0/12
- --tls-cert-file=/etc/kubernetes/pki/apiserver.crt
- --tls-private-key-file=/etc/kubernetes/pki/apiserver.key
image: registry.aliyuncs.com/google_containers/kube-apiserver:v1.22.1
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 8
httpGet:
host: 10.206.0.10
path: /livez
port: 6443
scheme: HTTPS
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 15
name: kube-apiserver
readinessProbe:
failureThreshold: 3
httpGet:
host: 10.206.0.10
path: /readyz
port: 6443
scheme: HTTPS
periodSeconds: 1
timeoutSeconds: 15
resources:
requests:
cpu: 250m
startupProbe:
failureThreshold: 24
httpGet:
host: 10.206.0.10
path: /livez
port: 6443
scheme: HTTPS
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 15
volumeMounts:
- mountPath: /var/log/audit
name: audit-log
- mountPath: /etc/cube
name: cube
readOnly: true
- mountPath: /etc/ssl/certs
name: ca-certs
readOnly: true
- mountPath: /etc/ca-certificates
name: etc-ca-certificates
readOnly: true
- mountPath: /etc/kubernetes/pki
name: k8s-certs
readOnly: true
- mountPath: /usr/local/share/ca-certificates
name: usr-local-share-ca-certificates
readOnly: true
- mountPath: /usr/share/ca-certificates
name: usr-share-ca-certificates
readOnly: true
hostNetwork: true
priorityClassName: system-node-critical
securityContext:
seccompProfile:
type: RuntimeDefault
volumes:
- hostPath:
path: /var/log/audit
type: DirectoryOrCreate
name: audit-log
- hostPath:
path: /etc/cube
type: DirectoryOrCreate
name: cube
- hostPath:
path: /etc/ssl/certs
type: DirectoryOrCreate
name: ca-certs
- hostPath:
path: /etc/ca-certificates
type: DirectoryOrCreate
name: etc-ca-certificates
- hostPath:
path: /etc/kubernetes/pki
type: DirectoryOrCreate
name: k8s-certs
- hostPath:
path: /usr/local/share/ca-certificates
type: DirectoryOrCreate
name: usr-local-share-ca-certificates
- hostPath:
path: /usr/share/ca-certificates
type: DirectoryOrCreate
name: usr-share-ca-certificates
status: {}
直接通过界面配置,会报错
找不到ingressClassName:
由于公司是云服务器,使用了两台有外网的做实验
123.123.123.111 10.10.10.31 (添加了虚拟网卡将外网绑定到主机)
123.123.123.222 10.10.10.32 (添加了虚拟网卡将外网绑定到主机)
10.10.10.31 直接使用的all-in-one模式安装
10.10.10.32 node-join-master
最总结果kube get node显示结果为
10.10.10.31 master
123.123.123.222 node (这里估计填写node ip 10.10.10.32 可以使用内网ip也应该没什么问题)
上面括弧重的猜测已经测试:
KUBERNETES_BIND_ADDRESS="10.10.10.32" node-join-master 的时候 显示的也是外网Ip 123.123.123.222
我的网卡信息为
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether fa:16:3e:a0:9e:2b brd ff:ff:ff:ff:ff:ff
inet 10.10.10.32/24 brd 10.10.10.255 scope global dynamic eth0
valid_lft 25917702sec preferred_lft 25917702sec
inet 123.123.123.222/24 brd 123.123.123.255 scope global eth0:1
valid_lft forever preferred_lft forever
inet6 fe80::f816:3eff:fea0:9e2b/64 scope link
valid_lft forever preferred_lft forever
eth0:1 为追加虚拟网卡,绑定我的外网ip
然后添加 deployment dep-ng nginx->80
添加 service svc-ng dep-ng 80->80
添加 ingress ing-ng svc-ng 80 域名a.cn 转发规则“/”
将域名 a.cn解析到123.123.123.111
发现不能访问
查看ingress日志 说找不到ingressClassName
修改ingress ing-ng的yml配置,添加了ingressClassName:nginx
再查看日志,发现没有错误日志,但是域名依然无法访问
1.7的安装文档弄下啊,到了1.4怎么就没了
问题一:向集群添加工作节点时执行脚本install.sh报错
2021-08-18 15:45:41 INFO get docker binary from local /bin/mv: cannot stat ‘/etc/kubecube/packages/docker-ce/linux/static/stable/x86_64/docker-19.03.8.tgz’: No such file or directory 2021-08-18 15:45:41 ERROR install kubernetes failed
真正packages目录是在packages-master中:
[root@test-ec2 x86_64]# pwd /etc/kubecube/packages-master/docker-ce/linux/static/stable/x86_64 [root@gtlm-ec2 x86_64]# ls docker-19.03.8.tgz
问题二:添加新节点时,给的步骤链接404
链接:https://www.kubecube.io/docs/部署指南/添加节点/#向集群添加工作节点
问题三:
创建新集群时,过程与文档:https://www.kubecube.io/docs/installation-guide/add-member-k8s 完全不符!
这个问题新手遇到容易发狂!
I'd like to deploy and debug kubecube in local. I want an easier way to make deploy environment and debug. Something like install script and makefile is needed. manifests needed below:
Is your feature request related to a problem? Please describe.
Describe the solution you'd like
我在已有的k8s集群中安装了kubecube,此时我怎么样才可以查看kube-system内的资源情况,包括一些资源占用、容器日志、pod状态等信息?
kubecube只能管理通过kubecube创建的空间内的资源情况吗?
使用通用认证方式接入第三方认证时,用户在kubecube中不会自动创建,后续也就无法加入到项目中。是否需要自动创建用户,或者第三方还需要调用kubecube的接口创建用户
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.