Comments (17)
from cloudpods.
@chenjacken 是按照这个文档流程下线的计算节点吗?
https://www.cloudpods.org/zh/docs/setup/removehost/
from cloudpods.
@chenjacken 是按照这个文档流程下线的计算节点吗? https://www.cloudpods.org/zh/docs/setup/removehost/
是按照这个来操作的,另外。然后重新安装了操作系统再部署该节点。
过程中该节点host出错,查询原型是存储存在:
climc storage-list
看起来是storage 同名,无法再创建成功,使用climc storage-delete {ID} 删除之后,再删除host 服务pod
from cloudpods.
我重新操作一遍,看是否还这样。
@zexi 谢谢!
from cloudpods.
`[root@master01 ~]# kubectl exec -ti -n onecloud $(kubectl get pods -n onecloud | grep climc | awk '{print $1}') sh
cd /opt/yunion/scripts/tools/
/opt/yunion/scripts/tools # clean_host.sh de2751d7-3367-4a16-80ed-75614fef6256
sh: clean_host.sh: not found
/opt/yunion/scripts/tools # ./clean_host.sh de2751d7-3367-4a16-80ed-75614fef6256
./clean_host.sh: line 17: climc: command not found
./clean_host.sh: line 19: climc: command not found
Error: Cannot find host de2751d7-3367-4a16-80ed-75614fef6256`
from cloudpods.
我重新操作一遍,看是否还这样。 @zexi 谢谢!
@chenjacken 感谢反馈,我们也验证一下
from cloudpods.
按照这个文档流程下线的计算节点,服务器没重现安装操作系统,重启后再增加计算节点
https://www.cloudpods.org/zh/docs/setup/removehost/
host pod出错。
[root@master01 ocboot]# kubectl logs -n onecloud default-host-lpg6j -c host --tail 100 -f [info 231007 06:09:44 procutils.WaitZombieLoop(zombie_others.go:36)] My pid is not 1 and no need to wait zombies [info 231007 06:09:44 options.ParseOptions(options.go:318)] Use configuration file: /etc/yunion/host.conf [warning 231007 06:09:44 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1214)] Cannot find argument enable-qmp-monitor [warning 231007 06:09:44 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1214)] Cannot find argument enable-health-checker [warning 231007 06:09:44 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1214)] Cannot find argument disk-is-ssd [warning 231007 06:09:44 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1214)] Cannot find argument enable-rbac [warning 231007 06:09:44 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1214)] Cannot find argument health-driver [warning 231007 06:09:44 structarg.(*ArgumentParser).parseJSONKeyValue(structarg.go:1214)] Cannot find argument start-host-ignore-sys-error [info 231007 06:09:44 options.ParseOptions(options.go:340)] Set log level to "info" [info 2023-10-07 06:09:44 options.ParseOptions(options.go:318)] Use configuration file: /etc/yunion/common/common.conf [info 2023-10-07 06:09:44 options.ParseOptions(options.go:340)] Set log level to "info" [info 2023-10-07 06:09:44 hostman.(*SHostService).InitService(host_services.go:63)] exec socket path: /var/run/onecloud/exec.sock [info 2023-10-07 06:09:44 app.InitApp(app.go:32)] RequestWorkerCount: 8 [info 2023-10-07 06:09:44 appsrv.NewApplication(appsrv.go:118)] App hostId: 5BWX32sB5eKN5WDLxFgwC28vpxk= (host,node1,172.16.1.8) 2023/10/07 06:09:44 Allow hosts [] [info 2023-10-07 06:09:44 appsrv.(*Application).SetDefaultTimeout(appsrv.go:134)] adjust application default timeout to 60.000000 seconds [info 2023-10-07 06:09:44 hostinfo.DetectCpuInfo(hostinfohelper.go:77)] cpuinfo freq 2197 [info 2023-10-07 06:09:44 hostinfo.NewHostInfo(hostinfo.go:2318)] CPU Model Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz Microcode 0xb00001e [info 2023-10-07 06:09:44 hostinfo.NewHostInfo(hostinfo.go:2338)] Get kubelet container image Fs: /opt/docker, eviction config: {"evictionHard":{"imagefs.available":{"Signal":"imagefs.available","Operator":"LessThan","Value":{"Quantity":null,"Percentage":0.05}},"memory.available":{"Signal":"memory.available","Operator":"LessThan","Value":{"Quantity":"100Mi","Percentage":0}},"nodefs.available":{"Signal":"nodefs.available","Operator":"LessThan","Value":{"Quantity":null,"Percentage":0.05}},"nodefs.inodesFree":{"Signal":"nodefs.inodesFree","Operator":"LessThan","Value":{"Quantity":null,"Percentage":0.05}}}} [error 2023-10-07 06:09:47 fileutils2.GetAllBlkdevsIoSchedulers(fileutils.go:170)] no block device avaiable [info 2023-10-07 06:09:47 hostinfo.(*SHostInfo).prepareEnv(hostinfo.go:403)] I/O Scheduler switch to none [info 2023-10-07 06:09:47 hostinfo.(*SHostInfo).getKubeReservedMemMb(hostinfo.go:1512)] Kubelet memory threshold subtracted: 100MB [info 2023-10-07 06:09:47 hostinfo.(*SHostInfo).Init(hostinfo.go:195)] Start detectHostInfo [info 2023-10-07 06:09:47 hostinfo.(*SHostInfo).detectKVMMaxCpus(hostinfo.go:864)] KVM API VERSION 12 [info 2023-10-07 06:09:47 hostinfo.(*SHostInfo).detectKVMMaxCpus(hostinfo.go:869)] KVM CAP MAX VCPUS: 288 [info 2023-10-07 06:09:47 hostinfo.(*SHostInfo).detectKVMMaxCpus(hostinfo.go:877)] KVM CAP NR VCPUS: 240 [info 2023-10-07 06:09:47 sysutils.detectNestSupport(kvm.go:146)] Host is support kvm nest ... [info 2023-10-07 06:09:47 sysutils.detectNestSupport(kvm.go:151)] Host kvm nest is enabled ... [info 2023-10-07 06:09:47 hostinfo.(*SHostInfo).detectOsDist(hostinfo.go:757)] DetectOsDist CentOS Linux 7.9.2009 [info 2023-10-07 06:09:47 hostinfo.(*SHostInfo).detectQemuVersion(hostinfo.go:831)] Detect qemu version is 4.2.0 [info 2023-10-07 06:09:47 hostinfo.(*SHostInfo).detectOvsVersion(hostinfo.go:972)] Detect OVS version is 2.12.4 [info 2023-10-07 06:09:47 hostinfo.(*SHostInfo).detectOvsKOVersion(hostinfo.go:989)] kernel module openvswitch vermagic: 5.4.130-1.yn20221208.el7.x86_64 SMP mod_unload modversions [error 2023-10-07 06:09:47 hostinfo.(*SHostInfo).Init(hostinfo.go:202)] Prepare host bridge "openvswitch" error: exit status 1 [fatal 2023-10-07 06:09:47 hostman.(*SHostService).RunService(host_services.go:80)] Host instance init error: Prepare host bridge "openvswitch" error: exit status 1
from cloudpods.
根据文档步骤操作下线计算节点,是否还需要卸载软件,例如openvswitch那些;同时是否还需要删除某些文件,例如/opt下的
然后不需要重新安装操作系统,再次增加计算节点。
from cloudpods.
https://www.cloudpods.org/zh/docs/setup/uninstall/
https://www.cloudpods.org/zh/docs/setup/removehost/
在计算节点上,这两个结合来做?也把计算节点的软件remove掉等操作?
from cloudpods.
@chenjacken 应该不需要再卸载软件了,这个是下线计算节点的步骤。
[error 2023-10-07 06:09:47 hostinfo.(*SHostInfo).Init(hostinfo.go:202)] Prepare host bridge "openvswitch" error: exit status 1 [fatal 2023-10-07 06:09:47 hostman.(*SHostService).RunService(host_services.go:80)] Host instance init error: Prepare host bridge "openvswitch" error: exit status 1
这个报错是创建 ovs 网桥报错了,看下下面两个服务的状态:
systemctl status yunion-executor openvswitch.service
如果服务都是 running ,看下 yunion-executor 的日志:
journalctl -u yunion-executor --no-pager
from cloudpods.
yunion-executor启动中
openvswitch没启动
尝试启动,失败:
-- The result is assert. 10月 07 14:27:23 node1 polkitd[1595]: Unregistered Authentication Agent for unix-process:37797:132166 (system bus name :1.369, object path /org/freedesktop/PolicyKit1/AuthenticationAgent, locale zh_CN.UTF-8) (disconnected from 10月 07 14:27:25 node1 kubelet[1588]: E1007 14:27:25.501672 1588 pod_workers.go:190] Error syncing pod 9fc9ecb6-e33c-42c8-a2f5-2f3171a5f165 ("default-host-hhpj5_onecloud(9fc9ecb6-e33c-42c8-a2f5-2f3171a5f165)"), skipping: fa 10月 07 14:27:28 node1 kubelet[1588]: E1007 14:27:28.173240 1588 summary_sys_containers.go:47] Failed to get system container stats for "/system.slice/docker.service": failed to get cgroup stats for "/system.slice/docker.se 10月 07 14:27:38 node1 kubelet[1588]: E1007 14:27:38.193096 1588 summary_sys_containers.go:47] Failed to get system container stats for "/system.slice/docker.service": failed to get cgroup stats for "/system.slice/docker.se 10月 07 14:27:40 node1 ovs-vsctl[37981]: ovs|00001|vsctl|INFO|Called as ovs-vsctl -- --may-exist add-br brtap 10月 07 14:27:40 node1 ovs-vsctl[37981]: ovs|00002|db_ctl_base|ERR|unix:/var/run/openvswitch/db.sock: database connection failed (No such file or directory) 10月 07 14:27:40 node1 kubelet[1588]: E1007 14:27:40.501414 1588 pod_workers.go:190] Error syncing pod 9fc9ecb6-e33c-42c8-a2f5-2f3171a5f165 ("default-host-hhpj5_onecloud(9fc9ecb6-e33c-42c8-a2f5-2f3171a5f165)"), skipping: fa 10月 07 14:27:45 node1 ovs-vsctl[38118]: ovs|00001|db_ctl_base|ERR|unix:/var/run/openvswitch/db.sock: database connection failed (No such file or directory) 10月 07 14:27:48 node1 kubelet[1588]: E1007 14:27:48.212677 1588 summary_sys_containers.go:47] Failed to get system container stats for "/system.slice/docker.service": failed to get cgroup stats for "/system.slice/docker.se 10月 07 14:27:52 node1 kubelet[1588]: E1007 14:27:52.501290 1588 pod_workers.go:190] Error syncing pod 9fc9ecb6-e33c-42c8-a2f5-2f3171a5f165 ("default-host-hhpj5_onecloud(9fc9ecb6-e33c-42c8-a2f5-2f3171a5f165)"), skipping: fa 10月 07 14:27:58 node1 kubelet[1588]: E1007 14:27:58.231465 1588 summary_sys_containers.go:47] Failed to get system container stats for "/system.slice/docker.service": failed to get cgroup stats for "/system.slice/docker.se 10月 07 14:28:03 node1 kubelet[1588]: E1007 14:28:03.501544 1588 pod_workers.go:190] Error syncing pod 9fc9ecb6-e33c-42c8-a2f5-2f3171a5f165 ("default-host-hhpj5_onecloud(9fc9ecb6-e33c-42c8-a2f5-2f3171a5f165)"), skipping: fa 10月 07 14:28:05 node1 ovs-vsctl[38389]: ovs|00001|db_ctl_base|ERR|unix:/var/run/openvswitch/db.sock: database connection failed (No such file or directory) 10月 07 14:28:07 node1 ovs-vsctl[38400]: ovs|00001|vsctl|INFO|Called as ovs-vsctl -- --may-exist add-br brtap 10月 07 14:28:07 node1 ovs-vsctl[38400]: ovs|00002|db_ctl_base|ERR|unix:/var/run/openvswitch/db.sock: database connection failed (No such file or directory) 10月 07 14:28:08 node1 kubelet[1588]: E1007 14:28:08.250624 1588 summary_sys_containers.go:47] Failed to get system container stats for "/system.slice/docker.service": failed to get cgroup stats for "/system.slice/docker.se 10月 07 14:28:16 node1 kubelet[1588]: E1007 14:28:16.501684 1588 pod_workers.go:190] Error syncing pod 9fc9ecb6-e33c-42c8-a2f5-2f3171a5f165 ("default-host-hhpj5_onecloud(9fc9ecb6-e33c-42c8-a2f5-2f3171a5f165)"), skipping: fa
from cloudpods.
这步操作,会断网,无法SSH到服务
from cloudpods.
执行3之前可以先重启一下计算节点,断网说明ovs还在接管物理网卡。重启后,ovs不会启动,这时候再删除ovs。
这步操作,会断网,无法SSH到服务
from cloudpods.
@chenjacken 断网是否和你提的这个 #18221 问题有关系?
现在是否还有问题?
from cloudpods.
@chenjacken 断网是否和你提的这个 #18221 问题有关系? 现在是否还有问题?
问题解决了,我根据文档https://www.cloudpods.org/zh/docs/setup/removehost/ 操作卸载
同时在web页面把该节点的存储删除(貌似以上文档的操作不会删除该节点的存储信息),然后再重新部署上线解决了。
from cloudpods.
@chenjacken 断网是否和你提的这个 #18221 问题有关系? 现在是否还有问题?
问题解决了,我根据文档https://www.cloudpods.org/zh/docs/setup/removehost/操作卸载等操作 同时在web页面把该节点的存储删除(貌似以上文档的操作不会删除该节点的存储信息),然后再重新部署上线解决了。
好的,那这个问题我就先关闭了,有其他问题再提 issue 就行。
from cloudpods.
执行3之前可以先重启一下计算节点,断网说明ovs还在接管物理网卡。重启后,ovs不会启动,这时候再删除ovs。
明白。因为文档没提要重启,所以接着操作就断网了。
from cloudpods.
Related Issues (20)
- [求助/Help] it looks like a python issue,[Errno 2] No such file or directory: '/usr/bin/python': '/usr/bin/python'\n", "module_stdout": "", "msg": "MODULE FAILURE\nSee stdout/stderr for the exact error HOT 4
- [BUG]弹性IP的bug,在aws完成弹性ip的创建、绑定。但是在多云平台未能在主机页显示弹性ip HOT 32
- [求助/Help]配置aksk后,SDK, climc和api接口问题 HOT 4
- [BUG] DELL R420 开启iommu启动异常 HOT 2
- [求助/Help] 开出的 虚拟机可以访问 宿主机的网络, 请问有什么办法可以让虚拟机无法访问宿主机的网络? HOT 3
- [求助/Help]default-region-dns pod出现问题 HOT 7
- [BUG] 不确实是不是一个BUG:在用户登陆的时候,token_cache_tbl会生成6个不同token_id HOT 1
- [BUG]公有云镜像未同步更新 HOT 3
- [求助/Help] 物理机pxe无盘启动,o.Options.NfsBootRootfs如何使用 HOT 1
- [求助/Help]在执行run.py安装部署时遇到pull ocadm images 错误 HOT 5
- [BUG] 创建硬盘时 【指定虚拟机】展示的列表不全
- [求助/Help] 负载均衡连接WINDOWS 10远程桌面3389端口出现断连情况 HOT 1
- [求助/Help] 对接vmware原来3.10.7正常,升级 3.11.2 后,创建虚拟机无法选择子网 HOT 1
- [求助/Help] 绑定eip 后虚拟机无法上网
- [求助/Help] v3.11.3版本/forecast接口 java sdk验证失败 HOT 1
- [求助/Help]v3.11.2,计算节点离线,报:Host instance init error: Setup OVN Chassis: normalize db host: dns lookup (default-ovn-north) failed
- quit
- [求助/Help] 纳管计算节点后,计算节点的host服务起不来 HOT 2
- [BUG] 使用lvmlockd模式扩容关机的虚拟机磁盘报错 HOT 2
- [求助/Help]baremetal物理机管理服务组件DHCP服务的配置文件在哪里? HOT 10
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cloudpods.