Giter Site home page Giter Site logo

1.16.1 join err about kubeedge HOT 14 CLOSED

thinkeng avatar thinkeng commented on May 25, 2024
1.16.1 join err

from kubeedge.

Comments (14)

Shelley-BaoYue avatar Shelley-BaoYue commented on May 25, 2024

add --remote-runtime-endpoint=unix:///var/run/cri-dockerd.sock

from kubeedge.

thinkeng avatar thinkeng commented on May 25, 2024

添加 --remote-runtime-endpoint=unix:///var/run/cri-dockerd.sock

I0410 19:58:32.759022  875478 command.go:901] 1. Check KubeEdge edgecore process status
I0410 19:58:32.772668  875478 command.go:901] 2. Check if the management directory is clean
I0410 19:58:32.772759  875478 join.go:94] 3. Create the necessary directories
I0410 19:58:32.861311  875478 join_others.go:183] 4. Pull Images
Pulling kubeedge/installation-package:v1.16.1 ...
Successfully pulled kubeedge/installation-package:v1.16.1
I0410 19:59:22.256568  875478 join_others.go:183] 5. Copy resources from the image to the management directory
E0410 19:59:42.276078  875478 remote_runtime.go:176] "RunPodSandbox from runtime service failed" err="rpc error: code = DeadlineExceeded desc = context deadline exceeded"
Error: edge node join failed: copy resources failed: rpc error: code = DeadlineExceeded desc = context deadline exceeded
execute keadm command failed:  edge node join failed: copy resources failed: rpc error: code = DeadlineExceeded desc = context deadline exceeded

from kubeedge.

Shelley-BaoYue avatar Shelley-BaoYue commented on May 25, 2024

I guess the status of cri-dockerd need to be active(running)

from kubeedge.

thinkeng avatar thinkeng commented on May 25, 2024

I guess the status of cri-dockerd need to be active(running)

cri-docker.service 的配置增加

ExecStart=/usr/bin/cri-dockerd --network-plugin=cni --pod-infra-container-image=registry.aliyuncs.com/google_containers/pause:3.7

journalctl -u edgecore.service -xe

看到日志

"Starting to sync pod status with apiserver"
44] "Starting kubelet main sync loop"
68] "Skipping pod synchronization" err="[container runtime status check may not have completed yet, PLEG is not healthy: pleg has yet to be successful]"
status.go:68] "Attempting to register node" node="barry-edge-01"
66] failed to unmarshal message content to unstructured obj: Object 'Kind' is missing in '{"metadata":{"name":"barry-edge-01","creationTimestamp":null,"labels":{"beta.>
8] process remote failed, req[msgID[c1178973-2eaf-4f93-b725-52a9d6c59acb] resource[default/node/barry-edge-01]], err: not connected
el.go:164] Get bad anonName: when sendresp message, do nothing
o:214] "Starting CPU manager" policy="none"
o:215] "Reconciling" reconcilePeriod="10s"
36] "Initialized new in-memory state store"
88] "Updated default CPUSet" cpuSet=""
96] "Updated CPUSet assignments" assignments={}
o:49] "None policy: Start"
r.go:169] "Starting memorymanager" policy="None"
35] "Initializing new in-memory state store"
75] "Updated machine memory state"
1] "Failed to read data from checkpoint" checkpoint="kubelet_internal_checkpoint" err="checkpoint is not found"
r.go:118] "Starting Kubelet Plugin Manager"
ger.go:262] "Eviction manager: failed to get summary stats" err="failed to get node info: node length from meta db is 0"
:295] Failed to initialize CSINode: error updating CSINode annotation: timed out waiting for the condition; caused by: node length from meta db is 0
:295] Failed to initialize CSINode: error updating CSINode annotation: timed out waiting for the condition; caused by: node length from meta db is 0
:295] Failed to initialize CSINode: error updating CSINode annotation: timed out waiting for the condition; caused by: node length from meta db is 0
:295] Failed to initialize CSINode: error updating CSINode annotation: timed out waiting for the condition; caused by: node length from meta db is 0
 start connect to mqtt server with client id: hub-client-sub-1712802162
 client hub-client-sub-1712802162 isconnected: false
] connect error: Network Error : dial tcp 127.0.0.1:1883: connect: connection refused
:295] Failed to initialize CSINode: error updating CSINode annotation: timed out waiting for the condition; caused by: node length from meta db is 0

from kubeedge.

Shelley-BaoYue avatar Shelley-BaoYue commented on May 25, 2024

make sure if edge node can curl cloudcore or if there is any err log in cloudcore.log
image

from kubeedge.

thinkeng avatar thinkeng commented on May 25, 2024

make sure if edge node can curl cloudcore or if there is any err log in cloudcore.log image
9091 端口需要往外暴露么,这个端口是干什么的

from kubeedge.

thinkeng avatar thinkeng commented on May 25, 2024

make sure if edge node can curl cloudcore or if there is any err log in cloudcore.log image
9091 端口需要往外暴露么,这个端口是干什么的

可以了谢谢

from kubeedge.

thinkeng avatar thinkeng commented on May 25, 2024

再问一下,9091 端口需要往外暴露么? 这个端口是干什么的

from kubeedge.

Shelley-BaoYue avatar Shelley-BaoYue commented on May 25, 2024

再问一下,9091 端口需要往外暴露么? 这个端口是干什么的

https://github.com/kubeedge/kubeedge/blob/master/pkg/apis/componentconfig/cloudcore/v1alpha1/types.go#L54

from kubeedge.

thinkeng avatar thinkeng commented on May 25, 2024

节点加入成功后 pod 一直 ContainerCreating
报错

4月 11 19:35:56 rootk edgecore[1864043]: E0411 19:35:56.965075 1864043 pod_workers.go:1294] "Error syncing pod, skipping" err="failed to \"CreatePodSandbox\" for \"edge-node-exporter-g72hr_monitoring(ecf3a3ef-bee0-412d-8a6c-84d8cdb8c055)\" with CreatePodSandboxError: \"Failed to create sandbox for pod \\\"edge-node-exporter-g72hr_monitoring(ecf3a3ef-bee0-412d-8a6c-84d8cdb8c055)\\\": rpc error: code = Unknown desc = failed to create a sandbox for pod \\\"edge-node-exporter-g72hr\\\": Error response from daemon: cgroup-parent for systemd cgroup should be a valid slice named as \\\"xxx.slice\\\"\"" pod="monitoring/edge-node-exporter-g72hr" podUID="ecf3a3ef-bee0-412d-8a6c-84d8cdb8c055"

docker 配置
/etc/docker/daemon.json

{
  "registry-mirrors":["https://bycacelf.mirror.aliyuncs.com"],
  "exec-opts": ["native.cgroupdriver=systemd"],
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "100m"
  },
  "storage-driver": "overlay2"
}

from kubeedge.

Shelley-BaoYue avatar Shelley-BaoYue commented on May 25, 2024

https://kubeedge.io/docs/setup/prerequisites/runtime#configure-the-runtime-for-edgecore-using-keadm-2

from kubeedge.

thinkeng avatar thinkeng commented on May 25, 2024

https://kubeedge.io/docs/setup/prerequisites/runtime#configure-the-runtime-for-edgecore-using-keadm-2

可以了,谢谢谢谢

from kubeedge.

cosmosiwi avatar cosmosiwi commented on May 25, 2024

https://kubeedge.io/docs/setup/prerequisites/runtime#configure-the-runtime-for-edgecore-using-keadm-2

可以了,谢谢谢谢

can I know how you configure the cri-dockerd and the platform you boot k8s ?

from kubeedge.

thinkeng avatar thinkeng commented on May 25, 2024

https://kubeedge.io/docs/setup/prerequisites/runtime#configure-the-runtime-for-edgecore-using-keadm-2

可以了,谢谢谢谢

can I know how you configure the cri-dockerd and the platform you boot k8s ?

keadm 安装 的 边缘节点
边缘节点 cri-docker.service 修改下面配置

ExecStart=/usr/local/bin/cri-dockerd --network-plugin=cni --cni-bin-dir=/opt/cni/bin --cni-cache-dir=/var/lib/cni/cache --cni-conf-dir=/etc/cni/net.d   --pod-infra-container-image=registry.aliyuncs.com/google_containers/pause:3.7

用kubekey 安装的k8s集群 https://github.com/kubesphere/kubekey
k8s 集群 cri-docker.service

[Unit]
Description=CRI Interface for Docker Application Container Engine
Documentation=https://docs.mirantis.com

[Service]
Type=notify
ExecStart=/usr/bin/cri-dockerd --pod-infra-container-image registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.9
ExecReload=/bin/kill -s HUP $MAINPID
TimeoutSec=0
RestartSec=2
Restart=always

# Note that StartLimit* options were moved from "Service" to "Unit" in systemd 229.
# Both the old, and new location are accepted by systemd 229 and up, so using the old location
# to make them work for either version of systemd.
StartLimitBurst=3

# Note that StartLimitInterval was renamed to StartLimitIntervalSec in systemd 230.
# Both the old, and new name are accepted by systemd 230 and up, so using the old name to make
# this option work for either version of systemd.
StartLimitInterval=60s

# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity

# Comment TasksMax if your systemd version does not support it.
# Only systemd 226 and above support this option.
TasksMax=infinity
Delegate=yes
KillMode=process

[Install]
WantedBy=multi-user.target

from kubeedge.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.