intel / afxdp-plugins-for-kubernetes Goto Github PK
View Code? Open in Web Editor NEWLicense: Apache License 2.0
License: Apache License 2.0
Change the DP syncer from optional to mandatory at DP launch time and remove the BPF unloading functionality from the CNI.
Leverage the existing DelNetDev()
function in the DPServer Server to unload the bpf program on the interface.
Depends on #67 being merged.
Hey,
I'm trying to deploy a pod with 2 af-xdp interaces:
and facing this issue:
INFO[2023-04-20 18:09:16] Reading config file: /afxdp/config/config.json
INFO[2023-04-20 18:09:16] Unmarshalling config data
INFO[2023-04-20 18:09:16] Config Data:
{
"Pools": [
{
"Name": "access",
"Mode": "primary",
"Drivers": [
{
"Name": "ice",
"Primary": 0,
"Secondary": 0,
"ExcludeDevices": [
{
"Name": "ens4f1",
"Pci": "",
"Mac": "",
"Secondary": 0
}
],
"ExcludeAddressed": false
}
],
"Devices": null,
"Nodes": null,
"UdsServerDisable": false,
"UdsTimeout": 0,
"UdsFuzz": false,
"RequiresUnprivilegedBpf": false,
"uid": 0,
"ethtoolCmds": null
},
{
"Name": "core",
"Mode": "primary",
"Drivers": [
{
"Name": "ice",
"Primary": 0,
"Secondary": 0,
"ExcludeDevices": [
{
"Name": "ens4f0",
"Pci": "",
"Mac": "",
"Secondary": 0
}
],
"ExcludeAddressed": false
}
],
"Devices": null,
"Nodes": null,
"UdsServerDisable": false,
"UdsTimeout": 0,
"UdsFuzz": false,
"RequiresUnprivilegedBpf": false,
"uid": 0,
"ethtoolCmds": null
}
],
"LogFile": "afxdp-dp.log",
"LogLevel": "debug"
}
INFO[2023-04-20 18:09:16] Validating config data
INFO[2023-04-20 18:09:16] Setting log directory: /var/log/afxdp-k8s-plugins/
INFO[2023-04-20 18:09:16] Setting log file: afxdp-dp.log
INFO[2023-04-20 18:09:16] Setting log level: debug
INFO[2023-04-20 18:09:16] Switching to debug log format
INFO[2023-04-20 18:09:16] [main.go:75] [main] Starting AF_XDP Device Plugin
INFO[2023-04-20 18:09:16] [main.go:78] [main] Checking if host meets requirements
DEBU[2023-04-20 18:09:16] [main.go:171] [checkHost] Checking kernel version
DEBU[2023-04-20 18:09:16] [main.go:197] [checkHost] Kernel version: 5.13.0-1009-oem meets minimum requirements
DEBU[2023-04-20 18:09:16] [main.go:200] [checkHost] Checking host for Libbpf
DEBU[2023-04-20 18:09:16] [host.go:85] [HasLibbpf] Directory /usr/lib64/ does not exist
DEBU[2023-04-20 18:09:16] [main.go:207] [checkHost] Libbpf found on host:
DEBU[2023-04-20 18:09:16] [main.go:209] [checkHost] /usr/lib/libbpf.so.0
DEBU[2023-04-20 18:09:16] [main.go:209] [checkHost] /usr/lib/libbpf.so.0.5.0
INFO[2023-04-20 18:09:16] [main.go:88] [main] Host meets requirements
INFO[2023-04-20 18:09:16] [main.go:91] [main] Getting device pools
DEBU[2023-04-20 18:09:16] [config.go:111] [GetPoolConfigs] Unprivileged BPF is allowed on this host
DEBU[2023-04-20 18:09:16] [config.go:130] [GetPoolConfigs] docker0 is not a physical device, removing from list of host devices
DEBU[2023-04-20 18:09:16] [config.go:130] [GetPoolConfigs] calibfc702d80c1 is not a physical device, removing from list of host devices
DEBU[2023-04-20 18:09:16] [config.go:135] [GetPoolConfigs] eno1 a globally prohibited device, removing from list of host devices
DEBU[2023-04-20 18:09:16] [config.go:130] [GetPoolConfigs] calibd8ae8df0e5 is not a physical device, removing from list of host devices
DEBU[2023-04-20 18:09:16] [config.go:130] [GetPoolConfigs] data is not a physical device, removing from list of host devices
DEBU[2023-04-20 18:09:16] [config.go:130] [GetPoolConfigs] cali09d96a0e9e0 is not a physical device, removing from list of host devices
DEBU[2023-04-20 18:09:16] [config.go:130] [GetPoolConfigs] lo is not a physical device, removing from list of host devices
DEBU[2023-04-20 18:09:16] [config.go:135] [GetPoolConfigs] eno2 a globally prohibited device, removing from list of host devices
DEBU[2023-04-20 18:09:16] [config.go:130] [GetPoolConfigs] cali537aa6a676c is not a physical device, removing from list of host devices
DEBU[2023-04-20 18:09:16] [config.go:130] [GetPoolConfigs] calie47cf898f7c is not a physical device, removing from list of host devices
DEBU[2023-04-20 18:09:16] [config.go:130] [GetPoolConfigs] worker is not a physical device, removing from list of host devices
DEBU[2023-04-20 18:09:16] [config.go:130] [GetPoolConfigs] vxlan.calico is not a physical device, removing from list of host devices
DEBU[2023-04-20 18:09:16] [config.go:130] [GetPoolConfigs] cali561dddf72ed is not a physical device, removing from list of host devices
DEBU[2023-04-20 18:09:16] [config.go:145] [GetPoolConfigs] Host devices:
{
"ens4f0": {},
"ens4f1": {}
}
INFO[2023-04-20 18:09:16] [config.go:149] [GetPoolConfigs] Processing Pool: access
DEBU[2023-04-20 18:09:16] [config.go:163] [GetPoolConfigs] Using default UDS timeout: 30 seconds
INFO[2023-04-20 18:09:16] [config.go:264] [getDeviceListOfDriverType] ens4f0 added to pool
DEBU[2023-04-20 18:09:16] [config.go:332] [validateDevice] ens4f1 is an excluded device for ice driver
DEBU[2023-04-20 18:09:16] [config.go:273] [getDeviceListOfDriverType] Exit discovery.
INFO[2023-04-20 18:09:16] [config.go:149] [GetPoolConfigs] Processing Pool: core
DEBU[2023-04-20 18:09:16] [config.go:163] [GetPoolConfigs] Using default UDS timeout: 30 seconds
DEBU[2023-04-20 18:09:16] [config.go:316] [validateDevice] Device ens4f0 is fully assigned
INFO[2023-04-20 18:09:16] [config.go:264] [getDeviceListOfDriverType] ens4f1 added to pool
DEBU[2023-04-20 18:09:16] [config.go:273] [getDeviceListOfDriverType] Exit discovery.
DEBU[2023-04-20 18:09:16] [poolManager.go:327] [startGRPC] afxdp/access started serving on /var/lib/kubelet/device-plugins/afxdp-access.sock
INFO[2023-04-20 18:09:16] [poolManager.go:88] [Init] Pool afxdp/access started serving
INFO[2023-04-20 18:09:16] [poolManager.go:93] [Init] Pool afxdp/access registered with Kubelet
DEBU[2023-04-20 18:09:16] [poolManager.go:123] [ListAndWatch] Pool afxdp/access ListAndWatch started
DEBU[2023-04-20 18:09:16] [poolManager.go:327] [startGRPC] afxdp/core started serving on /var/lib/kubelet/device-plugins/afxdp-core.sock
INFO[2023-04-20 18:09:16] [poolManager.go:88] [Init] Pool afxdp/core started serving
INFO[2023-04-20 18:09:16] [poolManager.go:93] [Init] Pool afxdp/core registered with Kubelet
DEBU[2023-04-20 18:09:16] [poolManager.go:123] [ListAndWatch] Pool afxdp/core ListAndWatch started
DEBU[2023-04-20 18:21:06] [poolManager.go:152] [Allocate] New allocate request on pool access
INFO[2023-04-20 18:21:06] [poolManager.go:155] [Allocate] Creating new UDS server
DEBU[2023-04-20 18:21:06] [poolManager.go:180] [Allocate] Device: {
"Name": "ens4f0",
"Mode": "primary",
"Driver": "ice",
"Pci": "0000:d8:00.0",
"MacAddress": "3c:ec:ef:d9:62:f2",
"FullyAssigned": true,
"EthtoolFilters": null,
"Primary": {
"Name": "ens4f0",
"Mode": "primary",
"Driver": "ice",
"Pci": "0000:d8:00.0",
"MacAddress": "3c:ec:ef:d9:62:f2",
"FullyAssigned": true,
"EthtoolFilters": null,
"Primary": null
}
}
DEBU[2023-04-20 18:21:06] [poolManager.go:190] [Allocate] Primary mode
DEBU[2023-04-20 18:21:06] [poolManager.go:202] [Allocate] Cycling state of device ens4f0
INFO[2023-04-20 18:21:06] [poolManager.go:209] [Allocate] Loading BPF program on device: ens4f0
INFO[2023-04-20 18:21:06] [bpfWrapper.go:104] [Infof] Load_bpf_send_xsk_map: disovering if_index for interface ens4f0
INFO[2023-04-20 18:21:06] [bpfWrapper.go:104] [Infof] Load_bpf_send_xsk_map: if_index for interface ens4f0 is 4
INFO[2023-04-20 18:21:06] [bpfWrapper.go:104] [Infof] Load_bpf_send_xsk_map: starting setup of xdp program on interface ens4f0 (4)
INFO[2023-04-20 18:21:06] [bpfWrapper.go:104] [Infof] Load_bpf_send_xsk_map: loaded xdp program on interface ens4f0 (4), file descriptor 11
INFO[2023-04-20 18:21:06] [poolManager.go:215] [Allocate] BPF program loaded on: ens4f0 File descriptor: 11
DEBU[2023-04-20 18:21:06] [poolManager.go:233] [Allocate] Container environment variables: {
"AFXDP_DEVICES": "ens4f0"
}
DEBU[2023-04-20 18:21:06] [udsserver.go:166] [start] Initialising Unix domain socket: /tmp/afxdp_dp/afxdp_access/13ab88a3-4751-40fb-a9b5-2b3118180c4e.sock
INFO[2023-04-20 18:21:06] [udsserver.go:174] [start] Unix domain socket initialised. Listening for new connection.
DEBU[2023-04-20 18:21:06] [poolManager.go:152] [Allocate] New allocate request on pool core
INFO[2023-04-20 18:21:06] [poolManager.go:155] [Allocate] Creating new UDS server
DEBU[2023-04-20 18:21:06] [poolManager.go:180] [Allocate] Device: {
"Name": "ens4f1",
"Mode": "primary",
"Driver": "ice",
"Pci": "0000:d8:00.1",
"MacAddress": "3c:ec:ef:d9:62:f3",
"FullyAssigned": true,
"EthtoolFilters": null,
"Primary": {
"Name": "ens4f1",
"Mode": "primary",
"Driver": "ice",
"Pci": "0000:d8:00.1",
"MacAddress": "3c:ec:ef:d9:62:f3",
"FullyAssigned": true,
"EthtoolFilters": null,
"Primary": null
}
}
DEBU[2023-04-20 18:21:06] [poolManager.go:190] [Allocate] Primary mode
DEBU[2023-04-20 18:21:06] [poolManager.go:202] [Allocate] Cycling state of device ens4f1
INFO[2023-04-20 18:21:06] [poolManager.go:209] [Allocate] Loading BPF program on device: ens4f1
INFO[2023-04-20 18:21:06] [bpfWrapper.go:104] [Infof] Load_bpf_send_xsk_map: disovering if_index for interface ens4f1
INFO[2023-04-20 18:21:06] [bpfWrapper.go:104] [Infof] Load_bpf_send_xsk_map: if_index for interface ens4f1 is 5
INFO[2023-04-20 18:21:06] [bpfWrapper.go:104] [Infof] Load_bpf_send_xsk_map: starting setup of xdp program on interface ens4f1 (5)
INFO[2023-04-20 18:21:06] [bpfWrapper.go:104] [Infof] Load_bpf_send_xsk_map: loaded xdp program on interface ens4f1 (5), file descriptor 15
INFO[2023-04-20 18:21:06] [poolManager.go:215] [Allocate] BPF program loaded on: ens4f1 File descriptor: 15
DEBU[2023-04-20 18:21:06] [poolManager.go:233] [Allocate] Container environment variables: {
"AFXDP_DEVICES": "ens4f1"
}
DEBU[2023-04-20 18:21:06] [udsserver.go:166] [start] Initialising Unix domain socket: /tmp/afxdp_dp/afxdp_core/36add699-c5fb-4edb-9609-68c4d6f7a0ee.sock
INFO[2023-04-20 18:21:06] [udsserver.go:174] [start] Unix domain socket initialised. Listening for new connection.
ERRO[2023-04-20 18:21:36] [uds.go:134] [Listen] Listener timed out: accept unixpacket /tmp/afxdp_dp/afxdp_access/13ab88a3-4751-40fb-a9b5-2b3118180c4e.sock: i/o timeout
ERRO[2023-04-20 18:21:36] [udsserver.go:179] [start] Listener timed out: accept unixpacket /tmp/afxdp_dp/afxdp_access/13ab88a3-4751-40fb-a9b5-2b3118180c4e.sock: i/o timeout
DEBU[2023-04-20 18:21:36] [uds.go:298] [cleanup] Closing Unix listener
DEBU[2023-04-20 18:21:36] [uds.go:304] [cleanup] Closing socket file
DEBU[2023-04-20 18:21:36] [uds.go:306] [cleanup] Removing socket file
ERRO[2023-04-20 18:21:36] [uds.go:134] [Listen] Listener timed out: accept unixpacket /tmp/afxdp_dp/afxdp_core/36add699-c5fb-4edb-9609-68c4d6f7a0ee.sock: i/o timeout
ERRO[2023-04-20 18:21:36] [udsserver.go:179] [start] Listener timed out: accept unixpacket /tmp/afxdp_dp/afxdp_core/36add699-c5fb-4edb-9609-68c4d6f7a0ee.sock: i/o timeout
DEBU[2023-04-20 18:21:36] [uds.go:298] [cleanup] Closing Unix listener
DEBU[2023-04-20 18:21:36] [uds.go:304] [cleanup] Closing socket file
DEBU[2023-04-20 18:21:36] [uds.go:306] [cleanup] Removing socket file
daemonset file:
apiVersion: v1
kind: ConfigMap
metadata:
name: afxdp-dp-config
namespace: kube-system
data:
config.json: |
{
"logLevel":"debug",
"logFile":"afxdp-dp.log",
"pools":[
{
"name":"access",
"mode":"primary",
"drivers":[
{
"name":"ice",
"excludeDevices":[
{
"name":"ens4f1"
}
]
}
]
},
{
"name":"core",
"mode":"primary",
"drivers":[
{
"name":"ice",
"excludeDevices":[
{
"name":"ens4f0"
}
]
}
]
}
]
}
dpdk-devbind.py -s:
Network devices using kernel driver
===================================
0000:60:00.0 'Ethernet Connection X722 for 1GbE 37d1' if=eno1 drv=i40e unused=vfio-pci *Active*
0000:60:00.1 'Ethernet Connection X722 for 1GbE 37d1' if=eno2 drv=i40e unused=vfio-pci
0000:d8:00.0 'Device 159b' if= drv=ice unused=vfio-pci
0000:d8:00.1 'Device 159b' if= drv=ice unused=vfio-pci
cc @garyloug
Hey,
I'm facing this issue after af-xdp plugin deployed successfully.
Logs:
INFO[2023-01-27 09:49:48] Reading config file: /afxdp/config/config.json
INFO[2023-01-27 09:49:48] Unmarshalling config data
INFO[2023-01-27 09:49:48] Config Data:
{
"Pools": [
{
"Name": "eastPool",
"Mode": "primary",
"Drivers": [
{
"Name": "i40e",
"Primary": 0,
"Secondary": 0,
"ExcludeDevices": [
{
"Name": "eno2",
"Pci": "",
"Mac": "",
"Secondary": 0
}
],
"ExcludeAddressed": false
}
],
"Devices": null,
"Nodes": null,
"UdsServerDisable": false,
"UdsTimeout": 0,
"UdsFuzz": false,
"RequiresUnprivilegedBpf": false,
"uid": 0,
"ethtoolCmds": null
},
{
"Name": "westPool",
"Mode": "primary",
"Drivers": [
{
"Name": "i40e",
"Primary": 0,
"Secondary": 0,
"ExcludeDevices": [
{
"Name": "eno1",
"Pci": "",
"Mac": "",
"Secondary": 0
}
],
"ExcludeAddressed": false
}
],
"Devices": null,
"Nodes": null,
"UdsServerDisable": false,
"UdsTimeout": 0,
"UdsFuzz": false,
"RequiresUnprivilegedBpf": false,
"uid": 0,
"ethtoolCmds": null
}
],
"LogFile": "afxdp-dp.log",
"LogLevel": "debug"
}
INFO[2023-01-27 09:49:48] Validating config data
INFO[2023-01-27 09:49:48] Setting log directory: /var/log/afxdp-k8s-plugins/
INFO[2023-01-27 09:49:48] Setting log file: afxdp-dp.log
INFO[2023-01-27 09:49:48] Setting log level: debug
INFO[2023-01-27 09:49:48] Switching to debug log format
INFO[2023-01-27 09:49:48] [main.go:75] [main] Starting AF_XDP Device Plugin
INFO[2023-01-27 09:49:48] [main.go:78] [main] Checking if host meets requriements
DEBU[2023-01-27 09:49:48] [main.go:171] [checkHost] Checking kernel version
DEBU[2023-01-27 09:49:48] [main.go:197] [checkHost] Kernel version: 5.13.0-1009-oem meets minimum requirements
DEBU[2023-01-27 09:49:48] [main.go:200] [checkHost] Checking host for Libbpf
DEBU[2023-01-27 09:49:48] [host.go:85] [HasLibbpf] Directory /usr/lib64/ does not exist
DEBU[2023-01-27 09:49:48] [main.go:207] [checkHost] Libbpf found on host:
DEBU[2023-01-27 09:49:48] [main.go:209] [checkHost] /usr/lib/libbpf.so.0
DEBU[2023-01-27 09:49:48] [main.go:209] [checkHost] /usr/lib/libbpf.so.0.5.0
INFO[2023-01-27 09:49:48] [main.go:88] [main] Host meets requriements
INFO[2023-01-27 09:49:48] [main.go:91] [main] Getting device pools
DEBU[2023-01-27 09:49:48] [config.go:111] [GetPoolConfigs] Unprivileged BPF is allowed on this host
DEBU[2023-01-27 09:49:48] [config.go:135] [GetPoolConfigs] eno2 a globally prohibited device, removing from list of host devices
DEBU[2023-01-27 09:49:48] [config.go:130] [GetPoolConfigs] docker0 is not a physical device, removing from list of host devices
DEBU[2023-01-27 09:49:48] [config.go:130] [GetPoolConfigs] cali57cbcb24c31 is not a physical device, removing from list of host devices
DEBU[2023-01-27 09:49:48] [config.go:130] [GetPoolConfigs] cali231ee6496e0 is not a physical device, removing from list of host devices
DEBU[2023-01-27 09:49:48] [config.go:135] [GetPoolConfigs] eno1 a globally prohibited device, removing from list of host devices
DEBU[2023-01-27 09:49:48] [config.go:130] [GetPoolConfigs] caliadc8f19fd1c is not a physical device, removing from list of host devices
DEBU[2023-01-27 09:49:48] [config.go:130] [GetPoolConfigs] caliae6307e8d8d is not a physical device, removing from list of host devices
DEBU[2023-01-27 09:49:48] [config.go:130] [GetPoolConfigs] cali903687db04a is not a physical device, removing from list of host devices
DEBU[2023-01-27 09:49:48] [config.go:130] [GetPoolConfigs] lo is not a physical device, removing from list of host devices
DEBU[2023-01-27 09:49:48] [config.go:130] [GetPoolConfigs] iface is not a physical device, removing from list of host devices
DEBU[2023-01-27 09:49:48] [config.go:130] [GetPoolConfigs] vxlan.calico is not a physical device, removing from list of host devices
DEBU[2023-01-27 09:49:48] [config.go:145] [GetPoolConfigs] Host devices:
{
"ens1f0": {},
"ens1f1": {}
}
INFO[2023-01-27 09:49:48] [config.go:149] [GetPoolConfigs] Processing Pool: eastPool
DEBU[2023-01-27 09:49:48] [config.go:163] [GetPoolConfigs] Using default UDS timeout: 30 seconds
DEBU[2023-01-27 09:49:48] [config.go:254] [getDeviceListOfDriverType] ens1f0 is the wrong driver type: igb
DEBU[2023-01-27 09:49:48] [config.go:254] [getDeviceListOfDriverType] ens1f1 is the wrong driver type: igb
DEBU[2023-01-27 09:49:48] [config.go:273] [getDeviceListOfDriverType] Exit discovery.
INFO[2023-01-27 09:49:48] [config.go:149] [GetPoolConfigs] Processing Pool: westPool
DEBU[2023-01-27 09:49:48] [config.go:163] [GetPoolConfigs] Using default UDS timeout: 30 seconds
DEBU[2023-01-27 09:49:48] [config.go:254] [getDeviceListOfDriverType] ens1f0 is the wrong driver type: igb
DEBU[2023-01-27 09:49:48] [config.go:254] [getDeviceListOfDriverType] ens1f1 is the wrong driver type: igb
DEBU[2023-01-27 09:49:48] [config.go:273] [getDeviceListOfDriverType] Exit discovery.
#81 creates a UDS per device attached to the pod. The UDS Server was original designed to handle multipole devices. There may be opportunity for us to go in here and simplify the code, avoid some unnecessary loops, etc.
I am trying to understand how the AF_XDP plugin for K8s works, correct me if I am wrong.
packets -> primary device(physical NIC) -> NIC Driver(applies the XDP program at the hook) -> AF_XDP socket(outside the pod) -> pod
packets -> primary device(physical NIC) -> NIC Driver(applies the XDP program at the hook) -> subfunction(resides outside the pod but in the userspace) -> AF_XDP socket(outside the pod) -> pod
Note: I had seen the image of AF_XDP high-level Arch, but not sure about the implementation of the subfunction and AF_XDP socket inside the pod. Correct me if it is implemented the same as what was there.
Thank you in advance!
I tried the CDQ mode and got the following log: the i40e driver does not support CDQ mode.
ERRO[2023-06-19 10:53:09] [config.go:293] [getSecondaryDevices] Error assigning subfunctions from device ens259f1: Device has an incompatible driver, i40e does not support CDQ
Is there a list of device drivers supported by CDQ mode?
Hi,
I have some questions regarding the behavior of cndp based interfaces.
Once the cndp based interface is injected to the POD:
Hi , I have a question and I'm newbie here.
Please , by using bpftool prog list , I have a program that I wanted to attach to a specific interface.
So I want to know if by using network attachment or another thing , I can mentioned the bpf program ( or bin file) inside the manifest file to attach to the interface
After rebooting my kubernetes nodes with the AF_XDP CNI, the pod doesn't work anymore.
During multus adding network process, post-reboot, I've got this error :
[...] error adding container to network "afxdp-network": cmdAdd(): failed to find device: Link not found
This node was running fine with AF_XDP before the reboot, for 5 days.
my configuration is as following. (note that for the moment I'm just doing some test, this is not a production node)
daemon.set
apiVersion: v1
kind: ConfigMap
metadata:
name: afxdp-dp-config
namespace: kube-system
data:
config.json: |
{
"logLevel":"debug",
"logFile":"afxdp-dp.log",
"pools":[
{
"name":"myPool",
"UdsTimeout":-1,
"mode":"primary",
"devices":[
{
"name":"enp2s0"
}
],
"drivers":[
{
"name":"virtio_net",
"ExcludeDevices":[
{
"name":"enp1s0"
}
]
}
]
}
]
}
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: afxdp-device-plugin
namespace: kube-system
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
[...etc, this is the classic one]
My network attach definition
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
name: afxdp-network
annotations:
k8s.v1.cni.cncf.io/resourceName: afxdp/myPool
spec:
config: '{
"cniVersion": "0.3.0",
"type": "afxdp",
"mode": "primary",
"logFile": "afxdp-cni.log",
"logLevel": "debug",
"ipam": {
"type": "whereabouts",
"range": "10.99.99.0/24"
}
}'
the deployment of the pod
apiVersion: apps/v1
kind: Deployment
metadata:
name: afxdp-deployment
spec:
replicas: 1
selector:
matchLabels:
app: afxdp
template:
metadata:
labels:
app: afxdp
annotations:
k8s.v1.cni.cncf.io/networks: kube-system/afxdp-network
spec:
containers:
- name: afxdptest
image: travelping/nettools
imagePullPolicy: IfNotPresent
command: ["tail", "-f", "/dev/null"]
resources:
requests:
afxdp/myPool: '1'
limits:
afxdp/myPool: '1'
securityContext:
privileged: true
I've tried to remove multus, whereabout and all the files concerned by AF_XDP. It still doesn't work.
note that libbpf is installed in my machine, and sysctl is configure permanently with the following:
sysctl kernel.unprivileged_bpf_disabled=0
sysctl net.core.bpf_jit_enable=1
Am I missing a point in the use of this CNI ?
thank for you help !
I was trying out afxdp-plugin with cndp to deploy sample application in Kubernetes. I faced the following error inside the pod.
When I add NET_ADMIN and SYS_ADMIN then it works without any issue but I thought we did not require any privilege to run the pod. Can you please help me out here.
These are the yaml files I have used.
POD.YAML
apiVersion: v1
kind: Pod
metadata:
name: cndp-0-0
annotations:
k8s.v1.cni.cncf.io/networks: cndp-cni-afxdp0
spec:
volumes:
- name: shared-data
emptyDir: {}
- name: unixsock
hostPath:
path: /tmp/afxdp_dp/
containers:
- name: cndp-0
command:
- sleep
- inf
image: cndp
imagePullPolicy: Never
securityContext:
capabilities:
add:
- NET_RAW
- IPC_LOCK
- NET_ADMIN
- SYS_ADMIN
ports:
- containerPort: 8094
hostPort: 8094
resources:
requests:
afxdp/pool1: '1'
limits:
afxdp/pool1: '1'
hugepages-2Mi: 512Mi
memory: 2Gi
volumeMounts:
- name: shared-data
mountPath: /var/run/cndp/
- name: unixsock
mountPath: /tmp/afxdp_dp/
NAD.YAML
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
name: cndp-cni-afxdp0
annotations:
k8s.v1.cni.cncf.io/resourceName: afxdp/pool1
spec:
config: '{
"cniVersion": "0.3.0",
"type": "afxdp",
"mode": "primary",
"queues": "1",
"logLevel": "debug",
"ipam": {
"type": "host-local",
"subnet": "192.168.1.0/24",
"rangeStart": "192.168.1.200",
"rangeEnd": "192.168.1.216",
"routes": [
{ "dst": "0.0.0.0/0" }
],
"gateway": "192.168.1.1"
}
}
DAEMONSET.YAML
apiVersion: v1
kind: ConfigMap
metadata:
name: afxdp-dp-config
namespace: kube-system
data:
config.json: |
{
"clusterType": "physical",
"mode": "primary",
"logLevel": "debug",
"pools":[
{
"name":"pool1",
"mode":"primary",
"udsTimeout":300,
"drivers":[
{
"name":"i40e"
},
{
"name":"ice"
}
]
}
]
}
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: afxdp-device-plugin
namespace: kube-system
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: kube-afxdp-device-plugin
namespace: kube-system
labels:
tier: node
app: afxdp
spec:
selector:
matchLabels:
name: afxdp-device-plugin
template:
metadata:
labels:
name: afxdp-device-plugin
tier: node
app: afxdp
spec:
hostNetwork: true
nodeSelector:
kubernetes.io/arch: amd64
tolerations:
- key: node-role.kubernetes.io/master
operator: Exists
effect: NoSchedule
serviceAccountName: afxdp-device-plugin
containers:
- name: kube-afxdp
image: intel/afxdp-plugins-for-kubernetes:latest
imagePullPolicy: IfNotPresent
securityContext:
capabilities:
drop:
- all
add:
- SYS_ADMIN
- NET_ADMIN
resources:
requests:
cpu: "250m"
memory: "40Mi"
limits:
cpu: "1"
memory: "200Mi"
volumeMounts:
- name: unixsock
mountPath: /tmp/afxdp_dp/
- name: bpfmappinning
mountPath: /var/run/afxdp_dp/
- name: devicesock
mountPath: /var/lib/kubelet/device-plugins/
- name: resources
mountPath: /var/lib/kubelet/pod-resources/
- name: config-volume
mountPath: /afxdp/config
- name: log
mountPath: /var/log/afxdp-k8s-plugins/
- name: cnibin
mountPath: /opt/cni/bin/
volumes:
- name: unixsock
hostPath:
path: /tmp/afxdp_dp/
- name: bpfmappinning
hostPath:
path: /var/run/afxdp_dp/
- name: devicesock
hostPath:
path: /var/lib/kubelet/device-plugins/
- name: resources
hostPath:
path: /var/lib/kubelet/pod-resources/
- name: config-volume
configMap:
name: afxdp-dp-config
items:
- key: config.json
path: config.json
- name: log
hostPath:
path: /var/log/afxdp-k8s-plugins/
- name: cnibin
hostPath:
path: /opt/cni/bin/
I've browsed a little bit through the repo source code and I've seen the internal udsserver
and uds
packages. The uds
package seems to be the part I've marked in the above illustration, taken from the documentation in this repository.
What I don't understand yet and haven't found any explicit documentation or examples (if I'm not mistaken): how do I now use this in my container workload, how do I integrate the UDS client into my own workload code?
I was working with afxdp-plugins-for-kubernetes
For my use case i needed to attach 2 device to the pod.
I used the following deploy
apiVersion: v1
kind: Pod
metadata: name: afxdp-pod
annotations:
k8s.v1.cni.cncf.io/networks: cndp-cni-afxdp0 spec:
containers:
- name: afxdp1
image: ubuntu:latest
command: ["tail", "-f", "/dev/null"]
resources:
requests:
afxdp/mypool: '2'
limits:
afxdp/mypool: '2'
I expected that 2 devices will available for use in the pod. But inside the container, I could only see one device.
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 13s default-scheduler Successfully assigned default/afxdp-pod to turing-04
Normal AddedInterface 10s multus Add eth0 [10.244.0.11/24] from cbr0
Normal AddedInterface 10s multus Add net1 [192.168.1.204/24] from default/cndp-cni-afxdp0
Normal Pulling 9s kubelet Pulling image "ubuntu:latest"
Normal Pulled 7s kubelet Successfully pulled image "ubuntu:latest" in 2.386145893s (2.386155184s including waiting)
Normal Created 7s kubelet Created container afxdp1
Normal Started 7s kubelet Started container afxdp1
This kubectl describe for the pod.
root@cndp-0-0:~# ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0@if145: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP mode DEFAULT group default
link/ether ce:0f:94:1a:d2:fc brd ff:ff:ff:ff:ff:ff link-netnsid 0
6: ens259f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc mq state UP mode DEFAULT group default qlen 1000
link/ether 3c:fd:fe:9e:7b:5c brd ff:ff:ff:ff:ff:ff
prog/xdp id 325 tag 992d9ddc835e5629 jited
Running ip link inside the container shows only one device
The afxdp-plugin logs show that 3 device were allocated and added to the container
DEBU[2023-07-17 06:37:47] [poolManager.go:233] [Allocate] Container environment variables: {
"AFXDP_DEVICES": "ens259f1 ens259f0 enp7s0f0"
}
The bpf program is loaded on both the interface but the other device is still in the host namespace.
In internal/bpf/bpfWrapper.c, there is a call to bpf_set_link_xdp_fd() at line 153. This function has been replaced in newer versions of libbpf.
A change to DPDK code (net/af_xdp) for this same issue can be seen at:
https://patchwork.dpdk.org/project/dpdk/patch/[email protected]/
RHEL 9.2 includes libbpf v1.0.0, causing this code to break when moving to a newer OS.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.