Giter Site home page Giter Site logo

afxdp-plugins-for-kubernetes's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

afxdp-plugins-for-kubernetes's Issues

[Listen] Listener timed out: accept unixpacket

Hey,
I'm trying to deploy a pod with 2 af-xdp interaces:
and facing this issue:

INFO[2023-04-20 18:09:16] Reading config file: /afxdp/config/config.json 
INFO[2023-04-20 18:09:16] Unmarshalling config data                    
INFO[2023-04-20 18:09:16] Config Data:
{
  "Pools": [
    {
      "Name": "access",
      "Mode": "primary",
      "Drivers": [
        {
          "Name": "ice",
          "Primary": 0,
          "Secondary": 0,
          "ExcludeDevices": [
            {
              "Name": "ens4f1",
              "Pci": "",
              "Mac": "",
              "Secondary": 0
            }
          ],
          "ExcludeAddressed": false
        }
      ],
      "Devices": null,
      "Nodes": null,
      "UdsServerDisable": false,
      "UdsTimeout": 0,
      "UdsFuzz": false,
      "RequiresUnprivilegedBpf": false,
      "uid": 0,
      "ethtoolCmds": null
    },
    {
      "Name": "core",
      "Mode": "primary",
      "Drivers": [
        {
          "Name": "ice",
          "Primary": 0,
          "Secondary": 0,
          "ExcludeDevices": [
            {
              "Name": "ens4f0",
              "Pci": "",
              "Mac": "",
              "Secondary": 0
            }
          ],
          "ExcludeAddressed": false
        }
      ],
      "Devices": null,
      "Nodes": null,
      "UdsServerDisable": false,
      "UdsTimeout": 0,
      "UdsFuzz": false,
      "RequiresUnprivilegedBpf": false,
      "uid": 0,
      "ethtoolCmds": null
    }
  ],
  "LogFile": "afxdp-dp.log",
  "LogLevel": "debug"
} 
INFO[2023-04-20 18:09:16] Validating config data                       
INFO[2023-04-20 18:09:16] Setting log directory: /var/log/afxdp-k8s-plugins/ 
INFO[2023-04-20 18:09:16] Setting log file: afxdp-dp.log               
INFO[2023-04-20 18:09:16] Setting log level: debug                     
INFO[2023-04-20 18:09:16] Switching to debug log format                
INFO[2023-04-20 18:09:16] [main.go:75] [main] Starting AF_XDP Device Plugin                
INFO[2023-04-20 18:09:16] [main.go:78] [main] Checking if host meets requirements          
DEBU[2023-04-20 18:09:16] [main.go:171] [checkHost] Checking kernel version                      
DEBU[2023-04-20 18:09:16] [main.go:197] [checkHost] Kernel version: 5.13.0-1009-oem meets minimum requirements 
DEBU[2023-04-20 18:09:16] [main.go:200] [checkHost] Checking host for Libbpf                     
DEBU[2023-04-20 18:09:16] [host.go:85] [HasLibbpf] Directory /usr/lib64/ does not exist         
DEBU[2023-04-20 18:09:16] [main.go:207] [checkHost] Libbpf found on host:                        
DEBU[2023-04-20 18:09:16] [main.go:209] [checkHost] 	/usr/lib/libbpf.so.0                        
DEBU[2023-04-20 18:09:16] [main.go:209] [checkHost] 	/usr/lib/libbpf.so.0.5.0                    
INFO[2023-04-20 18:09:16] [main.go:88] [main] Host meets requirements                      
INFO[2023-04-20 18:09:16] [main.go:91] [main] Getting device pools                         
DEBU[2023-04-20 18:09:16] [config.go:111] [GetPoolConfigs] Unprivileged BPF is allowed on this host     
DEBU[2023-04-20 18:09:16] [config.go:130] [GetPoolConfigs] docker0 is not a physical device, removing from list of host devices 
DEBU[2023-04-20 18:09:16] [config.go:130] [GetPoolConfigs] calibfc702d80c1 is not a physical device, removing from list of host devices 
DEBU[2023-04-20 18:09:16] [config.go:135] [GetPoolConfigs] eno1 a globally prohibited device, removing from list of host devices 
DEBU[2023-04-20 18:09:16] [config.go:130] [GetPoolConfigs] calibd8ae8df0e5 is not a physical device, removing from list of host devices 
DEBU[2023-04-20 18:09:16] [config.go:130] [GetPoolConfigs] data is not a physical device, removing from list of host devices 
DEBU[2023-04-20 18:09:16] [config.go:130] [GetPoolConfigs] cali09d96a0e9e0 is not a physical device, removing from list of host devices 
DEBU[2023-04-20 18:09:16] [config.go:130] [GetPoolConfigs] lo is not a physical device, removing from list of host devices 
DEBU[2023-04-20 18:09:16] [config.go:135] [GetPoolConfigs] eno2 a globally prohibited device, removing from list of host devices 
DEBU[2023-04-20 18:09:16] [config.go:130] [GetPoolConfigs] cali537aa6a676c is not a physical device, removing from list of host devices 
DEBU[2023-04-20 18:09:16] [config.go:130] [GetPoolConfigs] calie47cf898f7c is not a physical device, removing from list of host devices 
DEBU[2023-04-20 18:09:16] [config.go:130] [GetPoolConfigs] worker is not a physical device, removing from list of host devices 
DEBU[2023-04-20 18:09:16] [config.go:130] [GetPoolConfigs] vxlan.calico is not a physical device, removing from list of host devices 
DEBU[2023-04-20 18:09:16] [config.go:130] [GetPoolConfigs] cali561dddf72ed is not a physical device, removing from list of host devices 
DEBU[2023-04-20 18:09:16] [config.go:145] [GetPoolConfigs] Host devices:
{
  "ens4f0": {},
  "ens4f1": {}
} 
INFO[2023-04-20 18:09:16] [config.go:149] [GetPoolConfigs] Processing Pool: access                      
DEBU[2023-04-20 18:09:16] [config.go:163] [GetPoolConfigs] Using default UDS timeout: 30 seconds        
INFO[2023-04-20 18:09:16] [config.go:264] [getDeviceListOfDriverType] ens4f0 added to pool                         
DEBU[2023-04-20 18:09:16] [config.go:332] [validateDevice] ens4f1 is an excluded device for ice driver  
DEBU[2023-04-20 18:09:16] [config.go:273] [getDeviceListOfDriverType] Exit discovery.                              
INFO[2023-04-20 18:09:16] [config.go:149] [GetPoolConfigs] Processing Pool: core                        
DEBU[2023-04-20 18:09:16] [config.go:163] [GetPoolConfigs] Using default UDS timeout: 30 seconds        
DEBU[2023-04-20 18:09:16] [config.go:316] [validateDevice] Device ens4f0 is fully assigned              
INFO[2023-04-20 18:09:16] [config.go:264] [getDeviceListOfDriverType] ens4f1 added to pool                         
DEBU[2023-04-20 18:09:16] [config.go:273] [getDeviceListOfDriverType] Exit discovery.                              
DEBU[2023-04-20 18:09:16] [poolManager.go:327] [startGRPC] afxdp/access started serving on /var/lib/kubelet/device-plugins/afxdp-access.sock 
INFO[2023-04-20 18:09:16] [poolManager.go:88] [Init] Pool afxdp/access started serving            
INFO[2023-04-20 18:09:16] [poolManager.go:93] [Init] Pool afxdp/access registered with Kubelet    
DEBU[2023-04-20 18:09:16] [poolManager.go:123] [ListAndWatch] Pool afxdp/access ListAndWatch started       
DEBU[2023-04-20 18:09:16] [poolManager.go:327] [startGRPC] afxdp/core started serving on /var/lib/kubelet/device-plugins/afxdp-core.sock 
INFO[2023-04-20 18:09:16] [poolManager.go:88] [Init] Pool afxdp/core started serving              
INFO[2023-04-20 18:09:16] [poolManager.go:93] [Init] Pool afxdp/core registered with Kubelet      
DEBU[2023-04-20 18:09:16] [poolManager.go:123] [ListAndWatch] Pool afxdp/core ListAndWatch started         
DEBU[2023-04-20 18:21:06] [poolManager.go:152] [Allocate] New allocate request on pool access          
INFO[2023-04-20 18:21:06] [poolManager.go:155] [Allocate] Creating new UDS server                      
DEBU[2023-04-20 18:21:06] [poolManager.go:180] [Allocate] Device: {
  "Name": "ens4f0",
  "Mode": "primary",
  "Driver": "ice",
  "Pci": "0000:d8:00.0",
  "MacAddress": "3c:ec:ef:d9:62:f2",
  "FullyAssigned": true,
  "EthtoolFilters": null,
  "Primary": {
    "Name": "ens4f0",
    "Mode": "primary",
    "Driver": "ice",
    "Pci": "0000:d8:00.0",
    "MacAddress": "3c:ec:ef:d9:62:f2",
    "FullyAssigned": true,
    "EthtoolFilters": null,
    "Primary": null
  }
} 
DEBU[2023-04-20 18:21:06] [poolManager.go:190] [Allocate] Primary mode                                 
DEBU[2023-04-20 18:21:06] [poolManager.go:202] [Allocate] Cycling state of device ens4f0               
INFO[2023-04-20 18:21:06] [poolManager.go:209] [Allocate] Loading BPF program on device: ens4f0        
INFO[2023-04-20 18:21:06] [bpfWrapper.go:104] [Infof] Load_bpf_send_xsk_map: disovering if_index for interface ens4f0 
INFO[2023-04-20 18:21:06] [bpfWrapper.go:104] [Infof] Load_bpf_send_xsk_map: if_index for interface ens4f0 is 4 
INFO[2023-04-20 18:21:06] [bpfWrapper.go:104] [Infof] Load_bpf_send_xsk_map: starting setup of xdp program on interface ens4f0 (4) 
INFO[2023-04-20 18:21:06] [bpfWrapper.go:104] [Infof] Load_bpf_send_xsk_map: loaded xdp program on interface ens4f0 (4), file descriptor 11 
INFO[2023-04-20 18:21:06] [poolManager.go:215] [Allocate] BPF program loaded on: ens4f0 File descriptor: 11 
DEBU[2023-04-20 18:21:06] [poolManager.go:233] [Allocate] Container environment variables: {
  "AFXDP_DEVICES": "ens4f0"
} 
DEBU[2023-04-20 18:21:06] [udsserver.go:166] [start] Initialising Unix domain socket: /tmp/afxdp_dp/afxdp_access/13ab88a3-4751-40fb-a9b5-2b3118180c4e.sock 
INFO[2023-04-20 18:21:06] [udsserver.go:174] [start] Unix domain socket initialised. Listening for new connection. 
DEBU[2023-04-20 18:21:06] [poolManager.go:152] [Allocate] New allocate request on pool core            
INFO[2023-04-20 18:21:06] [poolManager.go:155] [Allocate] Creating new UDS server                      
DEBU[2023-04-20 18:21:06] [poolManager.go:180] [Allocate] Device: {
  "Name": "ens4f1",
  "Mode": "primary",
  "Driver": "ice",
  "Pci": "0000:d8:00.1",
  "MacAddress": "3c:ec:ef:d9:62:f3",
  "FullyAssigned": true,
  "EthtoolFilters": null,
  "Primary": {
    "Name": "ens4f1",
    "Mode": "primary",
    "Driver": "ice",
    "Pci": "0000:d8:00.1",
    "MacAddress": "3c:ec:ef:d9:62:f3",
    "FullyAssigned": true,
    "EthtoolFilters": null,
    "Primary": null
  }
} 
DEBU[2023-04-20 18:21:06] [poolManager.go:190] [Allocate] Primary mode                                 
DEBU[2023-04-20 18:21:06] [poolManager.go:202] [Allocate] Cycling state of device ens4f1               
INFO[2023-04-20 18:21:06] [poolManager.go:209] [Allocate] Loading BPF program on device: ens4f1        
INFO[2023-04-20 18:21:06] [bpfWrapper.go:104] [Infof] Load_bpf_send_xsk_map: disovering if_index for interface ens4f1 
INFO[2023-04-20 18:21:06] [bpfWrapper.go:104] [Infof] Load_bpf_send_xsk_map: if_index for interface ens4f1 is 5 
INFO[2023-04-20 18:21:06] [bpfWrapper.go:104] [Infof] Load_bpf_send_xsk_map: starting setup of xdp program on interface ens4f1 (5) 
INFO[2023-04-20 18:21:06] [bpfWrapper.go:104] [Infof] Load_bpf_send_xsk_map: loaded xdp program on interface ens4f1 (5), file descriptor 15 
INFO[2023-04-20 18:21:06] [poolManager.go:215] [Allocate] BPF program loaded on: ens4f1 File descriptor: 15 
DEBU[2023-04-20 18:21:06] [poolManager.go:233] [Allocate] Container environment variables: {
  "AFXDP_DEVICES": "ens4f1"
} 
DEBU[2023-04-20 18:21:06] [udsserver.go:166] [start] Initialising Unix domain socket: /tmp/afxdp_dp/afxdp_core/36add699-c5fb-4edb-9609-68c4d6f7a0ee.sock 
INFO[2023-04-20 18:21:06] [udsserver.go:174] [start] Unix domain socket initialised. Listening for new connection. 
ERRO[2023-04-20 18:21:36] [uds.go:134] [Listen] Listener timed out: accept unixpacket /tmp/afxdp_dp/afxdp_access/13ab88a3-4751-40fb-a9b5-2b3118180c4e.sock: i/o timeout 
ERRO[2023-04-20 18:21:36] [udsserver.go:179] [start] Listener timed out: accept unixpacket /tmp/afxdp_dp/afxdp_access/13ab88a3-4751-40fb-a9b5-2b3118180c4e.sock: i/o timeout 
DEBU[2023-04-20 18:21:36] [uds.go:298] [cleanup] Closing Unix listener                        
DEBU[2023-04-20 18:21:36] [uds.go:304] [cleanup] Closing socket file                          
DEBU[2023-04-20 18:21:36] [uds.go:306] [cleanup] Removing socket file                         
ERRO[2023-04-20 18:21:36] [uds.go:134] [Listen] Listener timed out: accept unixpacket /tmp/afxdp_dp/afxdp_core/36add699-c5fb-4edb-9609-68c4d6f7a0ee.sock: i/o timeout 
ERRO[2023-04-20 18:21:36] [udsserver.go:179] [start] Listener timed out: accept unixpacket /tmp/afxdp_dp/afxdp_core/36add699-c5fb-4edb-9609-68c4d6f7a0ee.sock: i/o timeout 
DEBU[2023-04-20 18:21:36] [uds.go:298] [cleanup] Closing Unix listener                        
DEBU[2023-04-20 18:21:36] [uds.go:304] [cleanup] Closing socket file                          
DEBU[2023-04-20 18:21:36] [uds.go:306] [cleanup] Removing socket file

daemonset file:

apiVersion: v1
kind: ConfigMap
metadata:
  name: afxdp-dp-config
  namespace: kube-system
data:
  config.json: |
    {
        "logLevel":"debug",
        "logFile":"afxdp-dp.log",
        "pools":[
          {
             "name":"access",
             "mode":"primary",
             "drivers":[
                {
                   "name":"ice",
                   "excludeDevices":[
                    {
                        "name":"ens4f1"
                    }
                    ]
                }
             ]
          },
          {
             "name":"core",
             "mode":"primary",
             "drivers":[
                {
                   "name":"ice",
                   "excludeDevices":[
                    {
                        "name":"ens4f0"
                    }
                    ]
                }
             ]
          }
       ]
    }

dpdk-devbind.py -s:

Network devices using kernel driver
===================================
0000:60:00.0 'Ethernet Connection X722 for 1GbE 37d1' if=eno1 drv=i40e unused=vfio-pci *Active*
0000:60:00.1 'Ethernet Connection X722 for 1GbE 37d1' if=eno2 drv=i40e unused=vfio-pci 
0000:d8:00.0 'Device 159b' if= drv=ice unused=vfio-pci 
0000:d8:00.1 'Device 159b' if= drv=ice unused=vfio-pci 

cc @garyloug

eno1 a globally prohibited device, removing from list of host devices

Hey,
I'm facing this issue after af-xdp plugin deployed successfully.
Logs:

INFO[2023-01-27 09:49:48] Reading config file: /afxdp/config/config.json 
INFO[2023-01-27 09:49:48] Unmarshalling config data                    
INFO[2023-01-27 09:49:48] Config Data:
{
  "Pools": [
    {
      "Name": "eastPool",
      "Mode": "primary",
      "Drivers": [
        {
          "Name": "i40e",
          "Primary": 0,
          "Secondary": 0,
          "ExcludeDevices": [
            {
              "Name": "eno2",
              "Pci": "",
              "Mac": "",
              "Secondary": 0
            }
          ],
          "ExcludeAddressed": false
        }
      ],
      "Devices": null,
      "Nodes": null,
      "UdsServerDisable": false,
      "UdsTimeout": 0,
      "UdsFuzz": false,
      "RequiresUnprivilegedBpf": false,
      "uid": 0,
      "ethtoolCmds": null
    },
    {
      "Name": "westPool",
      "Mode": "primary",
      "Drivers": [
        {
          "Name": "i40e",
          "Primary": 0,
          "Secondary": 0,
          "ExcludeDevices": [
            {
              "Name": "eno1",
              "Pci": "",
              "Mac": "",
              "Secondary": 0
            }
          ],
          "ExcludeAddressed": false
        }
      ],
      "Devices": null,
      "Nodes": null,
      "UdsServerDisable": false,
      "UdsTimeout": 0,
      "UdsFuzz": false,
      "RequiresUnprivilegedBpf": false,
      "uid": 0,
      "ethtoolCmds": null
    }
  ],
  "LogFile": "afxdp-dp.log",
  "LogLevel": "debug"
} 
INFO[2023-01-27 09:49:48] Validating config data                       
INFO[2023-01-27 09:49:48] Setting log directory: /var/log/afxdp-k8s-plugins/ 
INFO[2023-01-27 09:49:48] Setting log file: afxdp-dp.log               
INFO[2023-01-27 09:49:48] Setting log level: debug                     
INFO[2023-01-27 09:49:48] Switching to debug log format                
INFO[2023-01-27 09:49:48] [main.go:75] [main] Starting AF_XDP Device Plugin                
INFO[2023-01-27 09:49:48] [main.go:78] [main] Checking if host meets requriements          
DEBU[2023-01-27 09:49:48] [main.go:171] [checkHost] Checking kernel version                      
DEBU[2023-01-27 09:49:48] [main.go:197] [checkHost] Kernel version: 5.13.0-1009-oem meets minimum requirements 
DEBU[2023-01-27 09:49:48] [main.go:200] [checkHost] Checking host for Libbpf                     
DEBU[2023-01-27 09:49:48] [host.go:85] [HasLibbpf] Directory /usr/lib64/ does not exist         
DEBU[2023-01-27 09:49:48] [main.go:207] [checkHost] Libbpf found on host:                        
DEBU[2023-01-27 09:49:48] [main.go:209] [checkHost] 	/usr/lib/libbpf.so.0                        
DEBU[2023-01-27 09:49:48] [main.go:209] [checkHost] 	/usr/lib/libbpf.so.0.5.0                    
INFO[2023-01-27 09:49:48] [main.go:88] [main] Host meets requriements                      
INFO[2023-01-27 09:49:48] [main.go:91] [main] Getting device pools                         
DEBU[2023-01-27 09:49:48] [config.go:111] [GetPoolConfigs] Unprivileged BPF is allowed on this host     
DEBU[2023-01-27 09:49:48] [config.go:135] [GetPoolConfigs] eno2 a globally prohibited device, removing from list of host devices 
DEBU[2023-01-27 09:49:48] [config.go:130] [GetPoolConfigs] docker0 is not a physical device, removing from list of host devices 
DEBU[2023-01-27 09:49:48] [config.go:130] [GetPoolConfigs] cali57cbcb24c31 is not a physical device, removing from list of host devices 
DEBU[2023-01-27 09:49:48] [config.go:130] [GetPoolConfigs] cali231ee6496e0 is not a physical device, removing from list of host devices 
DEBU[2023-01-27 09:49:48] [config.go:135] [GetPoolConfigs] eno1 a globally prohibited device, removing from list of host devices 
DEBU[2023-01-27 09:49:48] [config.go:130] [GetPoolConfigs] caliadc8f19fd1c is not a physical device, removing from list of host devices 
DEBU[2023-01-27 09:49:48] [config.go:130] [GetPoolConfigs] caliae6307e8d8d is not a physical device, removing from list of host devices 
DEBU[2023-01-27 09:49:48] [config.go:130] [GetPoolConfigs] cali903687db04a is not a physical device, removing from list of host devices 
DEBU[2023-01-27 09:49:48] [config.go:130] [GetPoolConfigs] lo is not a physical device, removing from list of host devices 
DEBU[2023-01-27 09:49:48] [config.go:130] [GetPoolConfigs] iface is not a physical device, removing from list of host devices 
DEBU[2023-01-27 09:49:48] [config.go:130] [GetPoolConfigs] vxlan.calico is not a physical device, removing from list of host devices 
DEBU[2023-01-27 09:49:48] [config.go:145] [GetPoolConfigs] Host devices:
{
  "ens1f0": {},
  "ens1f1": {}
} 
INFO[2023-01-27 09:49:48] [config.go:149] [GetPoolConfigs] Processing Pool: eastPool                    
DEBU[2023-01-27 09:49:48] [config.go:163] [GetPoolConfigs] Using default UDS timeout: 30 seconds        
DEBU[2023-01-27 09:49:48] [config.go:254] [getDeviceListOfDriverType] ens1f0 is the wrong driver type: igb         
DEBU[2023-01-27 09:49:48] [config.go:254] [getDeviceListOfDriverType] ens1f1 is the wrong driver type: igb         
DEBU[2023-01-27 09:49:48] [config.go:273] [getDeviceListOfDriverType] Exit discovery.                              
INFO[2023-01-27 09:49:48] [config.go:149] [GetPoolConfigs] Processing Pool: westPool                    
DEBU[2023-01-27 09:49:48] [config.go:163] [GetPoolConfigs] Using default UDS timeout: 30 seconds        
DEBU[2023-01-27 09:49:48] [config.go:254] [getDeviceListOfDriverType] ens1f0 is the wrong driver type: igb         
DEBU[2023-01-27 09:49:48] [config.go:254] [getDeviceListOfDriverType] ens1f1 is the wrong driver type: igb         
DEBU[2023-01-27 09:49:48] [config.go:273] [getDeviceListOfDriverType] Exit discovery.

Simplify UDS Server

#81 creates a UDS per device attached to the pod. The UDS Server was original designed to handle multipole devices. There may be opportunity for us to go in here and simplify the code, avoid some unnecessary loops, etc.

working of AF_XDP plugin

Working of AF_XDP Plugin

I am trying to understand how the AF_XDP plugin for K8s works, correct me if I am wrong.

In Primary mode

Packets Flow:

packets -> primary device(physical NIC) -> NIC Driver(applies the XDP program at the hook) -> AF_XDP socket(outside the pod) -> pod

Components Role:

  1. AF_XDP plugin is responsible to create af_xdp sockets for each pod
  2. AF_XDP CNI is responsible for configuring the network interfaces of the pods

In CDQ mode

Packets Flow:

packets -> primary device(physical NIC) -> NIC Driver(applies the XDP program at the hook) -> subfunction(resides outside the pod but in the userspace) -> AF_XDP socket(outside the pod) -> pod

Components Role:

  1. NIC Driver creates the subfunction and assigns it to AF_XDP CNI to manage.
  2. AF_XDP plugin creates the AF_XDP socket at the userspace, outside the pod. it commands the driver to create a new subfunction(outside the pod) based on requirements.
  3. AF_XDP CNI is responsible for assigning subfunctions to the respective pods.

Note: I had seen the image of AF_XDP high-level Arch, but not sure about the implementation of the subfunction and AF_XDP socket inside the pod. Correct me if it is implemented the same as what was there.

Thank you in advance!

Which device drivers are supported by cdq mode?

I tried the CDQ mode and got the following log: the i40e driver does not support CDQ mode.

ERRO[2023-06-19 10:53:09] [config.go:293] [getSecondaryDevices] Error assigning subfunctions from device ens259f1: Device has an incompatible driver, i40e does not support CDQ

Is there a list of device drivers supported by CDQ mode?

Question: Will The CNDP Interface Be Visible In The Kernel Space

Hi,

I have some questions regarding the behavior of cndp based interfaces.

Once the cndp based interface is injected to the POD:

  • Will the interface be visible in the kernel space like using normal Linux commands ifconfig/iproute2?
  • Is there a strict need for an additional software/library to be installed before the interface can be used?

use our own xdp program

Hi , I have a question and I'm newbie here.
Please , by using bpftool prog list , I have a program that I wanted to attach to a specific interface.

So I want to know if by using network attachment or another thing , I can mentioned the bpf program ( or bin file) inside the manifest file to attach to the interface

CNI doesn't find link after reboot

After rebooting my kubernetes nodes with the AF_XDP CNI, the pod doesn't work anymore.

During multus adding network process, post-reboot, I've got this error :

 [...] error adding container to network "afxdp-network": cmdAdd(): failed to find device: Link not found

This node was running fine with AF_XDP before the reboot, for 5 days.

my configuration is as following. (note that for the moment I'm just doing some test, this is not a production node)

daemon.set

apiVersion: v1
kind: ConfigMap
metadata:
  name: afxdp-dp-config
  namespace: kube-system
data:
  config.json: |
    {
       "logLevel":"debug",
       "logFile":"afxdp-dp.log",
       "pools":[
          {
             "name":"myPool",
             "UdsTimeout":-1,
             "mode":"primary",
             "devices":[
               {
                   "name":"enp2s0"
               }
             ],
             "drivers":[
                {
                 "name":"virtio_net",
                 "ExcludeDevices":[
                    {
                       "name":"enp1s0"
                    }
                  ]
                }
             ]
          }
       ]
    }
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: afxdp-device-plugin
  namespace: kube-system
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
[...etc, this is the classic one]

My network attach definition

apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
  name: afxdp-network
  annotations:
    k8s.v1.cni.cncf.io/resourceName: afxdp/myPool
spec:
  config: '{
      "cniVersion": "0.3.0",
      "type": "afxdp",
      "mode": "primary",
      "logFile": "afxdp-cni.log",
      "logLevel": "debug",
      "ipam": {
        "type": "whereabouts",
        "range": "10.99.99.0/24"
      }
  }'

the deployment of the pod

apiVersion: apps/v1
kind: Deployment
metadata:
  name: afxdp-deployment
spec:
  replicas: 1
  selector:
    matchLabels:
      app: afxdp
  template:
    metadata:
      labels:
        app: afxdp
      annotations:
        k8s.v1.cni.cncf.io/networks: kube-system/afxdp-network
    spec:
      containers:
      - name: afxdptest
        image: travelping/nettools
        imagePullPolicy: IfNotPresent
        command: ["tail", "-f", "/dev/null"]
        resources:
          requests:
            afxdp/myPool: '1'
          limits:
            afxdp/myPool: '1'
        securityContext:
          privileged: true

I've tried to remove multus, whereabout and all the files concerned by AF_XDP. It still doesn't work.

note that libbpf is installed in my machine, and sysctl is configure permanently with the following:

sysctl kernel.unprivileged_bpf_disabled=0
sysctl net.core.bpf_jit_enable=1

Am I missing a point in the use of this CNI ?

thank for you help !

Pod Requiring Privilege in order to run cndpfwd application in Kubernetes.

image

I was trying out afxdp-plugin with cndp to deploy sample application in Kubernetes. I faced the following error inside the pod.
image

When I add NET_ADMIN and SYS_ADMIN then it works without any issue but I thought we did not require any privilege to run the pod. Can you please help me out here.
image

These are the yaml files I have used.

POD.YAML

apiVersion: v1
kind: Pod
metadata:
  name: cndp-0-0
  annotations:
    k8s.v1.cni.cncf.io/networks: cndp-cni-afxdp0
spec:
  volumes:
  - name: shared-data
    emptyDir: {}
  - name: unixsock
    hostPath:
      path: /tmp/afxdp_dp/
  containers:
    - name: cndp-0
      command: 
      - sleep
      - inf
      image: cndp
      imagePullPolicy: Never
      securityContext:
        capabilities:
          add:
            - NET_RAW
            - IPC_LOCK
            - NET_ADMIN
            - SYS_ADMIN
      ports:
      - containerPort: 8094
        hostPort: 8094
      resources:
        requests:
          afxdp/pool1: '1'
        limits:
          afxdp/pool1: '1'
          hugepages-2Mi: 512Mi
          memory: 2Gi
      volumeMounts:
        - name: shared-data
          mountPath: /var/run/cndp/
        - name: unixsock
          mountPath: /tmp/afxdp_dp/

NAD.YAML

apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
  name: cndp-cni-afxdp0
  annotations:
    k8s.v1.cni.cncf.io/resourceName: afxdp/pool1
spec:
  config: '{
      "cniVersion": "0.3.0",
      "type": "afxdp",
      "mode": "primary",
      "queues": "1",
      "logLevel": "debug",
      "ipam": {
        "type": "host-local",
        "subnet": "192.168.1.0/24",
        "rangeStart": "192.168.1.200",
        "rangeEnd": "192.168.1.216",
        "routes": [
          { "dst": "0.0.0.0/0" }
        ],
        "gateway": "192.168.1.1"
      }
    }

DAEMONSET.YAML

apiVersion: v1
kind: ConfigMap
metadata:
  name: afxdp-dp-config
  namespace: kube-system
data:
  config.json: |
    {
      "clusterType": "physical",
      "mode": "primary",
      "logLevel": "debug",
      "pools":[
          {
             "name":"pool1",
             "mode":"primary",
              "udsTimeout":300,
             "drivers":[
                {
                   "name":"i40e"
                },
                {
                   "name":"ice"
                }
             ]
          }
       ]
    }
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: afxdp-device-plugin
  namespace: kube-system
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: kube-afxdp-device-plugin
  namespace: kube-system
  labels:
    tier: node
    app: afxdp
spec:
  selector:
    matchLabels:
      name: afxdp-device-plugin
  template:
    metadata:
      labels:
        name: afxdp-device-plugin
        tier: node
        app: afxdp
    spec:
      hostNetwork: true
      nodeSelector:
        kubernetes.io/arch: amd64
      tolerations:
        - key: node-role.kubernetes.io/master
          operator: Exists
          effect: NoSchedule
      serviceAccountName: afxdp-device-plugin
      containers:
        - name: kube-afxdp
          image: intel/afxdp-plugins-for-kubernetes:latest
          imagePullPolicy: IfNotPresent
          securityContext:
            capabilities:
              drop:
                - all
              add:
                - SYS_ADMIN
                - NET_ADMIN
          resources:
            requests:
              cpu: "250m"
              memory: "40Mi"
            limits:
              cpu: "1"
              memory: "200Mi"
          volumeMounts:
            - name: unixsock
              mountPath: /tmp/afxdp_dp/
            - name: bpfmappinning
              mountPath: /var/run/afxdp_dp/
            - name: devicesock
              mountPath: /var/lib/kubelet/device-plugins/
            - name: resources
              mountPath: /var/lib/kubelet/pod-resources/
            - name: config-volume
              mountPath: /afxdp/config
            - name: log
              mountPath: /var/log/afxdp-k8s-plugins/
            - name: cnibin
              mountPath: /opt/cni/bin/
      volumes:
        - name: unixsock
          hostPath:
            path: /tmp/afxdp_dp/
        - name: bpfmappinning
          hostPath:
            path: /var/run/afxdp_dp/
        - name: devicesock
          hostPath:
            path: /var/lib/kubelet/device-plugins/
        - name: resources
          hostPath:
            path: /var/lib/kubelet/pod-resources/
        - name: config-volume
          configMap:
            name: afxdp-dp-config
            items:
              - key: config.json
                path: config.json
        - name: log
          hostPath:
            path: /var/log/afxdp-k8s-plugins/
        - name: cnibin
          hostPath:
            path: /opt/cni/bin/

[newbie] UDS client or API specification?

image

I've browsed a little bit through the repo source code and I've seen the internal udsserver and uds packages. The uds package seems to be the part I've marked in the above illustration, taken from the documentation in this repository.

What I don't understand yet and haven't found any explicit documentation or examples (if I'm not mistaken): how do I now use this in my container workload, how do I integrate the UDS client into my own workload code?

Cannot allocate multiple devices to pod

I was working with afxdp-plugins-for-kubernetes
For my use case i needed to attach 2 device to the pod.

I used the following deploy

apiVersion: v1
kind: Pod
metadata: name: afxdp-pod
annotations:
k8s.v1.cni.cncf.io/networks: cndp-cni-afxdp0 spec:
containers:
- name: afxdp1
image: ubuntu:latest
command: ["tail", "-f", "/dev/null"]
resources:
requests:
afxdp/mypool: '2'
limits:
afxdp/mypool: '2'

I expected that 2 devices will available for use in the pod. But inside the container, I could only see one device.
Events:
  Type    Reason          Age   From               Message
  ----    ------          ----  ----               -------
  Normal  Scheduled       13s   default-scheduler  Successfully assigned default/afxdp-pod to turing-04
  Normal  AddedInterface  10s   multus             Add eth0 [10.244.0.11/24] from cbr0
  Normal  AddedInterface  10s   multus             Add net1 [192.168.1.204/24] from default/cndp-cni-afxdp0
  Normal  Pulling         9s    kubelet            Pulling image "ubuntu:latest"
  Normal  Pulled          7s    kubelet            Successfully pulled image "ubuntu:latest" in 2.386145893s (2.386155184s including waiting)
  Normal  Created         7s    kubelet            Created container afxdp1
  Normal  Started         7s    kubelet            Started container afxdp1
This kubectl describe for the pod.

root@cndp-0-0:~# ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0@if145: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP mode DEFAULT group default
    link/ether ce:0f:94:1a:d2:fc brd ff:ff:ff:ff:ff:ff link-netnsid 0
6: ens259f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 3c:fd:fe:9e:7b:5c brd ff:ff:ff:ff:ff:ff
    prog/xdp id 325 tag 992d9ddc835e5629 jited

Running ip link inside the container shows only one device

The afxdp-plugin logs show that 3 device were allocated and added to the container
DEBU[2023-07-17 06:37:47] [poolManager.go:233] [Allocate] Container environment variables: {
  "AFXDP_DEVICES": "ens259f1 ens259f0 enp7s0f0"
}

The bpf program is loaded on both the interface but the other device is still in the host namespace.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.