Giter Site home page Giter Site logo

Comments (7)

lifupan avatar lifupan commented on June 2, 2024

First of all, is your network device for the container or the sandbox? If it is for sandbox, how is the device passed to kata? If it is a network device, the device should theoretically be set to the netns of the pod through CNI, including the device and network address. If it is for the container, it should be passed through the device of oci spec? If it is passed through the device, who is responsible for setting your network address?

from kata-containers.

l8huang avatar l8huang commented on June 2, 2024

The VFIO network device is for the k8s Pod, the ovn-k8s CNI supposes to configure the network interfaces through Sandbox Hotplug API.

Isn't Sandbox Hotplug API for such kind of use case?

from kata-containers.

lifupan avatar lifupan commented on June 2, 2024

The VFIO network device is for the k8s Pod, the ovn-k8s CNI supposes to configure the network interfaces through Sandbox Hotplug API.

Isn't Sandbox Hotplug API for such kind of use case?

Hi @l8huang

Yes, sandbox hotplug api was used for device hotplug. But for network interface, kata hasn't support netwrok device hotplug. Compared to other devices such as block/gpu etc, kata could get the device type thus it would specify the proper operation in guest. But for network devices, kata not only needs to obtain the type of device, but also needs to know the network address and other information of the device. Therefore, currently kata only goes to pod netns to obtain all network endpoints when starting a pod. According to the obtained The network endpoint type determines how to add network devices to kata.

Regarding your question, what I want to confirm is how the VFIO network device here is passed to kata runtime? Only by knowing how your network device information is transmitted can we decide how to hot-plug your network device.

from kata-containers.

lifupan avatar lifupan commented on June 2, 2024

@l8huang BTW, do you mean the network device info passed to kata runtime using ENV just as #9605 said?

from kata-containers.

l8huang avatar l8huang commented on June 2, 2024

Thanks for your reply.

how the VFIO network device here is passed to kata runtime?

Long story short, the VFIO network device info is in OCI spec config.json when Sandbox.CreateContainer() is called, e.g.:

  "linux": {
    "devices": [
      {
        "path": "/dev/vfio/vfio",
        "type": "c",
        "major": 10,
        "minor": 196,
        "fileMode": 438,
        "uid": 0,
        "gid": 0
      },
      {
        "path": "/dev/vfio/130",
        "type": "c",
        "major": 238,
        "minor": 3,
        "fileMode": 384,
        "uid": 0,
        "gid": 0
      }
    ],

Then GetAllVFIODevicesFromIOMMUGroup() is called to get VFIODev when attaching the device. The env PCIDEVICE_<prefix>_<resource-name>_INFO in #9605 is not relevant here.

But for network devices, kata not only needs to obtain the type of device, but also needs to know the network address and other information of the device.

The network address and route config for the VFIO interface are in Pod's annotations:

    k8s.ovn.org/pod-networks: '{
      "default":{
        "ip_addresses":["10.1.2.62/26"],
        "mac_address":"0a:58:0a:c0:02:3e",
        "gateway_ips":["10.1.2.1"],
        "routes":[
          {"dest":"10.1.0.0/16","nextHop":"10.1.2.1"},
          {"dest":"10.2.0.0/16","nextHop":"10.1.2.1"},
          {"dest":"10.3.0.0/16","nextHop":"10.1.2.1"}],
        "mtu":"1500",
        "ip_address":"10.1.2.62/26",
        "gateway_ip":"10.1.2.1"}}'

    k8s.v1.cni.cncf.io/network-status: |-
      [{
          "name": "default/ovn-primary-vfio",
          "interface": "eth0",
          "ips": [
              "10.1.2.62"
          ],
          "mac": "0a:0a:0a:c0:02:3e",
          "default": true,
          "dns": {},
          "device-info": {
              "type": "pci",
              "version": "1.0.0",
              "pci": {
                  "pci-address": "0000:84:02.1"
              }
          }
      }]

If Sandbox Hotplug API is no longer exposed to external controllers, then kata runtime need to interpret above annotations and invokes corresponding APIs to config network in guest VM. In this case, the problem becomes determining which annotations the Kata runtime should look at to gather network configurations(k8s.ovn.org/pod-networks + k8s.ovn.org/pod-networks in case of using ovn-kubernetes.

Another way is expose Sandbox Hotplug API, so CNI can call them to config the network in guest VM.

Please kindly let me know how to proceed, thanks.

from kata-containers.

lifupan avatar lifupan commented on June 2, 2024

Hi @l8huang

I roughly understand your needs.

The CNI currently supported by Kata creates the network during the create sandbox stage, and sets the network to pod netns. Then, kata scans network devices from netns when creating pause container, and cold-plugs the scanned network devices into the hypervisor.

What I need to confirm is that the VFIO network device and corresponding network address and other information generated by your CNI are all generated when the business container is created? Isn't it generated when creating sandbox?

from kata-containers.

l8huang avatar l8huang commented on June 2, 2024

Indeed, the allocation of the VFIO network device(done by kubelet and sriov network device plugin) and the corresponding network address(done by CNI) occurs before containerd creates the sandbox. But the VFIO device info presents in the first container's OCI spec when it being created.

We have a mutation webhook to check VFIO device resource requirement based on a Pod's network config, for example, below annotation set Pod's primary network interface to a VFIO device:

    v1.multus-cni.io/default-network: default/ovn-primary-vfio

The mutation webhook parses the annotation, and patches 1st container's resources as below:

    resources:
      limits:
        nvidia.com/bf2_vfio: "1" 
      requests:
        nvidia.com/bf2_vfio: "1"

At the Pod level, there are no resource settings, so the resources setting for the first container is patched, even though the VFIO network device is intended for the Pod's network interface.

from kata-containers.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.