avnu / detd Goto Github PK

View Code? Open in Web Editor NEW

17.0 17.0 6.0 176 KB

Proof-of-concept for a developer-friendly system service to handle time-sensitive applications.

License: BSD 3-Clause "New" or "Revised" License

Python 94.07% Shell 5.51% Dockerfile 0.41%

detd developer-experience time-sensitive time-sensitive-networking tsn

detd's People

Contributors

Stargazers

Watchers

Forkers

puminski koalo jeppex1 nilsalon100 junyang0412 kamber-intel

detd's Issues

Unify talker_manager and listener_manager within InterfaceManager

Manager's add_talker and add_listener create their own InterfaceManager instances. We should have a single InterfaceManager instance that encapsulates the Tx and Rx resources needed on both directions.

For example, the following items should be considered as "per interface settings" that the unified InterfaceManager would need to control:

VLAN interfaces
Sockets
Mappings (in some cases Rx and Tx mappings might differ, we would handle that once it occurs)
ethtool channels configuration
. . .

Add remove_listener method

remove_listener should:

Remove the listener stream from the configuration
Free up any resources being used, like filters
Etc

Add helper script to build the deb package for Debian Bookworm

The script would:

Trigger a containerized build based on the Bookworm Dockerfile
When the build is complete, move the generated deb to the host filesystem
Provide information to the user about where to find the generated deb package

Add basic test scripts for real hardware

Add a test script that runs a sequence of talker stream configurations on target. This will help to expose issues specific to device capabilities.

The script will take as argument the name of a network interface.

It will then run the following sequence, handling the responses:

Add a stream at offset 0
Add a stream at offset 250us
Add a stream at offset 500us

For the remaining values like VID, it is fine to re-use the same as in the regular examples for the time being.

Unify StreamQos messages

After the merge of listener support, we should refactor the protobuf stream request messages.

More details to come in this ticket once the PR is merged. The key ideas are:

We can reuse almost all the info for talker and listener in the same request, with a field to discriminate the type
Most fields can be used without further change. But for txmin,txmax, we may rename as e.g. offsetmin offsetmax, so they would fit on Rx as well.

Add Alder Lake S integrated MAC support

Some Alder Lake SKUs include integrated TSN MACs.

The EHL plugin can be used as a reference: https://github.com/Avnu/detd/blob/30aeb789903ae66424d0f1c3ce87406d863a0467/detd/devices/intel_mgbeehl.py

Some suggested changes:

Rename as intel_mgbeadls.py
Rename class and references as IntelMgbeAdlS, Alder Lake S, etc
Use the device IDs as in the Linux driver, e.g. in the dwmac-intel.c file: https://github.com/torvalds/linux/blob/f016f7547aeedefed9450499d002ba983b8fce15/drivers/net/ethernet/stmicro/stmmac/dwmac-intel.c#L1198 and https://github.com/torvalds/linux/blob/f016f7547aeedefed9450499d002ba983b8fce15/drivers/net/ethernet/stmicro/stmmac/dwmac-intel.c#L1199
Change the number of Tx and Rx queues to 4 each

That should be enough for a first round of testing.

We are leaving out aspects like launch time control or ingress optimization until the listener stream support and launch time control support are merged.

Null bytes in protobuf messages sent via unix datagram socket

Hi,
I just noticed that when the service sends a StreamQosResponse with ok = False, an empty packet is received on receiver side.
The reason for that is that protobuf messages are allowed to contain null bytes (and a False boolean is encoded as null byte \0).
Unfortunately, that is also the terminator of a datagram. In the current implementation that probably got unnoticed, because if ok = False the subsequent fields are ignored anyway. However, I still think we should avoid sending raw protobuf messages via unix datagram sockets.

I am happy to provide a fix, but since there are many different options to solve this, I would appreciate your opinion on this topic:

We could further encode the protobuf message to prevent null bytes (e.g. as Base64).
We could avoid using datagram sockets and use stream sockets instead. We would then need a method for separating the messages in the stream again.
We completely avoid protobuf and use another IPC method. One option that you already mentioned in the README would be to replace it by DBus.

What do you think?

Add Raptor Lake integrated MAC support

Some Raptor Lake SKUs include integrated TSN MACs.

The EHL plugin can be used as a reference: https://github.com/Avnu/detd/blob/30aeb789903ae66424d0f1c3ce87406d863a0467/detd/devices/intel_mgbeehl.py

Some suggested changes:

Rename as intel_mgberpl.py
Rename class and references as IntelMgbeRpl, Raptor Lake, etc
Use the device ID as in the Linux driver, e.g. in the dwmac-intel.c file: https://github.com/torvalds/linux/blob/f016f7547aeedefed9450499d002ba983b8fce15/drivers/net/ethernet/stmicro/stmmac/dwmac-intel.c#L1201
Change the number of Tx and Rx queues to 4 each

That should be enough for a first round of testing.

We are leaving out aspects like launch time control or ingress optimization until the listener stream support and launch time control support are merged.

Add instrumentation to run real-hardware experiments

Developing and testing QoS configurations involve at least two end-stations connected back-to-back, plus potentially a development host.

This ticket is about adding some helper tools based on ansible, to streamline testing on several target machines, by driving the tests from the development host. The only requirement on the end-stations is to have ansible and ssh available.

Listener streams

Implement listener streams

Clean-up resources when stopping the service

When the service is stopped, the interface should be returned to its original state, or at least to one close to it. E.g.:

All the streams and the resources allocated to them will be cleaned up
- Depends on #15
All the device tuning performed will be reverted (e.g. EEE will be re-enabled)
Any qdisc mappings, VLAN associations, etc, will be removed
Etc

Add deb package for Ubuntu 24.04

Replace call to ethtool by ioctl

The code currently executes ethtool commands with the Ethtool class:

detd/detd/ethtool.py

Line 23 in 30aeb78

class CommandEthtool:

In some cases, this leads to problems when two commands are issues consecutively. A workaround is to sleep one second, like in:

detd/detd/devices/intel_i225.py

Line 67 in 30aeb78

def get_rate(self, interface):

This is not optimal, and introduces a delay of 1 second, that accounts for a big chunk of the execution time.

The code should use the SIOCETHTOOL to implement the ethtool operations instead. One suggested implementation could be:

Create ioctl.py with Ioctl wrapper class
Reimplement the required methods
Replace the commands in SystemInformation, e.g. for get_rate replace

detd/detd/systemconf.py

Line 349 in 30aeb78

def get_rate(self, interface):

Add backend hint to the configuration information

Related to this PR, specifically to decoupling the backend selection from the implementation:
https://github.com/Avnu/detd/pull/20/files#r1582042749

Please note that detd will still default to the most sensible (e.g. hardware accelerated) setup by default. This class is intended to allow to override that configuration.

This class allows the customization of the specific implementation of the traffic specification and qos provided in the configuration. Based on the hint and the device capabilities, the initialization sequence will instantiate the specific classes to implement it. E.g. it may decide to use specific mappings, taprio modes, etc.

This ticket only contains the changes to provide the info an get that to the InterfaceManager for usage.

The processing of the hint to implement the specific spec/QoS will be described in a separated ticket.

Task 1: add new class Hints

We introduce a new class Hints. This class allows to select how the traffic specifications have to be implemented. It just holds the following attributes, with the described possible values:

tx_selection: ENHANCEMENTS_FOR_SCHEDULED_TRAFFIC, STRICT_PRIORITY (mention in the docstring that EST is 802.1Qbv)
tx_selection_offload: true, false (mention in docstring that true means that a Hardware offload for the tx_selection mechanism must be used, as opposed to a software based one)
data_path: AF_PACKET, AF_XDP_ZC (mention in docstring that other options can be added for the data path, e.g. DPDK)
preemption: true, false
launch_time_control: true, false

We may add a class Hints in the scheduler module. This is a temporary location to enable the merge of the PR:
#20

Task 2: provide a default Hints class per device plugin

Each device plugin will provide a default Hints instantiation that will define how the traffic spec / QoS will be implemented if no other Hints are provided by the caller.

It should be possible to calculate the default hints based on the device capabilities, so a method could be added to device fr that. But it should also be possible to override it. Other parameters would depend on the framework as well, like the AF_XDP_ZC or other data path implementations.

Plugin	TxSelection	TxSelectionOffload	DataPath	LaunchTimeControl	Preemption
I210	Qbv	false	AF_PACKET	false	false
I225	Qbv	true	AF_PACKET	false	false
EHL	Qbv	true	AF_PACKET	false	false

Task 3: provide a default parameter flow until InterfaceManager

If the Configuration constructor contains a hint, it will be assigned to Configuration.hints
Otherwise, the default value will be None

This hints value will be directly passed through without further processing across Proxy, Service, Manager and up to InterfaceManager. E.g. no change to the hints will be made on its way to InterfaceManager.

The InterfaceManager will have an attribute hints, that will be initialized to the default hints for the device associated to the specific interface.

When a new talker or listener is added:

If hints is None, the hints associated to the InterfaceManager will be used (e.g. left unmodified)
If hints is not None and no other talker or listener has been added: the InterfaceManager hints will be updated to the user provided ones
If hints is not None and there are already talkers or listeners added, and the InterfaceManager hints is different than the one provided by the user, the operation will fail, and the user will be informed via a logging message

Timestamping/Diagnostics mode

Diagnostics mode for performance measuring.
For timestamping we could do timestamp in talker -> send to listener, or timestamp in both talker and listener and compare. What do you think/do you have any other ideas?

Unify ansible-based target testing scripts

This PR introduces an ansible playbook, and some scripts to perform testing on target systems:

The code could be reduced by merging:

Python target testing scripts: e.g. have a single test_detd_target.py that accepts as first parameter "talker" or "listener".
Bash setup scripts: e.g. having a single test_txrx.sh that accepts as first parameter "talker" or "listener".

The above may also have the potential to reduce the size of the ansible playbook.

Add add_talkers() method

Currently, we offer an interface to add a single TimeAware stream:

detd/detd/ipc.proto

Line 4 in 1a327a1

message StreamQosRequest {

That is exposed via the InterfaceManager's add_talker method:
https://github.com/Avnu/detd/tree/master?tab=readme-ov-file#current-functionality

This ticket is about adding a new method that allows to add multiple talkers on the same interface simultaneously. E.g.:
add_talkers([config1, config2, ...])

Or with some additional changes:
add_talkers(interface, [config1, config2, ...])

This is specially beneficial in scenarios where the use case configuration involves multiple streams but it is fixed, but the operation of the device prevents to update the streams on the fly.

Set VLAN interfaces up after creating them

Check that the same DMAC cannot be used for two streams on the same VLAN

Apply backend hint when configuring the device

On top of providing the hints (#21), this issue adds the actual configuration for the interface.

For the initial version, it is enough to use the device hints from the SystemConfigurator to select the right Qdisc variant. Initially, cover just:

Offload
Software

In the future, we may add the tx-assisted variant. Another future improvement may be to initialize different configurator objects in the InterfaceManager constructor depending on the backed to use, so there is no selection in runtime.

On top of using the right taprio backend, the mapping also needs to be customized in the InterfaceManager initialization. The reason is, software taprio allows more flexibility than actual hardware, like modifying the traffic classes, and the meaning of the mapping is also different. In order to implement the change:

Rename Mapping as MappingFixed
Rename MappingNaive as MappingFlexible
If the hint uses hardware Qbv, select the MappingFixed, otherwise the MappingFlexible

Add i210 support

i210 support is missing and could be useful to add. I'm currently working on an implementation

Add remove_talker method

remove_talker should:

Remove the talker stream from the configuration
Recompute the schedule
Modify the schedule if needed
Free up any resources being used