Comments (39)
What FPGA board and what host CPU? Can you show the full lspci output of the card? And is this TX, RX, or simultaneous? And did you use iperf or iperf3? (iperf, while actually being multithreaded, is not as efficient as iperf3). Corundum supports IP checksum offload for RX and TX and Toeplitz hashing for RSS. LSO support is not planned.
from corundum.
Also, what's the link partner, and have you done a test with a commercial 100G NIC with the same host system and link partner?
from corundum.
U200 and Intel(R) Xeon(R) Platinum 8163 CPU. I only tried one to one, so I am not sure about TX or RX. iperf is used for the test.
lspci output is as follows:
25:00.0 Ethernet controller: Device 1234:1001
Subsystem: Device 1234:1001
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 32 bytes
Interrupt: pin ? routed to IRQ 282
NUMA node: 0
Region 0: Memory at 4bfff000000 (64-bit, prefetchable) [size=16M]
Capabilities: [40] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [48] MSI: Enable+ Count=32/32 Maskable+ 64bit+
Address: 00000000fee00418 Data: 0000
Masking: 00000000 Pending: 00000000
Capabilities: [70] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 1024 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 0.000W
DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
MaxPayload 256 bytes, MaxReadReq 512 bytes
DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
LnkCap: Port #0, Speed 8GT/s, Width x16, ASPM not supported, Exit Latency L0s unlimited, L1 unlimited
ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 8GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Range BC, TimeoutDis+, LTR-, OBFF Not Supported
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+, EqualizationPhase1+
EqualizationPhase2+, EqualizationPhase3+, LinkEqualizationRequest-
Capabilities: [100 v1] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
AERCap: First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn-
Capabilities: [1c0 v1] #19
Kernel driver in use: mqnic
from corundum.
That looks like a reasonably performant CPU, and it looks like the card is running at full gen 3 x 16 bandwidth. Max payload of 256 is also reasonable. Is this a dual-socket system with NUMA, or is there only one CPU? If you have two CPUs, then you need to use numactl to run iperf on the same node, which in this case is node 0 (as reported by lspci). Also, the machine hosting the link partner also plays a role - is that an identical machine?
from corundum.
There is only one CPU, and the two servers are identical. The performance of iperf3 is worse, only 8Gbps with default settings.
from corundum.
What's the link partner? And what kind of performance do you get with two commercial 100G NICs cross-connected in the same machines?
from corundum.
What does link partner
means? server or switch, or something else? How I can get the info of link partner?
from corundum.
Link partner is whatever is on the other end of the QSFP28 cable that's plugged in to the FPGA board.
from corundum.
Ok, the two FPGA cards are connected through a P4 switch.
And, it seems that RX is the bottleneck. In 1to2 case, each iperf sender can get 26G. But in 2to1 case, each iperf sender only get 12G.
from corundum.
Ah, so you only have corundum running? Do you have any commercial 100G NICs, perhaps something from Intel or Mellanox? It would be good to do a quick sanity check to make sure you're seeing reasonable performance with a commercial NIC.
from corundum.
Also, IOMMU on or off? Have you tried with the IOMMU turned off?
from corundum.
Hi, I replaced a AU200 card with Mellanox CX6 NIC, and found that when running iperf from AU200 to CX6, 16 connections achieve 75.8Gbps in total, but running iperf from CX6 to AU200, the total throughput is only 23.7Gbps. It seems that when corundum is used as reciever, the performance is low. Do you have any idea about the cause of the receiver bottleneck?
In addition, IOMMU is off.
from corundum.
I also noticed that RX perf. is lower than TX.
for example, in a 10g variation on AMD platform, single thread TX perf. is 9.x Gbps and quite stable, but RX is about 7.x Gbps with swing.
currently I had no ideas about what happened.
should dig deeper....
from corundum.
Interesting that the RX performance is that low. This is with 1500 byte MTU frames? Have you tried 9KB just for comparison? And I suppose if the CX6 can receive at 75.8 in an identical machine, then the machine in question should definitely be capable of receiving at that rate. Can you check the interrupt stats to make sure that all of the interrupts from the card are being used? There are also a few things that you can put an ILA on to get some idea of where the bottleneck might be. Some of these should probably be brought out as performance counters of some sort, for others I'm not sure the best way to go about doing that.
Also, I will note that both the driver and the gateware can probably use some optimizations to improve performance. Variable-length descriptor support should help at least by supporting descriptor block reads, and there are definitely memory management improvements that can be made to the driver. Unfortunately, I am not an expert at kernel development; I'm hoping that there will be enough community interest in Corundum to turn up some potential contributors with more experience in that area who can help out.
from corundum.
I added a log before https://github.com/ucsdsysnet/corundum/blob/7c8abe261b2ec3e653da7bc881f769668a231bde/modules/mqnic/mqnic_rx.c#L296 to get info about cq_ring->head_ptr
and cq_ring->tail_ptr
. I found there are some bursts in head-tail
, and only a few cqes are in cq ring in most time. More details are show in the figure.
from corundum.
Another thing that you can try: increase the number of in-flight RX and TX operations like so:
parameter TX_DESC_TABLE_SIZE = 64;
parameter TX_PKT_TABLE_SIZE = 16;
parameter RX_DESC_TABLE_SIZE = 64;
parameter RX_PKT_TABLE_SIZE = 16;
And you are seeing multiple RX queues in use, correct? You just picked one for plotting?
from corundum.
yes, mutiple RX queues are used. The figure is plotted with the data where iperf has only one connection from CX6 to U200. I also tested with more connections, and each queue is in used normally. When I tested with two U200 cards, head-tail
has smaller fluctuations than one CX6 NIC and one U200 card.
from corundum.
Another thing that you can try: increase the number of in-flight RX and TX operations like so:
parameter TX_DESC_TABLE_SIZE = 64; parameter TX_PKT_TABLE_SIZE = 16; parameter RX_DESC_TABLE_SIZE = 64; parameter RX_PKT_TABLE_SIZE = 16;
And you are seeing multiple RX queues in use, correct? You just picked one for plotting?
Hi, Alex. I tried to set the parameters as the value you mentioned above, but it achieved the same performance as the defaut parameters.
from corundum.
Hi, Alex,
Sometime, the one-to-one throughput can be 17Gbps, while it is 10Gbps in most time. Then I add a log to get cq index
, rx queue head
and rx queue tail
before
corundum/modules/mqnic/mqnic_rx.c
Line 296 in 32abea8
The following two figures show the elasped time between the continuous two outputs and the corresponding
head
, tail
and cq index
. Hope this can be helpful to locate the reason.
from corundum.
The performance issue I met is different but I figure it might be reasonable to ask here.
the CPU is intel i5, memory size is 8G. I use 690T to implement corundum. Corundum NIC is connected to an Intel 10G NIC back to back. The issue is that every time the throughput gets higher than 100Mbps (1500B size), there will be around 1% drop rate (I've also tried it with 1Gbps and 4Gbps).
Another issue that I observed (not sure if they are related) is that the IP addr (configured using ifconfig
cmd) on eth0 get lost from time to time with the mesg "Activation of network connection failed"
.
Do you have any idea what caused the issues?
from corundum.
I am getting similar performance as @wangshuaizs on Alveo U50 running at PCIe Gen3 (8GT/s) x16.
The CPU is Intel-Xeon(R) E5 2690v3 @2.60GHz 24C48T, the memory is DDR4 2400 (running at 2133MT/s). The linux host of U50 is ubuntu server 20.04 LTS running on an SSD.
The peer is using the same CPU, motherboard, memory and storage. The commercial NIC is Mellanox CX4 100GbE. The linux host of CX4 is CentOS 7.
The hardware and software is compiled from the default configuration from commit 38f7666.
I am setting the MTU
of corundum as 9000
, as shown below.
enp4s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 9000
inet 192.168.0.5 netmask 255.255.255.0 broadcast 0.0.0.0
inet6 fe80::20a:35ff:fe06:792e prefixlen 64 scopeid 0x20<link>
ether 00:0a:35:06:79:2e txqueuelen 1000 (Ethernet)
RX packets 123213551 bytes 185574188431 (185.5 GB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 50241829 bytes 70609934522 (70.6 GB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
The output of sudo lspci d 1234:1001 -vvv
is shown as below.
04:00.0 Ethernet controller: Device 1234:1001
Subsystem: Device 1234:1001
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin ? routed to IRQ 116
NUMA node: 0
Region 0: Memory at 3bffe000000 (64-bit, prefetchable) [size=16M]
Capabilities: [40] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [48] MSI: Enable+ Count=32/32 Maskable+ 64bit+
Address: 00000000fee01018 Data: 0000
Masking: 00000000 Pending: 00000000
Capabilities: [70] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 1024 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 25.000W
DevCtl: CorrErr- NonFatalErr+ FatalErr+ UnsupReq+
RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
MaxPayload 256 bytes, MaxReadReq 4096 bytes
DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
LnkCap: Port #0, Speed 8GT/s, Width x16, ASPM not supported
ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 8GT/s (ok), Width x16 (ok)
TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Range BC, TimeoutDis+, NROPrPrP-, LTR-
10BitTagComp-, 10BitTagReq-, OBFF Not Supported, ExtFmt-, EETLPPrefix-
EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
FRS-, TPHComp-, ExtTPHComp-
AtomicOpsCap: 32bit- 64bit- 128bitCAS-
DevCtl2: Completion Timeout: 65ms to 210ms, TimeoutDis-, LTR-, OBFF Disabled
AtomicOpsCtl: ReqEn-
LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+, EqualizationPhase1+
EqualizationPhase2+, EqualizationPhase3+, LinkEqualizationRequest-
Capabilities: [100 v1] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt+ UnxCmplt+ RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES+ TLP+ FCP+ CmpltTO+ CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr-
CEMsk: RxErr+ BadTLP+ BadDLLP+ Rollover+ Timeout+ AdvNonFatalErr+
AERCap: First Error Pointer: 00, ECRCGenCap- ECRCGenEn- ECRCChkCap- ECRCChkEn-
MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
HeaderLog: 00000000 00000000 00000000 00000000
Capabilities: [1c0 v1] Secondary PCI Express
LnkCtl3: LnkEquIntrruptEn-, PerformEqu-
LaneErrStat: 0
Capabilities: [1f0 v1] Virtual Channel
Caps: LPEVC=0 RefClk=100ns PATEntryBits=1
Arb: Fixed- WRR32- WRR64- WRR128-
Ctrl: ArbSelect=Fixed
Status: InProgress-
Port Arbitration Table [500] <?>
VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=ff
Status: NegoPending- InProgress-
Kernel driver in use: mqnic
The IOMMU seems turned off from the lspci result, compared to the output from https://github.com/corundum/corundum/wiki/Performance-Tuning
The RX performance is roughly 10Gbit/s, and the TX performance is roughly 20Gbit/s. As below, the left is the peer on CX4, and the right is the ubuntu with corudum on U50. I started an iperf client from peer CentOS, and an iperf server from Ubuntu. iperf3
is measuring corundum RX performance, and iperf3 -R
is measuring corundum TX performance.
I did a CX4 100GbE to CX4 100GbE test and reached roughly 90Gbps in my hardware setup, so I am wondering where the bottlenect might be and if there are some configurations I missed. Thanks.
from corundum.
After making a cold reboot of the peer server, making sure that there are no other process tapping on the Linux network stack, and changing the peer MTU to 9000, the corundum RX bandwidth managed to reach around 45Gbps, as below. The white bg terminal is the peer, and the black bg terminal is corundum, iperf3 -s
on the corundum host and iperf3 -c
on the peer.
But then if I do the reverse test (iperf3 -c -R
on the corundum host and iperf3 -s
on the peer, or iperf3 -s
on the peer and iperf3 -c
on the corundum host), iperf3 is unable to send out data after the initial several hundred KBytes. Moreover, ping fails as well afterwards. It seems that the TX of the current build fails on JUMBO packets.
I tried dual-side MTU=6000. The RX bandwidth is around 39Gbps. The corundum TX fails as well. Tests running at dual-side MTU=1500 is stable, though.
from corundum.
Well, that's a bit concerning. When it gets stuck, what does mqnic-dump report?
from corundum.
I attach the report below after a clean corundum boot and a TCP TX request on dual side MTU=9000. It seems that most queues are not moving. Under normal situations, if I understand correcly, the head and the tail of the ring buffers should be non-zero. Is it right?
from corundum.
If you only run one instance of iperf and you don't specify -P, then only one TX queue should be used. Looks like it used 4853. However, it doesn't look like the card is hung as both TXQ 4853 and TXCQ 4853 are empty, as well as all of the event queues. Did network manager delete the IP address or something?
from corundum.
The IP address is still there from the Ubuntu ifconfig, but the outbound port seems not responding. As below, I tried to capture the packet after a halted iperf test from the peer. It seems that there is no outbound packets from the FPGA -- that's why ping fails as well.
On the other side, on the Ubunutu machine running corundum, the wireshark capture of a ping request to the peer is as below. It seems that the peer is not receiving ARP packets and generating ARP responses, hence blocking all following packets.
From these observations, it is more likely to be a failure on the TX path from my perspective.
The dmesg output seems ok btw.
After using taskset to set the cpu affinity, at dual-side MTU=1500, corundum RX can reach 32Gbps with 1 iperf process, but as I increase the parallel processes with -P 2
, and taskset -c 31,32
, for example, the RX bandwidth does not seem to change obviously.
The TX bandwidth is around 20Gbps with a reversed iperf environment, and as I increase the parallel processes the bandwidth does not seem to change much as well.
from corundum.
With iperf3, -P only opens multiple connections (and as such will use more TX/RX queues), but you'll have to explicitly run multiple iperf processes in parallel to use more than 1 CPU core.
Anyway, it's interesting that the TX queues are empty when it's hung. Usually when I have seen a hang in the past, there will be packets stuck in the TX queues. So it's not clear what's going on. My initial thought is that maybe the packets are not being handed off to the NIC at all. But maybe the packets are being sent, but then they're getting dropped for some reason before reaching the TX MAC? Very strange.
from corundum.
Hmm, I just noticed that your lspci output lists MaxReadReq as 4096 (the maximum possible value). I do not think I have any machines that set MaxReadReq that high. It's possible that there is a bug somewhere wrt. the DMA engine; I may have to add some additional tests around that. I suspect that if that were an issue, we would see a different failure mode. But perhaps not. At any rate, try an MTU setting less than 4096 (perhaps 3000) and see if that makes a difference.
from corundum.
Okay alex, thanks for your help. I can use 1500 for now, because the bandwidth should be enough for most circumstances.
BTW I just made a script to run multiple iperf clients from the dedicated NUMA for the NIC at each side, and managed to reach 63Gbps for RX, and 71Gbps for TX. That's really impressing! 👍 👍
from corundum.
I tried to gradually increase the dual-side MTU value, and starting from 4148, corundum TX fails. Hope it helps.
Host and peer MTU | TX function |
---|---|
1500 | good |
2000 | good |
3000 | good |
4000 | good |
4096 | good |
4100 | good |
4104 | good |
4108 | good |
4112 | good |
4116 | good |
4120 | good |
4124 | good |
4128 | good |
4132 | good |
4136 | good |
4140 | good |
4144 | good |
4148 | fail |
4172 | fail |
4500 | fail |
6000 | fail |
9000 | fail |
from corundum.
Yep, that definitely looks like there is some issue related to the max read request size of 4096. I don't have a script to force the max read request size to something else, but you can try is poking that register in the PCIe config space with setpci and set it to something in the range of 512-2048. And one of our testbeds actually does set max read request size to 4096, but it's only set up for 10G so I have never tried running with an MTU larger than 1500 before. I will see if I can replicate the issue on my end so I can get it sorted out. I also have a possible lead on the bug; it looks like I am not interpreting byte count field value 0 as 4096 as the spec requires, hopefully that's the only thing that I am missing.
from corundum.
I was not able to replicate any strange behavior in my standalone DMA test, but I was able to replicate the hang with Corundum after changing the settings to support jumbo frames. I'll let you know when I have that particular issue fixed. And apparently the machine in question only sets the max read request size to 4096 on boot, if I do a hot reset later it sets it to 512. So maybe I should also create a script to change the max read request size setting.
from corundum.
I think I have it fixed; try the latest commit on my fork and see if that works correctly for jumbo frames (https://github.com/alexforencich/corundum)
from corundum.
I tried the latest commit on your fork and it works! 👍 🎉
from corundum.
Oops, I think the pcie hot reset changes the max read req to 512. I may need to reboot the machine and try it agin.
from corundum.
It works, once agiain, at MaxReadReq=4096 and MTU=9000.
82:00.0 Ethernet controller: Device 1234:1001
Subsystem: Device 1234:1001
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
NUMA node: 1
Region 0: Memory at 3ffff000000 (64-bit, prefetchable) [size=16M]
Capabilities: [40] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [48] MSI: Enable- Count=1/32 Maskable+ 64bit+
Address: 0000000000000000 Data: 0000
Masking: 00000000 Pending: 00000000
Capabilities: [70] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 1024 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 25.000W
DevCtl: CorrErr- NonFatalErr+ FatalErr+ UnsupReq+
RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
MaxPayload 256 bytes, MaxReadReq 4096 bytes
DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
LnkCap: Port #0, Speed 8GT/s, Width x16, ASPM not supported
ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 8GT/s (ok), Width x16 (ok)
TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Range BC, TimeoutDis+, NROPrPrP-, LTR-
10BitTagComp-, 10BitTagReq-, OBFF Not Supported, ExtFmt-, EETLPPrefix-
EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
FRS-, TPHComp-, ExtTPHComp-
AtomicOpsCap: 32bit- 64bit- 128bitCAS-
DevCtl2: Completion Timeout: 65ms to 210ms, TimeoutDis-, LTR-, OBFF Disabled
AtomicOpsCtl: ReqEn-
LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+, EqualizationPhase1+
EqualizationPhase2+, EqualizationPhase3+, LinkEqualizationRequest-
Capabilities: [100 v1] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt+ UnxCmplt+ RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES+ TLP+ FCP+ CmpltTO+ CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr-
CEMsk: RxErr+ BadTLP+ BadDLLP+ Rollover+ Timeout+ AdvNonFatalErr+
AERCap: First Error Pointer: 00, ECRCGenCap- ECRCGenEn- ECRCChkCap- ECRCChkEn-
MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
HeaderLog: 00000000 00000000 00000000 00000000
Capabilities: [1c0 v1] Secondary PCI Express
LnkCtl3: LnkEquIntrruptEn-, PerformEqu-
LaneErrStat: 0
Capabilities: [1f0 v1] Virtual Channel
Caps: LPEVC=0 RefClk=100ns PATEntryBits=1
Arb: Fixed- WRR32- WRR64- WRR128-
Ctrl: ArbSelect=Fixed
Status: InProgress-
Port Arbitration Table [500] <?>
VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=ff
Status: NegoPending- InProgress-
shishi@ubuntu-r730➜ ~ cd corundum/modules/mqnic
shishi@ubuntu-r730➜ mqnic git:(master) ✗
shishi@ubuntu-r730➜ mqnic git:(master) ✗ ls
iperf0.log iperf6.log mqnic_board.o mqnic_eq.o mqnic_i2c.o mqnic.mod.c mqnic_port.o mqnic_tx.o
iperf1.log iperf7.log mqnic_cq.c mqnic_ethtool.c mqnic_ioctl.h mqnic.mod.o mqnic_ptp.c
iperf2.log Makefile mqnic_cq.o mqnic_ethtool.o mqnic.ko mqnic_netdev.c mqnic_ptp.o
iperf3.log modules.order mqnic_dev.c mqnic.h mqnic_main.c mqnic_netdev.o mqnic_rx.c
iperf4.log Module.symvers mqnic_dev.o mqnic_hw.h mqnic_main.o mqnic.o mqnic_rx.o
iperf5.log mqnic_board.c mqnic_eq.c mqnic_i2c.c mqnic.mod mqnic_port.c mqnic_tx.c
shishi@ubuntu-r730➜ mqnic git:(master) ✗ sudo insmod mqnic.ko
shishi@ubuntu-r730➜ mqnic git:(master) ✗ sudo rmmod mqnic.ko
shishi@ubuntu-r730➜ mqnic git:(master) ✗ sudo insmod mqnic.ko
shishi@ubuntu-r730➜ mqnic git:(master) ✗ sudo ip link set dev enp130s0 up
shishi@ubuntu-r730➜ mqnic git:(master) ✗ sudo ip addr add 192.168.0.5/24 dev enp130s0
shishi@ubuntu-r730➜ mqnic git:(master) ✗ sudo ip link set mtu 9000 dev enp130s0
shishi@ubuntu-r730➜ mqnic git:(master) ✗ ifconfig
eno1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 10.110.101.144 netmask 255.255.255.0 broadcast 10.110.101.255
inet6 fe80::1618:77ff:fe56:3b6c prefixlen 64 scopeid 0x20<link>
ether 14:18:77:56:3b:6c txqueuelen 1000 (Ethernet)
RX packets 1236 bytes 895381 (895.3 KB)
RX errors 0 dropped 43 overruns 0 frame 0
TX packets 717 bytes 80054 (80.0 KB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
device interrupt 38
eno2: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500
ether 14:18:77:56:3b:6d txqueuelen 1000 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
device interrupt 91
eno3: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500
ether 14:18:77:56:3b:6e txqueuelen 1000 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
device interrupt 93
eno4: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500
ether 14:18:77:56:3b:6f txqueuelen 1000 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
device interrupt 95
enp130s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 9000
inet 192.168.0.5 netmask 255.255.255.0 broadcast 0.0.0.0
inet6 fe80::20a:35ff:fe06:792e prefixlen 64 scopeid 0x20<link>
ether 00:0a:35:06:79:2e txqueuelen 1000 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 10 bytes 836 (836.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10<host>
loop txqueuelen 1000 (Local Loopback)
RX packets 92 bytes 7100 (7.1 KB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 92 bytes 7100 (7.1 KB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
shishi@ubuntu-r730➜ mqnic git:(master) ✗ cd ~
shishi@ubuntu-r730➜ ~ ./batch_iperf_c.sh
shishi@ubuntu-r730➜ ~ ------------------------------------------------------------
Client connecting to 192.168.0.2, TCP port 5001
TCP window size: 682 KByte (default)
------------------------------------------------------------
[ 3] local 192.168.0.5 port 39242 connected with 192.168.0.2 port 5001
------------------------------------------------------------
Client connecting to 192.168.0.2, TCP port 5001
TCP window size: 1.84 MByte (default)
------------------------------------------------------------
[ 3] local 192.168.0.5 port 39244 connected with 192.168.0.2 port 5001
------------------------------------------------------------
Client connecting to 192.168.0.2, TCP port 5001
TCP window size: 715 KByte (default)
------------------------------------------------------------
[ 3] local 192.168.0.5 port 39250 connected with 192.168.0.2 port 5001
------------------------------------------------------------
Client connecting to 192.168.0.2, TCP port 5001
TCP window size: 390 KByte (default)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to 192.168.0.2, TCP port 5001
TCP window size: 715 KByte (default)
------------------------------------------------------------
[ 3] local 192.168.0.5 port 39246 connected with 192.168.0.2 port 5001
------------------------------------------------------------
Client connecting to 192.168.0.2, TCP port 5001
TCP window size: 715 KByte (default)
------------------------------------------------------------
[ 3] local 192.168.0.5 port 39256 connected with 192.168.0.2 port 5001
------------------------------------------------------------
Client connecting to 192.168.0.2, TCP port 5001
TCP window size: 715 KByte (default)
------------------------------------------------------------
[ 3] local 192.168.0.5 port 39248 connected with 192.168.0.2 port 5001
[ 3] local 192.168.0.5 port 39254 connected with 192.168.0.2 port 5001
------------------------------------------------------------
Client connecting to 192.168.0.2, TCP port 5001
TCP window size: 715 KByte (default)
------------------------------------------------------------
[ 3] local 192.168.0.5 port 39258 connected with 192.168.0.2 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-10.0 sec 11.6 GBytes 9.97 Gbits/sec
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-10.0 sec 11.6 GBytes 9.96 Gbits/sec
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-10.0 sec 11.6 GBytes 9.96 Gbits/sec
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-10.0 sec 11.6 GBytes 9.96 Gbits/sec
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-10.0 sec 11.6 GBytes 9.96 Gbits/sec
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-10.0 sec 11.6 GBytes 9.96 Gbits/sec
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-10.0 sec 11.6 GBytes 9.96 Gbits/sec
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-10.0 sec 11.6 GBytes 9.96 Gbits/sec
from corundum.
Excellent, good to hear!
from corundum.
I wonder, with the MRRS 0 now correctly interpreted as 4096, was this also the underlying cause of lowered performance for the 100 Gbps case of the original issue, i.e. opening topic?
from corundum.
No. First, this seems to cause the design to hang. So it's not low performance, it's no performance. Second, the bug was introduced recently with the new generic PCIe DMA engine due to an oversight; the ultrascale "descriptor" format uses a wider field and as such does not have this ambiguity so the old ultrascale-specific DMA engine is unaffected.
from corundum.
Related Issues (20)
- Why is distributed RAM used to store packages on the chip, and why is it divided into segment?
- Missing prerequisite in makefile HOT 1
- Document needs to be updated
- Custom App Ports HOT 2
- Cocotb with icarus verilog gets frozen stuck at some point during simulation: How to debug? HOT 2
- port to zc706 HOT 1
- insmod mqnic.ko error HOT 2
- Does corundum support two devices in a server? HOT 3
- Accelerating PPPOE
- petalinux compile error for mqnic module HOT 8
- nic
- Error "Device needs to be reset" when insmod mqnic.ko HOT 1
- AU50 working with DAC but not Optical HOT 3
- issues in doc HOT 1
- Porting PCIe example to zu7cg HOT 7
- mqnic is not compilent from modern kernel HOT 1
- Weird bugs meet with fb2CG and Vivado 2023.2 HOT 3
- Inquiry on Round-Trip Latency
- problem while building the driver HOT 7
- Troubleshooting Interface Configuration Issue with mqnic Driver HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from corundum.