Wireguard is agnostic to dropped packets (it's the job of applications reliant on UDP

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Error correction (reed solomon) for dropped packets about boringtun HOT 6 OPEN

cloudflare commented on May 30, 2024 4

Error correction (reed solomon) for dropped packets

from boringtun.

Comments (6)

vkrasnov commented on May 30, 2024

Yes, this is something we would consider

from boringtun.

bingh0 commented on May 30, 2024

Would it be more appropriate to use existing standards such as FECFRAME with UDP packets (https://tools.ietf.org/html/rfc6681) and more modern codes with better computational and error-correcting efficiency such as RaptorQ (https://tools.ietf.org/html/rfc6330)?

Qualcomm's IPR is very clear in this use case (https://datatracker.ietf.org/ipr/2554/) that it would not assert any licensing claims.

There's even an actively developed Apache 2.0 licensed implementation of RaptorQ in Rust (https://github.com/cberner/raptorq).

from boringtun.

commented on May 30, 2024

@bingh0 Reed Solomon is an optimal code (maximum distance) for this purpose (this is why all storage solutions use RS for example, doing better is simply not possible). RaptorQ is not suitable for this purpose - it's mostly for streaming chunks of data with low decoding complexity (it is slightly less bandwidth efficient than RS by the way). That's why Qualcomm wants to use it perhaps, lower battery usage though more likely they just want something with a patent and to justify purchase of the company responsible for the code.

And LDPC code is one of the best on the level of bits. With UDP we are dealing with packet erasure/drop since corrupt packets will fail the checksum and aren't even likely to make it to your PC much less past your OS so we don't care about correcting individual bits but the whole packet or payload.

from boringtun.

bingh0 commented on May 30, 2024

I am certainly not the most qualified to discuss the technical merits of various erasure codes, but who better to ask than Mike Luby, the inventor of LT codes (Luby Transform) who supervised the team that invented tornado codes? I have reprinted his reply without modification below (with his permission).

I’m no longer with Qualcomm, and I’ve rejoined the non-profit research institute (International Computer Science Institute, ICSI, affiliated with UC Berkeley) that I was with over 20 years ago. At ICSI, we have developed an optimized implementation of RaptorQ (see www.codornices.info for more details). We are offering deployment licenses to our implementation (ICSI is soft-funded so we need to raise enough funding to support our team), which is a flat annual licensing fee that allows unlimited deployment.

It is true that Reed-Solomon is used in many deployed distributed storage systems, although there are alternative solutions that have benefits over a Reed-Solomon based solution, including solutions based on RaptorQ, see for example https://dl.acm.org/citation.cfm?doid=3311821.3281276.

For protection of data sent from point A to point B using UDP over lossy high speed networks, you are correct that having a high performing FEC erasure code is a good idea, and RaptorQ is a great choice. The overhead differences between RaptorQ and Reed-Solomon are essentially non-existent for the same code parameters. The primary difference is that RaptorQ can scale efficiently to protect high bandwidth lossy connections, which Reed-Solomon will struggle with due to CPU issues (for the Cloudflare application, a good approach would be to protect all data flowing from location A to location B instead of individual streams, and thus the bandwidth of the protected stream can be very high, i.e., multiple Gbps). For example, the RaptorQ code can support data blocks with up to 56,403 source packets and up to 2 billion repair packets (in most applications, the number of repair packets generated and sent is a small fraction of the number of source packets), the time for decoding is proportional to the data block size, and the time for encoding is proportional to the sum of the data block size and the total size of the repair packets. In contrast, the underlying field often used with Reed-Solomon codes is GF(256), which means that the Reed-Solomon code can support a data block with at most 256 source plus repair packets, and the time for encoding and decoding depends on the product of the data block size and the total size of the repair packets (quadratic).

RaptorQ is specifically designed to operate with respect to erasures, RaptorQ does not error-correct bits in packets (the lower layer LDPC physical layer codes do the error-correction of packets, and if the LDPC codes fail to correct all the errors within a packet then the packet is discarded and there is packet loss, which is handled by RaptorQ at an upper layer). One way to use RaptorQ is to collect data in blocks for encoding, where the data in each block corresponds to the data that flows through the sender over a configurable interval of time, e.g., interval = 10 ms, and then RaptorQ would be applied to each block of data as it flows through the sender (the original source data of the block flows through the sender without any delay), where the number of packets in each block could range from 10s to 1000s depending on the bandwidth of the connection and the duration of each block. For example, if the traffic from point A to point B is 10 Gbps and each block consists of 10 ms of data, then a block of data would be 100 Mbits, and, assuming each packet is 10,000 bits, this data block would be sent within 10,000 source packets, and some number of repair packets, e.g., 1,111, could be generated using RaptorQ and sent just after the source packets (in this case protecting against 10% packet loss, i.e., any up to 1,111 out of the 11,111 source and repair packets can be lost and the RaptorQ decoder can recover the original data block). Trying to do this with a Reed-Solomon code would be extremely challenging from a CPU perspective.

WRT Intellectual Property Rights (IPR), Qualcomm has IPR on RaptorQ. Qualcomm has made an IPR declaration in the IETF with respect to RaptorQ, which is a mirror of the IPR commitment that Qualcomm has made for MPEG DASH. Note that MPEG DASH has been widely deployed around the world for internet streaming by many companies with no problems from Qualcomm (see Legal Status on https://en.wikipedia.org/wiki/Raptor_code).

from boringtun.

commented on May 30, 2024

(for the Cloudflare application, a good approach would be to protect all data flowing from location A to location B instead of individual streams, and thus the bandwidth of the protected stream can be very high, i.e., multiple Gbps)

This does not apply here as we have cryptography at play, every UDP packet will be unique.

In contrast, the underlying field often used with Reed-Solomon codes is GF(256), which means that the Reed-Solomon code can support a data block with at most 256 source plus repair packets, and the time for encoding and decoding depends on the product of the data block size and the total size of the repair packets (quadratic).

True but nowhere close to 256 will ever be used much less more, Cloudflare will never agree to increase bandwidth consumption by as much as two orders of magnitude to accommodate some poor guy who has an effectively impossibly bad connection or introduce massive latency by batching too much. I anticipate 10 is going to be the absolute maximum tolerated.

Perhaps you didn't fully explain the context to him, because RaptorQ is probabilistic and relies on the magnitude of packets to make up for that (have probability be "guaranteed" to be 1). It is not suitable for an application where you just want to make sure that an individual packet makes through with minimal overhead. I suppose it would be possible to force large internal packets and break them down into the fountain (the actual UDP packets sent) but no one is gonna want to do that extra work here and introduce latency in the process.

from boringtun.

commented on May 30, 2024

This would be a nice addition, there is even a rust RS crate to do this: https://github.com/darrenldl/reed-solomon-erasure

Regarding RaptorQ, you will not find any public implementations that impress. I'm somewhat doubtful the private ones that require licensing are any better. RaptorQ have some theoretic appeal and the marketing behind Fountains is strong but for when you don't require streaming massive amounts of static data, RS over GF(256) is more than sufficient: batch 128 packets, add 128 parity packets and be protected from 50% packet loss. Fountains only make sense when packet loss is expected to be unrealistically bad and for some reason you want to stream a massive chunk of data all at once (and where burst errors can occur lasting large periods of time).

You can always use RS over GF(2^64+) and mimic Fountains while not being probabilistic, though performance may finally noticeably swing towards RaptorQ.

If I'm beaming a 1GB file into the ether continuously (throughput >> 1TB) with the expectation pretty much anyone listening at some point will be able to get it even with an abysmal channel. Then RaptorQ may finally be good a choice, but for such a niche application performance is likely not very important and I would want to use RS anyway for a guarantee nothing can go wrong.

from boringtun.

Error correction (reed solomon) for dropped packets about boringtun HOT 6 OPEN

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent