Giter Site home page Giter Site logo

Path MTU Discovery about base-drafts HOT 23 CLOSED

quicwg avatar quicwg commented on May 25, 2024
Path MTU Discovery

from base-drafts.

Comments (23)

ianswett avatar ianswett commented on May 25, 2024

Agreed that (a) seems largely impractical. (b) is semi-practical, but the checksum is still computable by a passive observer.

Something along the lines of (c) is what I would recommend. It's somewhat similar to what our current experiment does today, though we ignore ICMP entirely and rely on loss as a signal. The negative of course is that

I don't think we want to support small MTUs for QUIC, so we should probably pick some minimum size that we expect all forthcoming CHLOs to fit into. Multipacket CHLOs make stateless rejects impractical, and networks that don't support packets at least ~1200 bytes in size are exceedingly rare.

from base-drafts.

alagoutte avatar alagoutte commented on May 25, 2024

@ianswett if i remember CHLO is always padded to maximum size (1370 bytes for IPv4 and 1350 bytes for IPv6) ? For avoid amplification ?

from base-drafts.

ianswett avatar ianswett commented on May 25, 2024

All handshake packets are always padded to the maximum packet size to ensure the path supports the chosen MTU. So if the path doens't, the handshake fails. This has proven to be a very practical approach, only removing <1% of users who don't support 1350/1370 MTU sizes.

Conveniently, the CHLO needs to be padded for anti-amplifications reasons as well.

from base-drafts.

martinduke avatar martinduke commented on May 25, 2024

"(b) is semi-practical, but the checksum is still computable by a passive observer."

Indeed, all ICMP is vulnerable to a passive observer, but if the header echo isn't there it's vulnerable to off-path attacks that guess the 4-tuple.

It seems odd to spend an enormous amount of energy to save a handful of bytes in headers and frames, and then turn around and use a conservatively low MTU, when a very common IPv4 UDP MSS is 1472 Bytes.

Regarding option (c), it makes a lot of sense for CHLO to be a max-size packet for these purposes, but it's not sufficient. Furthermore, using loss as the signal instead of ICMP seems like an inefficient solution.

from base-drafts.

ianswett avatar ianswett commented on May 25, 2024

In practice today, we're losing less than 10% of the potential packet size, but I agree it isn't ideal.

Can you clarify why padding the CHLO is not sufficient?

from base-drafts.

martinduke avatar martinduke commented on May 25, 2024
  1. The server must probe the server->client path.

  2. If the path changes (particularly with multipath), there must be some means of probing the MTU or you will have lots of packet fragmentation.

from base-drafts.

vasilvv avatar vasilvv commented on May 25, 2024

I am not sure QUIC can in general rely on the ICMP messages for path MTU discovery, since the ICMP messages are normally consumed by the kernel, which may not expose them to individual clients.

from base-drafts.

martinduke avatar martinduke commented on May 25, 2024

I agree that user-space implementations will be reliant on OSes that do whatever they want with UDP ICMP messages. Perhaps the Path MTU discovery section needs lots of SHOULDs instead of MUSTs, because as QUIC migrates into the kernel (?) it can fix these problems. But at the very least, we can set the sockopts to not use the DF bit most of the time.

I should really put something together in a detailed proposal. But the outline should be something like this.

  1. Conditions in which QUIC MUST or MAY send a packet at full-size (using PAD frames as necessary) with the DF bit set. -- this would certainly include the first CHLO and, probably, the second server-generated packet, in addition to packets involving new 4-tuples.
  2. Reactions to loss of those packets
  3. Reactions to ICMP packet too big messages -- would include SHOULDs involving storage of IP header fields, which would be most secure. I think some would say that QUIC SHOULD ignore these packets entirely, but I would disagree.

Relatedly, we could set a relatively high minimum MTU for QUIC connections (~1000 Bytes?). For legitimately low-MTU links, this would cause lots of fragmentation, but it would substantially mitigate off-path attackers.

What do all of you think?

from base-drafts.

martinduke avatar martinduke commented on May 25, 2024

I should also add that the very first packet is a poor one to use to wait for loss, as the RTT is entirely unknown and must use a conservative RTO value. It might be better to use a relatively conservative value, as the draft does, at first, and probe upwards to see if there's more capacity to unlock.

from base-drafts.

rjshade avatar rjshade commented on May 25, 2024

What's the benefit of relying on ICMP responses vs. doing MTU discovery at the QUIC layer ("packetization layer PMTUD" RFC 4821)?

from base-drafts.

martinduke avatar martinduke commented on May 25, 2024

What's the benefit of relying on ICMP responses vs. doing MTU discovery at the QUIC layer ("packetization layer PMTUD" RFC 4821)?

Thanks for the reference; I had not seen that RFC.

Though ICMP messages have their disadvantages, there are four advantages over a loss-based scheme.

  1. Immediate reporting of the actual PMTU, rather than a search process that is likely to undershoot the actual PMTU.
  2. Precision only comes with many probes -> many losses that must be retransmitted.
  3. Loss is an overloaded signal, meaning congestion or RF problems in different contexts. Therefore there is the possibility of error in interpreting a loss.
  4. In many cases, detecting a loss will take much longer than an ICMP message sent from an in-path router.

Meanwhile, a MTU underestimate has not only packet overhead considerations, but also directly impacts the gross throughput possible via congestion control (which operates in multiples of the MSS).

from base-drafts.

ianswett avatar ianswett commented on May 25, 2024

I think I'd like a 'trust but verify" approach to path MTU. In an ideal case, QUIC would get the ICMP message and verify that it really could get a non-fragmented packet through with that size. As long as the size was larger than the chosen handshake size, it would try it.

Some of what we've discussed(ie: padding the CHLO and SHLO and setting the DF bit) is what the implementation does today, and not including it in the draft was an oversight we really need to fix.

Actually, QUIC's congestion control(and I believe FreeBSD's) operates in bytes, not MSS. But I agree it's likely the network and host are more efficient with larger packets.

But please do a pull request with what you describe above, because I think you're going in a good direction, and it's just a matter of working out some details, which is easy to do in the comments of a PR.

from base-drafts.

martinduke avatar martinduke commented on May 25, 2024

I think I'd like a 'trust but verify" approach to path MTU. In an ideal case, QUIC would get the ICMP message and verify that it really could get a non-fragmented packet through with that size. As long as the size was larger than the chosen handshake size, it would try it.

Stacks should ignore ICMP messages that increase the PMTU. The "only" issues with ICMP are non-conforming routers, and attackers (especially "off-path" attackers) that drive the MTU down to the minimum value.

Some of what we've discussed(ie: padding the CHLO and SHLO and setting the DF bit) is what the implementation does today, and not including it in the draft was an oversight we really need to fix.

That's great, but again, CHLO and SHLO will often have long RTOs, so loss-based MTU discovery is uniquely ill-suited to these packets.

Actually, QUIC's congestion control(and I believe FreeBSD's) operates in bytes, not MSS. But I agree it's likely the network and host are more efficient with larger packets.

I believe there's already a comment that QUIC congestion control is poorly spelled out in the draft. But the draft says it uses TCP congestion controls, which define their initial cwnd in multiples of MSS. In the absence of ABC, which is not listed in the draft, then acknowledgments increment cwnd in multiples of MSS as well.

But please do a pull request with what you describe above, because I think you're going in a good direction, and it's just a matter of working out some details, which is easy to do in the comments of a PR.

It might take me a week or two to get to it, but I will do so. Thanks for the encouragement!

from base-drafts.

ianswett avatar ianswett commented on May 25, 2024

from base-drafts.

martinduke avatar martinduke commented on May 25, 2024

I'm not sure if the PR pings people who are tracking this issue, but I submitted two pull requests:
#105
#106

The first is my preferred version, which strongly recommends ICMP-based PMTU discovery. It makes somewhat bold assumptions about the real world:

  • ICMP black holes are rare enough to be handled by a MAY if people want to use PLPMTUD in addition to ICMP.
  • I personally tend to work in kernel space, but my quick glance at UDP socket APIs suggest that it's not a very hard problem to modify normal DF settings/ICMP handling in a user space implementation.
  • It strongly discourages a fixed, conservative PMTU. IMO it seems perverse to leave ~100 bytes on the table given all the complexity we're introduced to save a handful of bytes in packet and frame headers.

The second PR trashes all those assumptions, and is a very permissive (and wordy) spec that basically allows anything. It still adds a bunch of SHOULDs that make ICMP-based discovery work better in a QUIC context. This is the one piece where I feel strongly that QUIC's packetization section should not just reference a bunch of MTU RFCs.

I am interested in feedback on one or both, in particular which PR is a better basis for further editing.

from base-drafts.

MikeBishop avatar MikeBishop commented on May 25, 2024

I think the spirit of QUIC so far has been "Be as efficient as possible; if we break something, there's always TCP." In that vein, the first seems more in keeping. On a purely technical basis, I don't have enough context to opine.

from base-drafts.

martinduke avatar martinduke commented on May 25, 2024

from base-drafts.

mcmanus avatar mcmanus commented on May 25, 2024

both 105 and 106 shift us from MAY use some kind of pmtud onto SHOULD use some kind of pmtud (the details of which vary). Functionally, that's creating a requirement of implementations that I don't think is justified as necessary by the experience so far and the complexity laid out in the PR.

given ian's experience of 90% effectiveness in comment in #64 (comment) I would be wary of introducing ICMP into this at all.

from base-drafts.

martinduke avatar martinduke commented on May 25, 2024

given ian's experience of 90% effectiveness in comment in #64 (comment) I would be wary of introducing ICMP into this at all.

'90% effectiveness' means we're leaving about 150 bytes per datagram on the table. It would be fine to have a protocol that didn't packetize data all that efficiently in the name of simplicity, but if that is the goal then we should absolutely get rid of the many variable-length header fields, which introduce a ton of complexity for less than 100 bytes of savings in most cases.

from base-drafts.

mcmanus avatar mcmanus commented on May 25, 2024

from base-drafts.

martinthomson avatar martinthomson commented on May 25, 2024

I think that #106 is closer now, we should discuss at the interim.

from base-drafts.

mnot avatar mnot commented on May 25, 2024

As discussed in Tokyo, @martinduke to propose text for PR #106 to reduce the default packet size to the IPv6 default and recommend PLPMTUD with optional usage of ICMP information

from base-drafts.

martinthomson avatar martinthomson commented on May 25, 2024

#106 was merged, so this is now done.

from base-drafts.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.