Giter Site home page Giter Site logo

Comments (4)

seanmonstar avatar seanmonstar commented on May 27, 2024

Hey there! Thanks for the write-up.

When a major version of HTTP does not define any minor versions, the minor version "0" is implied.

This is saying when a specification defines a version, such as HTTP/2, if it doesn't define it with a minor version, then it can be treated as HTTP/2.0 when required. However:

  • The specifications for HTTP/1.0 and HTTP/1.1 do define the minor versions.
  • This doesn't refer to parsing, but rather implied definition.

So, if you look at RFC 9112ยง2.3, it explicitly defines how the version must be parsed:

  HTTP-version  = HTTP-name "/" DIGIT "." DIGIT
  HTTP-name     = %s"HTTP"

Therefore, a message that starts like GET / HTTP/1\r\n is not valid.

I would expect nested HTTP messages to use the same version as the root level. I could just accept the nested message have to be HTTP/1.0 or HTTP/1.1, but that seems a bit odd and specific.

I don't quite understand what you're referring to with this, sorry.

It's also worth mentioning that this library won't parse HTTP/0.9 either.

That's correct.

from httparse.

commonsensesoftware avatar commonsensesoftware commented on May 27, 2024

I suppose that's fair. RFC 9110 is the new spec for all of HTTP, but specs like RFC 9112 update the existing specs for older versions. Since older specs require <major>.<minor>, it makes sense that the minor version would be required since it would break older clients. RFC 9110 should indicate that the new semantics only apply to HTTP/2 and beyond.

Let me outline my scenario a bit more for you. I'm parsing multipart/mixed messages. Sadly, there are no existing crates for this. All of the existing crates only support multipart/form-data (which is a common, but limited use case). A message would look like this:

POST /batch HTTP/2
Content-Length: 420
Content-Type: multipart/mixed; boundary=9036ca8fc2f1473091f5ed273ef1b472

--9036ca8fc2f1473091f5ed273ef1b472
Content-Type: application/http; msgtype=request
Content-Length: 120

POST /item HTTP/2
Content-Type: application/json
Content-Length: 42

{"id":42,"name":"Example 1"}
--9036ca8fc2f1473091f5ed273ef1b472
Content-Type: application/http; msgtype=request
Content-Length: 120

POST /item HTTP/2
Content-Type: application/json
Content-Length: 42

{"id":43,"name":"Example 2"}
--9036ca8fc2f1473091f5ed273ef1b472
Content-Type: application/http; msgtype=request
Content-Length: 120

POST /item HTTP/2
Content-Type: application/json
Content-Length: 42

{"id":44,"name":"Example 3"}
--9036ca8fc2f1473091f5ed273ef1b472--

I have all the functionality built to parse the parts, but it doesn't make sense to create my own parser for the HTTP request message.

I would expect nested HTTP messages to use the same version as the root level. I could just accept the nested message have to be HTTP/1.0 or HTTP/1.1, but that seems a bit odd and specific.

I don't quite understand what you're referring to with this, sorry.

It makes logical sense that a client would use/generate whatever HTTP version they are currently connected to. If they connect with HTTP/2 or otherwise know that they will, it's more than reasonable to expect that they would use HTTP/2. For my purposes, and in general, it doesn't really matter which HTTP version is specified at this point. I can workaround the issue by forcing the start line to always use HTTP/1.0 for a part that is application/http; msgtype=request. This requirement is, unfortunately, not that obvious to clients and there doesn't seem to be an escape hatch for parsing other HTTP versions.

This library covers the 99% of use cases that people are interested in which is a high perf parse of the start line, headers, and start of the body. It doesn't, nor is intended to, deal with connections, negotiations, prologue, epilogue, or even trailer headers (I believe). All that being said, I believe it is possible to support the same parsing semantics for a message that says it is HTTP/2 , HTTP/3, or some future version without sacrificing any of the perform it already has. Supporting it would enable potential callback behavior for edge cases or simply put full control in the user's hands as to what special handling is required for other HTTP versions, if any.

Ultimately, there doesn't appear to be a compelling reason to not support parsing any version. I can't think of any other version-specific parsing behavior required, but limiting that to the behavior of HTTP/1.1 seems more than reasonable now and - perhaps - forever. Since version: u8 already exists as a member, I would suggest adding a full_version or http_version as a simple f32 or something like:

use std::fmt::{Debug, Display, Formatter, Result as FormatResult};

#[derive(Clone, Copy, Eq, Ord)]
pub struct HttpVersion(u8,Option<u8>);

impl HttpVersion {
    pub fn new(major: u8, minor: Option<u8>) -> Self {
        Self(major, minor)
    }

    pub fn major(&self) -> u8 {
        self.0
    }

    pub fn minor(&self) -> u8 {
        self.1.unwrap_or_default()
    }
}

impl PartialEq for HttpVersion {
    fn eq(&self, other: &Self) -> bool {
        self.0 == other.0
        && self.1.unwrap_or_default() == other.1.unwrap_or_default()
    }
}

impl PartialOrd for HttpVersion {
    fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
        match self.0.partial_cmp(&other.0) {
            Some(Ordering::Equal) =>
                self.1.unwrap_or_default()
                      .partial_cmp(&other.1.unwrap_or_default()),
            ord => return ord,
        }
    }
}

impl Display for HttpVersion {
    fn fmt(&self, f: &mut Formatter<'_>) -> FormatResult {
        self.0.fmt(f)?;

        if let Some(minor) = self.1 {
            f.write('.')?;
            minor.fmt(f)?;
        }
    }
}

impl Debug for HttpVersion {
    fn fmt(&self, f: &mut Formatter<'_>) -> FormatResult {
        f.debug_struct("HttpVersion")
         .field("major", &self.0)
         .field("minor", &self.1.unwrap_or_default())
         .finish()
    }
}

This approach would still be inline with avoiding allocations since everything would still be on the stack.

The logical way to think about this approach (at least in my small, ๐Ÿฟ๏ธ ๐Ÿง ) is the parse message can be any HTTP version, but the parsing semantics are [currently or forever] restricted to HTTP 1.1.

Thoughts?

from httparse.

seanmonstar avatar seanmonstar commented on May 27, 2024

Oh, I see what you mean now by nested messages, multipart/mixed. That made it all click, I knew I must have been missing something.

So, on one level, I don't believe that is valid HTTP/2, since HTTP/2 defines a binary protocol with HEADERS and DATA frames.

Ultimately, there doesn't appear to be a compelling reason to not support parsing any version.

The reason so far is that HTTP/1.0 and HTTP/1.1 have specifically defined parsing rules, and the parser doesn't "know" about the rules for any other version.

from httparse.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.