Comments (6)
I feel like the bytes that httparse initially allowed were from an RFC, but I no longer remember which one it was (7230? URIs? WHATWG Fetch?)
from httparse.
But if actual user agents, including safari, do send requests with utf8 paths, shouldn't we support them in server-side libraries, even if there is an RFC saying they are invalid ?
from httparse.
The general answer to that question is not always easy. There's sometimes good reasons to prevent them.
In this case, it could be fine to make the parser more relaxed. I was trying to remember what said to make it strict originally, so I could read if they included any comments as to why.
from httparse.
I believe this is coming from rfc3986 section 3.3 and the definition for a path segment not allowing non-us-ascii characters
In particular it's the pchar definition, where characters outside the allowed set need to be percent encoded.
path = path-abempty ; begins with "/" or is empty
/ path-absolute ; begins with "/" but not "//"
/ path-noscheme ; begins with a non-colon segment
/ path-rootless ; begins with a segment
/ path-empty ; zero characters
path-abempty = *( "/" segment )
path-absolute = "/" [ segment-nz *( "/" segment ) ]
path-noscheme = segment-nz-nc *( "/" segment )
path-rootless = segment-nz *( "/" segment )
path-empty = 0<pchar>
segment = *pchar
segment-nz = 1*pchar
segment-nz-nc = 1*( unreserved / pct-encoded / sub-delims / "@" )
; non-zero-length segment without any colon ":"
pchar = unreserved / pct-encoded / sub-delims / ":" / "@"
pct-encoded = "%" HEXDIG HEXDIG
unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
sub-delims = "!" / "$" / "&" / "'" / "(" / ")"
/ "*" / "+" / "," / ";" / "="
The whole abnf can be view at either https://abnf-uri.edgecompute.app/ or https://www.rfc-editor.org/rfc/rfc3986.html#appendix-A
right now applications which use httparse can be sure that it returns a valid uri according to rfc3986
perhaps a feature flag could be added to allow utf8 (rfc3986-non-compliant) path support, so that applications can choose whether they want httparse to be strict or relaxed?
from httparse.
But if there is a flag, is it realistic to think people will care enough to activate it ? I only learned about this after getting bitten by it, and httparse is probably a sub-sub-dependency for most people who just want to handle http requests made by web browsers. I think it would be great if httparse were able to parse web requests made by Safari by default, and if the update just silently fixed this issue that is currently present by transitivity almost everywhere in the rust server framework ecosystem.
from httparse.
Related Issues (20)
- Headers lost on a Partial parse HOT 1
- HTTP POST Request Example HOT 1
- expose static Error str
- method might not refer to a location within buffer HOT 2
- tests fail to build with --no-default-features.
- Not support http body HOT 1
- Invalid end of chunk (parse_chunk_size) HOT 2
- [Question] Why Is the HTTP Version Limited to 1.1? HOT 4
- Should Line Folding get replaced by spaces? HOT 2
- Please increase the demo number of headers HOT 3
- Parse from noncontiguous byte slices? HOT 5
- Allow providing a list of expected headers HOT 1
- Improve error message on invalid request start-line
- double quotes in headers can not be parsed HOT 3
- Benchmark is not an apples to apples comparison (sse4.2) HOT 8
- Allow " in URIs HOT 3
- Allow spaces between header names and colons HOT 5
- README is outdated HOT 1
- Need for `Request` to take a slice of `MaybeUninit<Header>` HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from httparse.