seanmonstar / httparse Goto Github PK
View Code? Open in Web Editor NEWA push parser for the HTTP 1.x protocol in Rust.
Home Page: https://docs.rs/httparse
License: Apache License 2.0
A push parser for the HTTP 1.x protocol in Rust.
Home Page: https://docs.rs/httparse
License: Apache License 2.0
Hello, the end of the chunk should be ended by two sets of \r\n
. but parse_chunk_size
considers that one set is enough, which results in the inability to know the completion of message transmission in the TCP stream.
for example:
correct:
let buf = b"0\r\n\r\n";
assert_eq!(httparse::parse_chunk_size(buf), Ok(httparse::Status::Complete((3, 0))));
unexp:
let buf = b"0\r\n";
assert_eq!(httparse::parse_chunk_size(buf), Ok(httparse::Status::Partial));
let buf = b"0\r\n\r";
assert_eq!(httparse::parse_chunk_size(buf), Ok(httparse::Status::Partial));
Hello, I'm trying to use this crate to implement a parser, but I'm having difficulties implementing response reading loop.
Here's my crate's code: https://github.com/MOZGIII/http-proxy-client-async
And I'm having issues with this section in particular:
https://github.com/MOZGIII/http-proxy-client-async/blob/d5d29ec06c5cd912e17ec358ee77860c7e8b4f61/src/http.rs#L39-L54
pub async fn receive_response<'buf, ARW>(stream: &mut ARW) -> io::Result<Vec<u8>>
where
ARW: AsyncRead + AsyncWrite + Unpin,
{
let mut response_headers = [httparse::EMPTY_HEADER; 16];
let mut buf = [0u8; 1024];
let mut response = httparse::Response::new(&mut response_headers);
let (consumed, total) = loop {
let total = stream.read(&mut buf).await?;
let result = response.parse(&buf[..total]);
match result {
Err(err) => return Err(io::Error::new(io::ErrorKind::InvalidData, err)),
Ok(httparse::Status::Complete(consumed)) => break (consumed, total),
Ok(httparse::Status::Partial) => continue,
};
};
let leftovers = Vec::from(&buf[consumed..total]);
Ok(leftovers)
}
The issue is with borrowing the buf
:
error[E0502]: cannot borrow `buf` as mutable because it is also borrowed as immutable
--> src/http.rs:44:33
|
44 | let total = stream.read(&mut buf).await?;
| ^^^^^^^^ mutable borrow occurs here
45 | let result = response.parse(&buf[..total]);
| -------- --- immutable borrow occurs here
| |
| immutable borrow later used here
Since this crate has a kind of unique API, how would you recommend solving this issue?
While RFC 7230 deprecated mutliline headers (search around for obs-fold
in the RFC, they're still something you sometimes encounter. I noticed this while using the multipart_mime library, which in turn uses httparse to handle headers in a MIME message. Here's a failing test case:
req! {
test_multiline_header,
b"GET / HTTP/1.1\r\nX-Received: by 10.84.217.214 with SMTP id whatever;\r\n Wed, 21 Jun 2017 09:04:21 -0700 (PDT)",
|req| {
assert_eq!(req.headers.len(), 1);
}
}
I'm not sure how to actually fix the issue, but I figured I'd at least report the bug.
If the HTTP message begins with whitespace, the resulting error is "invalid HTTP version". This felt misleading, or at least less helpful than it could be, since the version (HTTP/1.1) in the start-line was fine. If the message could instead either, ideally call out leading whitespace (as it may be subtle to spot), or at least instead say something like "invalid start-line", or "doesn't match HTTP format", or similar, I think it could help save people some time in tracking down this particular issue.
I am trying to use reqwest
to parse a response from a server I don't control. reqwest
uses hyper
which uses httparse
for parsing HTTP/1.x headers. Anyway, this server has a weird bug where it consistently returns a single corrupted header line in an otherwise completely valid response (the header contains unescaped non-token characters). Specifically, for some reason it tries to send the DOCTYPE
as a header. The bug is unlikely to be fixed (this is old software), but it isn't really a problem because the page displays fine in all major browsers.
It seems that all major browsers simply ignore invalid header lines. However, httparse
returns an error that aborts the entire parsing process. IMO this is a problem and should be fixed.
Here's a screenshot from Chrome that shows the invalid header being ignored:
In fact, Chrome's behavior is commented as: "skip malformed header".
Although technically changing this could be breaking, in this case, I can't imagine that any code would rely on response parsing to fail in this particular case.
Here are the relevant lines:
Lines 594 to 595 in 6f696f5
Lines 613 to 614 in 6f696f5
Line 672 in 6f696f5
Lines 613 to 614 in 6f696f5
I think all of these would be fixed by consuming b
until the next newline then continue 'headers
. I would open a PR but I just want to check that you agree that this change should be made.
Hello Team,
This doesnt qualifiy as an issue but more of an example request. I am trying to build a Rust client for making a http api call which is based on no-std. I am new to
Please can give me an example for the following api calls with httparse crate.
Call to get Ouath Token in reponse passing username and password:
curl -d "username=hello&password=world" -H "Content-Type: application/x-www-form-urlencoded" -X POST http://localhost:8080/login
Call to get data using the token
curl --get http://localhost:8080/user/api/data --header "Authorization: Bearer eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiJ9.eyJzdWIiOiJyZW5hdWx0IiwibmJmIjoxNjMxNzEzMTcyLCJleHAiOjE2MzE3OTk1NzIsInVzZXJJZCI6IjEiLCJhdXRob3JpdGllcyI6IlVTRVIiLCJ1c2VybmFtZSI6InJlbmF1bHQifQ.Ktsg_084LPg8KSZnKqdloRjdHBQzEeuBGAka8CHcrUIA6kubvMGBq03qWYKhUP-_FrBZKOd5eb2DuUt24K0TLcaM-meGNtUvSDU-0wVZIxEwgSTHbVZ2QRf9eNuSkcW7s1QHg29hzxZ2_f2KHNZWqVjSs4JxqExXYPkxLhidYT8d_22oLWcDMtnfdUZ6fhmsRZ-jV0h-sB_zV0z3dBY9ZNL_KduYhdCGzXpGPjfpieYJAieDqc2P1Gy2N1gk88eCKvsYAs011egSBmhRGy-fJuU_Y4rlvdxa5I6pjec_vvWBMVMxGtyxLjHCFn8VQW59DnUA5hEOVRzgv7l_IiZrvA"
Thanks for the help.
Hello, first, thanks for making this tool.
I wanted to point out your benchmark is a bit unfair as you compare httparse sse4 against picohttpparser without sse4. The reason picohttpparser doesn't have sse4 is because your dependency 'pico-sys' does not compile picohttpparser with sse4 enabled.
Your benchmark showed a ~60% improvement in performance for 'bench_pico' once sse4 was enabled in the underlying crate.
I forked the underlying crate 'pico-sys' and made a few modifications if you want to verify my results:
AFAICT the HTTP spec does not limit the chunk size. Yet, parse_chunk_size
tries to parse it into a u64
. If a value is provided that doesn't fit in a u64
, the multiplication operator will overflow in release mode or panic in debug mode. Instead, the function should return an error. This might be a good use for checked_mul.
Hello !
I discovered recently that many user agents send raw UTF8 bytes in HTTP paths, and proxies forward them without issues. See actix/actix-web#3102
However, it looks like this library fails to parse such HTTP requests, preventing web servers from handling them.
Would it be possible to handle such requests in this library ?
This code does not uphold Rust safety invariants: either debug_assert!()
should be assert!()
or the function must be marked unsafe fn
:
Lines 38 to 43 in 6f696f5
Also, it's weird to see a custom function for this - slice[..len]
looks like it should be sufficient, but I'm probably missing something.
I'm trying to parse multipart/form-data
form submissions. I can extract and pass on the header lines of each part into a parser. But which parser? parse_headers() is private and its not obvious how to use it. Advice?
It might be inconsistent to have a space in header names, but some legacy systems have such. Is it possible to support them in the httparse?
For my purpose i need to completely ignore headers, i am only interested in request method and path.
Is there a way to ignore TooManyHeaders
error and get all other request data?
We're using actix-web library in our project and it uses this library for HTTP-parsing.
We are in a situation where we need to accept requests with non RFC2396 characters (like caret ^
) in query parameters.
Urls are considered human interface and humans shouldn't be expected to handle urlencoding in these situations.
We checked other parser implementations from Nginx (https://github.com/nginx/nginx/blob/master/src/http/ngx_http_parse.c#L13) and Node.js (https://github.com/nodejs/http-parser/blob/master/http_parser.c#L187) and they seem to be more liberal by allowing more characters than httparse imlementation.
Should we find out some workaround or would same kind of implementation be in the scope of httparse?
Hi,
First, thanks for this great library!
Is it possible to parse around the edge of a ring buffer boundary? E.g by providing two buffers as input to the parser? Currently I'm shifting all remaining bytes down after every successful parse but it would be nice to not have to do that. Any advice?
code:
#![feature(plugin)]
#![plugin(afl_coverage_plugin)]
extern crate afl_coverage;
extern crate httparse;
use std::io::{self, Read};
fn main() {
let mut input = String::new();
let result = io::stdin().read_to_string(&mut input);
if result.is_ok() {
/*
{
let mut headers = [httparse::EMPTY_HEADER; 16];
let mut req = httparse::Request::new(&mut headers);
req.parse(input.as_bytes());
}
*/
{
let mut headers = [httparse::EMPTY_HEADER; 16];
let mut res = httparse::Response::new(&mut headers);
res.parse(input.as_bytes());
}
}
input: (this is encoded in base64, decode it before feeding it in)
SFRUUC8xLjESMjAw
error:
root@vultr:~/afl-staging-area2# cargo run < outputs/crashes/id:000002,sig:04,src:000001,op:havoc,rep:2
Running `target/debug/afl-staging-area2`
thread '<main>' panicked at 'arithmetic operation overflowed', /root/httparse/src/lib.rs:34
An unknown error occurred
To learn more, run the command again with --verbose.
This bug was found using https://github.com/kmcallister/afl.rs ๐
RFC7230#Field parsing states that parsers should remove leading in trailing whitespace from header field values. Currently httparse only removes leading whitespace, removing trailing whitespace would make httparse easier to use for other crates.
The following two requests should be parsed the same:
GET / HTTP/1.1\r\nHost: foo.com\r\nUser-Agent: foobarsoft\r\n\r\n
GET / HTTP/1.1\r\nHost: foo.com\r\nUser-Agent: foobarsoft \t \t \r\n\r\n
I was trying to use hyper and reqwest on a project yesterday to communicate to a server internal to my workplace. When trying to create a client using either crate, it returns an HTTP(Status) error, stating "Invalid Status provided".
Here's the strace of the executable, changed only to remove private server details from the request and response headers:
sendto(3, "GET /{OMITTED} HTTP/1.1\r\nHost: {OMITTED}\r\nAccept: */*\r\nUser-Agent: reqwest/0.1.0\r\n\r\n", 103, 0, NULL, 0) = 103
read(3, "HTTP/1.1 200\r\nServer: nginx/1.4.6 (Ubuntu)\r\nDate: Fri, 02 Dec 2016 21:18:20 GMT\r\nContent-Type: text/html\r\nTransfer-Encoding: chunked\r\nConnection: keep-alive\r\nContent-Language: en\r\n\r\n1fba\r\n<html>\n<head>
{Output Truncated}
write(1, "Failed because: Invalid Status provided\n", 40Failed because: Invalid Status provided
) = 40
+++ exited with 0 +++
After discussing this in the rust IRC, we've suspect that httparse might be failing on the fact that the custom webserver is returning "HTTP/1.1 200\r\n", with no SP or Reason Phrase following the status code before the CRLF.
CCing @joshtriplett on this, as well, since he helped me find the issue
Sorry, I edited my issue, mixed up two problems at once.
curl -v "https://videoroll.net/vpaut_option_get.php?pl_id=6577"
Note the space after Access-Control-Allow-Credentials
.
< HTTP/1.1 200 OK
< Server: nginx/1.16.0
< Date: Tue, 16 Mar 2021 09:03:39 GMT
< Content-Type: text/json;charset=utf-8
< Transfer-Encoding: chunked
< Connection: keep-alive
< Access-Control-Allow-Origin: *
< Access-Control-Allow-Credentials : true
< Expires: Tue, 23 Mar 2021 09:03:39 GMT
< Cache-Control: max-age=604800
What does the spec says about this?
No whitespace is allowed between the header field-name and colon. In the past, differences in the handling of such whitespace have led to security vulnerabilities in request routing and response handling. A server MUST reject any received request message that contains whitespace between a header field-name and colon with a response code of 400 (Bad Request). A proxy MUST remove any such whitespace from a response message before forwarding the message downstream.
So first, I want to point out that the security vulnerabilities can't really happen through httparse (or at the very least, through Hyper), given Hyper does not keep around the literal text of the HTTP request, and thus those pesky spaces cannot be passed to anyone downstream AFAIK.
Second, and this is what matters to me (wearing my work hat), is that the spec itself implies that a proxy has to be able to parse those anyway to remove the headers, thus it needs to successfully be able to parse such a response. httparse (and thus Hyper) does not let any of that pass through, which is a problem for people implementing proxies.
Making a patch that makes those spaces not fail the entire parse is pretty trivial, but I wonder if we want a relaxed_response_parsing
option of some sort. Such an option exists in Squid.
In the browser space, Firefox (and AFAIK Chrome too) happily ignore spaces between the header name and the colon, and will also ignore spaces in header names themselves (ignoring the whole "name: value" pair and just skipping to the next header in the response.
curl -v https://crlog-crcn.adobe.com/crcn/PvbPreference -X POST
Note el famoso Updated Preferences: []
header in the response.
> POST /crcn/PvbPreference HTTP/1.1
> Host: crlog-crcn.adobe.com
> User-Agent: curl/7.64.1
> Accept: */*
>
< HTTP/1.1 200
< Updated Preferences: []
< Content-Length: 2
< Date: Tue, 16 Mar 2021 08:43:38 GMT
<
We can discuss whether or not to let that through at a later date.
I would like to build a function
fn read_headers<'a,'b,R: BufRead>(
stream: &mut R,
buf: &'b mut Vec<u8>,
headers: &'a mut [Header<'b>]
) -> Result<Request<'a,'b>,E>
that reads as much from stream
into buf
as necessary to get a Complete
return from Request::parse
. This turns out to not be trivial and require lots of extra allocations and work.
Here's what I came up with:
fn read_headers<'a,'b,R: BufRead>(clnt: &mut R, buf: &'b mut Vec<u8>, headers: &'a mut [Header<'b>]) -> Result<Request<'a,'b>,String> {
fn extend_and_parse<R: BufRead>(clnt: &mut R, headers: &mut [Header]) -> Result<Vec<u8>,String> {
let mut buf=Vec::<u8>::new();
let len=headers.len();
loop {
let buf_orig_len=buf.len();
let additional_len={
let additional=try!(clnt.fill_buf().map_err(|e|e.to_string()));
buf.extend_from_slice(additional);
additional.len()
};
let mut headers=Vec::with_capacity(len);
headers.resize(len,httparse::EMPTY_HEADER);
let mut req=Request::new(&mut headers);
match req.parse(&buf) {
Ok(httparse::Status::Complete(n)) => {
clnt.consume(n-buf_orig_len);
break
},
Ok(httparse::Status::Partial) => {
clnt.consume(additional_len);
}
Err(e) => return Err(format!("HTTP parse error {:?}",e)),
};
}
Ok(buf)
}
let result=extend_and_parse(clnt,headers);
result.map(move|nb|{
::core::mem::replace(buf,nb);
let mut req=Request::new(headers);
req.parse(buf);
req
})
}
The main issues are having to allocate a new array of temporary headers for every iteration, and having to parse the succesful result twice.
I think this is partially Rust's fault, but also partially httparse
for having a not so great API. For example, the lifetime of everything is fixed upon Request
creation, so that a parse failure doesn't release the borrow of the input buffer.
Context: hyper
passes uninitialized array of httparse::Header
to Request::new
.
This is undefined behavior and I discovered it while working on Replacing mem::uninitialized
with MaybeUninit
I acknowledge that changing headers
field will be a breaking change, because it is public, so
is there a chance that a Request
type copy could be implemented, but with headers: &mut [MaybeUninit<Header>]
?
EDIT: I just realized that author of httparse
is also author of hyper
๐
These were added in #40, but understanding exactly what is happening is difficult. It'd be best to document what in the world is happening :)
Beginner here. Have not understood lifetimes completely. I am trying to return a parsed Request
from a function like this -
fn read_and_parse(mut stream: TcpStream) -> Option<Request> {
..
return req;
}
I get this error -
error[E0107]: wrong number of lifetime parameters: expected 2, found 0
--> src/main.rs:64:61
|
64 | fn read_and_parse(mut stream: TcpStream) -> Option<&'static Request> {
| ^^^^^^^ expected 2 lifetime parameters
error: aborting due to previous error
How do I solve this?
Miri is currently choking on the use of is_x86_feature_detected!()
here: https://github.com/seanmonstar/httparse/blob/master/src/simd/mod.rs#L74
Miri is probably not going to be supporting SIMD anytime soon anyway so it'd be nice if we could use #[cfg(miri)]
to turn off feature detection entirely and just use the naive algorithms.
This could potentially happen automatically but this crate may need attention in other places in order to work in Miri.
It seems that commit alexcrichton/libc@1791046 in the libc dependency broke bench_pico
benchmark. Type alias size_t
now stands for usize
, not for u32
/u64
(architecture dependent) as it did before.
In the "Usage" part of the README file:
assert!(req.parse(buf)?.is_partial());
throws an error:
cannot use the ? operator in a function that returns ()
The correct usage is as in the documentation:
req.parse(buf).unwrap().is_partial()
I try to launch:
cargo +nightly clippy
on my project and I get this error:
error[E0658]: macro is_x86_feature_detected! is unstable (see issue #0)
--> /Users/marco/.cargo/registry/src/github.com-1ecc6299db9ec823/httparse-1.3.2/src/simd/mod.rs:72:48
|
72 | if cfg!(target_arch = "x86_64") && is_x86_feature_detected!("avx2") {
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= help: add #![feature(stdsimd)] to the crate attributes to enable
error[E0658]: macro is_x86_feature_detected! is unstable (see issue #0)
--> /Users/marco/.cargo/registry/src/github.com-1ecc6299db9ec823/httparse-1.3.2/src/simd/mod.rs:75:23
|
75 | } else if is_x86_feature_detected!("sse4.2") {
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= help: add #![feature(stdsimd)] to the crate attributes to enable
error: aborting due to 2 previous errors
I have tried to add httparse in my dependencies with:
#![feature(stdsimd)]
extern crate httparse;
but it stay similar.
Is it possible to have some help on it ?
Thanks,
Marc-Antoine
Lines 284 to 285 in 6f696f5
It suggests version
could contain HTTP/1.1
, but version
is a u8
. I believe it should say something like "For HTTP/1.1
this will contain 1
".
tldr: Please change the documentation demo line from
let mut headers = [httparse::EMPTY_HEADER; 16];
let mut req = httparse::Request::new(&mut headers);
to
let mut headers = [httparse::EMPTY_HEADER; 64];
let mut req = httparse::Request::new(&mut headers);
This is a really silly bug that bit us.
Our server uses httparse. Since we weren't quite sure how many headers we needed to pre-allocate, we used the demo code's suggested 16 headers. Unfortunately, with work going on around the new Sec-*
headers, we suddenly saw a spike of clients trying to connect with 17 headers. It took us a bit to figure out what was going on and why these connections were mysteriously failing.
Boosting the count might help others not suddenly hit this problem.
It would be nice, if one could (at least had an option to) get the body of request, while parsing. Body of HTTP request is a byte sequence anyway, so it just could be stored in Request struct as &[u8], and further manipulation could be left to the user.
I've get a pcap file from tcpdump.
I tried to use httparse to process those packet. However, I found it well return Status::Complete even if packet is not complete.
I think this is caused by not processing the body.
As a rust novice, I translated the relevant code from golang's http library and it works fine now. How do I submit this code. Or can someone help me submit the code.
Here is my hyper trace
TRACE hyper::client::pool > checkout waiting for idle connection: "http://10.200.14.75:8000"
TRACE hyper::client::connect::http > Http::connect; scheme=http, host=10.200.14.75, port=Some(8000)
DEBUG hyper::client::connect::http > connecting to 10.200.14.75:8000
DEBUG hyper::client::connect::http > connected to Some(V4(10.200.14.75:8000))
TRACE hyper::client::conn > client handshake HTTP/1
TRACE hyper::client > handshake complete, spawning background dispatcher task
TRACE hyper::proto::h1::conn > flushed({role=client}): State { reading: Init, writing: Init, keep_alive: Busy }
TRACE hyper::client::pool > checkout dropped for "http://10.200.14.75:8000"
TRACE hyper::proto::h1::role > Client::encode method=POST, body=Some(Known(6))
DEBUG hyper::proto::h1::io > flushed 152 bytes
TRACE hyper::proto::h1::conn > flushed({role=client}): State { reading: Init, writing: KeepAlive, keep_alive: Busy }
TRACE hyper::proto::h1::conn > Conn::read_head
DEBUG hyper::proto::h1::io > read 172 bytes
TRACE hyper::proto::h1::role > Response.parse([Header; 100], [u8; 172])
TRACE hyper::proto::h1::conn > State::close_read()
DEBUG hyper::proto::h1::conn > parse error (invalid HTTP status-code parsed) with 172 bytes
DEBUG hyper::proto::h1::dispatch > read_head error: invalid HTTP status-code parsed
Error invalid HTTP status-code parsed
TRACE hyper::proto::h1::conn > State::close()
TRACE hyper::proto::h1::conn > flushed({role=client}): State { reading: Closed, writing: Closed, keep_alive: Disabled }
TRACE hyper::proto::h1::conn > shut down IO complete
Same call works fine with curl cmd
curl -d 'abcdef1234abcdef=27062019112303|' http://10.200.14.75:8000 -v
POST / HTTP/1.1
Host: 10.200.14.75:8000
User-Agent: curl/7.47.0
Accept: /
Content-Length: 32
Content-Type: application/x-www-form-urlencoded
Any idea what is wrong here. My rust code is
extern crate hyper;
extern crate pretty_env_logger;
use std::io::{self, Write};
use hyper::{Client, Request, Body};
use hyper::rt::{self, Future, Stream};
fn main() {
pretty_env_logger::init();
let url = "http://10.200.14.75:8000".to_string();
let url = url.parse::<hyper::Uri>().unwrap();
if url.scheme_part().map(|s| s.as_ref()) != Some("http") {
println!("This example only works with 'http' URLs.");
return;
}
rt::run(fetch_url(url));
}
fn fetch_url(url: hyper::Uri) -> impl Future<Item=(), Error=()> {
let mut request = Request::builder();
let req = request
.method("POST")
.uri("http://10.200.14.75:8000")
.header("User-Agent", "my-awesome-agent/1.0")
.header("Content-Type", "application/x-www-form-urlencoded").body(Body::from("Hallo!"))
.expect("request builder");
let client = Client::builder()
.keep_alive(true)
.build_http();
client
.request(req)
.and_then(|res| {
println!("Response: {}", res.status());
println!("Headers: {:#?}", res.headers());
res.into_body().for_each(|chunk| {
io::stdout().write_all(&chunk)
.map_err(|e| panic!("example expects stdout is open, error={}", e))
})
})
.map(|_| {
println!("\n\nDone.");
})
.map_err(|err| {
eprintln!("Error {}", err);
})
}
This issue was automatically generated. Feel free to close without ceremony if
you do not agree with re-licensing or if it is not possible for other reasons.
Respond to @cmr with any questions or concerns, or pop over to
#rust-offtopic
on IRC to discuss.
You're receiving this because someone (perhaps the project maintainer)
published a crates.io package with the license as "MIT" xor "Apache-2.0" and
the repository field pointing here.
TL;DR the Rust ecosystem is largely Apache-2.0. Being available under that
license is good for interoperation. The MIT license as an add-on can be nice
for GPLv2 projects to use your code.
The MIT license requires reproducing countless copies of the same copyright
header with different names in the copyright field, for every MIT library in
use. The Apache license does not have this drawback. However, this is not the
primary motivation for me creating these issues. The Apache license also has
protections from patent trolls and an explicit contribution licensing clause.
However, the Apache license is incompatible with GPLv2. This is why Rust is
dual-licensed as MIT/Apache (the "primary" license being Apache, MIT only for
GPLv2 compat), and doing so would be wise for this project. This also makes
this crate suitable for inclusion and unrestricted sharing in the Rust
standard distribution and other projects using dual MIT/Apache, such as my
personal ulterior motive, the Robigalia project.
Some ask, "Does this really apply to binary redistributions? Does MIT really
require reproducing the whole thing?" I'm not a lawyer, and I can't give legal
advice, but some Google Android apps include open source attributions using
this interpretation. Others also agree with
it.
But, again, the copyright notice redistribution is not the primary motivation
for the dual-licensing. It's stronger protections to licensees and better
interoperation with the wider Rust ecosystem.
To do this, get explicit approval from each contributor of copyrightable work
(as not all contributions qualify for copyright, due to not being a "creative
work", e.g. a typo fix) and then add the following to your README:
## License
Licensed under either of
* Apache License, Version 2.0, ([LICENSE-APACHE](LICENSE-APACHE) or http://www.apache.org/licenses/LICENSE-2.0)
* MIT license ([LICENSE-MIT](LICENSE-MIT) or http://opensource.org/licenses/MIT)
at your option.
### Contribution
Unless you explicitly state otherwise, any contribution intentionally submitted
for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any
additional terms or conditions.
and in your license headers, if you have them, use the following boilerplate
(based on that used in Rust):
// Copyright 2016 httparse Developers
//
// Licensed under the Apache License, Version 2.0, <LICENSE-APACHE or
// http://apache.org/licenses/LICENSE-2.0> or the MIT license <LICENSE-MIT or
// http://opensource.org/licenses/MIT>, at your option. This file may not be
// copied, modified, or distributed except according to those terms.
It's commonly asked whether license headers are required. I'm not comfortable
making an official recommendation either way, but the Apache license
recommends it in their appendix on how to use the license.
Be sure to add the relevant LICENSE-{MIT,APACHE}
files. You can copy these
from the Rust repo for a plain-text
version.
And don't forget to update the license
metadata in your Cargo.toml
to:
license = "MIT OR Apache-2.0"
I'll be going through projects which agree to be relicensed and have approval
by the necessary contributors and doing this changes, so feel free to leave
the heavy lifting to me!
To agree to relicensing, comment with :
I license past and future contributions under the dual MIT/Apache-2.0 license, allowing licensees to chose either at their option.
Or, if you're a contributor, you can check the box in this repo next to your
name. My scripts will pick this exact phrase up and check your checkbox, but
I'll come through and manually review this issue later as well.
Hi,
It is not an issue, but more a question. There is no function to easily search/find a header in the header array. Is it because the need was not there it has not been done by anybody or is it because I missed something that could help me to do a such job ?
Same thing with the method. It returns you an Option<&Str>, but could be useful to know quickly if it is a GET, a POST...
For now, I wrote function on my side to help me, but I'm wondering if it could be useful in the crate itself.
Thanks
In the header name map, index 34 is false. So when parsing headers, the program will return Error::HeaderName when meeting double quotes(ascii number is 34) . However, bouble quotes in headers can be parsed correctly in chrome. Is this intentional or a bug?
Test Example:
#[test]
fn test_double_quotes() {
use std::mem;
let bytes= b"HTTP/1.1 200 OK\r\nServer: nginx/1.14.2\r\nDate: Mon, 25 Jan 2021 06:20:06 GMT\r\nContent-Type: image/png\r\nContent-Length: 24623\r\nConnection: keep-alive\r\n\"Access-Control-Allow-Origin: *\"\r\nAccept-Ranges: bytes\r\nAccess-Control-Allow-Origin: *\r\nCache-Control: 2592000\r\n\r\n";
let mut headers: [Header; 10] = unsafe { mem::uninitialized() };
let mut res = Response::new(&mut headers);
let parsed_res = res.parse(bytes);
println!("parsed res= {:?}", parsed_res);
}
I am building a HTTP server on top of tokio that needs to perform minimal memory allocations per client TCP connection, regardless of how many HTTP requests are received through the connection. Therefore I have chosen to use a fixed-size circular buffer to store the raw request data read from the wire, and now I am trying to use httparse to parse the request information. The problem I have run into is that the Request.parse
function takes in a single &[u8], but because I'm using a circular buffer I have two slices - one for the bytes in the remainder of the buffer, and one (optionally) for the bytes which wrapped around to the front of the buffer. This two-buffer approach works very well with vectored IO reads, but not so well so far with httparse.
At first I was hoping I that httparse's Request
type would be persistent, so I could call parse
in turn for both of the slices. But that appears to not be how the API works - it expects the one slice to have all the data, and when you call it a second time the same data should still be present, only with more added to the end.
Consequently, the only way I can find to use httparse today is to perform a copy of the data from the circular buffer into a secondary contiguous buffer. But the cost of such copying is potentially significant and I'd prefer to avoid it where possible. How feasible would it be to add some sort of parse_vectored
function to httpparse which takes a &[std::io::IoSlice]
?
According to RFC7230, header field values seem to be able to include horizontal tabs.
While updating the rust-httpparse package in Debian I noticed a failure building the tests with --no-default-features, due to use of std::u64::MAX, replacing it with core::u64::MAX fixed the issue.
Noticed this in RFC-7230:
A user agent that receives an obs-fold in a response message that is
not within a message/http container MUST replace each received
obs-fold with one or more SP octets prior to interpreting the field
value.
I think currently httparse will just reject a line fold. This paragraph seems to suggest it should reinterpret it with spaces. Is this interpretation correct?
I am running into this problem (and an invalid SOH character) with an internal server.
Thanks.
Our app parses a request but doesn't need to do anything with it until the body is also received. In order to allow the underlying buffer to be used for reading the body, and also to avoid parsing the request twice, the app converts the various slices (method, path, headers) into integer indexes within the underlying buffer and then drops the Request
in order to unborrow the buffer. Later on, the slices can be reestablished using the indexes.
As of httparse 1.8, our app started failing due to slice locations potentially existing outside of the buffer, likely due to "GET" and "POST" now being returned as static strings.
In hindsight I suppose we were abusing the API. httparse never guaranteed the slices would always point within the buffer. It was just an easy assumption to make since httparse is known to do in-place parsing without copying. I'm not sure if anything should be changed in httparse and we will look at reworking our code. Posting this in case anyone else ran into the same issue.
I'm not sure if this is the right place to ask this, but I'm looking through the documentation, and I don't see how I could use this library to get the body of a request or a response. Neither of the two structs have a field for body, and the parse
method just gives back an index, is that the starting byte-index of where the body should begin?
The specific error I get is "error[E0425]: cannot find function _tzcnt_u64
in this scope". I assume this is because it's a 32 bit platform but it could be something else.
Simplest reproduction.
extern crate httparse;
use httparse::{EMPTY_HEADER, parse_headers};
fn main() {
let mut buf = *b"Foo: Bar\r\n\r\n";
let mut headers = [EMPTY_HEADER];
let headers_len = {
let (_, headers) = parse_headers(&mut buf, &mut headers).unwrap().unwrap();
headers.len()
} ;
assert_eq!(headers_len, 1);
buf[0] = b'B';
// Prints "Boo"
println!("{:?}", headers[0].name);
}
As you can see, parse_headers()
allows borrows to buf
to escape in headers
, creating a double-borrow where the original buffer can be mutated while views to it exist.
Discovered by accident, I was working on some infinite-loop bugs in multipart when I took a double-take at this function and thought, "Wait a minute, how the hell did this work to begin with?" The r.consume()
at 80 shouldn't be allowed, but the borrow is escaping.
This crate could implement the parsing mechanism for the http crate, which is already used by everyone, this would be really cool, so this lib serializes and deserializes http::Request
and http::Response
.
When parsing a response, the number of headers is limited.
However, if an application knows upfront which headers it cares about, all other headers could be ignored, and simply skipped from returning.
It would be great if the user could provide a list of "expected" (or "allowed") headers. And only those headers would get recorded for the result.
Currently, it appears to return Status::Partial
even though a zero-sized buf generally means the end of a stream. It should probably return Status::Complete((0, &[]))
instead.
It would be nice to be able to access the error descriptions as &'static str
via the now private Error::description_str
, especially because std::error::Error::description
is deprecated.
Typical offenders:
HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
or
HTTP/1.1 404 Not Found
I'm not sure if this is worth fixing, but it will almost certainly break some things (it broke our tests at least):
In 1.4, parsing a request would extract headers if the string was terminated with a single newline, however, that behavior changed in 1.5 (now two newlines are required). A reproducer is below:
#[test]
fn test_httparse() {
let works_1_4_1 = b"GET /?Param2=value2&Param1=value1 HTTP/1.1\nHost:example.com\n";
let works_1_5 = b"GET /?Param2=value2&Param1=value1 HTTP/1.1\nHost:example.com\n\n";
let mut headers = [httparse::EMPTY_HEADER; 64];
let n_headers =
|req: httparse::Request| req.headers.iter().filter(|h| !h.name.is_empty()).count();
{
// test for 1.4.1
let mut req = httparse::Request::new(&mut headers);
let _ = req.parse(works_1_4_1).unwrap();
assert_eq!(n_headers(req), 1, "failed on 1.4.1 test");
}
{
// test for 1.5
let mut req = httparse::Request::new(&mut headers);
let _ = req.parse(works_1_5).unwrap();
assert_eq!(n_headers(req), 1);
}
}
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.