Giter Site home page Giter Site logo

ureq's Introduction

ureq

A simple, safe HTTP client.

Ureq's first priority is being easy for you to use. It's great for anyone who wants a low-overhead HTTP client that just gets the job done. Works very well with HTTP APIs. Its features include cookies, JSON, HTTP proxies, HTTPS, interoperability with the http crate, and charset decoding.

Ureq is in pure Rust for safety and ease of understanding. It avoids using unsafe directly. It uses blocking I/O instead of async I/O, because that keeps the API simple and keeps dependencies to a minimum. For TLS, ureq uses rustls or native-tls.

See the changelog for details of recent releases.

Usage

In its simplest form, ureq looks like this:

fn main() -> Result<(), ureq::Error> {
    let body: String = ureq::get("http://example.com")
        .set("Example-Header", "header value")
        .call()?
        .into_string()?;
    Ok(())
}

For more involved tasks, you'll want to create an Agent. An Agent holds a connection pool for reuse, and a cookie store if you use the "cookies" feature. An Agent can be cheaply cloned due to an internal Arc and all clones of an Agent share state among each other. Creating an Agent also allows setting options like the TLS configuration.

  use ureq::{Agent, AgentBuilder};
  use std::time::Duration;

  let agent: Agent = ureq::AgentBuilder::new()
      .timeout_read(Duration::from_secs(5))
      .timeout_write(Duration::from_secs(5))
      .build();
  let body: String = agent.get("http://example.com/page")
      .call()?
      .into_string()?;

  // Reuses the connection from previous request.
  let response: String = agent.put("http://example.com/upload")
      .set("Authorization", "example-token")
      .call()?
      .into_string()?;

Ureq supports sending and receiving json, if you enable the "json" feature:

  // Requires the `json` feature enabled.
  let resp: String = ureq::post("http://myapi.example.com/ingest")
      .set("X-My-Header", "Secret")
      .send_json(ureq::json!({
          "name": "martin",
          "rust": true
      }))?
      .into_string()?;

Error handling

ureq returns errors via Result<T, ureq::Error>. That includes I/O errors, protocol errors, and status code errors (when the server responded 4xx or 5xx)

use ureq::Error;

match ureq::get("http://mypage.example.com/").call() {
    Ok(response) => { /* it worked */},
    Err(Error::Status(code, response)) => {
        /* the server returned an unexpected status
           code (such as 400, 500 etc) */
    }
    Err(_) => { /* some kind of io/transport error */ }
}

More details on the Error type.

Features

To enable a minimal dependency tree, some features are off by default. You can control them when including ureq as a dependency.

ureq = { version = "*", features = ["json", "charset"] }

  • tls enables https. This is enabled by default.
  • native-certs makes the default TLS implementation use the OS' trust store (see TLS doc below).
  • cookies enables cookies.
  • json enables Response::into_json() and Request::send_json() via serde_json.
  • charset enables interpreting the charset part of the Content-Type header (e.g. Content-Type: text/plain; charset=iso-8859-1). Without this, the library defaults to Rust's built in utf-8.
  • socks-proxy enables proxy config using the socks4://, socks4a://, socks5:// and socks:// (equal to socks5://) prefix.
  • native-tls enables an adapter so you can pass a native_tls::TlsConnector instance to AgentBuilder::tls_connector. Due to the risk of diamond dependencies accidentally switching on an unwanted TLS implementation, native-tls is never picked up as a default or used by the crate level convenience calls (ureq::get etc) – it must be configured on the agent. The native-certs feature does nothing for native-tls.
  • gzip enables requests of gzip-compressed responses and decompresses them. This is enabled by default.
  • brotli enables requests brotli-compressed responses and decompresses them.
  • http-interop enables conversion methods to and from http::Response and http::request::Builder (v0.2).
  • http enables conversion methods to and from http::Response and http::request::Builder (v1.0).

Plain requests

Most standard methods (GET, POST, PUT etc), are supported as functions from the top of the library (get(), post(), put(), etc).

These top level http method functions create a Request instance which follows a build pattern. The builders are finished using:

JSON

By enabling the ureq = { version = "*", features = ["json"] } feature, the library supports serde json.

Content-Length and Transfer-Encoding

The library will send a Content-Length header on requests with bodies of known size, in other words, those sent with .send_string(), .send_bytes(), .send_form(), or .send_json(). If you send a request body with .send(), which takes a Read of unknown size, ureq will send Transfer-Encoding: chunked, and encode the body accordingly. Bodyless requests (GETs and HEADs) are sent with .call() and ureq adds neither a Content-Length nor a Transfer-Encoding header.

If you set your own Content-Length or Transfer-Encoding header before sending the body, ureq will respect that header by not overriding it, and by encoding the body or not, as indicated by the headers you set.

let resp = ureq::post("http://my-server.com/ingest")
    .set("Transfer-Encoding", "chunked")
    .send_string("Hello world");

Character encoding

By enabling the ureq = { version = "*", features = ["charset"] } feature, the library supports sending/receiving other character sets than utf-8.

For response.into_string() we read the header Content-Type: text/plain; charset=iso-8859-1 and if it contains a charset specification, we try to decode the body using that encoding. In the absence of, or failing to interpret the charset, we fall back on utf-8.

Similarly when using request.send_string(), we first check if the user has set a ; charset=<whatwg charset> and attempt to encode the request body using that.

Proxying

ureq supports two kinds of proxies, HTTP (CONNECT), SOCKS4 and SOCKS5, the former is always available while the latter must be enabled using the feature ureq = { version = "*", features = ["socks-proxy"] }.

Proxies settings are configured on an Agent (using [AgentBuilder]). All request sent through the agent will be proxied.

Example using HTTP

fn proxy_example_1() -> std::result::Result<(), ureq::Error> {
    // Configure an http connect proxy. Notice we could have used
    // the http:// prefix here (it's optional).
    let proxy = ureq::Proxy::new("user:[email protected]:9090")?;
    let agent = ureq::AgentBuilder::new()
        .proxy(proxy)
        .build();

    // This is proxied.
    let resp = agent.get("http://cool.server").call()?;
    Ok(())
}

Example using SOCKS5

fn proxy_example_2() -> std::result::Result<(), ureq::Error> {
    // Configure a SOCKS proxy.
    let proxy = ureq::Proxy::new("socks5://user:[email protected]:9090")?;
    let agent = ureq::AgentBuilder::new()
        .proxy(proxy)
        .build();

    // This is proxied.
    let resp = agent.get("http://cool.server").call()?;
    Ok(())
}

HTTPS / TLS / SSL

On platforms that support rustls, ureq uses rustls. On other platforms, native-tls can be manually configured using [AgentBuilder::tls_connector].

You might want to use native-tls if you need to interoperate with servers that only support less-secure TLS configurations (rustls doesn't support TLS 1.0 and 1.1, for instance). You might also want to use it if you need to validate certificates for IP addresses, which are not currently supported in rustls.

Here's an example of constructing an Agent that uses native-tls. It requires the "native-tls" feature to be enabled.

  use std::sync::Arc;
  use ureq::Agent;

  let agent = ureq::AgentBuilder::new()
      .tls_connector(Arc::new(native_tls::TlsConnector::new()?))
      .build();

Trusted Roots

When you use rustls (tls feature), ureq defaults to trusting webpki-roots, a copy of the Mozilla Root program that is bundled into your program (and so won't update if your program isn't updated). You can alternately configure rustls-native-certs which extracts the roots from your OS' trust store. That means it will update when your OS is updated, and also that it will include locally installed roots.

When you use native-tls, ureq will use your OS' certificate verifier and root store.

Blocking I/O for simplicity

Ureq uses blocking I/O rather than Rust's newer asynchronous (async) I/O. Async I/O allows serving many concurrent requests without high costs in memory and OS threads. But it comes at a cost in complexity. Async programs need to pull in a runtime (usually async-std or tokio). They also need async variants of any method that might block, and of any method that might call another method that might block. That means async programs usually have a lot of dependencies - which adds to compile times, and increases risk.

The costs of async are worth paying, if you're writing an HTTP server that must serve many many clients with minimal overhead. However, for HTTP clients, we believe that the cost is usually not worth paying. The low-cost alternative to async I/O is blocking I/O, which has a different price: it requires an OS thread per concurrent request. However, that price is usually not high: most HTTP clients make requests sequentially, or with low concurrency.

That's why ureq uses blocking I/O and plans to stay that way. Other HTTP clients offer both an async API and a blocking API, but we want to offer a blocking API without pulling in all the dependencies required by an async API.


Ureq is inspired by other great HTTP clients like superagent and the fetch API.

If ureq is not what you're looking for, check out these other Rust HTTP clients: surf, reqwest, isahc, attohttpc, actix-web, and hyper.

ureq's People

Contributors

airkek avatar algesten avatar alyoshavasilieva avatar bbx0 avatar deluvi avatar dependabot-preview[bot] avatar dependabot[bot] avatar edevil avatar ekardnt avatar fauxfaux avatar johan-bjareholt avatar jsha avatar jyn514 avatar k3d3 avatar kade-robertson avatar kkazuo avatar kubuzetto avatar llde avatar lolgesten avatar marijns95 avatar mcr avatar messense avatar orf avatar owez avatar paolobarbolini avatar rawler avatar robyoung avatar rustysec avatar soruh avatar zu1k avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ureq's Issues

Add a Resolver interface

Right now ureq will always use to_socket_addrs to lookup hostnames. For testing, it would be useful to force all lookups to resolve to localhost and a test server running there. This is particularly useful when a variety of hostnames are needed, or when testing large transfers that would be expensive to send over a real network.

We should add an internal Resolver interface that uses to_socket_addrs by default, but can be overridden during testing to provide mocked-out results.

Cookies don't seem to get picked up within a chain or redirect requests

Hi,

I'm using ureq in a project to do ADFS authentication for AWS. The login mechanics basically work like this:

  1. GET the login page, which will set an auth-request cookie
  2. POST post credentials (along with the auth-request cookie)
  3. If the credentials are valid the server responds with a 302 to the same login page and an auth-response cookie
  4. The client then GETs the redirect URL with the auth request and response cookie, to which the server replies with an HTML page that includes a form with SAML assertion that can be used to login to AWS.

(If this sounds somewhat circuitous, it is, but that's how ADFS works.)

I was having trouble getting this to work with ureq. I'm using an Agent (for automatic cookie persistence) but the login procedure kept failing at step 4. On a hunch I used redirect(0) on the POST in step 2, extracted the cookies from the response and did the redirect request "manually" and things suddenly worked. This seems to indicate that cookies set during a request in a chain of redirects are not used in subsequent requests.

Access all response headers

Hey!

It seems as if the only way to access response headers right now is explicitly by header name. There are some cases where I'd like to iterate over all existing headers.

Maybe I've missed something, but otherwise it would be nice if there was an accessor for the headers field.

Cheers!

Upgrade webpki 0.19 -> 0.21

Would you be able to upgrade the current dependency on webpki from 0.19 to the latest, 0.21? I'm getting version conflicts when combining with other crates that would be solved by this upgrade.

Stream (socket) pooling depends on reading streams to completion

In one of my apps, I use ureq to read/write from a key/value server. When I perform writes, the status response is sufficient to know failure or success. However, if I do not read the response body, the connection doesn't return to the pool since the data remains on the stream. This seems odd and requires in my app adding an explicit read on the response in order to allow the stream to return to the pool. I was curious why this is the case and whether or not we could just flush the buffer:

https://github.com/algesten/ureq/blob/master/src/pool.rs#L218

Connection pool fails if server closes connection

The connection pool does not cope with HTTPS servers closing the connection, due to rustls being mean.

If I make a request, then wait for the connection to time out, then make another request, I see the error: BadStatus. This means that every other request I make fails.

This test fails, because https://fau.xxx/ is currently running an nginx with a keepalive_timeout of 2s:

#[test]
fn connection_reuse() {
    let agent = ureq::Agent::default().build();
    let resp = agent.get("https://fau.xxx/").call();

    // use up the connection so it gets returned to the pool
    assert_eq!(resp.status(), 200);
    resp.into_reader().read_to_end(&mut vec![]).unwrap();

    // wait for the server to close the connection
    std::thread::sleep(Duration::from_secs(3));

    // try and make a new request on the pool
    let resp = agent.get("https://fau.xxx/").call();
    if let Some(err) = resp.synthetic_error() {
        panic!("boom! {:?}", err);
    }
    assert_eq!(resp.status(), 200);
}

nginx defaults to 75s, some servers have much longer timeouts, some much shorter, but everyone will eventually see this problem.

I was hoping that attempting a write during send_prelude would trigger the retry code, #8, but this does not help.

I do not see a way to fix this right now. A read(&mut [])? in send_prelude doesn't trigger it, and we aren't expecting any data to be readable at that point, so clever buffering wouldn't help.

Retry only idempotent requests

At https://github.com/algesten/ureq/blob/master/src/unit.rs#L167, requests that fail on bad_status_read are retried, but only if no body bytes were sent. The HTTP RFC says:

A user agent MUST NOT automatically retry a request with a non-
idempotent method unless it has some means to know that the request
semantics are actually idempotent, regardless of the method, or some
means to detect that the original request was never applied.

It's possible to have a POST request with an empty body; that would be non-idempotent, but would also have zero body bytes sent.

Relatedly, the comments in unit.rs discuss "body bytes sent" which suggests this code could be run if a request with a body was made, but the error happened before any bytes of the body were sent. However, body_bytes_sent is only set if the whole body is successfully sent. I think it would be clearer to run the retry only if body's size is set and is zero.

request does not check Accept-Encoding, Accept-Encoding is not respected

When sending a request with 'Accept-Encoding: identity', a response with "Transfer-Encoding: gzipped" still seems to be possible.

When running the same request with curl on the same server, the response is not gzipped.

One possible source of error seems to be here:

.header("transfer-encoding")

"Transfer-Encoding" seems to be a response header, not a request header, so should this be checking Accept-Encoding? (reference)

Proxy env variables

Was curious about your thoughts of having the proxy code support reading from environmental variables (http_proxy, https_proxy)? Easy enough to do it in my library, but I wanted to sync in here first to see if it made sense to upstream it.

header: disallow white space between field name and colon

https://tools.ietf.org/html/rfc7230#section-3.2.4

No whitespace is allowed between the header field-name and colon. In
the past, differences in the handling of such whitespace have led to
security vulnerabilities in request routing and response handling. A
server MUST reject any received request message that contains
whitespace between a header field-name and colon with a response code
of 400 (Bad Request). A proxy MUST remove any such whitespace from a
response message before forwarding the message downstream.

Support compression

It would be nice to support DEFLATE compression.

miniz_oxide is the fastest Rust DEFLATE crate and it's 100% safe code. It is already used for this purpose in reqwest, attohttpc, etc. This dependency can be easily made optional if desired.

Support run-time provided certificate in addition to compile time default.

Right now certificate management is a compile-time decision. Ideally we could specify an optional server certificate via the Request API and if it is set, the https would leverage that certificate otherwise it would leverage the default provided by configure_certs. I'm happy to write this code but would like feedback before submitting a PR.

Panic in ureq::stream::connect_https on some websites

I've tested ureq by downloading homepages of the top million websites with it. I've found a panic in ring, and 13 out of 1,000,000 websites triggered a panic in ureq::stream::connect_https.

Steps to reproduce:
Run this simple program with "yardmaster2020.com" given as the only command-line argument.

The same website opens fine in Chrome. Full list of websites where this happens: amadriapark.com, bda.org.uk, egain.cloud, gdczt.gov.cn, hsu.edu.hk, mathewingram.com, roadrover.cn, srichinmoyraces.org, thetouchx.com, tradekorea.com, utest.com, wlcbcgs.cn, yardmaster2020.com

Backtrace:

thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: InvalidDNSNameError', src/libcore/result.rs:1189:5
stack backtrace:
   0: backtrace::backtrace::libunwind::trace
             at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.40/src/backtrace/libunwind.rs:88
   1: backtrace::backtrace::trace_unsynchronized
             at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.40/src/backtrace/mod.rs:66
   2: std::sys_common::backtrace::_print_fmt
             at src/libstd/sys_common/backtrace.rs:77
   3: <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt
             at src/libstd/sys_common/backtrace.rs:59
   4: core::fmt::write
             at src/libcore/fmt/mod.rs:1057
   5: std::io::Write::write_fmt
             at src/libstd/io/mod.rs:1426
   6: std::sys_common::backtrace::_print
             at src/libstd/sys_common/backtrace.rs:62
   7: std::sys_common::backtrace::print
             at src/libstd/sys_common/backtrace.rs:49
   8: std::panicking::default_hook::{{closure}}
             at src/libstd/panicking.rs:195
   9: std::panicking::default_hook
             at src/libstd/panicking.rs:215
  10: std::panicking::rust_panic_with_hook
             at src/libstd/panicking.rs:472
  11: rust_begin_unwind
             at src/libstd/panicking.rs:376
  12: core::panicking::panic_fmt
             at src/libcore/panicking.rs:84
  13: core::result::unwrap_failed
             at src/libcore/result.rs:1189
  14: ureq::stream::connect_https
  15: ureq::unit::connect_socket

style: empty comment lines?

I had a question about coding style. I noticed there are a bunch of empty comments, like so:

    fn from_str(s: &str) -> Result<Self, Self::Err> {
        //
        let line = s.to_string();

Are these meant to denote something? Should I clean them up?

Error building with `default-features = false`

Hi there!

I tried to build without default features:

[dependencies]
ureq = { version = "0.4", default-features = false }

Sadly, this results in:

error[E0432]: unresolved import `native_tls`
 --> /home/lukas/.cargo/registry/src/github.com-1ecc6299db9ec823/ureq-0.4.8/src/error.rs:1:5
  |
1 | use native_tls::Error as TlsError;
  |     ^^^^^^^^^^ Maybe a missing `extern crate native_tls;`?

error[E0432]: unresolved import `native_tls`
 --> /home/lukas/.cargo/registry/src/github.com-1ecc6299db9ec823/ureq-0.4.8/src/error.rs:2:5
  |
2 | use native_tls::HandshakeError;
  |     ^^^^^^^^^^ Maybe a missing `extern crate native_tls;`?

error[E0599]: no variant named `Https` found for type `stream::Stream` in the current scope
  --> /home/lukas/.cargo/registry/src/github.com-1ecc6299db9ec823/ureq-0.4.8/src/stream.rs:28:17
   |
12 | pub enum Stream {
   | --------------- variant `Https` not found here
...
28 |                 Stream::Https(_) => "https",
   |                 ^^^^^^^^^^^^^^^^ variant not found in `stream::Stream`
   |
   = note: did you mean `stream::Stream::Http`?

error[E0599]: no variant named `Https` found for type `stream::Stream` in the current scope
  --> /home/lukas/.cargo/registry/src/github.com-1ecc6299db9ec823/ureq-0.4.8/src/stream.rs:41:13
   |
12 | pub enum Stream {
   | --------------- variant `Https` not found here
...
41 |             Stream::Https(_) => true,
   |             ^^^^^^^^^^^^^^^^ variant not found in `stream::Stream`
   |
   = note: did you mean `stream::Stream::Http`?

error[E0599]: no variant named `Https` found for type `stream::Stream` in the current scope
  --> /home/lukas/.cargo/registry/src/github.com-1ecc6299db9ec823/ureq-0.4.8/src/stream.rs:59:13
   |
12 | pub enum Stream {
   | --------------- variant `Https` not found here
...
59 |             Stream::Https(stream) => stream.read(buf),
   |             ^^^^^^^^^^^^^^^^^^^^^ variant not found in `stream::Stream`
   |
   = note: did you mean `stream::Stream::Http`?

error[E0599]: no variant named `Https` found for type `stream::Stream` in the current scope
  --> /home/lukas/.cargo/registry/src/github.com-1ecc6299db9ec823/ureq-0.4.8/src/stream.rs:71:13
   |
12 | pub enum Stream {
   | --------------- variant `Https` not found here
...
71 |             Stream::Https(stream) => stream.write(buf),
   |             ^^^^^^^^^^^^^^^^^^^^^ variant not found in `stream::Stream`
   |
   = note: did you mean `stream::Stream::Http`?

error[E0599]: no variant named `Https` found for type `stream::Stream` in the current scope
  --> /home/lukas/.cargo/registry/src/github.com-1ecc6299db9ec823/ureq-0.4.8/src/stream.rs:80:13
   |
12 | pub enum Stream {
   | --------------- variant `Https` not found here
...
80 |             Stream::Https(stream) => stream.flush(),
   |             ^^^^^^^^^^^^^^^^^^^^^ variant not found in `stream::Stream`
   |
   = note: did you mean `stream::Stream::Http`?

Connection closed too early on some websites

Some websites transmit data really slowly, but ureq drops the connection almost immediately after establishing it, without actually downloading the content, and without reporting an error either.

Example of where this happens is 7911game.com (warning: malicious website, ships some kind of VBScript so I assume it exploits Internet Explorer). curl takes a long time to download it and loads it gradually, as does https://github.com/jayjamesjay/http_req

cargo test doesn't pass

I'm probably doing something stupid. I cloned the repo (my HEAD is at 09dabbd) and ran cargo test. I get

failures:

---- src/lib.rs -  (line 9) stdout ----
error[E0432]: unresolved import `ureq::json`
 --> src/lib.rs:11:5
  |
5 | use ureq::json;
  |     ^^^^^^^^^^ no `json` in the root

error: cannot determine resolution for the macro `json`
  --> src/lib.rs:16:16
   |
10 |     .send_json(json!({
   |                ^^^^
   |
   = note: import resolution is stuck, try simplifying macro imports

error[E0599]: no method named `send_json` found for type `&mut ureq::Request` in the current scope
  --> src/lib.rs:16:6
   |
10 |     .send_json(json!({
   |      ^^^^^^^^^ method not found in `&mut ureq::Request`

error: aborting due to 3 previous errors

Some errors have detailed explanations: E0432, E0599.
For more information about an error, try `rustc --explain E0432`.
Couldn't compile the test.

failures:
    src/lib.rs -  (line 9)

test result: FAILED. 47 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out

I assume it's supposed to pass because CI is working. My rustc is stable-x86_64-apple-darwin unchanged - rustc 1.40.0 (73528e339 2019-12-16)

No way to set timeout for the entire request, which allows DoS attacks

ureq currently does not allow specifying a timeout for the entire request (i.e. until the request body is finished transferring), which means an established connection will keep going indefinitely if the remote host keeps replying.

This is fine for the simple use cases, like a user downloading something interactively, but enables denial-of-service attacks in automated scenarios: if the remote host keeps transferring data at a really low speed, e.g. several bytes per second, the connection will be open indefinitely. This makes it easy for an attacker who can submit URLs to the server to cause denial of service through some kind of resource exhaustion - running out of RAM, networking ports, etc.

ureq::Error doesn't implement std::error::Error

It looks like ureq::Error doesn't implement std::error::Error trait (or failure::Fail). It makes its use as cause for other errors somewhat problematic. Do you think it is possible to implement it?

Remove default host of localhost

Request::new can be called with something that's not a URL, e.g. /path, and it will automatically prepend http://localhost/. I think that's the wrong thing in most cases. We should instead make it an error to construct a Request with a string that doesn't parse as a URL.

This will probably break a lot of doctests, but I think we should update the doctests to use real URLs. For now those can be http://localhost/; when #82 is fixed, those can be http://example.com/, with an override to point example.com to localhost so the tests run quickly and don't hit the network.

Limit maximum size of connection pool

Right now, connections stay in the pool indefinitely. If someone makes requests to a wide variety of hosts, that can quickly fill up the pool. Each entry in the pool uses an FD, and eventually a program will hit ulimit_nofile and stop working.

We should set a max size for the pool, and when a new connection needs to be added but the pool is full, remove one of the existing connections. A couple of options for how to remove:

  • We could scan the pool for connections where the server has already closed them. Such connections are taking up resources needlessly. But if the connection pool is full in steady state, this could be a lot of scanning.
  • We could use an LRU cache.
  • We could remove an entry at random.

Doesn't work with some websites?

Hi! Seems like ureq is not compatible with some websites. Any idea why?
Here is an example:

let resp = ureq::get("https://www.okex.com/api/spot/v3/products")
    .set("Connection", "keep-alive")
    .set("Accept-Encoding", "identity")
    .timeout_connect(5_000)
    .timeout_read(5_000)
    .timeout_write(5_000)
    .call();
if !resp.ok() {
    eprintln!("Error! Code {}, line {}", resp.status(), resp.status_line());
}

It prints: Error! Code 500, line HTTP/1.1 500 Bad Status

Same URL works fine via firefox/chrome/curl/hyper/mio_httpc.

Same thing happens for URL https://api.fcoin.com/v2/public/symbols

response: combine header fields with the same field name

https://tools.ietf.org/html/rfc7230#section-3.2.2

A recipient MAY combine multiple header fields with the same field
name into one "field-name: field-value" pair, without changing the
semantics of the message, by appending each subsequent field value to
the combined field value in order, separated by a comma.
Note: In practice, the "Set-Cookie" header field ([RFC6265]) often
appears multiple times in a response message and does not use the
list syntax, violating the above requirements on multiple header
fields with the same name. Since it cannot be combined into a
single field-value, recipients ought to handle "Set-Cookie" as a
special case while processing header fields.

Right now, response.header() only returns the value of the first header field with the requested name. It would be good to handle the case where there are repeats. Fortunately, the HTTP spec allows doing so by concatenation, which lets us preserve a nice simple API.

`cargo-crev` review

I am looking for a http request crate that would be tiny (LoC-wise), comparing to reqwest, with ability to disable anything other than the simplest plain http, suitable for security-concious environments (like querying BitcoinCore RPC port, etc.) . I was suggested ureq, so I decided to review the source.

Here's the result: dpc/crev-proofs@42a3b5c

Sorry for giving a negative review, but the point of reviewing is to judge and point out problems. I also reserve a right to be wrong about some parts. :)

I hope at least it will help you improve some stuff.

Two different timeout error types when using .into_json()

In my ureq application, when I set the timeout to 20ms, I get an Io(Custom { kind: TimedOut, error: "timed out reading response" }) error. However, when I set the timeout to 100ms, I get a Custom { kind: InvalidData, error: "Failed to read JSON: timed out reading response" } error in the request.into_json() part.

I expect there to be only a single error associated with a timeout, not two.

Timeouts for DNS lookups

Right now ureq has no way to time out DNS lookups. It uses to_socket_addrs, which says:

Note that this function may block the current thread while resolution is performed.

Under the hood, I believe this uses getaddrinfo on Linux, which does not allow setting a timeout.

Some documentation about how curl handles this is here: https://github.com/curl/curl/blob/26d2755d7c3181e90e46014778941bff53d2309f/lib/hostip.c#L91-L115. It sounds like the options are:

  • Use an asynchronous resolver (probably not an option for us).
  • Run the resolve in a thread. Currently the only place ureq uses threads outside of tests is when handling a socks stream, and there's a comment on that code that suggests it would be better to set a read timeout on the upstream socks implementation. But some judicious use of threads in ureq is probably not a bad idea.
  • Use alarm() / SIGALRM. It can only set alarms in units of seconds, and curl mentions there might be some compatibility problems due to running the rest of the program in a signal handler: https://github.com/curl/curl/blob/26d2755d7c3181e90e46014778941bff53d2309f/lib/hostip.c#L625-L637

This may not be a terribly big priority because in practice getaddrinfo does have built-in timeouts on many systems. For instance, on Linux the default config has a timeout of 5s for name resolution. The Windows default is 15s.

Provide clean shutdown for test case servers

#67 introduced TCP listeners that act as HTTP servers for the purpose of testing. These relying on spawning threads. Ideally we'd like the thread running the accept loop to exit when the test case is over. That's a bit challenging because the listener.incoming() iterator blocks.

Right now there are a few possible solutions:

  1. Use Unix-specific APIs to get a raw FD, and close that. This wouldn't work on Windows.
  2. Set the listener to non-blocking and busy-loop if we get WouldBlock, checking on each loop whether it's time to exit.

(2) is unsatisfying for a real server because you need a sleep() to avoid spinning the CPU, but that sleep necessarily delays acceptance of new connections. However, it is probably good enough for test cases.

Support HTTP Proxies

Many thanks for your endeavors. We have actually included ureq into our trusted computing base of Libra. I would prefer we standardize its usage across our code base, but there are concerns that it does not currently support HTTP Proxies. Is this something you could consider? And if not, worst-case scenario, accept a 3rd party contribution?

Maintain a changelog of some sort

Disclaimer: this is a nitpick.

Since ureq seems to get a new update about every week, every week i get a ping from dependabot. Updating is no big deal, but the lack of a changelog makes it a bit less convenient, since i have to check the commits to see what changed.

It would be more convenient to either have a maintained CHANGELOG.md or just some short bullet points on the tags.

Consider proxy when caching a stream

It looks like PoolKeys only consider host and port, which can lead to problems if a proxy is being used as a gateway to internal addresses - eg. proxy1 connects to 1.2.3.4 in a private network, proxy2 connects to 1.2.3.4 in another private network. Another (perhaps more limiting) option might be to make proxies agent-scoped, so that different connection pools are used.

No automatic chunked Transfer Encoding when using Request.send

On ureq version 0.12.0:

When using the Request.send method, no headers are set to indicate some body content to the server. It causes some servers to completely ignore the body (like tiny_http).
The Request.send method should automatically enable the chunked encoding.

A workaround for this issue is to set the Transfer-Encoding header prior sending the request to enable the chunked transfer:

request.set("Transfer-Encoding", "chunked");
let response = request.send(body_reader);

I am doing this issue more as a warning for people like me that will stumble upon this gotcha. I have noticed that you are actually working a brand new version on ureq that probably fixed that issue already so it's maybe not a good idea to spend time fixing this issue. Maybe it should just be mentioned in the documentation of Request.send that the user may want to set the chunked transfer.

I can't send cookies with Agent::set_cookie()

I'm using ureq 0.11.2, and I can't seem to set cookies on a request correctly.

I have the following project:

# Cargo.toml
[package]
name = "ureq-cookies"
version = "0.1.0"
authors = ["Alex Chan <[email protected]>"]
edition = "2018"

[dependencies]
ureq = "0.11.2"
// main.rs
extern crate ureq;

fn main() {
    let agent = ureq::agent();
    
    let cookie = ureq::Cookie::new("name", "value");
    agent.set_cookie(cookie);
    
    let resp = agent.get("http://127.0.0.1:5000/").call();
  
    println!("{:?}", resp);
}

This code:

This is based on the example code given in https://github.com/algesten/ureq/blob/master/src/agent.rs#L189-L196

I'd expect this snippet to send the cookie name=value to http://127.0.0.1:5000, but the server isn't receiving the cookie. Am I doing something wrong?

If I set the Cookie header manually, the server does receive the cookie, but this seems to defeat the point of having a set_cookie() method:

extern crate ureq;

fn main() {
    let mut agent = ureq::agent();

    agent.set("Cookie", "name=value");
    
    let resp = agent.get("http://127.0.0.1:5000/").call();
  
    println!("{:?}", resp);
}

Conjecture

This is the body of set_cookie():

ureq/src/agent.rs

Lines 199 to 205 in da42f2e

let mut state = self.state.lock().unwrap();
match state.as_mut() {
None => (),
Some(state) => {
state.jar.add_original(cookie);
}
}

If it can't acquire the state as mutable, the cookie is quietly dropped. I wonder if that's what's happening here?

How I'm inspecting the cookies

At http://127.0.0.1:5000, I'm running a small Python web server. On every request, it prints the headers and the cookies it received:

#!/usr/bin/env python
# -*- encoding: utf-8

import flask

app = flask.Flask(__name__)


@app.route("/")
def index():
    print("\nGot a request!")
    print("Headers: %r" % dict(flask.request.headers))
    print("Cookies: %r" % flask.request.cookies)
    return "hello world"


if __name__ == "__main__":
    app.run(port=5000)

This is the output:

# Using set_cookie("name", "value")
Got a request!
Headers: {'Host': '127.0.0.1', 'User-Agent': 'ureq', 'Accept': '*/*'}
Cookies: {}

# Using set("Cookie", "name=value")
Got a request!
Headers: {'Host': '127.0.0.1', 'User-Agent': 'ureq', 'Accept': '*/*', 'Cookie': 'name=value'}
Cookies: {'name': 'value'}

Make 'cookie' optional

Is it possible to make the cookie dependency optional? When making server-to-server requests it's not often that you use cookies, and the cookie dependency brings in quite a few others:

│   ├── cookie v0.12.0
│   │   ├── time v0.1.42
│   │   │   └── libc v0.2.65 (*)
│   │   │   [dev-dependencies]
│   │   │   └── winapi v0.3.8
│   │   └── url v1.7.2
│   │       ├── idna v0.1.5
│   │       │   ├── matches v0.1.8
│   │       │   ├── unicode-bidi v0.3.4
│   │       │   │   └── matches v0.1.8 (*)
│   │       │   └── unicode-normalization v0.1.8
│   │       │       └── smallvec v0.6.10
│   │       ├── matches v0.1.8 (*)
│   │       └── percent-encoding v1.0.1

So, would it be acceptable to add a cookie feature (enabled by default perhaps) that disables this dependency?

"Host" header set incorectly (port missing)

It seems if an URL with a port is used – for example, "http://localhost:9222/json/version" – then the "Host" header is not set correctly (the port seems to be missing).

I connected to the HTTP front end of a DevTools protocol server, which returns a WebSocket URL.

When I connected via ureq this WebSocket URL did not include the port of the server (and connecting to the WebSocket URL failed, therefore).

When I inspected the response of the server with different clients (Chrome and Curl), the server returned the correct WebSocket URL.

After I added the correct "Host" header to the ureq request myself, the server returned the correct WebSocket URL as well.

Therefore I assume that ureq does not set the "Host" header correctly.

According to MDN, if no port is included in the "Host" header, it defaults to 80 (for HTTP) or 443 (for HTTPS), which in the above case is incorrect.

Solved

Hi, I have this error when making a post

No cached session for DNSNameRef("discordapp.com") 
Not resuming any session

Code

let resp = ureq::post(&url) // &url == discord webhook
.send_json(json!({ "content": "test" }));

info!("Response: {} Status: {} StatusLine: {}", resp.status(), resp.status_text(), resp.status_line());

Doc build failing

The doc build for 1.2.0 at https://docs.rs/crate/ureq/1.2.0/builds/261220 is failing due to:

[INFO] [stderr]  Documenting ureq v1.2.0 (/opt/rustwide/workdir)
[INFO] [stderr] error: You have both the "tls" and "native-tls" features enabled on ureq. Please disable one of these features.
[INFO] [stderr]    --> src/lib.rs:185:1
[INFO] [stderr]     |
[INFO] [stderr] 185 | / std::compile_error!(
[INFO] [stderr] 186 | |     "You have both the \"tls\" and \"native-tls\" features enabled on ureq. Please disable one of these features."
[INFO] [stderr] 187 | | );
[INFO] [stderr]     | |__^

Ability to debug internal errors

Hi @algesten , we are the veloren team (a rust game) and we decided to use your ureq crate a few months ago for our auth api to replace reqwest. We love the fact that it's so small and straight to the point.

But we just discoverd an issue. We are getting a 500 syntetic error BadStatus, similar to #10 (but not exactly).
We updated to 1.3 just to be sure, but the error is still persistent.
It seems to come sporadically all 5 minutes and looks like tls related.

I've read your reason for the Synthetic errors but in case something is Bad inside ureq or maybe in the way we are handling it, it makes stuff super hard to debug.

Do you have a default approach to take on errors like this ? btw we are using default-features = false and use only "tls"

Handle error in into_reader

Cannot handle error in such case:

use std::io;

fn main() {
    let mut reader = ureq::get("https:://123").call().into_reader();
    let mut out = Vec::new(); // error message will be there
    io::copy(&mut reader, &mut out).unwrap(); // no error

    dbg!(String::from_utf8_lossy(&out));
}

Looks like no error propagate from read to io::copy.

Any example of multipart post?

Hi, Thanks to ureq, have achieved what I wanted to a large extent in my project. Is there any multipart suppor/example right now?

Can't connect to website via https

I can't connect to a website:

    let agent = ureq::agent();
    agent.get("https://redacted.com/Default.asp?procedura=ERRORE").call();

I get Network Error: Connection reset by peer (os error 54), but curl handles it fine.

*   Trying 217.19.150.244...
* TCP_NODELAY set
* Connected to redacted.com (0.0.0.0) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/cert.pem
  CApath: none
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS change cipher, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES256-SHA384
* ALPN, server did not agree to a protocol
* Server certificate:
*  subject: OU=Domain Control Validated; CN=*.redacted.com
*  start date: Feb  3 18:49:20 2020 GMT
*  expire date: Mar  6 12:51:01 2022 GMT
*  subjectAltName: host "redacted.com" matched cert's "*.redacted.com"
*  issuer: C=BE; O=GlobalSign nv-sa; CN=AlphaSSL CA - SHA256 - G2
*  SSL certificate verify ok.
> GET /Default.asp?procedura=ERRORE HTTP/1.1
> Host: redacted.com
> User-Agent: curl/7.54.0
> Accept: */*
> 
< HTTP/1.1 200 OK

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.