Giter Site home page Giter Site logo

aesiniath / http-streams Goto Github PK

View Code? Open in Web Editor NEW
50.0 50.0 48.0 766 KB

Haskell HTTP client library for use with io-streams

Home Page: https://hackage.haskell.org/package/http-streams

License: BSD 3-Clause "New" or "Revised" License

Haskell 99.92% HTML 0.08%

http-streams's People

Contributors

23skidoo avatar 3noch avatar alissa-tung avatar andreasabel avatar brandon-leapyear avatar chemist avatar ddssff avatar emmanueltouzery avatar erikd avatar gregorycollins avatar hololeap avatar hvr avatar iliastsi avatar istathar avatar ixmatus avatar juhp avatar kgardas avatar lukehoersten avatar lukerandall avatar michaelxavier avatar nanotech avatar noteed avatar rtrvrtg avatar singpolyma avatar snoyberg avatar tanakh avatar theunixman avatar vekhir avatar werehamster avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

http-streams's Issues

Way to get at underlying response headers

I'm writing a utility that abstracts over http-streams into my own request/response types which exposes fields for the purpose of having a version that fakes HTTP calls/responses.

All of this is pretty straightforward except I find myself hindered by not being able to get at the full list of headers for a Response so I can export it into my own Response. I read your comment on why you hide the underlying representation but even outside of my special use case, there are reasons to want the full response headers. For instance: say you wanted to iterate over X-* headers the server gave back to you. You'd have no way of knowing which ones were there since they are not part of the spec, so you could probe the response for them.

Convenience functions don't support authentication

When writing a small spider application for a webservice, I had to re-implement some code from the http-streams convenience functions because they do not support authentication.
Would it make sense to support URLs like

https://user:[email protected]/blub

to enable that use case?
I volunteer to look into that if it makes sense to you. I'm also open for other approaches to solve this problem.
(I didn't check the code on Github, just the released tarball, so feel free to close this issue if this already supported.)

Segfault when retrieving `https` urls via `get`

The following programs segfaults for me

module Main where

import qualified Data.ByteString.Char8 as B
import           Data.Monoid
import           Network.Http.Client

main :: IO ()
main = do
    content <- get "http://www.heise.de" concatHandler'
    putStrLn $ "HTTP: got response with " <> show (B.length content) <> " bytes"
    content2 <- get "https://www.heise.de" concatHandler'
    putStrLn $ "HTTPS: got response with " <> show (B.length content2) <> " bytes"
    return ()

When compiled and executed in gdb, I get the following (the http://-fetch succeeds, the https://-fetch segfaults):

HTTP: got response with 98401 bytes

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff75c206f in SSL_CTX_set_cipher_list () from /lib/x86_64-linux-gnu/libssl.so.1.0.0
(gdb) bt
#0  0x00007ffff75c206f in SSL_CTX_set_cipher_list () from /lib/x86_64-linux-gnu/libssl.so.1.0.0
#1  0x000000000069b50c in sUI7_info ()
#2  0x0000000000000000 in ?? ()

The following package versions were used:

"http-streams-0.6.1.1" -> "HsOpenSSL-0.10.3.4"
"http-streams-0.6.1.1" -> "aeson-0.6.2.0"
"http-streams-0.6.1.1" -> "attoparsec-0.10.4.0"
"http-streams-0.6.1.1" -> "base-4.6.0.1"
"http-streams-0.6.1.1" -> "base64-bytestring-1.0.0.1"
"http-streams-0.6.1.1" -> "blaze-builder-0.3.1.1"
"http-streams-0.6.1.1" -> "bytestring-0.10.0.2"
"http-streams-0.6.1.1" -> "case-insensitive-1.1"
"http-streams-0.6.1.1" -> "io-streams-1.1.2.0"
"http-streams-0.6.1.1" -> "mtl-2.1.2"
"http-streams-0.6.1.1" -> "network-2.4.1.2"
"http-streams-0.6.1.1" -> "openssl-streams-1.1.0.0"
"http-streams-0.6.1.1" -> "text-0.11.3.1"
"http-streams-0.6.1.1" -> "transformers-0.3.0.0"
"http-streams-0.6.1.1" -> "unordered-containers-0.2.3.2"

Multi-line headers are not be properly handled

I have a feeling the header part of the response parser may not be handling multi-line values. Someone needs to write a test case to check for the correct behaviour. Then we can fix it if necessary.

AfC

Expose hidden modules

I am using http-streams to write tests against a HTTP API. I would like to simulate an interrupted transmission in sendRequest. I wanted to copy/paste/modify the original sendRequest function but I cannot do that because it relies on hidden modules.

Would you agree to expose the hidden modules ?

Otherwise, would you accept a Pull Request to add such a misbehaving version of sendRequest ?

Otherwise I may be able to achieve what I want by passing an appropriately broken handler argument to sendRequest.

Chunk size for chunk encoded content should not be limited

(No code for this issue attached, since I think it should be discussed first.)
I was using http-streams to spider our company's webservice and discovered that it couldn't download some documents because of the chunk size selected by the webserver:

  *** Exception: HttpParseException "parseTransferChunk: chunk of size 1589419 too long."

I had a look at the HTTP spec and there is no maximum limit on chunk size defined.
Now, I understand why you built this limitation into http-streams, but it really limits the usability of the package in real-life scenarios.
I would vote for removing this limit.

HTTPS connections throw a segmentation fault

I'm just using:
get "https://google.com" concatHandler

When the program is run with gdb it gives me this:

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0x0000000000000000
0x00007fff8acad807 in SSL_CTX_set_cipher_list ()

I'm currently trying to track it down and figure out if it's a bad openssl version.

openConnectionSSL can fail if the system supports ipv6

openConnectionSSL assumes that the socket's family is AF_INET.

Specifically it is failing for the case of google.

import qualified Data.ByteString.Char8 as BS
import Network.Http.Client
import OpenSSL (withOpenSSL)

main = withOpenSSL $ do
ctx <- baselineContextSSL
c <- openConnectionSSL ctx (BS.pack "www.google.com") 443
closeConnection c

buildRequest is in IO and requires an open connection

buildRequest requires an open connection as a parameter, to ensure that the Host header gets the correct value. However, this design forces the connection to be opened before beginning to build the request.

kstt suggested that instead, the value of the Host header should be a Maybe value, Nothing by default. If Host is still Nothing when the request is sent, it would be filled in with the value taken from the connection object.

kstt also pointed out that buildRequest should be a pure function with a Monoid instance, not in the IO monad.

form-data encoding support?

It seems current encodedFormBody implementation doesn't support file input, while my company's file uploading api dosen't support PUT raw file, i'd like request a feature that allow building POST request with file with form-data/multipart encoded : )

Regression from #25

Hey guys,

I think there might have been a regression from issue #25. Using the latest http-streams 0.6.0.2 and io-streams 1.1.0.3, when making a request (in this case a GET) to a URL that returns an empty response body (status code 204 no content), the client hangs forever. The same gist I filed in the original ticket #26 still reproduces the error for me:

https://gist.github.com/MichaelXavier/5475167

Segfault when using with SSL on Ubuntu 13.04

Any attempt to perform an HTTPS request results in a segfault on my system:

Program received signal SIGSEGV, Segmentation fault.
0xb7dd55c9 in SSL_CTX_set_cipher_list ()
   from /lib/i386-linux-gnu/libssl.so.1.0.0
(gdb) bt
#0  0xb7dd55c9 in SSL_CTX_set_cipher_list ()
   from /lib/i386-linux-gnu/libssl.so.1.0.0
#1  0x0821fa0c in sUBm_info ()
#2  0x00000000 in ?? ()

Fails to compile with GHC 7.0.4

cabal install http-streams on GHC 7.0.4 fails with

Configuring http-streams-0.7.2.6...
Preprocessing library http-streams-0.7.2.6...
Preprocessing test suites for http-streams-0.7.2.6...
Building http-streams-0.7.2.6...

lib/Network/Http/Utilities.hs:27:14:
    Unsupported extension: Trustworthy
Failed to install http-streams-0.7.2.6
cabal: Error: some packages failed to install:
http-streams-0.7.2.6 failed during the building phase. The exception was:
ExitFailure 1

It works fine in GHC 7.2.2 though. Hence, if you don't want to support GHC 7.0.4 anymore, please set a lower bound base >= 4.4 in future http-streams releases

release 0.7.2.5 doesn't include test modules

In-place registering http-streams-0.7.2.5...
Preprocessing test suite 'check' for http-streams-0.7.2.5...

tests/check.hs:19:8:
    Could not find module ‘MockServer’
    Use -v to see a list of the files searched for.

tests/check.hs:20:8:
    Could not find module ‘TestSuite’
    Use -v to see a list of the files searched for.

Does not work with GHC 7.0.x

I just migrated snap to use http-streams, and on the 2011.4.0.0 platform build, we're getting a failure with this error:

src/Network/Http/Inconvenience.hs:49:34:
Module Data.Monoid' does not export(<>)'

SSL CA certificates not found on fedora

If I run this test program on fedora 18:
https://dl.dropbox.com/u/22600720/Test.hs

ghc Test.hs -threaded

The output for me is:
Test: ProtocolError "error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed"

curl and http-conduit on the same computer have no problems therefore I think the CA certificates are present on the computer.

$ ls -lh /etc/ssl/certs/
total 1.5M
-rw-r--r-- 1 root root 697K Jan 4 19:10 ca-bundle.crt
-rw-r--r-- 1 root root 778K Jan 4 19:10 ca-bundle.trust.crt
-rwxr-xr-x 1 root root 610 Mar 18 22:21 make-dummy-cert
-rw-r--r-- 1 root root 2.2K Mar 18 22:21 Makefile
-rwxr-xr-x 1 root root 829 Mar 18 22:21 renew-dummy-cert

And if in Inconvenience.hs, I change the linux code to this:

SSL.contextSetCAFile ctx "/etc/ssl/certs/ca-bundle.crt"

Then it works.

I tracked this in one of the dependencies of http-conduit:
https://github.com/vincenthz/hs-certificate/tree/master/System/Certificate/X509

the Unix.hs version seems to open by hand every file under that certs folder. I guess openSSL will only open single certificates, not bundles, hence the problem?

And finally:
$ ls -l /usr/lib/ssl
ls: cannot access /usr/lib/ssl: No such file or directory

$ ls -l /etc/ssl/certs
lrwxrwxrwx 1 root root 16 Jan 18 21:56 /etc/ssl/certs -> ../pki/tls/certs

Content-Length: 0 causes hang

For some reason, when following a redirect, get delivers the content but then hangs. Requesting the target URL directly works fine.

AfC

Can we export the entire Request object from Network.Http.Client?

Hi,

@afcowie, thanks for writing this library.

I'm writing bindings to Amazon's dynamodb service with http-streams.

It seems like the buildRequest function adds additional headers to requests (Accept-Encoding, Transfer-Encoding), shown below. When making requests with Amazon's V4 signing algorithm the headers must be exact or the signature will be invalid, and there are only specific headers that are allowed.

In order to get around this I tried to set the headers manually but it appears I cannot access the Request object constructor, it seems only the type is exported!

-- Line 122: Network.Http.Client
, Request -- would need to be Request (..)

Here's the request I'm making

req <- buildRequest $ do
          http POST "/"
          mapM_ makeHeader headers

-- these are the only headers I'm using, but additional ones are added
headers = ([ ("host", "dynamodb.us-east-1.amazonaws.com")
            , ("x-amz-target", "DynamoDB_20120810." <> operation)
            , ("connection", "Keep-Alive")
            , ("content-type", "application/x-amz-json-1.0")
            ] :: RequestHeaders)

 makeHeader :: Header -> RequestBuilder ()
 makeHeader (key, val) = setHeader (original key) val

Result:

POST / HTTP/1.1
Host: <default>
Authorization: AWS4-HMAC-SHA256 Credential=AKIAIEIPUFMLQGTUVCPA/20141104/us-east-1/dynamodb/aws4_request, SignedHeaders=connection;content-type;host;x-amz-date;x-amz-target, Signature=c00daa9dede346bbf7\
21664d7b8ebe42197ec2fbb2de8f698c8a2dd752f5478c
x-amz-target: DynamoDB_20120810.ListTables
Accept-Encoding: gzip // <-- I didn't specify this one 
Transfer-Encoding: chunked // <-- or this one 
content-type: application/x-amz-json-1.0
host: dynamodb.us-east-1.amazonaws.com
connection: Keep-Alive
x-amz-date: 20141104T214118Z

Which makes my signature invalid.

All I need to be able to do is this:

let req = request { qHeader = header } -- but I get "qHeaders is not a visible constructor" ;-(

Can we please re-export the entire Request object from Network.Http.Types, exporting Response, wouldn't hurt either. I don't care if you force the request to have additional HTTP/1.1 headers, but please let us override them if we don't want them by exporting the qHeaders constructor, or explain to me how to remove them another way.

Thanks,

David

Exception wrapping stream to catch attoparsec errors

In a comment on snapframework/io-streams#3 (comment), @gregorycollins wrote:

Attoparsec has limited
error reporting support. One thing you could do is write an exception
wrapping stream that would catch any exception generated by read and wrap
some kind of informative error message around it. More practically, what
you will want to do for http-streams is to catch the exceptions you expect
(ParseException being chief amongst them) and then do something smart about
converting that error into a sensible http-streams exception.

This would be a Good Thing™.

AfC

`HostName` should not be represented by `String`

Currently, the following defines exist in Network.Http.Client

type Hostname = String
type Port = Int

which are used by the following functions:

openConnection :: Hostname -> Port -> IO Connection
setHostname :: Hostname -> Port -> RequestBuilder ()
openConnectionSSL :: SSLContext -> Hostname -> Port -> IO Connection

The HostName type should rather be represented by a ByteString as hostnames are not supposed to be Unicode (except for things like punycode) and moreover, ByteString is a more efficient representation. I assume the reason for HostName being String is because the network package's Network.HostName is being mimicked.

PS: on a related matter, it might be sensible to represent Port with something like Word16 or network's Network.PortNumber (which is a newtype around Word16) as that is what valid portnumbers are for TCP/{IPv4,IPv6}

Possibly not handling 204 responses properly

I'm doing some testing with a server that returns 204 no content. I've implemented the test server in scotty and snap and for both, http-streams hangs indefinitely on 204 responses with no content. Here's a minimal example:

https://gist.github.com/MichaelXavier/5475167

Just for fun, I made another server in Ruby's sinatra library to return a 204. http-streams handled it fine. I hit both with curl and saw this difference:

ruby:
* Adding handle: conn: 0xc46de0
* Adding handle: send: 0
* Adding handle: recv: 0
* Curl_addHandleToPipeline: length: 1
* - Conn 0 (0xc46de0) send_pipe: 1, recv_pipe: 0
* About to connect() to localhost port 4567 (#0)
*   Trying 127.0.0.1...
* Connected to localhost (127.0.0.1) port 4567 (#0)
> GET / HTTP/1.1
> User-Agent: curl/7.30.0
> Host: localhost:4567
> Accept: */*
> 
< HTTP/1.1 204 No Content
< X-Content-Type-Options: nosniff
< Connection: close
* Server thin 1.5.1 codename Straight Razor is not blacklisted
< Server: thin 1.5.1 codename Straight Razor
< 
* Closing connection 0


scotty:
* Adding handle: conn: 0x76ade0
* Adding handle: send: 0
* Adding handle: recv: 0
* Curl_addHandleToPipeline: length: 1
* - Conn 0 (0x76ade0) send_pipe: 1, recv_pipe: 0
* About to connect() to localhost port 4568 (#0)
*   Trying 127.0.0.1...
* Connected to localhost (127.0.0.1) port 4568 (#0)
> GET / HTTP/1.1
> User-Agent: curl/7.30.0
> Host: localhost:4568
> Accept: */*
> 
< HTTP/1.1 204 No Content
* Server Warp/1.3.8 is not blacklisted
< Server: Warp/1.3.8
< 
* Connection #0 to host localhost left intact

I think the difference is that warp and snap are leaving the connection intact and http-streams isn't closing it?

add FreeBSD support

Hint: Mac OSX is built on FreeBSD. The gcc compiler define is __FreeBSD__.

SSL example fails with parse exception

Checking out the source and loading the SSL test file fails with "not enough bytes".

cabal repl
Network.Http.Client> :l tests/SecureSocketsSnippet.hs
Snippet> main
Exception: Parse exception: not enough bytes

I am running the Haskell Platform on 32-bit OS X using GHC 7.6.3.

Generic request helper

It would be nice to have something similar to post, but that takes an HTTP method as an argument, and has ContentType optional with Maybe.

Safe JSON response helper

I currently use:

safeJSONresponse :: (Aeson.FromJSON a) => Response -> InputStream ByteString - > IO (Either Error (Response, a))
safeJSONresponse resp i = runUnexceptionalIO $ runEitherT $ do
   v <- fmapLT (\e -> handle e (fromException e)) $ fromIO $
      parseFromStream Aeson.json' i
   case Aeson.fromJSON v of
      Aeson.Success a -> return (resp, a)
      Aeson.Error _ -> throwT parseError
   where
   parseError = Error Unavailable (T.pack "JSON parse error")
   handle _ (Just (ParseException _)) = parseError
   handle e _ = Error Unavailable (T.pack $ "Exception: " ++ show e)

For something generic to go into the library, this would probably be simplified to use Maybe and only catch the parse error -- other exceptions would still go through.

Cannot send requests concurrently

After establishing a connection to another http-server, when two send/receive actions are run concurrently, the results can be unpredictable (i.e. parse failures). Was wondering what the recommended way to send requests concurrently is (if possible). These tests were run under ghci and an executable compiled with ghc -rtsopts -threaded

Example

------------------------------------------------------------------------------
-- | Request Builder for API
buildHNRequest :: FromJSON a => Text -> HackerNews (Maybe a)
buildHNRequest url = do
    con <- ask
    liftIO $ do
      req <- buildRequest $ do
        http GET $ "/v0`/" <> T.encodeUtf8 url <> ".json"
        setHeader "Connection" "Keep-Alive"
        setAccept "application/json"
      let action = do sendRequest con req emptyBody
                      receiveResponse con $ const Streams.read
      c <- concurrently action action
      return $ case fst c of
        Nothing -> Nothing
        Just bs -> do
          let xs = rights [parseOnly value bs, parseOnly json bs]
          case xs of
            []    -> Nothing
            x : _ ->
              case fromJSON x of
                Success a -> Just a
                _         -> Nothing

Result, first works fine, second fails (goes back and forth like this).

λ> main
Just (User {userAbout = Nothing, userCreated = 2013-08-06 16:49:23 UTC, userDelay = 0, userId = UserId "dmjio", userKarma = 7, userSubmitted = [8433827,8429256,8429161,8429069,8374809,8341570,7919268,78\
25469,7350544,7327291,6495994,6352317,6168527,6168524,6167639], userDeleted = False, userDead = False})
λ> main
*** Exception: Parse exception: Failed reading: takeWith
λ>

`get` doesn't handle relative redirects correctly

e.g.

> get "http://www.melding-monads.com/redirect/example.php" concatHandler
*** Exception: Can't parse URI /redirect/success.html

Where example.php is just:

<?php
  header('Location: /redirect/success.html')
?>

While relative redirects are not specifically allowed by RFC 2616, this is considered a bug in the RFC and almost every http client supports them.

Feature request: behavior similar curl --resolve

Using http-stream's setHostname makes it possible for instance to test a Nginx virtual server served from 127.0.0.1. This is similar to curl -H 'Host: hostname' 127.0.0.1.

But to work with HTTPS, it is better to use curl --resolve 'hostname:80:127.0.0.1' hostname so that certificates can be checked for the correct hostname, instead of 127.0.0.1.

It would be great to have somthing similar to --resolve in http-streams.

Example programs don't run out of the box

I'm using GHC 7.4.1.

The following example doesn't work, for me, as listed:

import System.IO.Streams (InputStream, OutputStream, stdout)
import qualified System.IO.Streams as Streams
import qualified Data.ByteString as S
import Network.Http.Client

main :: IO ()
main = do
    c <- openConnection "www.example.com" 80

    q <- buildRequest $ do
        http GET "/"
        setAccept "text/html"

    sendRequest c q emptyBody

    receiveResponse c (\p i -> do
        x <- Streams.read i
        S.putStr $ fromMaybe "" x)

    closeConnection c

Firstly, fromMaybe requires Data.Maybe. Secondly, for most of the strings, I get errors like:

    Couldn't match expected type `Hostname' with actual type `[Char]'
    In the first argument of `openConnection', namely
      `"www.example.com"'

The two things that fixed it were:

{-# LANGUAGE OverloadedStrings #-}

or:

import qualified Data.ByteString.Char8 as S8
...
...(S8.pack "www.example.com")...

Both of these seemed to be considered capitulations of some sort by the folks in the haskell irc channel, I'm too much of a beginner to say. But there you have it, the examples don't quite work out of the box.

buildRequest in IO?

Why is buildRequest in IO? Shouldn't it be possible to build requests without IO?

`http` constructs invalid request-line

the builder http :: Method -> ByteString -> RequestBuilder () in its current form results in invalid request lines if called with an empty string for the url-path. At the very least, the documentation should mention the pre-conditions for the ByteString parameter. Alternatively, it could fixup an empty string to mean "/"

Requests to pathless urls make empty paths in request line

I'm not sure if this is intentional or not, but using get "http://www.google.com" will generate a request that looks like GET HTTP/1.1 - which I believe is an invalid HTTP request line.

Personally, I expected a request for GET / HTTP/1.1. While I think a blank path could be expected if I did something silly like http GET "", when using an interface that takes the whole url, I believe it should place the / automatically.

What do you think?

Weird build setup interferes with use unpacked in a sandbox

I'm using http-stream unpacked with patches applied in a sandbox. The weird build stuff in http-stream is sufficiently odd that it prevents cabal from installing any other packages.

$ cabal unpack http-streams
$ cabal sandbox add-source http-streams*
$ git clone [email protected]:/me/myrepo
$ cabal sandbox add-source myrepo
$ cabal install http-streams
<snip>

$ cabal install myrepo
setup: snippet.hs doesn't exist

To make the install work I have to remove the http-streams directory as a source:

$ cabal sandbox delete-source http-streams-*
$ cabal install myrepo
<snip>

While this seems to be a bug in Cabal or cabal-install in running the wrong setup binary, not having the crazy build stuff would stop it from exhibiting.

Support HTTP/1.0 responses

If the server responds with HTTP/1.0, the response parser fails - it appears that only HTTP/1.1 is supported. See Network.Http.ResponseParser.parseStatusLine

DELETE does not send body

Streaming a body works correctly for POST and PUT requests, however if you wire up a DELETE request with a body stream, the server never gets it.

https://gist.github.com/MichaelXavier/5564392

I did the server in ruby just for a minimal, obvious test case but you could do it in haskell. A POST or a PUT request will echo back from the server the request body it sent. A DELETE will not send a body at all.

Segfault's on post

Sadly my first try with this caused a seg fault :(

import qualified Data.ByteString.Char8 as C
import Network.Http.Client as HTTP
HTTP.post (C.pack "https://www.google.com") (C.pack "multipart/form-data") HTTP.emptyBody HTTP.concatHandler

Either I'm doing something wrong (likely) or there is a bug?

Expose more granular header control

I'll throw down a scenario here, in which I've had some altercations with http-streams:

Say you're generating comprehensive request signatures for a 3rd party service such as some of the newer Amazon Services, which require you to audit/order and comprehensively sign what constitutes your request.

When using http-streams by default the User-Agent and Accept-Encoding headers are set .. so far, no problem .. can just poke around in the source and then hardcode the values, no?

But then .. when http-streams VERSION is incremented after a package upgrade, the signing process breaks!

There is also a tangentially related issue (I'd rather not argue HTTP specs here) that by setting the Host header with a port number, if :80 is not used (let's say :443), more flaming hoops must be leaped through.

A solution would be some additional mechanism(s) for having more control over headers before the request is finalised.

Some ideas in no particular order, regardless of sanity:

  • Add getHeaders to the RequestBuilder monad.
  • Expose deleteHeader.
  • Make setHeader "Host" <value> override the default Host header.
  • Get rid of the default User-Agent header.

Thanks for listening!

openConnection and openConnectionSSL leak file descriptors on connect failures

When attempting to open a connection to an address which can be successfully resolved but not connected to openConnection and openConnectionSSL leave the file descriptor belonging to the socket open.

Test-case:

module Main where

import Control.Exception
import Network.Http.Client
import qualified System.IO.Streams as S

main :: IO ()
main = sequence_
     . take 2048
     . repeat $ handle (\e -> print (e :: IOException))
         (get "http://localhost:60000" (\_ i -> S.connect i S.stdout))

One has to handle exceptions within openConnection* to ensure the socket is closed, e.g.

diff --git a/src/Network/Http/Connection.hs b/src/Network/Http/Connection.hs
index d2eb1a4..d1078ce 100644
--- a/src/Network/Http/Connection.hs
+++ b/src/Network/Http/Connection.hs
@@ -39,7 +39,7 @@ import Blaze.ByteString.Builder (Builder)
 import qualified Blaze.ByteString.Builder as Builder (flush, fromByteString,
                                                       toByteString)
 import qualified Blaze.ByteString.Builder.HTTP as Builder (chunkedTransferEncoding, chunkedTransferTerminator)
-import Control.Exception (bracket)
+import Control.Exception (bracket, bracketOnError, onException)
 import Data.ByteString (ByteString)
 import qualified Data.ByteString.Char8 as S
 import Data.Monoid (mappend, mempty)
@@ -176,19 +176,18 @@ openConnection h1' p = do
     is <- getAddrInfo (Just hints) (Just h1) (Just $ show p)
     let addr = head is
     let a = addrAddress addr
-    s <- socket (addrFamily addr) Stream defaultProtocol
-
-    connect s a
-    (i,o1) <- Streams.socketToStreams s
-
-    o2 <- Streams.builderStream o1
-
-    return Connection {
-        cHost  = h2',
-        cClose = close s,
-        cOut   = o2,
-        cIn    = i
-    }
+    bracketOnError (socket (addrFamily addr) Stream defaultProtocol) close $ \s -> do
+        connect s a
+
+        (i, o1) <- Streams.socketToStreams s
+        o2      <- Streams.builderStream o1
+
+        return Connection {
+            cHost  = h2',
+            cClose = close s,
+            cOut   = o2,
+            cIn    = i
+        }
   where
     hints = defaultHints {addrFlags = [AI_ADDRCONFIG, AI_NUMERICSERV]}
     h2' = if p == 80
@@ -234,21 +233,20 @@ openConnectionSSL ctx h1' p = do
         f = addrFamily $ head is
     s <- socket f Stream defaultProtocol

-    connect s a
-
-    ssl <- SSL.connection ctx s
-    SSL.connect ssl
+    connect s a `onException` (close s)

-    (i,o1) <- Streams.sslToStreams ssl
+    bracketOnError (SSL.connection ctx s) (closeSSL s) $ \ssl -> do
+        SSL.connect ssl

-    o2 <- Streams.builderStream o1
+        (i, o1) <- Streams.sslToStreams ssl
+        o2      <- Streams.builderStream o1

-    return Connection {
-        cHost  = h2',
-        cClose = closeSSL s ssl,
-        cOut   = o2,
-        cIn    = i
-    }
+        return Connection {
+            cHost  = h2',
+            cClose = closeSSL s ssl,
+            cOut   = o2,
+            cIn    = i
+        }
   where
     h2' :: ByteString
     h2' = if p == 443

Need to send content chunked if length unknown

Fairly serious weakness -- at the moment we rely on the user to set Content-Length: if PUTing or POSTing because we haven't implemented chunked transfer-encoding for sending.

The convenience APIs run the OutputStream to calculate the length and set it appropriately, so they are correct at least.

The library should probably just send chunked unless a length is explicitly provided, though working out a sensible public API taking this into account might take a little doing.

AfC

HsOpenSSL license problimatic

Hey Andrew! We were looking into getting http-streams packaged for Debian, and noticed that there is a bit of a licencing gotcha with its use of openssl.

The problem is that the openssl license is not compatible with licenses like the GPL. Now, HsOpenSsl and http-streams are under licenses which do not have any compatability problems with the openssl license. You're in the clear there.

But, suppose I have an existing GPL code base (I do), and I'd like to make it use http-streams (I would), and other people have contributed patches (they have) and I don't require license assignment (I don't). It's legally impossible for me to link my GPLed code with openssl and distribute the result. To make it legal, I'd need to change my code's license, adding a special license exception to the GPL to allow linking it with openssl. But since I am not the sole copyright holder, I can't do that, and would have to track down everyone who had improved my code, and get them to agree too.

Looking at HSOpenssl, its own documentation encourages using TLS instead. If you could switch to TLS, these nasty license problems would go away. Note that http-conduit uses TLS, apparently with success!

I think we could package http-streams for Debian regardless. But if I can't use it in my existing code, some of the motivation is gone..

http-streams defines an `IsString Builder` orphan instance

Since Haskell does not allow to control the import/export of typeclass instances, or as is written in the Haskell Wiki:

Type class instances are special in that they don't have a name and cannot be imported explicitly. This also means that they cannot be excluded explicitly. All instances defined in a module A are imported automatically when importing A, or importing any module that imports A, directly or indirectly.

And here's evidence to show, that http-streams leaks the IsString Builder instance, as soon as Network.Http.Types is imported (indirectly):

$ ghci
GHCi, version 7.6.2: http://www.haskell.org/ghc/  :? for help
Loading package ghc-prim ... linking ... done.
Loading package integer-gmp ... linking ... done.
Loading package base ... linking ... done.
λ> import Data.String
λ> import Blaze.ByteString.Builder (Builder)
λ> :info IsString
class IsString a where
  fromString :: String -> a
    -- Defined in `Data.String'
instance IsString [Char] -- Defined in `Data.String'

λ> import Network.Http.Client 
λ> :info IsString
class IsString a where
  fromString :: String -> a
    -- Defined in `Data.String'
instance IsString Builder -- Defined in `http-streams-0.4.0.1:Network.Http.Types'
instance IsString [Char] -- Defined in `Data.String'

' ' is not an ascii digit

This test currently causes the error "' ' is not an ascii digit":

{-# LANGUAGE OverloadedStrings #-}
import System.IO.Streams
import Network.Http.Client

main :: IO ()
main = get "http://www.google.com" $ \_ i -> connect i stdout

Here is the output from whireshark:

GET / HTTP/1.1
Host: www.google.com
User-Agent: http-streams/0.6.0.1
Accept-Encoding: gzip
Accept: */*

HTTP/1.1 302 Moved Temporarily
Content-Length: 189      
Content-Encoding: gzip
Location: http://www.google.de/
Cache-Control: private
...

The stack trace looks as follows:

...
</html>*** Exception (reporting due to +RTS -xc): (THUNK_2_0), stack trace: 
  Network.Http.ResponseParser.readResponseHeader,
  called from Network.Http.Connection.receiveResponse,
  called from Network.Http.Inconvenience.getN,
  called from Network.Http.Inconvenience.get,
  called from Main.main,
  called from Main.CAF
  ...
test: ' ' is not an ascii digit

Reason is that the response header "Content-Length" contains trailing whitespace. According to RFC2616, section 4.2, "[...] leading or trailing LWS MAY be removed without changing the semantics of the field value"

So, when adding header values, should we maybe trim these?

diff --git a/src/Network/Http/Types.hs b/src/Network/Http/Types.hs
index 509bcfd..de0c557 100644
--- a/src/Network/Http/Types.hs
+++ b/src/Network/Http/Types.hs
@@ -370,9 +370,11 @@ addHeader
     -> (ByteString,ByteString)
     -> HashMap (CI ByteString) ByteString
 addHeader m (k,v) =
-    insertWith f (mk k) v m
+    insertWith f (mk k) (trim v) m
   where
     f new old = S.concat [old, ",", new]
+    trim      = S.dropWhile lws . fst . S.spanEnd lws
+    lws       = flip elem [' ', '\t', '\n', '\r']

 lookupHeader :: Headers -> ByteString -> Maybe ByteString
 lookupHeader x k =

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.