Giter Site home page Giter Site logo

locustio / geventhttpclient Goto Github PK

View Code? Open in Web Editor NEW

This project forked from geventhttpclient/geventhttpclient

16.0 12.0 11.0 545 KB

A high performance, concurrent http client library for python with gevent

License: Other

Makefile 0.14% Python 54.29% C 45.57%

geventhttpclient's Introduction

geventhttpclient

Fork

This fork automatically builds wheels and uploads to the geventhttpclient-wheels package on PyPI

Build Status

A high performance, concurrent HTTP client library for python using gevent.

gevent.httplib support was removed in gevent 1.0, geventhttpclient now provides that missing functionality.

geventhttpclient use a fast http parser, written in C, originating from nginx, extracted and modified by Joyent.

geventhttpclient has been specifically designed for high concurrency, streaming and support HTTP 1.1 persistent connections. More generally it is designed for efficiently pulling from REST APIs and streaming APIs like Twitter's.

Safe SSL support is provided by default. geventhttpclient depends on the certifi CA Bundle. This is the same CA Bundle which ships with the Requests codebase, and is derived from Mozilla Firefox's canonical set.

Python 2.7 and 3.4+ are supported. Python 2.6 is no longer supported.

Use of SSL/TLS with python 2.7.9 is not recommended and may be broken.

A simple example:

#!/usr/bin/python

from geventhttpclient import HTTPClient
from geventhttpclient.url import URL

url = URL('http://gevent.org/')

http = HTTPClient(url.host)

# issue a get request
response = http.get(url.request_uri)

# read status_code
response.status_code

# read response body
body = response.read()

# close connections
http.close()

httplib compatibility and monkey patch

geventhttpclient.httplib module contains classes for drop in replacement of httplib connection and response objects. If you use httplib directly you can replace the httplib imports by geventhttpclient.httplib.

# from httplib import HTTPConnection
from geventhttpclient.httplib import HTTPConnection

If you use httplib2, urllib or urllib2; you can patch httplib to use the wrappers from geventhttpclient. For httplib2, make sure you patch before you import or the super calls will fail.

import geventhttpclient.httplib
geventhttpclient.httplib.patch()

import httplib2

High Concurrency

HTTPClient has connection pool built in and is greenlet safe by design. You can use the same instance among several greenlets.

#!/usr/bin/env python

import gevent.pool
import json

from geventhttpclient import HTTPClient
from geventhttpclient.url import URL


# go to http://developers.facebook.com/tools/explorer and copy the access token
TOKEN = '<go to http://developers.facebook.com/tools/explorer and copy the access token>'

url = URL('https://graph.facebook.com/me/friends')
url['access_token'] = TOKEN

# setting the concurrency to 10 allow to create 10 connections and
# reuse them.
http = HTTPClient.from_url(url, concurrency=10)

response = http.get(url.request_uri)
assert response.status_code == 200

# response comply to the read protocol. It passes the stream to
# the json parser as it's being read.
data = json.load(response)['data']

def print_friend_username(http, friend_id):
    friend_url = URL('/' + str(friend_id))
    friend_url['access_token'] = TOKEN
    # the greenlet will block until a connection is available
    response = http.get(friend_url.request_uri)
    assert response.status_code == 200
    friend = json.load(response)
    if friend.has_key('username'):
        print '%s: %s' % (friend['username'], friend['name'])
    else:
        print '%s has no username.' % friend['name']

# allow to run 20 greenlet at a time, this is more than concurrency
# of the http client but isn't a problem since the client has its own
# connection pool.
pool = gevent.pool.Pool(20)
for item in data:
    friend_id = item['id']
    pool.spawn(print_friend_username, http, friend_id)

pool.join()
http.close()

Streaming

geventhttpclient supports streaming. Response objects have a read(N) and readline() method that read the stream incrementally. See src/examples/twitter_streaming.py for pulling twitter stream API.

Here is an example on how to download a big file chunk by chunk to save memory:

#!/usr/bin/env python

from geventhttpclient import HTTPClient, URL

url = URL('http://127.0.0.1:80/100.dat')
http = HTTPClient.from_url(url)
response = http.get(url.query_string)
assert response.status_code == 200

CHUNK_SIZE = 1024 * 16 # 16KB
with open('/tmp/100.dat', 'w') as f:
    data = response.read(CHUNK_SIZE)
    while data:
        f.write(data)
        data = response.read(CHUNK_SIZE)

Benchmarks

The benchmark does 1000 get requests against a local nginx server with a concurrency of 10. See benchmarks folder.

  • httplib2 with geventhttpclient monkey patch (benchmarks/httplib2_patched.py): ~2500 req/s
  • geventhttpclient.HTTPClient (benchmarks/httpclient.py): ~4000 req/s

geventhttpclient's People

Contributors

amorgun avatar cloudaice avatar cyberw avatar graingert avatar gwik avatar heyman avatar janr avatar jimmyr avatar joshblum avatar krallin avatar lichray avatar llabatut avatar lucidfrontier45 avatar lvella avatar magupov avatar methane avatar ml31415 avatar monsterxx03 avatar nanki avatar northisup avatar ojomio avatar own3dh4rd avatar rmohr avatar sbraz avatar scarabeusiv avatar sirkonst avatar strakh avatar thanethomson avatar timclicks avatar tirkarthi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.