Giter Site home page Giter Site logo

scrapy-nimble's Introduction

Scrapy Nimble Middleware

scrapy-nimble is a Scrapy Downloader Middleware that helps to integrate Scrapy with Nimble Web API.

Install

You can install scrapy-nimble as a regular Python package from PyPI using:

pip install scrapy-nimble

Configuration

  1. If you don't have it yet, open an account with Nimble.

  2. Provide your credentials and enable the middleware through Scrapy settings.

    # settings.py
    NIMBLE_ENABLED = True
    
    NIMBLE_USERNAME = "username"
    NIMBLE_PASSWORD = "password"
  3. Add the downloader middleware to your DOWNLOADER_MIDDLEWARES Scrapy setting.

    # settings.py
    DOWNLOADER_MIDDLEWARES = {
        "scrapy_nimble.middlewares.NimbleWebApiMiddleware": 570,
    }

    If you have scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware enabled (it is enabled by default in DOWNLOADER_MIDDLEWARES_BASE setting with default order equal to 590), configure scrapy-nimble middleware before it.

Basic Usage

Once the downloader middleware is properly configured, every request goes through the Nimble's Web API. There is no need to change anything in your spider's code.

Real-time URL request

scrapy-nimble uses Nimble Web API with Real-time URL requests. In addition to the default GET request for a specific URL, this API provides some extra options that allow you to execute geolocated requests, render dynamic content, among others.

Right now the following request options can be used. Check the documentation for usage and the valid values that can be provided. If the option is not given, the default value from Web API will be used.

  • method
  • country
  • locale
  • headers
  • cookies
  • render
  • render_options

Add the options you want to be used inside the meta key of your request, appending nimble_ to the option name such as:

# Inside your spider
yield scrapy.Request(
   "https://nimbleway.com",
   meta={
      "nimble_country": "DE",
      "nimble_locale": "uk",
      "nimble_render": True,
   }
)

Development

We suggest the use of pyenv to manage your Python version and create an isolated environment where you can safely develop. After installing it, you can prepare the environment using the following commands:

$ pyenv virtualenv 3.11.6 myvenv
$ pyenv activate myvenv
$ python -m pip install -e .

To keep a standard in code formatting and do some linter checks, we use pre-commit hooks. Install pre-commit package and install the project hooks using:

$ pre-commit install

Now you are ready to start development.

scrapy-nimble's People

Contributors

aivrit avatar rennerocha avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.