Giter Site home page Giter Site logo

scrappydoo-node's Introduction

pageres

Companion crawl server for the ScrappyDoo Chrome Extension. Given a URL and a set of CSS selectors, the server goes through the markup, parses for and returns the values found at those selectors.

build

Dependencies

  1. Node.js (>= 6.9.0)
  2. yarn (npm i yarn --global)

Installation

$ yarn install

Usage

TO start up the server (on localhost and default port 6969), use the command -

$ npm start

API

POST /api/data

Request Headers

{
    "Content-Type": "application/json"
}

Request

url

Type: String (Page URL)
Required: true

Fully-qualified URL to crawl.

data

Type: Array (Selector information)
Required: true

Array of selectors.

data > name

Type: String (Unique name for the selector)
Required: true

Unique name for the selector. Will be used in the response to identify selector in results.

data > selector

Type: String (CSS selector to select from page)
Required: true

CSS selector for element to select.

data > attribute

Type: String (HTML attribute to pick from selected element)
Required: true

HTML attribute to pick from the selected element.

Sample Request

{
    "url": "https://www.reddit.com",
    "data": [
        {
            "name": "header_logo",
            "selector": "#header-img",
            "attribute": "href"
        },
        {
            "name": "sidebar_donate_link",
            "selector": "html>body>div:eq(2)>div:eq(7)>div>div>a",
            "attribute": "href"
        }
    ]
}

Sample Response

{
    "header_logo": "/",
    "sidebar_donate_link": "/gold?goldtype=code&source=progressbar"
}

License

WTFPL © GP

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.