Giter Site home page Giter Site logo

website-control-graph's Introduction

Website Control Graph

Website Control Graph is a webscraper which allows you to control websites via graphql.

Setup

git clone https://github.com/meinto/website-control-graph.git
cd website-control-graph
go run github.com/99designs/gqlgen
go run server/server.go

Test it

Open http://localhost:4000 and copy-paste the query. You will get the h1 and h2 headlines of the hello world wikipedia article as result.

query {
  control(
    actions:[
      {navigate:"https://en.wikipedia.org/wiki/%22Hello,_World!%22_program"},
      {waitVisible:"h1"}
    ]
    output: [
      {
        name: "Überschriften"
        selectors: [
          {
            type: string_array
            cssSelector: "h1"
            key: "headline1"
          }
          {
            type: string_array
            cssSelector: "h2"
            key: "headline2"
          }
        ]
      }
    ]
  ) {
    output
  }
}

Actions

The action type consists of different properties which you can use as input for your query. All actions will be executed as a queue.

input Action {
  navigate: String     # navigate to url
  sleep: Int           # sleep n seconds
  waitVisible: String  # wait till a specific element is visible on page
  sendKeys: Input      # fill data into an input
  click: String        # click a specific element on a page
  evalJS: String       # execute javascript
  runtimeVar: Selector # save a string from the website to use it in a following action
}

Example login

query {
  control(
    actions:[
      {navigate:"https://your-website-with-login.com/login"},
      {waitVisible:"input[name='user']"}
      {sendKeys:{
        selector: "input[name='user']"
        value:"your-name"
      }}
      {sendKeys:{
        selector: "input[name='password']"
        value:"your-pass"
      }}
      {click:"input[type='submit']"}
      {waitVisible:"p.content-you-want-to-query"}
    ]
    output: [
      {
        name: "Name of collection",
        selectors: [ ... ]
      }
    ]
  ) {
    output
  }
}

Example runtime variable

The result of the following query will be the h1 Headline of the Hello_World_(disambiguation) page. This is the first link of the hello world wikipedia article.

query {
  control(
    actions:[
      {navigate:"https://en.wikipedia.org/wiki/%22Hello,_World!%22_program"},
      {waitVisible:"h1"}
      {runtimeVar: {                              # store the link to Hello_World_(disambiguation) page
        cssSelector: ".mw-disambig"
        HTMLAttribute: "href"
        key: "runtimevar"
        type: string_prop
      }}
      {navigate:"https://en.wikipedia.org$0"},    # use the link to Hello_World_(disambiguation) page ($0)
      {waitVisible:"h1"}
    ]
    output: [
      {
        name: "Überschriften"
        selectors: [
          {
            type: string_array
            cssSelector: "h1"
            key: "headline1"
          }
        ]
      }
    ]
  ) {
    runtimeVars {
      name
      value
    }
    output
  }
}

Docker

Build your graphql as docker container:

docker build -t website-control-graph .
docker run -p 4000:4000 website-control-graph

website-control-graph's People

Stargazers

Roman avatar Marcel Renno avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.