Giter Site home page Giter Site logo

disqusscraper's Introduction

disqusScraper

disqusScraper is a console-based go-routine oriented Go application that lets you extract all popular threads from any given disqus forum name that is not private. For private disqus forums, disqusScraper accumulates a list of all top commenters and individually parses their activities to locate all threads that relate to the given disqus forum name.

This project was created in contract with removeyourmedia.com in order to make the disqus platform more accessible to scrutiny and thereby to help fight piracy and thriving pirated content on the aforementioned platform.

Tech

disqusScraper uses the following components to work properly:

Installation

disqusScraper requires Golang v1.7+ to run. Install the dependencies and devDependencies and compile the project using following command.

$ go build

Usage Instructions

$ c:\myEpicFolder\disqusScraper.exe -forum=kissanime -worker=10 -debug=true -nonoise=true

where;

  • -forum switch takes the name of the forum that we will scrape users off and get links; default “fiestaonline”
  • -worker switch defines how many simultaneous “threads” are doing the job; default 10
  • -debug switch defines whether we want to be verbose; default true
  • -nonoise switch decides whether to only pick up links that stem from given forum name, or ignore that and pick up everything, considered true if forum is public; defaults true
  • -parseusers switch decides to parse users even if forum is public, considered true if forum is private; defaults to false

Todos

  • Filter User comments to only show links that belong to provided forum name. (Done)
  • Get all links from a given disqus forum name without getting their users if the forum isn’t private. (Probably done)
  • Write Tests
  • Add Web Interface

N.B. More discussions in source-code

Example output text-file contents:

https://disqus.com/home/discussion/kissmanga/read_manga_happy_if_you_died_ch014_online_in_high_quality
https://disqus.com/home/discussion/kissmanga/revival_man_manga_read_revival_man_manga_online_in_high_quality
https://disqus.com/home/discussion/kissmanga/read_manga_young_gun_ch003_online_in_high_quality
https://disqus.com/home/discussion/kissmanga/read_manga_i_am_my_wife_ch002_online_in_high_quality

Contact:

License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

disqusscraper's People

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.