Giter Site home page Giter Site logo

wikitrends's Introduction

WikiTrends

Upenn ESE Senior Design Project

WikiTrends was originally created as a Senior Design Project at the University of Pennsylvania by Abhiti Prabahar, Cristina Buenahora, Alice Serfati, and Sierra Yit during the 2016-2017 school year. Their advisor was Professor Chris Callison-Burch.

The goal of the project is to find and analyze spikes in Wikipedia page count data to create a completely unbiased news source across different languages.

Members: Abhiti Prabahar, Cristina Buenahora, Alice Serfati, Sierra Yit

Advisor: Professor Chris Callison-Burch

Goal: Find and analyse spikes in Wikipedia page count data to create a completely unbiased news source across different languages.

Usages: downloadDay.sh 10 15 (downloads oct 15)

downloadMonth.sh 10 (downloads oct)

Both files download unzipped pages into pageviews/ and then the english parts to enpageviews/.

How to run the web app locally on a Mac.

  1. Clone the WikiTrends github repo
  2. The WikiTrends web app requires Node.js. You can download Node.js from https://nodejs.org/en/ (it's usually better to choose the version that is recommended for most user). Or, you can install Node.js using a package management system for your Mac like Homebrew
  • To install Homebrew follow the steps on the How to Install Homebrew on a Mac instruction guide to install Homebrew.
  • Open the Terminal app and type brew update. This updates Homebrew with a list of the latest version of Node.
  • Type brew install node.
  1. You'll then need to install several packages
  • First change directories in Terminal so that you are in the WikiTrends/node_modules/ directory. Then use the node package manager to install these packages:
  • npm install express
  • npm install express-session
  • npm install body-parser
  • npm install cookie-parser
  • npm install method-override
  • npm install ejs
  1. After that you can change directories to WikiTrends/webapp and run the terminal command node app.js. You should see a message saying "App listening on port 8000". If you get an error about a missing package, then it may be resolved by running npm install again with the package name.
  2. Point your web browser to http://localhost:8000/now There you should see the WikiTrends web app using example data from some time ago.

How to update data in the webpage

  • backend/hourly.sh is the key script for updating the counts and finding spikes hourly
  • it requires however, the past 2 weeks Wikipedia pageviews counts.

How to get and process the current page views

Change directories to WikiTrends/backend Make the scripts in the folder executable by:

  • chmod u+x *.sh

Process:

  1. Use dictMaker.py to create a file with pagename '\t' count1 count2 .....
  2. Use spikeFinder.py to find spikes in the dictMaker output

Done: Backend

  • Script to find the views everyday for a whole month
  • Working on script to find the spikes for the month

TODO:

  • Write spikeFinder.py, vary num days averaged
  • Download a small chunk of files to test spikeFinder.py

wikitrends's People

Contributors

maryannecosgrove avatar sierrayit avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.