Giter Site home page Giter Site logo

alickrxu / aaai-2015-demographics Goto Github PK

View Code? Open in Web Editor NEW

This project forked from tapilab/aaai-2015-demographics

1.0 1.0 0.0 1.99 MB

Code and data for the AAAI 2015 paper entitled: "Predicting the demographics of Twitter users from social evidence using website traffic data"

License: GNU General Public License v3.0

Jupyter Notebook 99.71% Python 0.29%

aaai-2015-demographics's Introduction

This repository contains data and code to reproduce the results of the paper "Predicting the demographics of Twitter users from website traffic data", by Aron Culotta, Nirmal Ravi, and Jennifer Cutler, presented at the 29th AAAI Conference on Artificial Intelligence (AAAI-15), 2015.

Abstract

Understanding the demographics of users of online social networks has important applications for health, marketing, and public messaging. In this paper, we predict the demographics of Twitter users based on whom they follow. Whereas most prior approaches rely on a supervised learning approach, in which individual users are labeled with demographics, we instead create a distantly labeled dataset by collecting audience measurement data for 1,500 websites (e.g., 50% of visitors to gizmodo.com are estimated to have a bachelor's degree). We then fit a regression model to predict these demographics using information about the followers of each website on Twitter. The resulting average held-out correlation is .77 across six different variables (gender, age, ethnicity, education, income, and child status). We additionally validate the model on a smaller set of Twitter users labeled individually for ethnicity and gender, finding performance that is surprisingly competitive with a fully supervised approach.

Content

This repository is organized as follows:

  • data: (some) of the data used in the experiments.
  • src: iPython notebooks to reproduce results.

###data

  • brands.json : contains names of 1500 websites
  • demo.json : contains variables information like gender,age etc for the 1500 websites
  • twitter-cred.json : accounts used in data collection

###src

  • data_collection.ipynb : data collection code
  • data_processing.ipynb : data processing code

###Installation and configuration

  1. Install MongoDB
  2. Install python modules
$ pip install anaconda
$ pip install pymongo
$ pip install twython
$ pip install ipython
$ pip install twutil
  1. Add your twitter credentials to data/twitter-cred.json.
  2. Clone this repo
$ git clone https://github.com/tapilab/aaai-2015-demographics.git
  1. Run the notebooks
$ cd aaai-2015-demographics/src
$ ipython notebook --matplotlib inline data_collection.ipynb
$ ipython notebook --matplotlib inline data_processing.ipynb

Static Notebooks

Static versions of the iPython notebooks are here:

aaai-2015-demographics's People

Contributors

aronwc avatar nkravi avatar alickrxu avatar

Stargazers

Zhao Jin avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.