Giter Site home page Giter Site logo

pogags / polarization-twint-project Goto Github PK

View Code? Open in Web Editor NEW
3.0 3.0 2.0 14.12 MB

This repository contains the files and info used in a Data Science project that focuses on the continued polarization of Americans on Twitter.

Python 0.01% Jupyter Notebook 99.99%
abortion culture data-mining education guns healthcare immigration natural-language-processing nltk political-analysis politics python twint

polarization-twint-project's Introduction

Polarization & a Divided America

Project by Paul Gagliardi and James Duchesneau

Overview

As Americans, it can seem that we are more divided than we have ever been when it comes to politics. Anecdotally, the civil unrest of 2020 and January 6th riots at the capital point towards a public that is increasingly polarized and partisan. In this study, we look to see if we can catch any objective clues that would allow us to show numerical proof that we are truly living in more divided times.

Have we become more polarized?

Strategy

The best way to understand what people think about topics in tweets is to utilize language. To address our research question, we will be creating data by pulling tweets, and using natural language processing to analyze the text data.

Content

Reference the Project Writeup for a detailed summary of findings and more info on this project.

To see how the visualizations were generated and how data was processed, see Generate Graphs.

To see how the Tweets were pulled, see Pull Tweets.

Data

Tweets

We examined tweets from November across five years (2016-2020) and five topics to examine how language has changed. The five topics we used were chosen because of their political significance as well as their tendency to be hotly debated and catalysts for polarization. The topics are:
• Abortion
• Education
• Gun Control
• Healthcare
• Immigration

~500 Tweets were pulled from each subject at each timeframe, meaning roughly 2500 for each subject and 12500 total.

Metrics

For this project, we examined two metrics and utilized a third visual representation strategy to try and pull together a picture of how the language over these topics have changed.
• Sentiment Analysis using NLTK
Extreme Word Count
• Word Cloud (visual representation for subjective analysis)

Twint

This project relied heavily on twint, a python package that allows for older tweets to be scraped from Twitter without the use of the Twitter API. A example is listed here.

Limitations

Due to the nature of the twint package and how it scrapes Twitter, the PullTweets file can occasionally struggle with pulling every CSV correctly the first time. If a CSV is not being generated, simply try running the cell or the specific line of code to generate that CSV again. This may take one or several tries.

If this does not work, simply use the files listed in the TweetCSV-archive folder.

Conclusion

Based on the data we collected, and specifically focusing on the extreme word counts, it is hard to deny that there is significant evidence for continuing polarization among the American public, at least when it comes to the politics around the topics of Abortion, Education, Gun Control, Healthcare, and Immigration. Extreme word use doubled for nearly every topic, and increased in all topics.

We do not have as strong evidence for continued division when examining sentiment analysis, but we do see some. Negative sentiment on Education is indicating more division and passion in that area (education's extreme words also doubled) and we saw much less neutrality and more opinionated opinions surrounding the area of healthcare (extreme words more than doubled here).

Combining these observations, we can confidently conclude that Healthcare and Education show strong evidence that they are quickly becoming more controversial/polarizing, whereas Gun Control, Immigration, and Abortion have some evidence for becoming more controversial/polarizing but not substantial.

polarization-twint-project's People

Contributors

pogags avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.