Giter Site home page Giter Site logo

fivethirtyeight-analysis's Introduction

FiveThirtyEight Analysis

I have been a regular reader of FiveThirtyEight since it's relaunch under ESPN in March 2014. I was drawn to their style of journalism that combined data analytics with the subjects I was interested in - namely politics, economics, and sports. In the beginning I read every word of every article on the site. However, as time went on and the site grew I found that I could no longer keep up with their daily content. I began to skip articles, and eventually found myself skimming those that I did read. I recently realized that I do not enjoy my reading of FiveThirtyEight in the way that I originally did.

I have a theory about why I don't enjoy FiveThirtyEight's articles as much as I once did: the articles are too long and less data-driven. I dislike having to read long articles that have few tables, graphics, etc. that help explain the data. If this hypothesis is true, then FiveThirtyEight has become more like other news outlets instead of a source of data journalism.

In a true FiveThirtyEight fashion, I decided to analyze data scraped from their website. The code provided here shows how I scraped and analyzed all articles from http://fivethirtyeight.com since it's official launch on March 17, 2014. You can run this code yourself to verify my results, and you may alter my choices to better match FiveThirtyEight's website.

Results

My findings suggest that my theory about FiveThirtyEight may be justified. As you can see in the below chart displaying monthly average word count, there is a clear upward trend in the length of FiveThirtyEight's articles. The trendline (with an r-squared coefficient of .376) moves from around 850 in March 2014 to almost 1100 in June 2018. Put another way, the average of the monthly word count averages for the first year of FiveThirtyEight's publication was 883.7, whereas it was 1039.5 in the most recent year (through end of June 2018). That is an increase of 17.6% in article length since the first year of publication. Furthermore, a statistical test on the linear regression gives these results have a p-value of 1.34e-06, meaning this is a statistically significant result. Thus, we reject the null hypothesis that FiveThirtyEight articles have remained the same length.

word count chart

I also analyzed the occurrences of charts and figures in FiveThirtyEight's articles, as that is my favorite aspect of their data journalism. In the below chart displaying monthly average figure count we can see a negative trendline with an r-squared coefficient of .1 (admittedly not a strong correlation). In the first year of publication the average of the monthly figure count averages was 1.83, while for the most recent year (through end of June 2018) it was 1.4. That is a decline of 23.5% since the first year of publication. Additionally, a statistical test on the linear regression gives these results have a p-value of 2.01e-02, meaning this is a statistically significant result. Thus, we reject the null hypothesis that FiveThirtyEight articles have maintained an steady number of charts and figures.

figure count chart

This analysis shows that, while slight, there has been a statistically significant increase in article length with a simultaneous decrease in the use of figures and charts. In my opinion, that shows that FiveThirtyEight has moved away from their data journalism ideals. There is more traditional writing and less usage of charts and figures across their articles. An example of this would be the FiveThirtyEight weekly slack chat, an article that is mostly just a log of a slack chat between political writers rather than a data-driven article. I'm not very interested in reading articles like these, whereas in the first year of publication I read every article on FiveThirtyEight. Now I have a data-driven explanation for why FiveThirtyEight has become less interesting.

How to run

Prerequisites:

  • Node.js
  • nvm
  • R if you would like to run a statistical test
$ nvm install
$ npm install
$ npm start

This will output all article information into a JSON object in a results.json file. It will then analyze those articles and output the analysis to results.csv.

You can modify the selectors (html element and class) used to scrape articles in config.js.

You can then run a statistical analysis on the results using R with R -f statistical-test.r

fivethirtyeight-analysis's People

Contributors

jdd1260 avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.