Giter Site home page Giter Site logo

uni_value's Introduction

title output
What is the Worth of Your Education?
layout

Monetary Value of University Degree

Tony Paek
March 22, 2015

Introduction


The graph critique is based on the Economist's article titled It depends on what you study, not where. The article presents a scatter plot of the average rate of return on investment from 240 educational institution in the United States (shown below) and makes two following claims.

  1. What people studied affects the future earnings of graduates.
  2. Where people study does not affect the future earnings of graduates.

Drawing

Graph Critique


The graph above has many issues, some of which are addressed below.

1. Two Claims in One Graph

The author makes two claims from one graph. The first one is that what people study affects the earnings of graduates. The second is that where people studied does not affect the earnings. I intend to verify if those claims are true, by creating separate graphs that serve to answer those two questions separately.

2. Choice of Variables

X-Axis : Selectivity of a School

Selectivity is important. However, is it the most important consideration when high schoolers evaluate colleges? Brown University, for example, is more selective in admitting students than Caltech, but do students think Brown would offer better earning potentials compared to Caltech? I disagree. However plagued it may be, the ranking of a university would be a more natural aspect to look at when people consider the potential earnings with the degrees in those colleges.

Therefore, my data exploration will thus noy only look at the selectivity of a school, but also the ranking of a university, graduation rate, its tuition in order to make inference about the correlations of those factors in student's earning potentials.

Y-Axis : Average Rate of Return on Educational Investment

20 year annual average return is calculated as below:

$$ Average Rate of Return = \frac{(Aggregate Earnings of Degree Holder)-(Aggregate Earnings of Highschoolers)-(4Year Tuition)}{(20* 4Year Tuition)} $$

The choice of Y-Axis makes the graph uninformed. I don't think the rate of return on college investement is appropriate. When the graph does not even provide how much it costs to obtain a bachelor's degree at a particular institution, the variable at a y-axis that represents the annual return on average in percentage does not inform me how much an individual institution would help the graduates earn. The aggregate return on investment over 20 years, on the other hand, would be a better piece of information.

3. Direction of Selected Variables

In addition, the direction of the axis was not natural to me. It would seem more natural that more selective schools are located on the right. For features that may be inversely correlated, the direction of the axis changed so that positive slopes are expected.

4. Omission of Variables

I'm not sure if it was intentional, but noticeable omissions of data exist in the graph. For arts/humanities data points, data for universities with an admission rate less than 10% do not exist. I think this severely scews the results in favor of engineering/computer science/maths degrees, and also the slope of a curve. I intend to address it by incorporating data points without bias.

Data Collection


PayScale Data

Data from PayScale that indicated the overall earnings of an institution were extracted. Sources are here. Since data were present in one html page, and I just had to merely paste and copy in Excel, and do some data cleaning in Python. 796 colleges were extracted from PayScale.

U.S. News Data

U.S. News had the most comprehensive ranking of national universities. Sources are here.

Data were scattered in multiple pages, which necessitated the use of Python + Beautifulsoup to extract data. As the tables were not clean, a lot of exception handling had to be done to correctly scrape data required for analysis. Here is the link to my ipython notebook for scraping the data. 273 colleges were extracted from U.S. News.

Entity Matching

Matching two universities was the most challenging portion of this project. The only available key values used to match two universities were names. While the names of private universities were more or less consistent, many state universities were represented very differently and inconsistently. Quite a lot of manual work was performed. The code to perform such tasks is shown here.

Eventually, except for 24 colleges that did not have a match, all the universities scraped from U.S. News were merged with data from PayScale, so that a comprehensive set of data points is plotted.

Clean data used for the analysis are below.

U.S. News College Rank Data + Payscale College Value Data

Payscale College Value Data Divided by Major

Payscale College Value Data Divided by Career

The codes used to scrap and process data are here.

Implementation


1. Static Graph (Rank vs ROI) A simple static graph(sort of) was created, with the features that let me choose the x-axis variables, which can be accessed here.

The codes used to produce the graph above are here.

2. Interactive Graph (Rank vs ROI) The graph lets users (i) filter the results by rank, admission rate, graduation rate, tuition cost, name, and (ii) select variables to be displayed for both x and y axis, which can be accessed here.

The codes used to produce the graph above are here.

3. Interactive Graph Organized Grouped by Major The graph lets users filter the results by ROI and total tuition, name of university, which can be accessed here.

The codes used to produce the graph above are here.

4. Interactive Graph Organized Grouped by Career The graph lets users filter the results by ROI and total tuition, name of university, which can be accessed here.

The codes used to produce the graph above are here.

uni_value's People

Contributors

tonypaek avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.