Giter Site home page Giter Site logo

sf_salaries_gender_bias's Introduction

205Final

Gender Gap in SF government employee salaries

I started this project thinking I could help potential SF government employee decide to take a job by providing them a sort of living standard that comes with their salaries. (Proposal powerpoint -Should you work as a SF government Employee.ppt) I realized that most of the employees did not have enough salary to be able to afford housing within SF city and getting data for the suburban neighborhood would be out of scope of this project. Also, I found interesting analysis done on kaggle which talked about looking at patterns using the gender of the employees available. It led me to create a python script (cleanSFCsalaries.php) that cleans the data, use sexmachine package to determine the gender of an employee based on their names. The issue with that is I had to drop some records for which the package could not determine the gender. I used the title to determine their career track, grade level and associated department. I used this new formatted data, and uploaded in hive.

I used tableau to connect and look for some patterns. I came up with a data visualization that I implemented in d3, and is hosted at

http://people.ischool.berkeley.edu/~sahab/w209final/

My future plan is to get more datasets for other major cities, and run similar scripts in them. I would use pyspark for cleaning because sex macine took a whole day in EC2 server. I will use SPARK for data aggregation once the data set is huge, and create a serving layer for users of my website, where they can pick and chose what factors and cities they want to compare. I will also apply machine learning to predict the pay based on job title, department and city (summer project). I will also do sampling of outiers to find wrongly associated department, career track and grade levels.

sf_salaries_gender_bias's People

Contributors

sahaba avatar

Watchers

James Cloos avatar Andy Reagan avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.