Giter Site home page Giter Site logo

cs418-s20-the-parselmouths's Introduction

The Parselmouths

Repository for CS 418 Group Project.

  • Project : Analyzing H1B Acceptance trends and factors relating to it.

H1B visa is a nonimmigrant visa issued to graduate level applicants allowing them to work in the United States. The employer sponsors the H1B visa for workers with theoretical or technical expertise in specialized fields such as in IT, finance, accounting etc. An interesting fact about immigrant workers is that about 52 percent of new Silicon valley companies were founded by such workers during 1995 and 2005.

Some famous CEOs like Indira Nooyi (Pepsico), Elon Musk (Tesla), Sundar Pichai (Google),Satya Nadella (Microsoft) once arrived to the US on a H1B visa.

Motivation: Our team consists of five international graduate students, in the future we will be applying for H1B visa. The visa application process seems very long, complicated and uncertain. So we decided to understand this process and use Machine learning algorithms to predict the acceptance rate and trends of H1B visa petitions.

Aim:

The goal of the project is to analyse the various trends in the H1b Visa dataset.We have two ML Models to predict acceptance or rejection of visa application and to predict the wage_rate of an applicant.

With the below visualizations in the notebook we provide insights into the H1B visa.

  • Salary distribution for Data Science Domain.

  • Number of jobs in Data Science Domain.

  • Top 10 employers who sponsor H1B Visa.

  • Job Distribution in the US for each State.

  • Education level comparison for each Degree.

Data Set:

The data used in the project has been collected from the Office of Foreign Labor Certification (OFLC).The Data provides insight into each petition with information such as the Job title, Wage, Employer, Worksite location etc. To download the data follow the steps below:

  1. Click on the above link to open the OFLC webpage.
  2. Click on the Disclosure data tab.
  3. Scroll down to find the LCA/H1B data.

Files in the repository:

  • cleaning.py - a collection of methods used to clean the data and used for feature engineering.

  • baseline.py- A python file used to calculate the baseline prediction.

  • read_files.py - A python program to read csv files for years 2015, 2016, 2017, 2018 and 2019 in an efficient way to avoid the RAM from crashing.

  • H1B_Visa_Analysis.ipynb: Main notebook of the project

  • visualizations.py- Code for the visualizations shown in the notebook.

  • mid_progress_report.ipynb: Progress report submitted in April.

  • H1b_perm.ipnyb: Perm data set used to create visualization for Education.png which is attached to the H1B_Visa_Analysis Notebook.

  • read_me_python_files.pdf: Instructions to load python files to google colab.

cs418-s20-the-parselmouths's People

Contributors

sheetal-prasad avatar swathi-965 avatar mjagad2 avatar ginogustavo avatar charicf avatar itsmeccr avatar

Watchers

James Cloos avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.