Giter Site home page Giter Site logo

final_project's Introduction

Salary Prediction Using Machine Learning

This project focuses on predicting salary based on various factors using machine learning techniques. It serves as the final project for the Ironhack bootcamp.

Table of Contents

Project Overview

The goal of this project is to develop accurate models that can predict salary based on occupation, education, and experience-related features. It involves data preprocessing, feature engineering, model selection, evaluation, and tuning.

The main objectives of the project are:

  • Exploring and understanding the dataset
  • Performing data cleaning and preprocessing
  • Applying various machine learning algorithms for salary prediction
  • Evaluating and comparing the performance of different models
  • Selecting the best-performing model and fine-tuning it
  • Providing insights and recommendations based on the results

Dataset

The dataset used for this project is a pre-processed and cleaned dataset obtained from reliable sources. It contains information about monthly salaries and related attributes of individuals. The features include gender, education level, years of experience, and more.

Installation

To run the project locally, please follow these steps:

  1. Clone the repository to your local machine.
  2. Install the required dependencies by running the following command: pip install -r requirements.txt.

Usage

  1. Make sure you have the necessary data file, salary_cleaned.csv, placed in the appropriate location.
  2. Run the main code file, salary_prediction.py, to train and evaluate the models.
  3. View the results, including MAE, MSE, and R-squared scores, printed in the console.
  4. Explore the code and modify it as needed for further analysis or experimentation.

Results

The project evaluates and compares the performance of three machine learning models: Gradient Boosting, K-Nearest Neighbors, and Random Forest. The key evaluation metrics include Mean Absolute Error (MAE), Mean Squared Error (MSE), and Coefficient of Determination (R-squared).

Based on the evaluation, the best-performing model for salary prediction is the Gradient Boosting model, which achieved the lowest MAE and MSE values and the highest R-squared score on the testing set.

Contributing

Contributions to this project are welcome. If you have any suggestions, improvements, or bug fixes, please open an issue or submit a pull request.

Credits

This analysis was conducted by Shahbaz Mazhar With the guidance from Ignacio, Lukaz, Sandra and my amazing fellow classmates.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.