Giter Site home page Giter Site logo

creditcarddefault's Introduction

💸Credit Card Default Prediction

Economic Development not only improves people’s living standards but also changes people’s consumption concept and consumption mode. People are more and more inclined to spend ahead of time and mortgage their “credit” to the bank to enjoy certain things in advance. However, when consuming, people often lack rational thinking and overestimate their ability to repay loans to banks in time. On the one hand, it increases the loan risk of banks; on the other hand, it increases the credit crisis of consumers themselves . With a large number of banks selling credit cards, the phenomenon of credit card default emerges one after another. It is very important for banks to effectively identify high-risk credit card default users.

📊 Data Source

In our dataset we have 25 columns which reflect various attributes of the customer. The target column is default.payment.next.month , which reflects whether the customer defaulted or not. Our aim is to predict the probability of default given the payment history of the customer. I have built my model using a public dataset available on kaggle.

https://www.kaggle.com/datasets/uciml/default-of-credit-card-clients-dataset

🖥 Web UI

App Screenshot

🎯Approach

Notebook Name : 1.1_EDA_DATA_PREPROCESSING

Custom Defined Modules Used : None

Notebook Description :

• General Data Visualisation, Analysing relation between features and target. • Using Boxplots to visualize outliers. • Data Sanity Checks.

Notebook Name : 1.2_FEATURE ENGINEERING FOR CATEGORICAL FEATURES

Custom Defined Modules Used : Data_Ingestion_And_Preprocessing

Notebook Description :

• Load and pre-process the data using custom-defined module. • This module performs data sanity checks, replaces unknowns,removes outliers and balances the data. • To handle our categoric features I created a basic random forest model. I tried one-hot encoding, count encoding, target mean encoding and leaving the categories as discrete ordinal features. • The best results were obtained by target mean encoding. • Hence the categoric features have been target mean encoded. • For logistic regression we have scaled the data. • The pre-processed data has been saved as train.csv and test.csv

Notebook Number : 2-5

Custom Defined Modules Used : Build_Evaluate_Model

Notebook Description :

• We have built logistic regression , random forest,balanced random forest , xgboost and adaboost classifier models. • To build each of the model we hae used a custom defined module Build_Evaluate_Model. • For each of the model we start by building a base model which is based on default parameters of the model. • We then perform hyperparameter tuning and find the best model. • We save the train and test score for model comparison. • Model Evaluation : For every model built we record the train and test roc_auc score • We choose the best model based on train and test roc_auc score and difference between train and test score to ensure that there is no overfitting.

  • Final Model is stored as pickle file Final_Model.pkl.

Custom Defined Modules

  • Data_Ingestion_And_Preprocessing- Data loading and Preprocessing.
  • Build_Evaluate_Model - Used in building classifiers and evaluating model performance.
  • Deployment_inputs-transforming inputs from user to Features of our model.
  • app.py- Used for building and deploying app.

Deployment Files

  • requirements.txt
  • Procfile
  • app.py

📑Documentation

Detailed Project Report

High Level Document

Low Level Document

Architecture Document

Wireframe Document

⚡Deployment

Deployed on web using Heroku url : https://credit-default-prob.herokuapp.com/

⚡Demo Video

For Project Demo Click Here

Author✍

creditcarddefault's People

Contributors

sanikadharwadker avatar

Stargazers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.