shuvamjoy34,Shuvam Sanyal,github

bankruptcy-prediction

Since Lehman Brother bankruptcy catastrophe event during 2008 global financial crisis, estimating the advanced risk of corporate bankruptcies has been of large importance to creditors and investors. Despite being a relatively new research topic, in recent years, artificial intelligence and machine learning methods have achieved promising results in corporate bankruptcy prediction settings. In this research summer project, I created a new interesting machine learning model for predicting upcoming bankruptcies using around 46 years US Corporate Bankruptcy Dataset. After thorough cleaning and missing value imputation as well as feature engineering, our ﬁnal dataset finally contains 23320 observations with 210 features related to ﬁnancial, management statements from 93837 observations with 15 features. I performed my analysis based on nine different machine learning techniques (Logistic Regression, KNN, SVM, Naïve Bayes, Decision Tree, Random Forest, AdaBoost, XgBoost, CatBoost) on the dataset. For evaluation, I have used Accuracy Scores, ROC-AUC Curve, Confusion Matrix, Precision, Recall as well as F Score and Cumulative Gain Chart. The best models came out to be Random Forest, XgBoost, AdaBoost, CatBoost and Decision Trees. After applying an Ensemble Voting Method on top 5 algorithms, it votes for Random Forest and the boosting algorithms to be the best two predictors on both training and testing data of bankruptcy cases, thus reducing over fitting problem. To crosscheck over fitting, we used the cross-validation method to find CV mean scores of algorithms. Again, Random Forest & Gradient Boosting algorithms topped the list. Both of the algorithms yields an overall accuracy of ∼93%, training data accuracy of ∼99% and class independent test accuracy of∼92% on the balanced imputed and feature engineered dataset, which I over sampled using SMOTE along with dimension reductions using PCA even before performing the Machine Learning models to achieve better accuracies. My final model is finally able to correctly predict all 8033 bankrupt ﬁrms correctly and 17208 non-bankrupt ﬁrms correctly out of 17217. Only 9 corporations out of all, have been misclassified as bankrupt when they are actually not. The results, I obtained from KNN, SVM, Naïve Bayes and Logistic Regression take lots of time to train and do not perform good on test datasets as well as training the model, despite having few good accuracies, when dataset is really large. Hence these models are not recommended for any kind of Corporate Bankruptcy or financial predictions. Furthermore, I found that our model assigns importance to few of the individual components of their ratios, in particular, components related to asset, liquidity, profitability and productivity.

covid-chestxray-dataset

We are building an open database of COVID-19 cases with chest X-ray or CT images.

data

dataset

loandefaultprediction

lstm-music-genre-classification

Music genre classification with LSTM Recurrent Neural Nets in Keras & PyTorch

mlflow2

mlflow3

mlflow_tutorial

multi-class-image-classification-using-cnn-and-ann

As a part of my third semester deep learning project assignment, I collected 100 random images containing 25 images of car and bike each and 50 other random images of various categories. Then I created two deep learning models using Convolutional Neural Network and Artificial Neural Network to compare the classification results using different metrics.

numbercrunchers

Hardworking and enthusiastic final year second semester masters student working towards a degree in Data Science,Machine Learning and Artificial Intelligence from Symbiosis International(Deemed University),Pune,India.

shuvamjoy34 Goto Github PK

Shuvam Sanyal's Projects

bankruptcy-prediction

covid-chestxray-dataset

data

dataset

loandefaultprediction

lstm-music-genre-classification

mlflow2

mlflow3

mlflow_tutorial

multi-class-image-classification-using-cnn-and-ann

numbercrunchers

shuvam

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent