Giter Site home page Giter Site logo

jeffersonlbr / bisnode_bankrupt_prediction Goto Github PK

View Code? Open in Web Editor NEW
1.0 1.0 0.0 8.63 MB

Bisnode Business Informations. Data analysis to predict whether a company might go bankrupt or not within a period of up to two years.

Jupyter Notebook 55.62% HTML 44.38%
machine-learning python r-markdown r-language

bisnode_bankrupt_prediction's Introduction

Project context

Building a complete data preprocessing and predictive classification model pipeline aimed at forecasting whether a company will cease operations within the next two years. The data was collected and curated by Bisnode, a European company specializing in business information. The dataset spans from 2005 to 2016 and encompasses companies across various sectors of the economy (such as electronics, electrical equipment, engines, etc.) and services (food, beverages, and lodging). Companies with revenue exceeding 100 million Euros were anonymized to minimize identification possibilities.

Preprocessing Activities (Python, R)

The entire data preprocessing pipeline was carried out in Python. Python scripts were used for data preprocessing, modeling, and exploratory analysis. For the final delivery of the predictive models, R Markdown was employed.

Information

  • Remove records from the year 2016 from your data.
  • Created a column for the dependent variable that will be the target of prediction.
  • A company has ceased operations if it was active in year X but did not report any sales in year X + 2.
  • Filtered data from the year 2012.
  • Check for inconsistencies (null values, blank values, mean, median, mode, NaN values, negative numbers).
  • np.where was used for adjustements, so that where Sales < 0, replace with 0.
  • If this variable is highly skewed, is it worth creating new columns that represent the log value of this column?
  • The company age was created using feature engineering (founded_year - year).
  • Data was filtered to include companies with revenue below 10 million euros and above 1000 euros.

Final model

  • Jupyter Notebook and VS Code were used, containing all the modeling code.
  • Markdown was used to explain ideas, code, graphs, etc.
  • Descriptive statistics and graphs were employed.
  • Files are available in the formats: .ipynb and .html.

ML model objective

The objective of this project is to develop a predictive model to indicate whether a company will cease operations within a period of up to two years.

  • The predictive models used was: decision trees, random forests, and ROC curve analysis.
  • Hyperparameter tuning was also performed.
  • The models were compared in terms of their predictive performance.

bisnode_bankrupt_prediction's People

Contributors

jeffersonlbr avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.