Giter Site home page Giter Site logo

bankingcampaign-outputprediction's Introduction

-Banking-Campaign-Output-Prediction

Introduction

The data given is related with direct marketing campaigns of a banking institution. The goal of this mini project is to train a Neural Network Model in order to predict whether a client would respond positive or negative to the campaign and subscribe to a term bank deposit. Note: the project was done in Google Colab, therefore before running the code you need to put .csv file into Google Colab explorer.

Data processing

The data was checked on/got rid off:

  1. Duplicates

  2. “unknown” values

  3. Null values

Deposit data was found to be unbalance, much more people do not subscribe to deposit.

image

Fig. 1. Deposit distribution overall.

Furthermore, the features were correlated with deposit information (fig. 2,3,4, 5). For full analysis of categories please look into the google colab notebook.

image

Fig. 2. Deposit distribution in every feature.

image image

Fig. 3. Deposit distribution in every feature.

image

Fig. 4. Heat map

image

Fig. 5. Box plot visualization for age.

Conclusion: overall, we can see that from all clients old people are more responsive to a deposit marketing. Also we note that emp.var.rate, cons.price.idx, euribor3m and nr.employed are features with very high correlation to deposit status!

Now we need to process the data before using it to train model. The categories duration, campaign, month, day_of_week, contact were dropped because they were no relevant to prediction deposit status of a client.

Because we have categorical data it was decided to use One Hot Encoding to make the data more useful and expressive, and it can be rescaled easily. By using numeric values, we more easily determine a probability for the values. Example of One Hot Encoded data we can see in fig. 6.

image

Fig. 6. One Hot Encoded data. The data is ready for creating the model.

Modelling

For network modelling I used Keras, an open-source software library that provides a Python interface for artificial neural networks. Performance of the model was improved by changing selecting features and deleting irrelevant ones (for examples, ‘contact’). The performance of model was evaluated. The accuracy for test set reached 0.8569, and the difference between train set accuracy and test set accuracy was not big, meaning the model wasn’t overfitted.

image

Fig. 7. Accuracy scores of the model.

Model also was checked for AUC and ROC curve. Higher the AUC value, higher the performance of the model. AUC for the train set is 0.7219. Note, that it is not the accuracy of the model.

image

Fig. 7. AUC scores of the model.

As for the ROC curve, it was visualized:

image

Fig. 8. ROC plot.

Furthermore, the network architecture was also visualized with the help of keras.utils:

image

Fig. 9. Network architecture.

Conclusion

During the work on project, there were several bottlenecks.

• The data type

Most variable of dataset were categorical, but machine learning and deep learning models, like those in Keras, require all input and output variables to be numeric. For that reason, I used One Hot Encoding to encode data into numeric variables. Another approach is to use other methods of encoding, such as Embedding Categorical Data and others.

• Network visualization

Unfortunately, beautiful visualizer ANN_viz is not supported by Keras anymore, therefore I have to find another way to visualize the network. So I used keras.utils embedded visualization.

In the conclusion, a Neural Network Model was built and used to predict deposit status for clients.

bankingcampaign-outputprediction's People

Contributors

dariabutyrskaya avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.