Giter Site home page Giter Site logo

akbhobhiya / data-analytics-project Goto Github PK

View Code? Open in Web Editor NEW
2.0 2.0 1.0 838 KB

Credit Card Fraud Detection: Combination of both Unsupervised and Supervised Algorithm to detect Fraud in credit cards

License: MIT License

Jupyter Notebook 100.00%
data-analytics credit-card-fraud-detection hybrid-approach

data-analytics-project's Introduction

Python Linux

Data-Analytics-Project

Credit Card Fraud Detection

Combination of Unsupervised and Supervised Technique in Credit Card Fraud Detection

This is a mini project at NITK Surathkal.

Dataset-

Dataset is collected from Kaggle named creditcard.csv. This is a credit card dataset made by European cardholders in September 2013. The dataset is extremely unbalanced, the actual classes are only 0.172% of all transactions. It holds only numeric variables there is no object variable is there in the dataset. All the numeric variables are taken after PCA transformation. link to dataset- https://www.kaggle.com/mlg-ulb/creditcardfraud

Idea-

This project proposes the implementation of a hybrid approach that makes use of unsupervised outlier scores to extend the feature set of a fraud detection classifier.We used outlier scores from the unsupervised outlier detection models and added them as the new features and fed the data to the classifier.We compared various classifiers and classifier models with outlier scores as features.In the classifiers Random forest classifier and SVC performed well with respect to accuracy,recall and F1 score.So,we selected them as base model for the further comparison for outlier scores.So,after adding them ,we compared SVC+outlier score and RFC+outlier score, CBLOF(cluster-based local outlier factor) outlier score increased the accuracy of both the models i.e,SVC and RFC. While we couldn’t attain our aim of 100% efficiency in fraud discovery, we made end up building a method that can, with sufficient time and data, get very near to that aim. As with any such project, there is some scope for advancement here. The very characteristics of this project admit for various algorithms to be combined as modules and their effects can be merged to improve the correctness of the final result.

A simple flow of the project-

Screenshot


Results only with supervised algorithms-

Screenshot


Results with Sampling Algorithms (SMOTE)-

Screenshot


Results with combination of all (Unsupervised + SMOTE + Supervised)-

Screenshot

How to setup the project

This project is built using Python3+ on jupyter-notebook. All the required libraries are listed in the first part of every code file. Make sure to install all of them to run the project smoothly.

Clone the project
$ git clone https://github.com/Akbhobhiya/Data-Analytics-Project.git
$ cd Data-Analytics-Project

Contributer

  • Wish to Contributing
  • Please feel free to send a pull request or create an issue if you find any.

data-analytics-project's People

Contributors

akbhobhiya avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

Forkers

zm-intern06

data-analytics-project's Issues

Question about code

Hi, can you give me how to visualize the final result between the algorithms?
thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.