Giter Site home page Giter Site logo

charumakhijani / credit-card-fraud-detection Goto Github PK

View Code? Open in Web Editor NEW
2.0 1.0 2.0 400 KB

Jupyter Notebook 100.00%
credit-card-fraud-detection kaggle-dataset kaggle exploratory-data-analysis tsne-visualization tsne-plot heatmap standard-scalar oversampling logistic-regression decision-tree random-forest adaboost xgboost neural-networks roc-curve feature-extraction feature-engineering feature-importance softmax

credit-card-fraud-detection's Introduction

Credit-Card-Fraud-Detection

Every year, millions of people fall victim to fraud that costs the global economy billions of dollars. If you're a victim, it can wreak havoc on your personal finances. Luckily, due to some modern fraud detection techniques many financial institutions have measures in place to help protect you from credit fraud.

Dataset is from below URL
https://www.kaggle.com/mlg-ulb/creditcardfraud

Fraud Detection

Fraud Detection is a technique used to identify unusual patterns that are different from the rest of the population and not behaving as expected. These unusual patterns are also called as outliers.

The fraud detection involves in-depth data analysis/data-mining to recognize the unusual patterns. In this dataset, most of the data analysis part is already done and most of the features are scaled. The names of the features are not shown due to privacy reasons.

Hence our main focus will be to balance the data and perform predective analysis.

Problem Statement

The Credit Card Fraud Detection dataset contains transactions made by credit cards in September 2013 by european cardholders. This dataset presents transactions that occurred in two days, where we have 492 frauds out of 284,807 transactions. The dataset is highly unbalanced, the positive class (frauds) account for 0.172% of all transactions.

Goals

Goal here is to identify as much fraudulent credit card transactions as possible. And as mentioned in the dataset insperation, I will calculate the accuracy using the Area Under the Precision-Recall Curve (AUPRC). Confusion matrix accuracy is not meaningful for unbalanced classification.

Table of Contents

  1. Import Libraries
  2. Read Data
  3. Understand the data
  4. Exploratory Data Analysis
  5. Label Data
  6. Cluster data using Dimensionality reduction
  7. Split into train and test sets
  8. Scaling
  9. Predictive Analysis on unbalanced data
  10. Validate Unbalanced Data
  11. Balance Data using oversampling method
  12. Predictive Analysis on Balanced Data
  13. Validate Balanced Data
  14. Feature Importance
  15. Conclusion

credit-card-fraud-detection's People

Contributors

charumakhijani avatar

Stargazers

 avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.