Giter Site home page Giter Site logo

abhigyan1120024 / document_verification Goto Github PK

View Code? Open in Web Editor NEW

This project forked from shiva2410/document_verification

0.0 0.0 0.0 47 KB

Visual Recognition project for automating KYC Process in BANK

License: Apache License 2.0

Python 100.00%

document_verification's Introduction

Document_verification

Medium Blog - https://medium.com/swlh/document-verification-for-kyc-with-ai-ocr-computer-vision-tool-3485d85d75f6

Visual_Recognition

DOCUMENT VALIDATION:

A RCNN based Image Classifier used to classify Aadhaar Card, Pan card and any other document. This model was trained on a dataset of Aadhaar Cards, PAN cards and Other documents like gas bills, voter ID cards, driving licence etc. collected from customer data. The model was trained over several variations of the images such as blurred or tilted images. The model has an accuracy of 94%.

Requirements

  • keras
  • Tensorflow
  • OpenCv
  • PIL
  • Tesseract-OCR
  • Google OCR

Training

  • Due to privacy reasons I was given a very limited dataset, so I had to upsample this data to ensure that I could train my model for all possible scenarios. I used the ImageDataGenerator module from keras and added variations to the dataset like tilting or selective blurring and increased my dataset size for each class.

  • Freezing the 4 layers allowed me to utilize only the convoluted layers and send the output to a custom fully connected neural network which would be adjusted for only specific images. Given that I had a limited dataset I had to include dropout in the fully connected network to prevent overfitting on the data.

  • This process ensured that the model would understand only the specific document types required to be classified.

Evaluation

  • The trained model was then evaluated on accuracy and through manual checking methods to ensure that the right cardsdocuments were being classified. The model has an accuracy of 94%.

  • The operations team manually verified several images using the classifier to ensure that the right documents were classified.

Impact of the project

  • This project was packaged into a ready to use library which helped save more 50 hours of manual customer document verification. This helped the operations team to focus on other high priority tasks.

Future Improvements

  • The model can be trained on fake documents and this can help in detecting fraudulent documents.

document_verification's People

Contributors

shiva2410 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.