Giter Site home page Giter Site logo

deepfake-detect's Introduction

DeepFake-Detect

https://deepfake-detect.com/

Description

This project aims to guide developers to train a deep learning-based deepfake detection model from scratch using Python, Keras and TensorFlow. The proposed deepfake detector is based on the state-of-the-art EfficientNet structure with some customizations on the network layers, and the sample models provided were trained against a massive and comprehensive set of deepfake datasets.

The proposed deepfake detection model is also served via a standard web-based interface at DF-Detect to assist both the general Internet users and digital media providers in identifying potential deepfake contents. It is hoped that such approachable solution could remind Internet users to stay vigilant against fake contents, and ultimately help counter the emergence of deepfakes.

Deepfake Datasets

Due to the nature of deep neural networks being data-driven, it is necessary to acquire massive deepfake datasets with various different synthesis methods in order to achieve promising results. The following deepfake datasets were used in the final model at DF-Detect:

Combining all the datasets from different sources would provide us a total of 134,446 videos with approximately 1,140 unique identities and around 20 deepfake synthesis methods.


Getting Started

Prerequisites

  • Python 3
  • Keras
  • TensorFlow
  • EfficientNet for TensorFlow Keras
  • OpenCV on Wheels
  • MTCNN

Installation

pip install -r requirements.txt

Usage

Step 0 - Convert video frames to individual images

python 00-convert_video_to_image.py

Extract all the video frames from the acquired deepfake datasets above, saving them as individual images for further processing. In order to cater for different video qualities and to optimize for the image processing performance, the following image resizing strategies were implemented:

  • 2x resize for videos with width less than 300 pixels
  • 1x resize for videos with width between 300 and 1000 pixels
  • 0.5x resize for videos with width between 1000 and 1900 pixels
  • 0.33x resize for videos with width greater than 1900 pixels

Step 1 - Extract faces from the deepfake images with MTCNN

python 01a-crop_faces_with_mtcnn.py

Further process the frame images to crop out the facial parts in order to allow the neural network to focus on capturing the facial manipulation artifacts. In cases where there are more than one subject appearing in the same video frame, each detection result is saved separately to provide better variety for the training dataset.

  • The pre-trained MTCNN model used is coming from this GitHub repo: https://github.com/ipazc/mtcnn
  • Added 30% margins from each side of the detected face bounding box
  • Used 95% as the confidence threshold to capture the face images

(Optional) Step 1b - Extract faces from the deepfake images with Azure Computer Vision API

In case you do not have a good enough hardware to run MTCNN, or you want to achieve a faster execution time, you may choose to run 01b instead of 01a to leverage the Azure Computer Vision API for facial recognition.

python 01b-crop_faces_with_azure-vision-api.py

Replace the missing parts (API Name & API Key) before running

Step 2 - Balance and split datasets into various folders

python 02-prepare_fake_real_dataset.py

As we observed that the number of fakes are much larger than the number of real faces (due to the fact that one real video is usually used for creating multiple deepfakes), we need to perform a down-sampling on the fake dataset based on the number of real crops, in order to tackle for possible class imbalance issues during the training phase.

We also need to split the dataset into training, validation and testing sets (for example, in the ratio of 80:10:10) as the final step in the data preparation phase.

Step 3 - Model training

python 03-train_cnn.py

EfficientNet is used as the backbone for the development work. Given that most of the deepfake videos are synthesized using a frame-by-frame approach, we have formulated the deepfake detection task as a binary classification problem such that it would be generally applicable to both video and image contents.

In this code sample, we have adapted the EfficientNet B0 model in several ways: The top input layer is replaced by an input size of 128x128 with a depth of 3, and the last convolutional output from B0 is fed to a global max pooling layer. In addition, 2 additional fully connected layers have been introduced with ReLU activations, followed by a final output layer with Sigmoid activation to serve as a binary classifier.

Thus, given a colored square image as the network input, we would expect the model to compute an output between 0 and 1 that indicates the probability of the input image being either deepfake (0) or pristine (1).

Authors

See also the list of contributors who participated in this project.

License

This project is licensed under the MIT License - see the LICENSE file for details

Acknowledgments

This project is built using the following packages and libraries as listed here

deepfake-detect's People

Contributors

aaronchong888 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.