Giter Site home page Giter Site logo

faceswap-gan's Introduction

deepfakes-faceswap-GAN

Adding Adversarial loss and perceptual loss (VGGface) to deepfakes' auto-encoder architecture.

Descriptions

  • FaceSwap_GAN_github.ipynb: This jupyter notebook does the following jobs:

    1. Build a GAN model.
    2. Train the GAN from scratch.
    3. Detect faces in an image using dlib's cnn model.
    4. Use GAN to transform detected face into target face.
    5. Use moviepy module to output a video clip with swapped face.
  • dlib_video_face_detection.ipynb: This jupyter notebook does the following jobs:

    1. Detect/Crop faces in a video using dlib's cnn model.
    2. Pack cropped face images into a zip file.
  • Training data: Training images are supposed to be in ./TE/ and ./SH/ folder for each target respectively. Face images can be of any size.

Results

In below are results that show trained models transforming Hinako Sano (佐野ひなこ, left) to Emi Takei (武井咲, right).

1. Autorecoder

It should be mentoined that the result of autoencoder (AE) can be much better if we trained it for longer.

AE GIFAE_results

2. Generative Adversarial Network, GAN (adding VGGFace perceptual loss)

Adversarial loss improves resolution of generated images. In addition, when perceptual loss is apllied, the movemnet of eyeballs becomes more realistic and consistent with input face.

GAN_PL_GIFGAN_PL_results

Perceptual loss (PL): The following figure shows nuanced eyeballs direction in model output trained with/wihtout PL.

Comp PL

Smoothed bounding box: Exponential moving average of bounding box position over frames is introduced to eliminate jittering on the swapped face. See the below gif for comparison. (Updated 29, Dec., 2017)

bbox

  • A. Source face
  • B. Swapped face, using smoothing mask
  • C. Swapped face, using smoothing mask and face alignment
  • D. Swapped face, using smoothing mask and smoothed bounding box

WIP

Mask geneartion: Model learns a proper mask that can help on handling occlusion.

mask0

mask1  mask2

  • Left: Source face
  • Middle: Swapped face, before masking
  • Right: Swapped face, after masking

Mask Visualization: Make video clips that shows mask heatmap & face bounding box.

mask_vis

  • Left: Source face
  • Middle: Swapped face, after masking
  • Right: Mask heatmap & face bounding box

Requirements

Notes:

  1. BatchNorm/InstanceNorm: Caused input/output skin color inconsistency when the 2 training dataset had different skin color dsitribution (light condition, shadow, etc.).
  2. Increasing perceptual loss weighting factor (to 1) unstablized training. But the weihgting [.01, .1, .1] I used is not optimal either.
  3. In the encoder architecture, flattening Conv2D and shrinking it to Dense(1024) is crutial for model to learn semantic features, or face representation. If we used Conv layers only (which means larger dimension), will it learn features like visaul descriptors? (source paper, last paragraph of sec 3.1)
  4. Transform Emi Takei to Hinko Sano gave suboptimal results, due to imbalanced training data that over 65% of images of Hinako Sano came from the same video series.
  5. Mixup technique (arXiv) and least squares loss function are adopted (arXiv) for training GAN. However, I did not do any ablation experiment on them. Don't know how much impact they had on outputs.
  6. Since humna faces are not 100% symmetric, should we remove random flipping from data augmenattion for model to learn better features? Maybe the generated faces will look more like the taget.

TODO

  1. Use Kalman filter to track bounding box.

Acknowledgments

Code borrows from tjwei and deepfakes. The generative network is adopted from CycleGAN.

faceswap-gan's People

Contributors

shaoanlu avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.