hongshuochen / defakehop Goto Github PK

Official code for DefakeHop: A Light-Weight High-Performance Deepfake Detector

Home Page: https://arxiv.org/abs/2103.06929

Python 100.00%

deepfake-detection successive-subspace-learning green-learning

defakehop's Introduction

DefakeHop: A Light-Weight High-Performance Deepfake Detector

This is the official Python implementation of our work: "DefakeHop: A Light-Weight High-Performance Deepfake Detector" accepted at ICME 2021.

State-of-the-art Deepfake detection methods are built upon deep neural networks. In this work, we proposed a non deep learning method to detect Deepfake videos which use the successive subspace learning (SSL) principle to extract features from various parts of face images. The features are also further distilled by our feature distillation module to derive a concise representation of the fake and real faces.

Required packages

conda install -c anaconda pandas 
conda install -c conda-forge opencv
conda install -c anaconda scikit-image
conda install -c conda-forge matplotlib
conda install -c conda-forge scikit-learn

Since we use GPU to accelerate the processes, please install xgboost by pip

pip install xgboost

Data

Please put your videos in following folders accordingly

train
- real
- fake
test
- real
- fake

Preprocessing

Extracting the facial landmarks using OpenFace. Please check here more more details.

python landmark_extractor.py

Face alignment and Crop the facial regions

python patch_extractor.py

Get the training and testing data

python data.py

How to run

We use UADFV dataset as an example to show how to use our code to train and test the model.

python model.py

When we train the model, we use three items to train.

Images: 4D numpy array (N,H,W,C).
Labels: 1D numpy array where 1 is Fake and 0 is Real.
Names: 1D numpy array storing frame names.

The frame name should follow the format of {video_name}_{frame_number}.

Example: real/0047_0786.bmp, we can know it is the 786 th frame from real/0047.mp4

Cite us

If you use this repository, please consider to cite.

@misc{chen2021defakehop,
      title={DefakeHop: A Light-Weight High-Performance Deepfake Detector}, 
      author={Hong-Shuo Chen and Mozhdeh Rouhsedaghat and Hamza Ghani and Shuowen Hu and Suya You and C. -C. Jay Kuo},
      year={2021},
      eprint={2103.06929},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Acknowledgment

This work was supported by the Army Research Laboratory (ARL) under agreement W911NF2020157.

defakehop's People

Contributors

Stargazers

Watchers

defakehop's Issues

How to train with image dataset?

Below is my file structure, each folder contains several images.

.
├── train
│   ├── fake
│   └── real
├── val
│   ├── fake
│   └── real

Can I use images to train the model directly?

Unable to train the model

Hey,
It is a nice work and really appreciated.

I am getting an error when i run python model.py. May i ask you what could be the problem?
Thanks!

Running model.py crashes

Running model.py gives me Out of memory crash on a system with 32GB of RAM.

The log: https://bin.snopyta.org/?0b74cdff60bad1ed#5BJRf22DBYiAc7Zw1CdfXbK6EejXyW9Aqx5qmn7UeioJ

hang in soft classifiers?

i tried to run your code but i got a hang after this output (on ubuntu and anaconda)
"
(4360, 32, 32, 3)
==============================left_eye==============================
===============DefakeHop Training===============
===============MultiChannelWiseSaab Training===============
Hop1
Input shape: (4360, 32, 32, 3)
Output shape: (4360, 15, 15, 12)
Hop2
SaabID: 0 ChannelID: 0 Energy: 0.3909475878148454
Input shape: (4360, 15, 15, 1)
Output shape: (4360, 7, 7, 7)
SaabID: 0 ChannelID: 1 Energy: 0.3447543175276042
Input shape: (4360, 15, 15, 1)
Output shape: (4360, 7, 7, 8)
SaabID: 0 ChannelID: 2 Energy: 0.10680955446396825
Input shape: (4360, 15, 15, 1)
Output shape: (4360, 7, 7, 8)
SaabID: 0 ChannelID: 3 Energy: 0.05905506598019355
Input shape: (4360, 15, 15, 1)
Output shape: (4360, 7, 7, 5)
SaabID: 0 ChannelID: 4 Energy: 0.03872900013611077
Input shape: (4360, 15, 15, 1)
Output shape: (4360, 7, 7, 5)
SaabID: 0 ChannelID: 5 Energy: 0.023054650053579685
Input shape: (4360, 15, 15, 1)
Output shape: (4360, 7, 7, 8)
Hop3
SaabID: 0 ChannelID: 0 Energy: 0.24112964726659797
Input shape: (4360, 7, 7, 1)
Output shape: (4360, 3, 3, 8)
SaabID: 0 ChannelID: 1 Energy: 0.09732579918720609
Input shape: (4360, 7, 7, 1)
Output shape: (4360, 3, 3, 8)
SaabID: 0 ChannelID: 2 Energy: 0.02775095300424238
Input shape: (4360, 7, 7, 1)
Output shape: (4360, 3, 3, 5)
SaabID: 0 ChannelID: 3 Energy: 0.015179178866751176
Input shape: (4360, 7, 7, 1)
Output shape: (4360, 3, 3, 5)
SaabID: 1 ChannelID: 0 Energy: 0.19901943188942417
Input shape: (4360, 7, 7, 1)
Output shape: (4360, 3, 3, 8)
SaabID: 1 ChannelID: 1 Energy: 0.08213302772662819
Input shape: (4360, 7, 7, 1)
Output shape: (4360, 3, 3, 8)
SaabID: 1 ChannelID: 2 Energy: 0.02246967960746773
Input shape: (4360, 7, 7, 1)
Output shape: (4360, 3, 3, 8)
SaabID: 1 ChannelID: 3 Energy: 0.018863600077674066
Input shape: (4360, 7, 7, 1)
Output shape: (4360, 3, 3, 8)
SaabID: 1 ChannelID: 4 Energy: 0.013172183331202066
Input shape: (4360, 7, 7, 1)
Output shape: (4360, 3, 3, 7)
SaabID: 2 ChannelID: 0 Energy: 0.04175134624454228
Input shape: (4360, 7, 7, 1)
Output shape: (4360, 3, 3, 8)
SaabID: 2 ChannelID: 1 Energy: 0.01873118654102847
Input shape: (4360, 7, 7, 1)
Output shape: (4360, 3, 3, 6)
SaabID: 2 ChannelID: 2 Energy: 0.01694263964772816
Input shape: (4360, 7, 7, 1)
Output shape: (4360, 3, 3, 6)
SaabID: 2 ChannelID: 3 Energy: 0.013554258645960802
Input shape: (4360, 7, 7, 1)
Output shape: (4360, 3, 3, 7)
SaabID: 3 ChannelID: 0 Energy: 0.024407171663723276
Input shape: (4360, 7, 7, 1)
Output shape: (4360, 3, 3, 5)
SaabID: 3 ChannelID: 1 Energy: 0.02307696785774328
Input shape: (4360, 7, 7, 1)
Output shape: (4360, 3, 3, 5)
SaabID: 4 ChannelID: 0 Energy: 0.014326932815538707
Input shape: (4360, 7, 7, 1)
Output shape: (4360, 3, 3, 4)
SaabID: 4 ChannelID: 1 Energy: 0.012846123839138926
Input shape: (4360, 7, 7, 1)
Output shape: (4360, 3, 3, 5)
spent 7.283803701400757 s
===============MultiChannelWiseSaab Transformation===============
Hop1
Input shape: (4360, 32, 32, 3)
Output shape: (4360, 15, 15, 12)
Hop2
SaabID: 0 ChannelID: 0
Input shape: (4360, 15, 15, 1)
Output shape: (4360, 7, 7, 7)
SaabID: 0 ChannelID: 1
Input shape: (4360, 15, 15, 1)
Output shape: (4360, 7, 7, 8)
SaabID: 0 ChannelID: 2
Input shape: (4360, 15, 15, 1)
Output shape: (4360, 7, 7, 8)
SaabID: 0 ChannelID: 3
Input shape: (4360, 15, 15, 1)
Output shape: (4360, 7, 7, 5)
SaabID: 0 ChannelID: 4
Input shape: (4360, 15, 15, 1)
Output shape: (4360, 7, 7, 5)
SaabID: 0 ChannelID: 5
Input shape: (4360, 15, 15, 1)
Output shape: (4360, 7, 7, 8)
Hop3
SaabID: 0 ChannelID: 0
Input shape: (4360, 7, 7, 1)
Output shape: (4360, 3, 3, 8)
SaabID: 0 ChannelID: 1
Input shape: (4360, 7, 7, 1)
Output shape: (4360, 3, 3, 8)
SaabID: 0 ChannelID: 2
Input shape: (4360, 7, 7, 1)
Output shape: (4360, 3, 3, 5)
SaabID: 0 ChannelID: 3
Input shape: (4360, 7, 7, 1)
Output shape: (4360, 3, 3, 5)
SaabID: 1 ChannelID: 0
Input shape: (4360, 7, 7, 1)
Output shape: (4360, 3, 3, 8)
SaabID: 1 ChannelID: 1
Input shape: (4360, 7, 7, 1)
Output shape: (4360, 3, 3, 8)
SaabID: 1 ChannelID: 2
Input shape: (4360, 7, 7, 1)
Output shape: (4360, 3, 3, 8)
SaabID: 1 ChannelID: 3
Input shape: (4360, 7, 7, 1)
Output shape: (4360, 3, 3, 8)
SaabID: 1 ChannelID: 4
Input shape: (4360, 7, 7, 1)
Output shape: (4360, 3, 3, 7)
SaabID: 2 ChannelID: 0
Input shape: (4360, 7, 7, 1)
Output shape: (4360, 3, 3, 8)
SaabID: 2 ChannelID: 1
Input shape: (4360, 7, 7, 1)
Output shape: (4360, 3, 3, 6)
SaabID: 2 ChannelID: 2
Input shape: (4360, 7, 7, 1)
Output shape: (4360, 3, 3, 6)
SaabID: 2 ChannelID: 3
Input shape: (4360, 7, 7, 1)
Output shape: (4360, 3, 3, 7)
SaabID: 3 ChannelID: 0
Input shape: (4360, 7, 7, 1)
Output shape: (4360, 3, 3, 5)
SaabID: 3 ChannelID: 1
Input shape: (4360, 7, 7, 1)
Output shape: (4360, 3, 3, 5)
SaabID: 4 ChannelID: 0
Input shape: (4360, 7, 7, 1)
Output shape: (4360, 3, 3, 4)
SaabID: 4 ChannelID: 1
Input shape: (4360, 7, 7, 1)
Output shape: (4360, 3, 3, 5)
spent 3.1094441413879395 s
===============Features Dimensions===============
Hop1 (4360, 15, 15, 12)
Hop2 (4360, 7, 7, 41)
Hop3 (4360, 3, 3, 111)
===============Spatial Dimension Reduction===============
Input shape: (15, 15) 225
Output shape: 32
Input shape: (7, 7) 49
Output shape: 12
Input shape: (3, 3) 9
Output shape: 5
===============Soft Classifiers===============
"
crtl c
"
^CTraceback (most recent call last):
File "model.py", line 163, in
model.fit_region(region, train_images, train_labels, train_names, multi_cwSaab_parm)
File "model.py", line 34, in fit_region
features = defakehop.fit(images, labels)
File "/home/user21/workspace/DefakeHop/defakeHop.py", line 54, in fit
fit_all_channel_wise_clf(self.features, labels, n_jobs=4)
File "/home/user21/workspace/DefakeHop/defakeHop.py", line 150, in fit_all_channel_wise_clf
pool.starmap(fit_channel_wise_clf, parameters)
File "/home/user21/anaconda3/lib/python3.8/multiprocessing/pool.py", line 372, in starmap
return self._map_async(func, iterable, starmapstar, chunksize).get()
File "/home/user21/anaconda3/lib/python3.8/multiprocessing/pool.py", line 765, in get
self.wait(timeout)
File "/home/user21/anaconda3/lib/python3.8/multiprocessing/pool.py", line 762, in wait
self._event.wait(timeout)
File "/home/user21/anaconda3/lib/python3.8/threading.py", line 558, in wait
signaled = self._cond.wait(timeout)
File "/home/user21/anaconda3/lib/python3.8/threading.py", line 302, in wait
waiter.acquire()
KeyboardInterrupt
"

how to train and test with image dataset

About Model Size / Saving Model

I wanted to ask about model size which you got after training Celab and FF++ datasets. I wanted to save the model and then use it for single prediction, but as I understood for prediction we need to save the "classifier" and "defakeHop" objects. However defakeHop object size depends on the training data. In results I have 10GB defakeHop and 760 KB classifier. May I did something wrong? How would you save the model for the future prediction? If you have some time could you explain it to me?

About dataset

Hey,thanks for your great work!I have some question about experiment dataset.
Do you train on celeb train set and test on celeb test set?or train on FF++ train set and test on celeb test set?
Best wish!

A small issue with column names

There is a small issue in the patch_extractor.py file. The name of columns are little bit different.
Instead of: df['success'] we have to use: df[' success'] (extra single whitespace). In my case, I used Docker way to extract features from video frames in the landmark_extractor.py file. For some reason, all columns name have extra whitespace in the begging except the first column ('frame'). Perhaps, you will not face with the same problem if you would used another method for feature extraction.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.