cc-hpc-itwm / deepfakedetection Goto Github PK

View Code? Open in Web Editor NEW

158.0 158.0 55.0 60.09 MB

Jupyter Notebook 99.74% Python 0.26%

deepfakedetection's People

Contributors

Stargazers

Watchers

Forkers

ricarddurall priamai cosmoshua jtzhpf knightofdawn jimmy-inl yatharthmathur dansmile rongisdragon boozyguo rock420 pgsrv free-tek keuperj email4reg ljdsina lakshyabatman goldcarpenter doanngoctu9893 spanwich omkardw23 zhjikoshlizhzc bkarwoski chumingchen2019 alexiehta jacksianyun pratimugale ai-hub-deep-learning-fundamental flaber123 rjcc littleqing0914 lgry shubhamkr96 kerenalli crazycoder009 baopingliu boragocode saraferreirascf 2016215226 sooddhruv lhfazry code-ash-iit ziv2 laimis-dev 5l1v3r1 nouraddini ahmedyes2000 sodiqsrb 747135205 qbqkwy company-deepfake akeel-ahmed

deepfakedetection's Issues

Wrong figure or wrong results

In TABLE VII of the paper it says 2000 samples. However, you are clearly loading and training with 3200 samples. Either the figure or your results are wrong.

What is epsilon used for? Because I get overlapping 1-d spectrums

Hi again,
Can you please tell me what epsilon is exactly used for? I didn't find any mention of it in the paper.
It is added to 'fshift' of real images but is not added to 'fshift' of fake images. This is in the code for the FaceForensics dataset: https://github.com/cc-hpc-itwm/DeepFakeDetection/blob/master/Experiments_DeepFakeDetection/FaceForensic.ipynb

Can't link to the lines directly as it is a ipynb but here are the lines of code:

For fake images:
fshift = np.fft.fftshift(f)
magnitude_spectrum = 20*np.log(np.abs(fshift))
For real images:
fshift = np.fft.fftshift(f)
fshift += epsilon
magnitude_spectrum = 20*np.log(np.abs(fshift))

Above is for FaceForensices python notebook, but epsilon is added in both the cases in the CelebA notebook.

Also is it okay to use MTCNN for cropping faces from the frames of a video? I'm concerned because I get the same 1D spectrum curve for real and fake images when I tried the code on Celeb-df dataset. (They overlap almost completely). I don't know if the MTCNN implementation I use does resizing of image or not because they haven't mentioned anywhere that the output image is resized/rescaled.

Can you please let me know what face detection library you used for cropping the images?

Misspell in Readme

The spelling of Faces is misspelled here

Question on robustness

Thank you for your good work.

Any re-sampling/re-scaling of the input images might distort the frequency spectrum: Do NOT resize the images, resize the spectra afterwards! Also: some prominent face detectors do resizing, don't use them if you can't turn it off.

In your description, you mention that re-scaling will distort the frequency spectrum. Does this mean that this detection approach does not work well in the case that the fake images are resized?

In a practical scenario, this would be a common case.

Update the metrics!

This is a repost as my issues were closed without being resolved. I am not happy with this behavior. It's not scientific.

As already mentioned, the training data is balanced and this is the important point.

I disagree. The important point is the test performance. The chosen metric "accuracy" is not able measure the performance as the test data is unbalanced. I thus proposed that you switch to roc_auc_score as metric! Using it is as simple as from sklearn.metrics import roc_auc_score. I look forward to your updates and scores given this meaningful metric for unbalanced data!

the parameter in SVM

I wonder know whether you have some suggestions about choose SVM parameters in different resolutions? I capture the video clip and covert to 1024x1024 .jpg.I use the parameter in FacesHQ,the accuracy is only about 45% in balanced dataset.

Using script on images of sizes 128x128 or 256x256

I tried using the Visualization ipynb on images generated from celebA but resized to 256x256 and 128x128. I was not able to visualize the high-frequency output image. I did change the mask to match the dimensions of the input images. I notice that the calculation of inv2 /= inv.max() causes the array to generate NAN for each portion. please let me know what you think?

Thanks.

Would you please release the pretrained model and show us how to run inference ?

Hi,

Thanks for the great work, would you please upload your trained model and publish a few lines about how to run inference with the trained model ?

Can we use RNNs for exploiting temporal inconsistencies in Videos?

Hi, thank you for the great work!

Can we use this approach along with RNNs to also exploit the temporal inconsistencies in the frames? Or will these inconsistencies get lost in the frequency space?
The idea would be that if the frequency fingerprint (after azimuthal averaging) that the GANs/Autoencoders leave are not consistent across the frames, those can be found. The RNN would be fed the Power Spectrum averaged values of spacial frequencies.
This is if we consider high quality videos like those in DFDC, Celeb-DF since your method needs high frequencies.

Is it worth giving this analysis a shot?

How do i open pkl file images

I want to take a look at each of those digit images in pkl file, so I need to unpack the pkl file, except I can't find out how.

Is there a way to unpack/unzip pkl file?

Faces-HQ dataset could not download properly

Hi, the Faces-HQ dataset on Google Drive has been reported could not download since I download it three times (the Internet is not stable enough) . Can I ask if there is another way to download the dataset? Thanks a lot and hope for your reply.

Update the code please

I am trying the reproduce your results but it's impossible. You just load a bunch of precomputed features. Please update your code and provide clear instructions on how the features were computed.

For example:

# load feature file
pkl_file = open('train_3200.pkl', 'rb')
data = pickle.load(pkl_file)
pkl_file.close()
X = data["data"]
y = data["label"]

Were is the code to generate these pkl files? Why not use a simple csv file instead of folderNames.npy? Why do you copy radialProfile.py to every folder? Please also add a demo with a pretrained model that either takes a image or a video as input. How can anybody use this project otherwise?

Link for downloading Faces-HQ dataset are not working

I want to use your dataset but, the link is not working..
Would you check the link?

I saw that it was used to be on the google drive. Is still that link available? can I get it?

The value for N (# features)

Hello, If we are wanting to run this method for other datasets, how would we be able to compute the value for N, the number of features for said dataset, to use in the code? I am seeing that all three ipynb fles have the number of features passed in the script itself so was wondering how I might be able to figure out the number of features. thanks.

why not train a neural network?

Is there a specific reason for not training NN for the data sets in which the accuracy is lower than 100%, like DeepFakeDetection dataset?

Will the frequency domain distribution be influenced by video compression mechanisum?

Hi,

Thanks for releasing this repository !!! After reading the README.md, I got to know that the frequency domain distribution can be easily distorted by resizing operation. Would you please tell me furtherly that will the frequency domain features be changed if the images are compressed into mp4 videos, or simply be encoded or decoded with jpeg method? Will the method proposed in the paper still work ?

Classifiers are very bad?

You report accuracy on unbalanced data splits. Why? Accuracy is not good for unbalanced data. Please update the results with roc_auc_score! For example a simple all 0 prediction is better than your classifiers. I only changed one line "SIMPLE". This "amazing" model gets 0.9340659340659341 accuracy ...

import numpy as np
from sklearn.svm import SVC
from sklearn.linear_model import LogisticRegression
import pickle

#train
pkl_file = open('train_3200.pkl', 'rb')
data = pickle.load(pkl_file)
pkl_file.close()
X = data["data"]
y = data["label"]

svclassifier_r = SVC(C=6.37, kernel='rbf', gamma=0.86)
svclassifier_r.fit(X, y)
logreg = LogisticRegression(solver='liblinear', max_iter=1000)
logreg.fit(X, y)

#test
pkl_file = open('test_full.pkl', 'rb')
data = pickle.load(pkl_file)
pkl_file.close()
X_ = data["data"]
y_ = data["label"]

SVM = svclassifier_r.score(X_, y_)
LR = logreg.score(X_, y_)
print("SVM: "+str(SVM))
print("LR: "+str(LR))
print("SIMPLE:"+str((np.zeros_like(y_) == y_).mean()))

Please let me know if I missed something. I look forward to your response.

What is “real data” in CelebA?

Hi!
I tried to run CelebA.ipynb, but found that the “real data” in it did not seem to be given. Would you please tell me which dataset is used? Thank you!

Bug in radialProfile.py

In line 18 of radialProfile.py,

it should be
center = np.array([(x.max()-x.min())/2.0, (y.max()-y.min())/2.0])
instead of
center = np.array([(x.max()-x.min())/2.0, (x.max()-x.min())/2.0]).

In current experiments, N*N images are applied, therefore, it performs okay.

How many frames to use when input is a video?

Results based on videos. (We apply a simple majority vote over the single frame classifications).

So you used a single SVM to predict multiple frames and them used the mode (majority vote) as prediction? How many frames did you use? All of them?
Thanks