Giter Site home page Giter Site logo

pfnet-research / tgan2 Goto Github PK

View Code? Open in Web Editor NEW
76.0 17.0 11.0 3.56 MB

The official implementation of "Train Sparsely, Generate Densely: Memory-efficient Unsupervised Training of High-resolution Temporal GAN"

Home Page: https://arxiv.org/abs/1811.09245

License: MIT License

Dockerfile 1.08% Jupyter Notebook 38.96% Python 59.80% Shell 0.17%
generative-adversarial-network video-generation machine-learning deep-learning

tgan2's Introduction

Temporal Generative Adversarial Nets v2

Generated samples by TGANv2

  • Unconditional video generation results by TGANv2 (trained on the FaceForensics dataset)

This repository contains the implementation of TGANv2 (see the details in "Train Sparsely, Generate Densely: Memory-efficient Unsupervised Training of High-resolution Temporal GAN") and scripts to reproduce experiments in the paper.

TGANv2

Requirements

  • docker
  • nvidia-docker
  • unrar

Tested environment

Ubuntu 18.04 w/

  • 4 x V100 (32GB) GPUs for UCF101 experiments
  • 8 x V100/P100 (16GB) GPUs for FaceForensics experiments

Setup the environment with Docker

cd docker
docker build -t tgan2 .

Prepare datasets

UCF101

Please run the download.sh script as followings.

bash scripts/download_ucf101.sh

It retrieves the UCF101 dataset from the official server and extract all the videos contained in it under datasets/ucf101 directory which is automatically created by the script.

datasets
└── ucf101
    ├── ucfTrainTestlist
    │   ├── classInd.txt
    │   ├── testlist01.txt
    │   ├── testlist02.txt
    │   ├── testlist03.txt
    │   ├── trainlist01.txt
    │   ├── trainlist02.txt
    │   └── trainlist03.txt
    ├── v_ApplyEyeMakeup_g01_c01.avi
    ├── v_ApplyEyeMakeup_g01_c02.avi
    ├── v_ApplyEyeMakeup_g01_c03.avi
    ⋮
    ├── v_YoYo_g25_c03.avi
    ├── v_YoYo_g25_c04.avi
    └── v_YoYo_g25_c05.avi

Then, please run the make_ucf101.py script to apply preprocessings:

docker run --rm -v $PWD:/tgan2 -t tgan2 \
bash -c "cd /tgan2 && python3 scripts/make_ucf101.py --img-rows 192 --img-cols 256"

FaceForensics

To download the FaceForensics (v1) dataset, please fill out this form: FaceForensics, FaceForensics++, and DeepFakes Detection Dataset. Once the request is accepted, you should be able to get the download script named faceforensics_download_v1.py by following the instruction given in an e-mail from the distributor, so please place it under the scripts dir.

(You can also find more details in this official README of the FaceForensics dataset.)

Then, please run the following commands to download the FaceForensics dataset.

python scripts/faceforensics_download_v1.py -d compressed datasets

It creates FaceForensics_compressed directory automatically under the datasets dir:

datasets
├── FaceForensics_compressed
│   ├── test
│   │   ├── altered
│   │   │   ├── 0r4uhJdcIQA_1_cpywXpZVP6o_6.avi
│   │   │   ├── 1aJO2VkfZiY_2_EMLALfhSftA_0.avi
│   │   │   ⋮
│   │   ├── mask
│   │   │   ├── 0r4uhJdcIQA_1_cpywXpZVP6o_6.avi
│   │   │   ├── 1aJO2VkfZiY_2_EMLALfhSftA_0.avi
│   │   │   ⋮
│   │   └── original
│   │       ├── 0r4uhJdcIQA_1_cpywXpZVP6o_6.avi
│   │       ├── 1aJO2VkfZiY_2_EMLALfhSftA_0.avi
│   │       ⋮
│   ├── train
│   │   ├── altered
│   │   │   ├── 00JT2rwquIE_1_10r7mRtkE0s_0.avi
│   │   │   ├── 03pM6Vgv8-g_3_Bpa3wwb5Dt8_1.avi
│   │   │   ⋮
│   │   ├── mask
│   │   │   ├── 00JT2rwquIE_1_10r7mRtkE0s_0.avi
│   │   │   ├── 03pM6Vgv8-g_3_Bpa3wwb5Dt8_1.avi
│   │   │   ⋮
│   │   └── original
│   │       ├── 00JT2rwquIE_1_10r7mRtkE0s_0.avi
│   │       ├── 03pM6Vgv8-g_3_Bpa3wwb5Dt8_1.avi
│   │       ⋮
│   └── val
│       ├── altered
│       │   ├── 093EdKDP7Zs_5_8-n1eJ93hLo_1.avi
│       │   ├── 0GvV_83Wlf8_2_aqHwzt9uceI_0.avi
│       │   ⋮
│       ├── mask
│       │   ├── 093EdKDP7Zs_5_8-n1eJ93hLo_1.avi
│       │   ├── 0GvV_83Wlf8_2_aqHwzt9uceI_0.avi
│       │   ⋮
│       └── original
│           ├── 093EdKDP7Zs_5_8-n1eJ93hLo_1.avi
│           ├── 0GvV_83Wlf8_2_aqHwzt9uceI_0.avi
│           ⋮
├── faceforensics_download_v1.py
⋮

Finally, please run the following command to create .h5 files from those downloaded videos:

docker run --rm -v $PWD:/tgan2 -t tgan2 \
bash -c "cd /tgan2 && python3 scripts/make_face_forensics.py"

It generates some .h5 files and .json files under datasets/face256px directory like this:

datasets
├── face256px
│   ├── test.h5
│   ├── test.json
│   ├── train.h5
│   ├── train.json
│   ├── val.h5
│   └── val.json
⋮

Pre-trained weights

Please download a pre-trained weights for C3D model that is used for calculating the Inception scores.

cd datasets
if [ ! -d models ]; then mkdir -p models; fi
wget https://github.com/rezoo/tgan2/releases/download/v1.0/mean2.npz
wget https://github.com/rezoo/tgan2/releases/download/v1.0/conv3d_deepnetA_ucf.npz

Experiments

Train TGANv2 on the UCF101 dataset

docker run --rm -v $PWD:/tgan2 -t tgan2 \
bash -c "cd /tgan2 && \
mpiexec -n 4 \
python3 train.py \
conf/dirac.yml \
conf/dset/ucf101_192x256.yml \
conf/gen/tgan_multi_192x256.yml \
conf/dis/resnet_multi.yml \
-a \
'out=results/full-bs-128' \
'batchsize=32' \
'gen.args.n_layers=1' \
'gen.args.clstm_ch=1024' \
'updater.args.n_dis=1' \
'updater.args.lam=0.5' \
'linear_decay.start=0' \
'gen.args.subsample_frame=[2,2,2]' \
'dataset.args.subsample_frame=[2,2,2]' \
'gen.args.subsample_batch=false' \
'updater.args.subsample_batch=false'"

Train TGANv2 on the FaceForensics dataset

docker run --rm -v $PWD:/tgan2 -t tgan2 \
bash -c "cd /tgan2 && \
mpiexec -n 8 \
python3 train.py \
conf/dirac.yml \
conf/dset/face256px.yml \
conf/gen/tgan_multi_192x256.yml \
conf/dis/resnet_multi.yml \
-a \
'out=results/full-bs-128_faceforensics3' \
'batchsize=16' \
'dataset.args.subsample_frame=[2,2,2]' \
'gen.args.n_layers=1' \
'gen.args.clstm_ch=1024' \
'gen.args.subsample_batch=false' \
'gen.args.subsample_frame=[2,2,2]' \
'gen.args.t_size=4' \
'updater.args.n_dis=1' \
'updater.args.lam=0.5' \
'updater.args.subsample_batch=false' \
'linear_decay.start=0'"

Evaluation

The inception score on UCF101

docker run --rm -v $PWD:/tgan2 -t tgan2 \
bash -c "cd /tgan2 && \
python3 scripts/compute_inception_score.py \
results/full-bs-128/config.yml \
-m results/full-bs-128/generator_iter_82000.npz \
-o results/full-bs-128/result.yml"

The inception score will be saved in the path given as the -o option.

FID scores

  1. Calculate statistics of the UCF101 dataset
docker run --rm -v $PWD:/tgan2 -t tgan2 \
bash -c "cd /tgan2 && \
python3 scripts/compute_fid.py \
--ucf101-h5path-train datasets/ucf101_192x256/train.h5 \
--ucf101-config-train datasets/ucf101_192x256/train.json \
--c3d-pretrained-model datasets/models/conv3d_deepnetA_ucf.npz \
--stat-output datasets/ucf101_192x256/ucf101_192px_stat.npz"

This command saves the resulting statistics of UCF101 (192x192 resolution) to the path given by --stat-output option.

  1. Calculate the FID score of a given trained model
docker run --rm -v $PWD:/tgan2 -t tgan2 \
bash -c "cd /tgan2 && \
python3 scripts/compute_fid.py \
--stat-filename datasets/ucf101_192x256/ucf101_192x256_stat.npz \
--config results/full-bs-128/config.yml \
--gen-snapshot results/full-bs-128/generator_iter_82000.npz \
--n-samples 2048 \
--batchsize 32 \
--result-dir results/full-bs-128"

--batchsize 32 comsumes about 13245MiB of the GPU memory, while --batchsize 64 requires about 24261MiB and --batchsize 80 requries about 32171MiB. If you run this on a 32 GB V100 GPU, choose 64 or 80 for the batchsize to make the calculation faster.

Results

Model UCF101
Conv3D (gen: conv3d, dis: conv3d) 9.53
Conv-ResNet (gen: conv3d, dis: resnet) 9.74
ResNet3D (gen: resnet, dis: resnet) 9.62
TGAN (v1) 11.85 +- 0.07
TGANv2 (gen: convLSTM+MultiResNet dis: MultiResNet) 25.07 +- 0.47

The FID score of TGANv2 on UCF101 is 3457.42 +- 26.81.

Citation

Please cite this paper when you use the code in this repository:

@journal{TGAN2020,
    author = {Saito, Masaki and Saito, Shunta and Koyama, Masanori and Kobayashi, Sosuke},
    title = {Train Sparsely, Generate Densely: Memory-efficient Unsupervised Training of High-resolution Temporal GAN},
    booktitle = {International Journal of Computer Vision},
    year = {2020},
    month = may,
    doi = {10.1007/s11263-020-01333-y},
    url = {https://doi.org/10.1007/s11263-020-01333-y},
}

You can find our paper also on arxiv.

tgan2's People

Contributors

mitmul avatar rezoo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tgan2's Issues

link to pre-trained weights for the c3d model do not work

Hi,

Great Work,

The readme provides the following links for the C3D model that currently do not work.

wget https://github.com/mitmul/tgan2/releases/download/untagged-d2d4900c6c1bb09d9f5a/conv3d_deepnetA_ucf.npz

wget https://github.com/mitmul/tgan2/releases/download/untagged-d2d4900c6c1bb09d9f5a/mean2.npz

replicating conditional UCF101 results

Hi, Great work,

The current code base does not appear to support the replication of the conditional UCF101 results..

I have successfully replicated the unconditional UCF-101 results. I modified the generator to support a 1-hot label concatenated with the latent vector for the conv-lstm input and additional conditioning via conditional batch normalization as detailed in the paper but cannot replicate any of the conditional UCF-101 results.

Could you provide more details on how to replicate the UCF-101 conditional results

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.