ha0tang / gesturegan Goto Github PK

[ACM MM 2018 Oral] GestureGAN for Hand Gesture-to-Gesture Translation in the Wild

Home Page: http://disi.unitn.it/~hao.tang/project/GestureGAN.html

License: Other

MATLAB 9.70% Lua 6.01% Shell 2.04% Python 82.25%

pytorch acmmm2018 computer-vision gans generative-model generative-adversarial-network deep-learning image-generation image-translation image-manipulation

gesturegan's Introduction

👯 We are looking self-motivated researcher to join/visit our Group.

Hao Tang

[Homepage] [Google Scholar] [Twitter]

I am currently a postdoctoral researcher at Computer Vision Lab, ETH Zurich, Switzerland.

⚡ News

We released the code of XingVTON and CIT for virtual try-on, the code of TransDA for source-free domain adaptation using Transformer, the code of IEPGAN for 3D pose transfer, the code of TransDepth for monocular depth prediction using Transformer, the code GLANet for unpaired image-to-image translation, the code MHFormer for 3D human pose estimation.

🌱 My Repositories

3D-Aware Image/Video Generation

3D-SGAN (ECCV 2022)

3D Human Pose Estimation

MHFormer (CVPR 2022)

Text-to-Image Synthesis

DF-GAN (CVPR 2022 Oral)
PPE (CVPR 2022)

3D Objection Generation

CGT (AAAI 2022)
IEPGAN (ICCV 2021)
AniFormer (BMVC 2021)

Monocular Depth Prediction

TransDepth (ICCV 2021)
StructuredAttention (CVPR 2018 Spotlight)

Face Anonymisation

AnonyGAN (ICIAP 2021)

Person Image Generation

XingGAN (ECCV 2020)
BiGraphGAN (BMVC 2020 Oral)
C2GAN (ACM MM 2019 Oral)
GestureGAN (ACM MM 2018 Oral & Best Paper Candidate)

Scene Image Generation

LGGAN (CVPR 2020)
DAGAN (ACM MM 2020)
DPGAN (TIP 2021)
SelectionGAN (CVPR 2019 Oral)
CrossMLP (BMVC 2021 Oral)
EdgeGAN
PanoGAN (TMM 2022)

Unsupervised Image Translation

GLANet
AttentionGAN (IJCNN 2019 Oral）
GazeAnimation (ACM MM 2020)
AsymmetricGAN (ACCV 2018 Oral)

Deep Dictionary Learning

DDLCN (WACV 2019 Oral)

Virtual Try-On

HCANet
CIT

Hand Gesture Recognition

HandGestureRecognition (Neurocomputing 2019)

Source-Free Domain Adaptation

TransDA

gesturegan's People

Contributors

Stargazers

Watchers

Forkers

ml-lab kekedan jdc08161063 hwb0314 databill86 yang-fei baldrlector gwliu213 unlugi texsp zhang405744522 cv-ip deekshadixit15 anubhavparas threeparty eduardopmaga propo41 mbrown3434 oper4nd

gesturegan's Issues

Not able to download the pretrained model for CVUSA

Hi,

First of all, great work.

I am not able to download any pre-trained model except ntu_gesturegan_twocycle_pretrained.tar.gz
Could you please provide an updated link for the other models?

Is there any way I can use the pretrained model to fine-tune it?

I'm trying to use the pre-trained model in my training process with my own dataset. It keeps throwing the error: No file found latest_net_D1.pth. I only used the flag --continue-training.

can't dowload NTU hand digit dataset

The download link in the paper is broken. Do you know where else to download?

L2 loss channel-wise missing?

Hello!

I've been reading the GestureGAN paper and the way in which calculating the L2 loss in the generator channel-wise avoids "channel pollution". However, I can not find it implemented in the code. Only the L1 loss is calculated channel-wise, which the paper states that is not necessary. Is this an error en in the code, and when calculating L1 loss it was meant to calculate L2?

Thanks!

hope the codes can be realeased soon

Filter failure cases manually from train set?

I am interested to understand the working of gestureGAN_twocycle model. I downloaded the senz3d dataset and prepared training and test data as indicated. When I saw the training output, I noticed that there are many images for which the pose is wrongly identified. Did you manually delete them from the 135,504 training samples? Can you please provide a .txt file with the correct sample's filenames ?
Thanks and best regards,
jysa01

How/Where is the skeleton image embedding implemented?

Hello @Ha0Tang ,
I read through the GestureGAN paper and noticed under Experimental setup that the skeleton images are embedded by passing through an encoder. I cannot identify where this has been implemented in the code.
The dataloader seggregates the incoming input image into four 256x256 blocks - RealA, RealB, RealC, RealD. This is then fed to the model using set_input function. Am I missing something here? Kindly help

Thanks,
jysa01

When will the codes be realeased??

Hi, I'd like to know when will this codes will be avaliable??