lucidrains / pi-gan-pytorch Goto Github PK

View Code? Open in Web Editor NEW

116.0 14.0 15.0 147 KB

Implementation of π-GAN, for 3d-aware image synthesis, in Pytorch

License: MIT License

Python 100.00%

artificial-intelligence deep-learning generative-adversarial-network nerf film

pi-gan-pytorch's Introduction

π-GAN - Pytorch (wip)

Implementation of π-GAN, for 3d-aware image synthesis, in Pytorch.

Project video from authors

Install

$ pip install pi-gan-pytorch

Usage

from pi_gan_pytorch import piGAN, Trainer

gan = piGAN(
    image_size = 128,
    dim = 512
).cuda()

trainer = Trainer(
    gan = gan,
    folder = '/path/to/images'
)

trainer()

Citations

@misc{chan2020pigan,
    title={pi-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis}, 
    author={Eric R. Chan and Marco Monteiro and Petr Kellnhofer and Jiajun Wu and Gordon Wetzstein},
    year={2020},
    eprint={2012.00926},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}

pi-gan-pytorch's People

Contributors

Stargazers

Watchers

Forkers

killsking fuyunfei edmontdants janhenrikbern antonlinderer flora-sun-zhixin easternjournalist crishy1995 easy-shu ssssshwan jaedukseo xingranzh scxw010516 whuhxb

pi-gan-pytorch's Issues

3D coordinates missing

Hi,

Thank you for the implementation.
I noticed you used the pixel coordinates (coors in Generator module), while in the paper they used 3D coordinates in space x = (x, y, z).
Is there a reason for this difference?

Any update on completing this?

Seems to be missing the training code

Issues

My desktop machine has broken during the middle of a pandemic, so this project will have to be on hold until after this current covid19 wave is over :(

Mapping network

Hi, I was looking at your code (I'm also thinking of implementing pi-gan) and had questions about the mapping network.

What is the idea behind the EqualLinear module?
Why are you normalizing the latent vector before passing it through the network?

As far as I can tell neither of these steps is mentioned in the paper so if you have another source for those steps that would be really helpful. Thanks!

Unlabeled data?

Hi,
in the paper, it's stated that the training happens on unlabeled data. My question which I can't get my head around is just how does it work?
Doesn't a NERF model give the 3d scene representation? well, it does, and only for one scene. while requiring a fair amount of data with known poses. My question is how does pi-GAN do it without labels and different images. They haven't been very thorough with the Training Details (Section 3.4 of the paper)
Thank you.

RuntimeError

Hi guys, thanks for the pi-GAN,

I faced the following problem after the 40th epoch: "Stack expects each tensor to be equal size, but got [1, 32, 32] at entry 0 and [3, 32, 32] at entry 1". How can I fix that?

P.S. My bad. I found an image with 1 color channel in my dataset.

Question in FilmSiren

Hello.

Your paper is very impressive. However I have a question.

The original siren module is designed very theoretically with some equations and statistical theory as below:

.

If the input "x" is modulated with the value gamma and beta, the "var(x)" would change.
The CLT condition would be not fulfilled in consequence of modulated siren(FilmSiren).

Could you explain why does the FilmSiren work well a little please?

Thank you :)

training script

Hi :) I am new to nerf and am having trouble identifying 'ray_direction'. Can you provide an example of this variable (or the shape of it)? It would be of great help if there is an example training script. Many thanks!

A bug in mapping network structure?

Hi, thanks for sharing the implementation of pi-gan.

I quickly go through the network part and find that there might be a bug in the mapping network:

pi-GAN-pytorch/pi_gan_pytorch/pi_gan_pytorch.py

Line 130 in 0067bff

return self.to_gamma(x), self.to_beta(x)

In my understanding, the gamma and beta for different SIREN block should be different. In the original styleGAN implementation, the gamma and beta for each AdaIN block are obtained via different fc layers. However in the implementation here, seems that gamma and beta are the same for all layers.

The original pi-gan paper does not describe this part clearly but only says that gamma and beta comes from the mapping network. I wonder if this different w.r.t the stylegan is a bug or not.