I was wondering if you have any suggestion on how to implement the code for 2d image g

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Do you have any suggestion on how to implement this model for 2D infinite generation? about alis HOT 6 OPEN

universome commented on August 24, 2024

Do you have any suggestion on how to implement this model for 2D infinite generation?

from alis.

Comments (6)

universome commented on August 24, 2024 3

I agree that the code is not too transparent and might take a lot of time to get one's head around. I checked several places that I think are the main places to change and think about the following ones:

GridInput class here. Looks like you can remove input_column altogether because for 2D generation one would need to generate vertical patches independently too, this input_column would become an "input_pixel", which will make it quite useless
patchwise_op function here, which is hard-coded for vertical patches only
(the most difficult one): fast_bilinear_mult_row function here, which is hard-coded to do interpolated only for 1D
Right now, we feed the central latent code w and its context ws_context in a very inconvenient way as separate arguments everywhere (like here) (I did it to preserve backward compatibility to StyleGAN2). Maybe it's worth dropping this and having a single w_grid argument of shape [batch_size, grid_h, grid_w, num_ws, w_dim] (or smth like this). But not sure about that. The problem with the current variant is that it is quite annoying to arange w and ws_context into a grid every time you need to interpolate.

Also, it might be useful to think what tricks can be borrowed from the recent Alias-Free GAN which is also a coordinate-based GAN model.

Feel completely free to ask any further questions if you'll have some!

from alis.

universome commented on August 24, 2024 1

Hi! In my head, it was supposed to look the following way. You sample 9 anchors and assemble them into a 3x3 grid (right now there are 3 anchors sampled that are assembled into a 1x3 grid). This grid is projected onto the coordinates plane in such a way that its left lower point is at location (-d, -d) and its upper right point is at (d, d). Here, d is a hyperparameter denoting the distance between anchors, in the paper we used d=2 (assuming that a frame has the size of 1x1 units).

Now, you randomly sample a square frame on this grid, render it and pass to the discriminator. For each coordinates location, you interpolate the styles codes from the above 3x3 anchors grid.

Note that it will be quite slow (i.e. ~2x) to train compared to the current 1D implementation if one does not implement a specialized CUDA kernel for the fused interpolate+multiply operation.

from alis.

ivanlen commented on August 24, 2024 1

It completely make sense what you say. Thanks!
I am still reading the code, comparing with StyleGan2 and with your manuscript, and trying to understand some lines before the implementation. It's a big piece of code and it is taking my a while, but I am advancing slowly.
As soon as I understand the important lines I will start to code and with some testings.

By the way, congrats for the manuscript, is super interesting and very well written, I enjoyed it a lot.

I will try not to spam here, but if I have some doubts with the implementation I'll come around.

Cheers!

from alis.

ivanlen commented on August 24, 2024 1

Hi @universome thank you very much for your hints.
I was quite busy with other stuff, but during these days I will check the code together with your suggestions and see if I can advance in the 2d implementation.

Thank you again for your answer, I think that it was what I needed to advance. Also I will check the Alias-Free GAN.
If I have further questions during the implementation I will definitely let you know.
Cheers!

from alis.

zengxianyu commented on August 24, 2024

Have you made any progress on the 2D version? I'm also interested

from alis.

ivanlen commented on August 24, 2024

Have you made any progress on the 2D version? I'm also interested

Not really, I have some drafts notebooks in which I was testing stuff, but still very far from something that can be shared or a PR.

I don't have much free time lately... I hope to be able to continue this soon, but I don't know when I am going to be able.

from alis.

Do you have any suggestion on how to implement this model for 2D infinite generation? about alis HOT 6 OPEN

Comments (6)

Related Issues (12)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent