Giter Site home page Giter Site logo

scaling-up-stylegan2's People

Contributors

l4rz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

scaling-up-stylegan2's Issues

Attention Layer

Hey l4rz,
thanks for the extensive research on this topic.

Have you considered adding attention layer(s) instead of increasing the capacity to achieve higher quality?
E.g. lucidrains (https://github.com/lucidrains/stylegan2-pytorch) claims this greatly improves results.

However, adding attention to the XXL model will probably yield OOMs. It would be interesting to see what benefits results more, more convolutional filters or attention.
Also we know that for harder tasks (more classes and poses), e.g. imagenet, StyleGANs fail at modeling larger structures.
We also know, BigGAN which employs attention on scale 64^2 (correct me if I'm wrong) does a far better job here.

I would just like to know what your experiences and thoughts are here.

Too much artifacts on generated images -Will try your recommendation

This is my original setup StyleGanV2-ADA. Even at 24,000 kimg (24 million images went through network).

  • Total images 47,000 (256x256) with 19 conditional classes
  • 4xV100 batch 64.
  • RAM/CPU much more than enough
  • Ubuntu 18/Torch 1.8 CUDA 11.1
  • Used paper256, R1 (gamma)=10, mit-han-lab augmentation all 3

REAL
130368819-8e1e934c-5212-4c85-a111-1741352e8eb1

FAKE
fakes000000 (3)

Not good AT ALL!
I am trying your setup now. Any hint for this dataset that you can recommend further:
'paper256_STEVE_START': dict(ref_gpus=8, kimg=10000, mb=64, mbstd=8, fmaps=0.5, lrate=2e-3, gamma=10, ema=20, ramp=None, map=8),
'paper256_STEVE_MID_10000': dict(ref_gpus=8, kimg=10000, mb=64, mbstd=8, fmaps=0.5, lrate=2e-4, gamma=10, ema=20, ramp=None, map=8),
'paper256_STEVE_END_20000' : dict(ref_gpus=8, kimg=50000, mb=64, mbstd=8, fmaps=0.5, lrate=1e-4, gamma=5, ema=20, ramp=None, map=8),

I also setting D_lrate = 2x G_lrate.
args.G_opt_kwargs = dnnlib.EasyDict(class_name='torch.optim.Adam', lr=spec.lrate, betas=[0,0.99], eps=1e-8)
args.D_opt_kwargs = dnnlib.EasyDict(class_name='torch.optim.Adam', lr=spec.lrate*2, betas=[0,0.99], eps=1e-8)

So at each level of kimg, I will try the above setting and seems like my dataset is somewhat cifar-alike hence I eliminate style mixing, pl_weight regularization as well . Should I ?
args.loss_kwargs.pl_weight = 0 # disable path length regularization
args.loss_kwargs.style_mixing_prob = 0 # disable style mixing
args.D_kwargs.architecture = 'orig' # disable residual skip connections

I do not re-train from scratch again but resume from last ptre-trained (which has mixing style, gamma=10 etc.). Is it a problem or should I re-train completely ?

Thanks, mate, any help is appreciated.
Steve

Great article, would love to see if you could try diffusion model?

Great article, from "Practical aspects of StyleGAN2 training" to "Scaling up StyleGAN2". A lot of insights really helpful.
Just wondering if you could try something like diffusion model or other state-of-the-art model for a better synthetic image quality?

Thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.