l4rz / scaling-up-stylegan2 Goto Github PK
View Code? Open in Web Editor NEWAchieving photorealistic quality by scaling up StyleGAN2
Achieving photorealistic quality by scaling up StyleGAN2
Hey l4rz,
thanks for the extensive research on this topic.
Have you considered adding attention layer(s) instead of increasing the capacity to achieve higher quality?
E.g. lucidrains (https://github.com/lucidrains/stylegan2-pytorch) claims this greatly improves results.
However, adding attention to the XXL model will probably yield OOMs. It would be interesting to see what benefits results more, more convolutional filters or attention.
Also we know that for harder tasks (more classes and poses), e.g. imagenet, StyleGANs fail at modeling larger structures.
We also know, BigGAN which employs attention on scale 64^2 (correct me if I'm wrong) does a far better job here.
I would just like to know what your experiences and thoughts are here.
This is my original setup StyleGanV2-ADA. Even at 24,000 kimg (24 million images went through network).
Not good AT ALL!
I am trying your setup now. Any hint for this dataset that you can recommend further:
'paper256_STEVE_START': dict(ref_gpus=8, kimg=10000, mb=64, mbstd=8, fmaps=0.5, lrate=2e-3, gamma=10, ema=20, ramp=None, map=8),
'paper256_STEVE_MID_10000': dict(ref_gpus=8, kimg=10000, mb=64, mbstd=8, fmaps=0.5, lrate=2e-4, gamma=10, ema=20, ramp=None, map=8),
'paper256_STEVE_END_20000' : dict(ref_gpus=8, kimg=50000, mb=64, mbstd=8, fmaps=0.5, lrate=1e-4, gamma=5, ema=20, ramp=None, map=8),
I also setting D_lrate = 2x G_lrate.
args.G_opt_kwargs = dnnlib.EasyDict(class_name='torch.optim.Adam', lr=spec.lrate, betas=[0,0.99], eps=1e-8)
args.D_opt_kwargs = dnnlib.EasyDict(class_name='torch.optim.Adam', lr=spec.lrate*2, betas=[0,0.99], eps=1e-8)
So at each level of kimg, I will try the above setting and seems like my dataset is somewhat cifar-alike hence I eliminate style mixing, pl_weight regularization as well . Should I ?
args.loss_kwargs.pl_weight = 0 # disable path length regularization
args.loss_kwargs.style_mixing_prob = 0 # disable style mixing
args.D_kwargs.architecture = 'orig' # disable residual skip connections
I do not re-train from scratch again but resume from last ptre-trained (which has mixing style, gamma=10 etc.). Is it a problem or should I re-train completely ?
Thanks, mate, any help is appreciated.
Steve
Great article, from "Practical aspects of StyleGAN2 training" to "Scaling up StyleGAN2". A lot of insights really helpful.
Just wondering if you could try something like diffusion model or other state-of-the-art model for a better synthetic image quality?
Thanks.
could you publish the code your modified
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.