Giter Site home page Giter Site logo

Comments (3)

Zhendong-Wang avatar Zhendong-Wang commented on July 20, 2024 2

Actually the same. Since we wrote our paper from the vallina GAN objective, for the vallina GAN, the output of discriminator is [0, 1], the probablity, and we add the -0.5 in the paper for consistency. In the implementation, we follow the StyleGAN-ADA and the discriminator outputs the logits, which could be negative, hence we use

from diffusion-gan.

Zhendong-Wang avatar Zhendong-Wang commented on July 20, 2024 1

(1) If you use sigmoid as the final activation function to rescale the output to [0, 1], then you should use r_d = (d_real.detach() - 0.5).sign().mean().

(2) we use concat and it works in our case, but I think better architecture design could benefit more here.

from diffusion-gan.

shoh4486 avatar shoh4486 commented on July 20, 2024

@Zhendong-Wang Hello again, and I have a few questions here:

(1) If to use r_d with LSGAN loss, that is, the discriminator D will learn to regress to 0.0 ~ 1.0, without the last activation layer (just logit):

class Discriminator(nn.Module):
    def __init__(self):
        super(Discriminator, self).__init__()
        self.mynet = MyNetwork()  # has NO last activation layer such as sigmoid

    def forward(self, img, t):
        batch_size, H, W = img.size(0), img.size(-2), img.size(-1)
        t = torch.ones(batch_size, 1, H, W) * t.view(-1, 1, 1, 1)
        x = torch.cat((img, t), dim=1)  # Is this correct?
        
        x = self.mynet(x)

        return x  # logit, without the last activation layer


loss = nn.MSELoss()

# ... extra code omitted
real_diffused, t = diffusion(real)
d_real = discriminator(real_diffused, t)

# ... extra code omitted
d_loss_real = loss(d_real, torch.ones_like(d_real))
d_loss_fake = loss(d_fake, torch.zeros_like(d_fake))

r_d = (d_real.detach() - 0.5).sign().mean()

In this case, r_d = (d_real.detach() - 0.5).sign().mean() is correct?

(2) In the above code block, intoducing the time step t as concat, just as the condition concat, is okay?
Or, a trainable structure, such as a fully-connected linear layer or single convolutional layer would be required?

Thank you !! :-)

from diffusion-gan.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.