Giter Site home page Giter Site logo

explainingai-code / ddpm-pytorch Goto Github PK

View Code? Open in Web Editor NEW
53.0 53.0 10.0 26 KB

This repo implements Denoising Diffusion Probabilistic Models (DDPM) in Pytorch

Python 100.00%
ddpm denoising-diffusion-models denoising-diffusion-probabilistic-models diffusion-model diffusion-models

ddpm-pytorch's Issues

Question on the training processo of a Diffusion Model

Hi Sir, I'm working on a LDM for my thesis and your video was very helpful in figuring out how the DDPM works. I only have a doubt in the training process, right now I'm:

  1. Sample a Batch of images and related caption
  2. Pass the images trough the Encoder of che diffusion model (to obtain the latent) and the caption trough the clip encoder
  3. Sample a random T and add noise to the latent with the scheduler
  4. Pass the latent in the Unet obtaining the predicted noise
  5. Calculate the loss between real Noise and predicted Noise

My doubt is, is it all i have to do? During the training process i don't have to do all the steps during forward and reverse project, but i can only limit to the single t i randomly sample?

RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu)

I get this error when using this repository.
This seems to fix it, but it's probably not the most efficient way to do it:

def add_noise(self, original, noise, t):
        original_shape = original.shape
        batch_size = original_shape[0]

        sqrt_alpha_cum_prod = self.sqrt_alpha_cum_prod[t.cpu()].reshape(batch_size)
        sqrt_one_minus_alpha_cum_prod = self.sqrt_one_minus_alpha_cum_prod[t.cpu()].reshape(batch_size)

        for _ in range(len(original_shape)-1):
            sqrt_alpha_cum_prod = sqrt_alpha_cum_prod.unsqueeze(-1)
        
        for _ in range(len(original_shape)-1):
            sqrt_one_minus_alpha_cum_prod = sqrt_one_minus_alpha_cum_prod.unsqueeze(-1)

        return (sqrt_alpha_cum_prod.to(original.device) * original + sqrt_one_minus_alpha_cum_prod.to(original.device) * noise)
    
    def backward(self, xt, noise_pred, t):
        x0 = (xt - (self.sqrt_one_minus_alpha_cum_prod[t.cpu()].to(noise_pred.device) * noise_pred)) / torch.sqrt(self.alpha_cum_prod[t.cpu()].to(noise_pred.device))
        x0 = torch.clamp(x0, -1., 1.)

        mean = xt - ((self.betas[t.cpu()]).to(noise_pred.device)*noise_pred).to(noise_pred.device)/(self.sqrt_one_minus_alpha_cum_prod[t.cpu()]).to(noise_pred.device)
        mean = mean / torch.sqrt(self.alphas[t.cpu()].to(noise_pred.device))

        if t == 0:
            return mean, mean
        else:
            variance = (1-self.alpha_cum_prod[(t-1).cpu()]).to(noise_pred.device) / (1. - self.alpha_cum_prod[t.cpu()]).to(noise_pred.device)
            variance = variance * self.betas[t.cpu()].to(noise_pred.device)
            sigma = variance ** 0.5
            z = torch.randn(xt.shape).to(xt.device)

            return mean + sigma*z, x0

what parameter changes would I need to make sure it runs on our dataset?

I am running this code on set of images but getting thisu error
" CUDA out of memory. Tried to allocate 150.06 GiB (GPU 0; 15.89 GiB total capacity; 720.18 MiB already allocated; 14.31 GiB free; 736.00 MiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. " I have updated the batch size, and also resize images to 224, 224 shape but it still giving me this CUDA error.

Can you please tell me what shold I do?

Thanks

Noise on all generated images

image
image
(Epoch 5)

No matter how many epochs I train the model for, I keep getting noise on my images. I think this is because there is still noise on timestep 0
image
Note: I still get noise on my image even with 1000 timesteps instead of 300

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.