explainingai-code / ddpm-pytorch Goto Github PK

This repo implements Denoising Diffusion Probabilistic Models (DDPM) in Pytorch

Python 100.00%

ddpm denoising-diffusion-models denoising-diffusion-probabilistic-models diffusion-model diffusion-models

ddpm-pytorch's Issues

Question on the training processo of a Diffusion Model

Hi Sir, I'm working on a LDM for my thesis and your video was very helpful in figuring out how the DDPM works. I only have a doubt in the training process, right now I'm:

Sample a Batch of images and related caption
Pass the images trough the Encoder of che diffusion model (to obtain the latent) and the caption trough the clip encoder
Sample a random T and add noise to the latent with the scheduler
Pass the latent in the Unet obtaining the predicted noise
Calculate the loss between real Noise and predicted Noise

My doubt is, is it all i have to do? During the training process i don't have to do all the steps during forward and reverse project, but i can only limit to the single t i randomly sample?

what changes would we need to do if we used our own dataset?

Thanks for the awesome explanation. Could you tell me which changes we need before training the model on our data?

RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu)

I get this error when using this repository.
This seems to fix it, but it's probably not the most efficient way to do it:

def add_noise(self, original, noise, t):
        original_shape = original.shape
        batch_size = original_shape[0]

        sqrt_alpha_cum_prod = self.sqrt_alpha_cum_prod[t.cpu()].reshape(batch_size)
        sqrt_one_minus_alpha_cum_prod = self.sqrt_one_minus_alpha_cum_prod[t.cpu()].reshape(batch_size)

        for _ in range(len(original_shape)-1):
            sqrt_alpha_cum_prod = sqrt_alpha_cum_prod.unsqueeze(-1)
        
        for _ in range(len(original_shape)-1):
            sqrt_one_minus_alpha_cum_prod = sqrt_one_minus_alpha_cum_prod.unsqueeze(-1)

        return (sqrt_alpha_cum_prod.to(original.device) * original + sqrt_one_minus_alpha_cum_prod.to(original.device) * noise)
    
    def backward(self, xt, noise_pred, t):
        x0 = (xt - (self.sqrt_one_minus_alpha_cum_prod[t.cpu()].to(noise_pred.device) * noise_pred)) / torch.sqrt(self.alpha_cum_prod[t.cpu()].to(noise_pred.device))
        x0 = torch.clamp(x0, -1., 1.)

        mean = xt - ((self.betas[t.cpu()]).to(noise_pred.device)*noise_pred).to(noise_pred.device)/(self.sqrt_one_minus_alpha_cum_prod[t.cpu()]).to(noise_pred.device)
        mean = mean / torch.sqrt(self.alphas[t.cpu()].to(noise_pred.device))

        if t == 0:
            return mean, mean
        else:
            variance = (1-self.alpha_cum_prod[(t-1).cpu()]).to(noise_pred.device) / (1. - self.alpha_cum_prod[t.cpu()]).to(noise_pred.device)
            variance = variance * self.betas[t.cpu()].to(noise_pred.device)
            sigma = variance ** 0.5
            z = torch.randn(xt.shape).to(xt.device)

            return mean + sigma*z, x0

In Diffusion Models: Why does not use 'Positional Encoding' in self-attention layers?

Thx for nice practicing about DM.
Actually, I'm really curious about why does not use 'Positional Encoding' (which was used in ViT or VanillaTransformer.. etc..) in self-attention layers?
Is that any reason and can we ensure self-attention in DDPM U-Net can maintain its position(pixel-wise) information?

what parameter changes would I need to make sure it runs on our dataset?

I am running this code on set of images but getting thisu error
" CUDA out of memory. Tried to allocate 150.06 GiB (GPU 0; 15.89 GiB total capacity; 720.18 MiB already allocated; 14.31 GiB free; 736.00 MiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. " I have updated the batch size, and also resize images to 224, 224 shape but it still giving me this CUDA error.

Can you please tell me what shold I do?

Thanks

Noise on all generated images

(Epoch 5)

No matter how many epochs I train the model for, I keep getting noise on my images. I think this is because there is still noise on timestep 0

Note: I still get noise on my image even with 1000 timesteps instead of 300

explainingai-code / ddpm-pytorch Goto Github PK

ddpm-pytorch's Issues

Question on the training processo of a Diffusion Model

what changes would we need to do if we used our own dataset?

RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu)

In Diffusion Models: Why does not use 'Positional Encoding' in self-attention layers?

what parameter changes would I need to make sure it runs on our dataset?

Noise on all generated images

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent