Giter Site home page Giter Site logo

moosquibe / diffusionlitrev Goto Github PK

View Code? Open in Web Editor NEW
2.0 2.0 0.0 4.36 MB

Interactive Literature Review on Diffusion Processes

Home Page: https://moosquibe.github.io/DiffusionLitrev/

License: MIT License

Starlark 0.13% Python 0.19% Shell 0.06% Jupyter Notebook 99.62%

diffusionlitrev's Introduction

🚧 Interactive Diffusion Literature Review - a tour of diffusion models 🚧

This repo is an ongoing educational journey into diffusion models. The goal is to provide a litreview with a progressively expanding collection of PyTorch reference implementations of the most important diffusion model milestones. Whenever possible, I prefer to keep training and inference runnable locally on an average modern MacBook, so I will rely more on small popular datasets and occasional synthetic data. Then name implies that by the time I finish, these models will be fairly "retro" (some of them already are). I am also writing a comprehensive survey post.

Roadmap

I will first do a thorough literature review to have a better understanding of what is worth implementing. The result will be a deep dive survey on a Github page.

Then I will start making some educational implementations of the most important architectures. Here is a long list of papers I plan to cover tentatively. This list will probably be heavily edited or somewhat pruned. No promises.

Name Authors ArXiv Link Year Note
"Diffusion Probabilistic Models" Sohl-Dickstein et al arXiv:1503.03585 2015 The paper that started it all, the first one to introduce the idea of diffusion models.
"Generative Modeling by Estimating Gradients of the Data Distribution" Song & Ermon arxiv:1907.05600 2019 Taking a somewhat parallel approach using Langevin Dynamics
"Denoising Diffusion Probabilistic Models" Ho et al arXiv:2006.11239 2020 The one that made diffusions take off. Introduced predicting the noise instead of the reverse process mean, etc.
"Score-Based Generative Modeling through Stochastic Differential Equations" Song et al. arxiv:2011.13456 2021 Experiments with continuous time SDE-s (Stochastic Differential Equations) for the generation.
"Diffusion Models Beat GANs on Image Synthesis" Dhariwal & Nichol arxiv:2105.05233 2021 Improve the architecture to achieve superior performance to GAN-s by the way of several ablations of the original setup. Also improves conditioned generation through classifier guidance.
"Classifier-Free Diffusion Guidance" Ho & Salimans arxiv:2207.12598 2021 Shows that diffusion guidance can be done without a classifier.
"Learning Transferable Visual Models From Natural Language Supervision" Radford et al. arxiv:2103.00020 2021 Introduces CLIP (Contrastive Language-Image Pre-training) which jointly trains representations between texts and images.
"GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models" Nichol et. al arxiv:2112.10741 2022 Compares CLIP and Classifier Free guidance
"High-Resolution Image Synthesis with Latent Diffusion Models" Rombach et. al arxiv:2112.10752 2022 Stable Diffusion: Latent Variable Diffusion Models
"Hierarchical Text-Conditional Image Generation with CLIP Latents" Ho et al arxiv:2204.06125 2022 CLIP + Latent Variables
"Denoising Diffusion Implicit Models" Song et al. arxiv:2010.02502 2022 Speeds up sampling through using a non-Markovian diffusion process.
"Progressive Distillation for Fast Sampling of Diffusion Models" Salimans & Ho arxiv:2202.00512 2022 Another approach to speed up sampling through distiallation (train a lighter student model on the output of a heavier teacher model)
"Consistency Models" Song et al. arxiv:2303.01469 2023 Yet another approach for speeding up sampling using models that directly map noise to data
"Scalable Diffusion Models with Transformers" Peebles & Chie arxiv:2212.09748 2023 Replace the U-Net in the network architecture with Transformers
"Adding Conditional Control to Text-to-Image Diffusion Models" Zhang & al arxiv:2302.05543 2023 Proposes ControlNet that adds strong conditioning control to Diffusion models.
"Causal Diffusion Autoencoders: Toward Counterfactual Generation via Diffusion Probabilistic Models" Komanduri et. al arxiv:2404.17735 2024 Proposes a model for counterfactual generation according to a pre-specified causal model.
"Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution" Lou et al. arxiv:2310.16834 2024 Diffusion for LLM-s!
"Scaling Rectified Flow Transformers for High-Resolution Image Synthesis" Esser et. al arxiv:2403.03206 2024 Stable Diffusion 3

Usage

To read the main document, see

To run the notebooks

bazel run jupyterlab

Using Bazel

To update package versions:

Start by updating the version requirements.in and then run

bazel run requirements.update()

The result can be validated by

bazel test requirements_test

To have access to requirements, add them to the jupyterlab target in //tools/jupyter/BUILD.

diffusionlitrev's People

Contributors

moosquibe avatar dependabot[bot] avatar

Stargazers

Ákos Nagy avatar Wenqing Hu avatar

Watchers

 avatar Kostas Georgiou avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.