Giter Site home page Giter Site logo

gctm's Introduction

Generalized Consistency Trajectory Models

Official PyTorch implementation of Generalized Consistency Trajectory Models for Image Manipulation by Beomsu Kim*, Jaemin Kim*, Jeongsol Kim, and Jong Chul Ye (*Equal contribution).

Diffusion models suffer from two limitations.

  • They require large number of function evaluations (NFEs) to generate high-fidelity images.
  • They only enable noise-to-image generation.

We propose the Generalized Consistency Trajectory Model (GCTM), which learns the probability flow ODE (PFODE) between arbitrary distributions via Flow Matching theory. Thus, GCTMs are capable of

  • Noise-to-image and image-to-image translation,
  • Score or velocity evaluation with NFE = 1,
  • Traversal between arbitrary points of the PFODE with NFE = 1.

Consequently, GCTMs are applicable to a wide variety of tasks, such as but not limited to

  • Unconditional generation
  • Image-to-image translation
  • Zero-shot and supervised image restoration
  • Image editing
  • Latent manipulation

Unconditional Generation

Image-to-Image Translation

Zero-shot and Supervised Image Restoration

Image Editing

Latent Manipulation

Environment

  • CUDA version 12.0
  • NVCC version 11.5.119
  • Python version 3.11.5
  • PyTorch version 2.0.1+cu118
  • Torchvision version 0.15.2+cu118
  • Torchaudio version 2.0.2+cu118

Datasets

Training

Use train_gctm.py to train unconditional and image-to-image models, and use train_gctm_inverse.py to train supervised image restoration models. To train unconditional or image-to-image models, one first needs to create a FID_stats directory and save the Inception activation statistics in the format (dataset name)_(resolution).npz. Inception activation statistics can be computed using save_fid_stats function in ./pytorch_fid/fid_score.py. Or, you can just comment out FID evaluation lines in the training code.

Example training scripts are provided in the ./configs directory. For instance, to train a CIFAR10 unconditional model with independent coupling, one may use the command

sh ./configs/unconditional/cifar10.sh

References

If you find this paper useful for your research, please consider citing

@article{
  kim2024gctm,
  title={Generalized Consistency Trajectory Models for Image Manipulation},
  author={Beomsu Kim and Jaemin Kim and Jeongsol Kim and Jong Chul Ye},
  journal={arXiv preprint arXiv:2403.12510},
  year={2024}
}

gctm's People

Contributors

1202kbs avatar

Stargazers

MinGiSa avatar Kyungsu Kim, PhD avatar Jose Cohenca avatar  avatar  avatar Sankarshana V avatar Mingyu Kim avatar Monteiro Steed avatar Zihao Zhang avatar Anderson Ma avatar An-zhi WANG avatar Mossy avatar Tao Hu avatar  avatar Kangfu Mei avatar Marty Sullivan avatar  avatar  avatar Marcin avatar kelsy gagnebin avatar  avatar Manish Kumar avatar Hogan Kangas avatar Jeff Carpenter avatar Le Zhuo avatar  avatar Geon Yeong Park avatar 김제민 avatar Jeongsol Kim avatar

Watchers

 avatar

gctm's Issues

About the scripts in pix2pix and unconditional task

Hi there,

Fantastic work on generalizing consistency trajectory models!

However, I have some small questions about the training scripts. It seems that the corresponding python scripts in cifar10.sh and edges2shoes.sh does not exist. When I tried to use train_gctm.py, this also does not work, no matter using it directly or removing unnecessary arguments.

I just wonder how I can conduct the unconditional and pix2pix experiments.

A million thanks.

Yours sincerely.
Weijian

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.