Giter Site home page Giter Site logo

magicmix's Introduction

MagicMix with Stable Diffusion

Output from this repo's implementation of MagicMix. Original Image, $\nu$ = 0.75, $\nu$ = 0.9 respectively.

Implementation of MagicMix with Stable Diffusion (https://arxiv.org/abs/2210.16056) in PyTorch. This is unofficial Implementation.

Installation

pip install git+https://github.com/cloneofsimo/magicmix.git

To get it to work with CUDA GPU, install necessary pytorch and cuda versions.

Explanations

There are three main parameters for MagicMix. $K_{min} = k_{min ratio} T$, $K_{max} = k_{max ratio} T$, and $\nu$. $T$ is the number of sampling steps for the scheduler.

magicmix

Basically, $\nu$ determines how little layout image (in the photo above, the corgi) is going to effect the diffusion process. Greater the $\nu$, greater the content is going to effect.

$k_{min ratio}$ and $k_{max ratio}$ determines the range of the mixing process. If $K_{max}$ is large, this will have the same effect as loosing much info of the original layout image. If $K_{min}$ is large, this will have the effect of letting content semantic to have effect more freely.

Basic Usage

In the package magic_mix, you can find the implementation of MagicMix with Stable Diffusion. Before running, fill in the variable HF_TOKEN in .env file with Huggingface token for Stable Diffusion, and load_dotenv().

from magic_mix import magic_mix_single_image

load_dotenv(verbose=True)
image = Image.open(input_image_path).convert("RGB")

mixed_sementics = magic_mix_single_image(
    layout_image=image,
    num_inference_steps=50,
    content_semantics_prompts=["coffee machine", "tiger"],
    k_min=20,
    k_max=30,
    nu=0.5,
    guidance_scale_at_mix=7.5,
    seed=0,
    device="cuda:0"
) # mixed sementics is PIL image files...

image[0].save("mixed_sementics.png")

Or simply run the following command to generate mixed images.

python scripts/run_text_image_mix.py \
    --input_image ./examples/inputs/1.jpg \
    --output_dir ./examples/outputs \
    --num_inference_steps 50 \
    --content_semantics_prompts "coffee machine" "tiger" \
    --k_min_ratio 0.3 \
    --k_max_ratio 0.6 \
    --nu 0.5 \
    --guidance_scale_at_mix 7.5 \
    --seed 0

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.