Giter Site home page Giter Site logo

sd-perturbed-attention's Introduction

Perturbed-Attention Guidance for ComfyUI / SD WebUI (Forge)

Implementation of Self-Rectifying Diffusion Sampling with Perturbed-Attention Guidance (D. Ahn et al.) as an extension for ComfyUI and SD WebUI (Forge).

Works with SD1.5 and SDXL.

Doesn't work with Stable Cascade.

Note

PAG may produce striped "noise", setting sigma_end to 0.7 or higher may reduce striped patterns.

Note

Paper and demo suggest using CFG scale 4.0 with PAG scale 3.0 applied to U-Net's middle layer 0, but feel free to experiment.

Sampling speed without adaptive_scale or sigma_start / sigma_end is similar to Self-Attention Guidance (x0.6 of usual it/s).

Installation

ComfyUI

Basic PAG node is now included into ComfyUI - you don't have to install this extension unless you want to mess with additional parameters.

comfyui-node-basic

To install the advanced PAG node from this repo, you can either:

  • git clone https://github.com/pamparamm/sd-perturbed-attention.git into ComfyUI/custom-nodes/ folder.

  • Install it via ComfyUI Manager (search for custom node named "Perturbed-Attention Guidance").

  • Install it via comfy-cli with comfy node registry-install sd-perturbed-attention

comfyui-node-advanced

SD WebUI (Forge)

git clone https://github.com/pamparamm/sd-perturbed-attention.git into stable-diffusion-webui-forge/extensions/ folder.

forge-script

Note

You can override CFG Scale and PAG Scale for Hires. fix by opening/enabling Override for Hires. fix tab. To disable PAG during Hires. fix, set PAG Scale under Override to 0.

SD WebUI (Auto1111)

As an alternative for A1111 WebUI you can use PAG implementation from sd-webui-incantations extension.

Parameters

  • scale: PAG scale, has some resemblance to CFG scale - higher values can both increase structural coherence of the image and oversaturate/fry it entirely.
  • adaptive_scale: PAG dampening factor, it penalizes PAG during late denoising stages, resulting in overall speedup: 0.0 means no penalty and 1.0 completely removes PAG.
  • unet_block: Part of U-Net to which PAG is applied, original paper suggests to use middle.
  • unet_block_id: Id of U-Net layer in a selected block to which PAG is applied. PAG can be applied only to layers containing Self-attention blocks.
  • sigma_start / sigma_end: PAG will be active only between sigma_start and sigma_end. Set both values to negative to disable this feature.
  • rescale_pag: Acts similar to RescaleCFG node - it prevents over-exposure on high scale values. Based on Algorithm 2 from Common Diffusion Noise Schedules and Sample Steps are Flawed (Lin et al.). Set to 0 to disable this feature.
  • rescale_mode:
    • full - takes into account both CFG and PAG.
    • partial - depends only on PAG.
  • unet_block_list: Replaces both unet_block and unet_block_id, allows you to select multiple U-Net layers separated with commas. SDXL U-Net has multiple indices for layers, you can specify them using dot symbol (if not specified, PAG would be applied to the whole layer). Example value: m0,u0.4 (PAG will be applied to middle block 0 and to output block 0 with index 4)
    • d means input, m means middle and u means output.
    • SD1.5 U-Net has layers d0-d5, m0, u0-u8.
    • SDXL U-Net has layers d0-d3, m0, u0-u5. In addition, each block except d0 and d1 has 0-9 index values (like m0.7 or u0.4). d0 and d1 have 0-1 index values.

ComfyUI TensorRT

To use PAG together with ComfyUI_TensorRT, you'll need to:

  1. Build static/dynamic TRT engine of a desired model.
  2. Build static/dynamic TRT engine of the same model with the same TRT parameters, but with fixed PAG injection in selected UNET blocks (TensorRT Attach PAG node).
  3. Use TensorRT Perturbed-Attention Guidance node with two model inputs: one for base engine and one for PAG engine.

trt-engines

trt-inference

sd-perturbed-attention's People

Contributors

pamparamm avatar dfl avatar extraltodeus avatar comfy-pr-bot avatar

Stargazers

 avatar  avatar  avatar Jonnyshao avatar  avatar blender avatar  avatar  avatar BMGD avatar dkluffy avatar Jean-Philippe Deblonde avatar ZRGX avatar Tommy Puglia avatar  avatar Rokyugen avatar  avatar ytoaa avatar CatHunter17 avatar David Marx avatar Ben Rockwood avatar Pete Sarabia avatar Razunter avatar Phạm Hưng avatar Tabitha S. Bragg avatar  avatar Razvan B. avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar Valeriy Selitskiy avatar Alisan Dagdelen avatar Kevin Yuan avatar Randy H avatar Vadim Kulibaba avatar Sean avatar ZF avatar  avatar  avatar  avatar  avatar Wilson avatar  avatar  avatar Mel Massadian avatar UglyStupidHonest avatar  avatar framinggraphics avatar Tophers avatar  avatar Sean Canton avatar machina avatar Antonio Vaca Cozar avatar Hoodad Mehrbod avatar  avatar Yoon, Seungje avatar EunPyo Hong avatar  avatar Robert Dean avatar  avatar  avatar  avatar  avatar  avatar focsuer avatar  avatar  avatar Luciano Santa Brígida avatar  avatar  avatar Erik avatar Bocchi avatar CB avatar  avatar  avatar  avatar  avatar 喵哩个咪 avatar  avatar Tripp Lyons avatar  avatar kanttouchthis avatar Karel Ševčík avatar VF-1J avatar will-gao avatar  avatar  avatar Carsten Li avatar  avatar  avatar  avatar  avatar Jodh Singh avatar  avatar 子龙 avatar  avatar

Watchers

Rahul Y Gupta avatar Alonso Júnior avatar Kostas Georgiou avatar  avatar

sd-perturbed-attention's Issues

Question about LORA

Sorry, this is not an issue, just a question: connect the PAG's model input to the CheckpointLoader or to the LORA's output? Thanks.

Disable for High-Res Fix?

Using this with Forge. Could you add an option to disable it during the Highres-fix?

It seems to look good when used without highres fix, but when used with highres fix it seems to lead to a lot of extra duplications.

Advanced node breaking ComfyUI instal

Hey there, thanks for the amazing work!

I updated ComfyUI today and installed the Advanced node from this repo. This broke my comfyui install by making my image generations deep fried, with and without the PAG (Advanced) node : https://imgur.com/a/JJanqFK
When I disable the extension/custom node, things go back to normal.

Just thought I'd mention it!

running out of vram

hello, i just install the repo to check your node but on a workflow with 3 ipadapter and a controlnet, i'm running out of memory on a RTX3090. Just can use it on very basic workflow , is that normal?

Question about sigma_start / sigma_end setting, and other advanced settings.

I have noticed that if I add your more advanced PAG node to my comfyui SVD workflow 5 times on blocks 0 - 5 on the unet output setting that it really seems to enhance the video. The issue I'm having is it gets very slow then. I was wanting to mess around with the sigma_start / sigma_end setting but I don't really know what numbers you would suggest for a regular ksampler.

I also have been messing around with a separate ksampler with Nvidia align your steps, and wondering if that would also affect what I set sigma_start / sigma_end to.

And last question, I was wondering about rescale setting and how high it goes, is 1 the highest and what you recommend there.

Basically just trying to get an idea for a good balanced tradeoff where you can still get good benefits without it slowing down too much. Thanks again for making this repo.

Adaptive scale behaviour

The adaptive scale seems to behave in a non-intuitive way, I added a print(signal_scale) after line 79 in pag_nodes.py to inspect the values and with 11 steps and 4 PAG scale this is what I get
Adaptive scale .1: 3.9, 3.9, 3.9, 3.8, 3.8, 3.8, 3.7, 3.7, 3.7, 3.6, 3.6
Adaptive scale .3: 3.9, 1.0, -1., -4., -7., -10, -13, -16, -19, -22, -25
Adaptive scale .5: 3.7, -19, -41, -64, -87, -10, -13, -15, -17, -20, -22

Both .3 and .5 disable PAG after the first step, where one (or at least I) would expect it to decay and zero out only after ~30% and ~50% of the steps respectively, or maybe decay by 10%, 30% and 50% every step, idk, but the way it currently works seems very unintuitive, am I missing something?

Img2img error

*** Error running post_sample: E:\stable-diffusion-webui-forge\extensions\sd-perturbed-attention\scripts\pag.py
    Traceback (most recent call last):
      File "E:\stable-diffusion-webui-forge\modules\scripts.py", line 867, in post_sample
        script.post_sample(p, ps, *script_args)
      File "E:\stable-diffusion-webui-forge\extensions\sd-perturbed-attention\scripts\pag.py", line 81, in post_sample
        if p.enable_hr and hr_override:
    AttributeError: 'StableDiffusionProcessingImg2Img' object has no attribute 'enable_hr'

---

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.