Giter Site home page Giter Site logo

Comments (6)

dunkeroni avatar dunkeroni commented on May 24, 2024 2

BUG FOUND:
Thanks @grunblatt-git for bringing this up and getting us to look at it. It turns out this is a long-standing issue with how Invoke has implemented inpainting, which super bad when using very low step counts and also happens to mess up XL model inpainting in general, which has been causing some grief lately. I have found a solution that avoids the problem and it will be fixed for the next release/RC.

Technical notes for me later on when I make a PR:
off-by-one error because masking applies noise based on "timestep" which is not updated until the next iteration.

step_output = guidance(step_output, timestep, conditioning_data)
This call needs to be updated to work on latents inputs instead of step_output and moved to the beginning of the step() logic before the scale operation.
if mask is not None and not gradient_mask:
This needs a where() operation for gradient masks.

from invokeai.

dunkeroni avatar dunkeroni commented on May 24, 2024 1

Could you elaborate a bit more on how the coherence pass used to work?

There's a lot of the same word over and over again in this, so here's the definitions we'll use:
Canvas_Mask: This is the mask that you as the User set in the canvas interface.
Denoise_Mask: This is the mask that makes it to the Denoise Latents node for the first pass
Coherence_Mask: This is the mask that makes it to the Denoise Latents node for the second pass
Denoise_1: The primary Denoise operation using your Denoise Strength Setting
Denoise_2: The coherence Denoise operation using your coherence denoise strength setting
Canvas_Image: The starting image in your canvas selection box
Inpaint_Result: The output of from the inpaint operation
Canvas_Result: The composited output image that you as the user see in your canvas after inpainting

The previous canvas inpaint pipelines worked like this:

  1. Upscale Canvas_Image and Canvas_Mask to the desired inpaint resolution
  2. Denoise_Mask = Blur(Canvas_Mask)
  3. VAE encode Canvas_Image -> latents_1 (this actually happened twice because of some dependencies for inpainting models)
  4. Denoise_1(Denoise_Mask, latents_1) -> latents_2
  5. Coherence_Mask gets built based on Canvas_Mask and user settings
  6. Denoise_2(Coherence_Mask, latents_2) -> latents_3
  7. VAE decode latents_3 -> Inpaint_Result
  8. Canvas_Result = Inpaint_Result composited onto Canvas_Image using Canvas_Mask

The key differences are in what type of coherence is selected. It could be the same Canvas_Mask again, a mask that just covers the Edge of the Canvas_Mask, or no mask at all (Unmasked). In that third option, what's happening is first an inpaint pass, then an img2img pass, then the Canvas_Mask region of that img2img output gets cut and pasted back onto the initial image.

The new strategy skips Denoise_2, and instead builds Denoise_Mask in a way that it expands during the process. Pixels farther out from the mask affect denoising later, so they blend a bit better with their surroundings. I have tried adding an Unmasked analogue to this, and it helps but still has some quality issues on low-step LCM. Will probably include that in 4.0 release. There are some deeper issues with how inpainting is being handled that seem to generically affect SDXL models and possible 1.5LCM, creating a lumpy texture on otherwise smooth backgrounds. I am still searching for the exact cause of this and a way around it, but I don't anticipate that to be out before 4.0.

My questions to you:
When you were getting good results in 3.7, what were your coherence settings?
Is that with SDXL or just SD1.5 models?

from invokeai.

dunkeroni avatar dunkeroni commented on May 24, 2024

Part of the issue here is that even when the edge radius is set to 0, it still treats the mask as a gradient, which does odd things at the edges when it interpolates down to the latent scale. Some things that shouldn't be included get included only for the last step. That also means that the minimum denoise is not strictly followed on the border pixels either.

I'll have to experiment to see if there is some LCM-friendly way to handle gradients, but also I can make some changes to the node that should improve the experience:

  1. Edge radius 0 (in Canvas) will disable gradient processing and use the standard instead.
  2. Minimum levels will be passed with the mask to enforce again after downscaling.

At this point we are not planning to re-implement the previous coherence pass options, since they add so much complexity (and processing time) to the canvas graphs. However, if we end up having too many unresolvable edge cases then we can look into more options.

from invokeai.

dunkeroni avatar dunkeroni commented on May 24, 2024

I've been doing some extensive testing and code modifications in both 3.7 and 4.0 to figure out the many roots of this issue. Here's what I found:

  1. LCM models do not take kindly to inpainting. You get the same artifacts on 3.7 depending on the settings. The best way to avoid them is to use img2img or mask the entire image (which is what the Unmasked coherence mode did in 3.7). The issues are much more pronounced at low step numbers. By default, the coherence pass of 3.7 is a 20-step denoise process after the 8-step one that the user has requested. For anyone in that use case, it would be more efficient to just use a non-LCM model at 20 steps in 4.0 release.
  2. The extreme nature of how this artifacting shows in the new coherence modes is not from the gradient denoise itself. The artifacts are actually just outside of the mask region. They are being uncovered by the Mask Blur parameter in the paste-back operation, which blends the masked region with the original image to avoid visible VAE discrepancy.

Changes to make the situation better:

  1. Standard masking denoise ends the process with a torch.lerp() pass that replaces the unmasked region. This is skipped for gradient masks. Reintroducing it as a torch.where() helps, but still leaves some messiness on the border pixels. Some investigation still to be done on if there are mid-process ways to improve this.
  2. Setting Edge Radius to 0 will revert back to the normal denoise mask behavior.
  3. Adding an "Unmasked" gradient coherence mode that switches to full img2img midway through the process. Paste-back mask handles like normal.

from invokeai.

grunblatt-git avatar grunblatt-git commented on May 24, 2024

or mask the entire image (which is what the Unmasked coherence mode did in 3.7).

Could you elaborate a bit more on how the coherence pass used to work?

Up to 3.7 i used LCM Models with 6 steps for inpainting and usually reduced the amount of steps for the coherence pass to 3. This solution was fast enough to run on a mediocre GPU and produced decent (or good) results without any notable artifacts.

However if i use inpainting with LCM (which creates artifacts) and then use image2image within the same bounding box with 3-6 steps, it produces a very notable edge on the area selected by the bounding box.

Is there any way to emulate the previous coherence pass results manually?

Since inpainting with LCM could be extremely fast before, it would be great to keep a similar behaviour in 4.0.0 and later versions

edit:
Maybe i just have to figure out the proper settings for 1.5 models.
I just removed all LCM related settings and did inpainting with a normal SD 1.5 model and 20 steps - the results are better (as in i don't see artifacts on the edge of the mask), but the masked area itself still contains artifacts. Also there are some minor changes to the image outside of the masked area that i didn't expect when inpainting.

from invokeai.

grunblatt-git avatar grunblatt-git commented on May 24, 2024

When you were getting good results in 3.7, what were your coherence settings?

I just tried downgrading back to 3.7 to reproduce this, but ran into some exceptions, so i can't answer this for sure.
However i left the inpainting settings mostly untouched.

As far as i can remember, i did set the coherence steps down to 3 (or 6) and decreased the blur-radius

Is that with SDXL or just SD1.5 models?

Due to hardware restrictions I only used SD1.5 models (almost exclusively with LCM Lora and Scheduler)

Edit:
So i tried to get inpainting working for me in 4.0.0rc2, and it seems like it's really mostly an issue with the step count.

I tried some other Schedulers and they behave mostly the same:

DPM++ SDE Karras was the only one that looked great to me when inpainting with 12 steps.

all other schedulers that i tried (LCM, DPM++ 2M SDE Karras, DPM++ 2M SDE Karras, LMS) produced notible artifacts on low steps (starting with 6 steps) and produced decent results on the default setting of 20 steps for inpainting.
However, all of them (including LCM) produced good inpainting results at around 30 Steps.

Some more Schedulers (Euler, Euler a, Heun) caused exceptions during inpainting, so i'm not sure how they would hold up.

from invokeai.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.