timothybrooks / instruct-pix2pix Goto Github PK

View Code? Open in Web Editor NEW

6.2K 6.2K 528.0 17.3 MB

License: Other

Python 99.29% Shell 0.71%

instruct-pix2pix's People

Contributors

Stargazers

Watchers

Forkers

jackzhousz zongking123 aasim-syed jimgoo cian0 jxzhangjhu theprintsco codeaudit integritynoble techthiyanes breengles cokeroluwafemi c00renut mertcookimg bapek nialljmiller pz325 johndpope eltociear sohowj milyiyo arielreplicate saulocatharino hirajanwin nicejhonee geko1100 eternaldusk mistobaan aivscovid19 ailabteam sonu6569 abi patrickvonplaten eldadcaspi abhinav1217 surander96 mehrnooshzandi jorik041 dbuos reyreyttt78 kenitious it21712 sirbenet chz scott-coalesce suryatmodulus slopesweb barseghyanartur akankushjnvku meatfucker florianjuengermann cclauss hadryan harsha-hue mrcodechef kaylam73 sanju7699 soaringgecko ryan4daniel4 netzo92 geisajcs brycedrennan vanclesig edenilsonn marcus-arcadius pingpong-ai-models jinoprinx tomer-rgo afkgit 04sumit04 joskid ahagai w0lramd agentmishra sandeshpatil1986 entn-at wiktoria30 tigeryy2 cjgammon hichemmaiza iuriimattos2 aladygit proven-design marcianobarros20 starcorpceo khanajmal007 hartl3y94 obwando 1jsingh techvbidut mahlac zby13857238876 rampall lidiakalugina marsh4ll99 rafayghafoor kokojo56 julianhasseusds tilakranjan elias-g-m

instruct-pix2pix's Issues

No module named ddpm_edit

I'm getting the following error when running the cli tool:

Loading model from checkpoints/instruct-pix2pix-00-22000.ckpt
Global Step: 22000
Traceback (most recent call last):
File "edit_cli.py", line 128, in
main()
File "edit_cli.py", line 79, in main
model = load_model_from_config(config, args.ckpt, args.vae_ckpt)
File "edit_cli.py", line 52, in load_model_from_config
model = instantiate_from_config(config.model)
File "/ingest/ImageDiffuserService/proto/instruct2pix2pix/instruct-pix2pix/stable_diffusion/ldm/util.py", line 85, in instantiate_from_config
return get_obj_from_str(config["target"])(**config.get("params", dict()))
File "/ingest/ImageDiffuserService/proto/instruct2pix2pix/instruct-pix2pix/stable_diffusion/ldm/util.py", line 93, in get_obj_from_str
return getattr(importlib.import_module(module, package=None), cls)
File "/home/elliot/anaconda3/envs/pytorch-env/lib/python3.8/importlib/init.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1014, in _gcd_import
File "", line 991, in _find_and_load
File "", line 973, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'ldm.models.diffusion.ddpm_edit'

I've googled "ldm.models.diffusion.ddpm_edit" and don't see any references to this module existing. any idea as to what i'm doing wrong?

'bash' is not recognized even after installing with pip command

'bash' is not recognized as an internal or external command,
operable program or batch file.

AttributeError: module transformers has no attribute CLIPImageProcessor

Hello, when running

import PIL
import requests
import torch
from diffusers import StableDiffusionInstructPix2PixPipeline, EulerAncestralDiscreteScheduler

model_id = "timbrooks/instruct-pix2pix"
pipe = StableDiffusionInstructPix2PixPipeline.from_pretrained(model_id, torch_dtype=torch.float16, revision="fp16", safety_checker=None)
pipe.to("cuda")
pipe.enable_attention_slicing()

i keep encountering "AttributeError: module transformers has no attribute CLIPImageProcessor"

i tried installing clip and updating transformers, but same error. The only issue similar issue i could find was

https://self-development.info/%E3%80%90stable-diffusion%E3%80%91%E3%82%B5%E3%83%B3%E3%83%80%E3%83%BC%E3%83%90%E3%83%BC%E3%83%89%E9%A2%A8%E3%81%AE%E7%94%BB%E5%83%8F%E3%82%92%E7%94%9F%E6%88%90%E3%81%99%E3%82%8B/

and google translate didnt help much lol

Running on CPU?

Hi folks,

I tried to run it, but the app exit with a 'killed' message.
dmesg state that is because an Out Of Memory error.

Can I run this app in system memory, using CPU? How could I do it?

Thank you!

No scale factor applied when concat image related information

Hi, this is a minor nit and asking to see if there are explicit motivations to do so. In the process of supporting this model in Draw Things, I noticed that unlike Inpainting models, where the encoded image multiplied the "scale factor": https://github.com/runwayml/stable-diffusion/blob/main/ldm/models/diffusion/ddpm.py#L550 while in instruct pix2pix, we don't: https://github.com/timothybrooks/instruct-pix2pix/blob/main/edit_cli.py#L103

Not a big deal to me as I figured out, modified a bit and it worked exactly like expected. But want to call it out and see if there are some considerations applied so I know where my modifications should be applied (whether treat the edit model as a special case, or just modify the first layer conv2d weights).

[Tutorial] Forget Photoshop - How To Transform Images With Text Prompts using InstructPix2Pix Model in NMKD GUI

This is not an issue.

I hope you can add this tutorial to the readme section of the page

Thank you so much for this new great AI model. I hope you even release its improved version soon

My tutorial could be the first one to use it with easiest way

https://youtu.be/EPRa8EZl9Os

How to save automatically all the images generated on a folder? In Colab

I'm using your Colab and I'm trying to figure out how to save automatically all the images generated from one prompt into a folder.

Any help?

Appreciate

How to make Windows 10 recognize Bash

Windows 10 (check if bash or wsl not working)
if not working:

if you run wsl --install and see the WSL help text, please try running wsl --list --online to see a list of available distros and run wsl --install -d to install a distro. To uninstall WSL, see Uninstall legacy version of WSL or unregister or uninstall a Linux distribution.

after installation test

try
wsl bash -c "echo hi from simple script"

the echo should work.

Was the model trained for 22 epochs?

The model name instruct-pix2pix-00-22000.ckpt looks like it was trained for 22 epochs. But when I trained the model, it was trained more than that. Just want to confirm how many epochs we shall train to get the final results. Thanks.

Automatic1111 integration

Is it possible to use this with the automatic1111 webui?

I tried downloading the ckpt and used the config file, but it seems it is not enough.

huggingface instruct-pix2pix fails

In hugging face.
timbrooks/instruct-pix2pix

Trying to turn David into a cyborg with the same
settings as your readme does not work.
It returns a multi color blur.

Fix CFG ON

Text CFG Image CFG
7.5 1.2

When running python edit_app.py after a while terminal dissapears and shows nothing

Hi!

I did all the steps required from the readme and when running python edit_app.pyit appears:

Loading model from checkpoints/instruct-pix2pix-00-22000.ckpt

and then it just closes the terminal and shows nothing. It should open the Gradio interface right?

Appreciate

VectorQuantizer2 error

ImportError: cannot import name 'VectorQuantizer2' from 'taming.modules.vqvae.quantize'
I'm getting an error like ... What could be the reason for this?

AssertionError: Torch not compiled with CUDA enabled

I get this error. How can I correct it?

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ F:\Work area\instruct-pix2pix\edit_cli.py:128 in <module>                                        │
│                                                                                                  │
│   125                                                                                            │
│   126                                                                                            │
│   127 if __name__ == "__main__":                                                                 │
│ ❱ 128 │   main()                                                                                 │
│   129                                                                                            │
│                                                                                                  │
│ F:\Work area\instruct-pix2pix\edit_cli.py:80 in main                                             │
│                                                                                                  │
│    77 │                                                                                          │
│    78 │   config = OmegaConf.load(args.config)                                                   │
│    79 │   model = load_model_from_config(config, args.ckpt, args.vae_ckpt)                       │
│ ❱  80 │   model.eval().cuda()                                                                    │
│    81 │   model_wrap = K.external.CompVisDenoiser(model)                                         │
│    82 │   model_wrap_cfg = CFGDenoiser(model_wrap)                                               │
│    83 │   null_token = model.get_learned_conditioning([""])                                      │
│                                                                                                  │
│ C:\Users\user\miniconda3\lib\site-packages\pytorch_lightning\core\mixins\device_dtype_mixin.py:1 │
│ 27 in cuda                                                                                       │
│                                                                                                  │
│   124 │   │   if device is None or isinstance(device, int):                                      │
│   125 │   │   │   device = torch.device("cuda", index=device)                                    │
│   126 │   │   self.__update_properties(device=device)                                            │
│ ❱ 127 │   │   return super().cuda(device=device)                                                 │
│   128 │                                                                                          │
│   129 │   def cpu(self) -> "DeviceDtypeModuleMixin":                                             │
│   130 │   │   """Moves all model parameters and buffers to the CPU.                              │
│                                                                                                  │
│ C:\Users\user\miniconda3\lib\site-packages\torch\nn\modules\module.py:749 in cuda                │
│                                                                                                  │
│    746 │   │   Returns:                                                                          │
│    747 │   │   │   Module: self                                                                  │
│    748 │   │   """                                                                               │
│ ❱  749 │   │   return self._apply(lambda t: t.cuda(device))                                      │
│    750 │                                                                                         │
│    751 │   def ipu(self: T, device: Optional[Union[int, device]] = None) -> T:                   │
│    752 │   │   r"""Moves all model parameters and buffers to the IPU.                            │
│                                                                                                  │
│ C:\Users\user\miniconda3\lib\site-packages\torch\nn\modules\module.py:641 in _apply              │
│                                                                                                  │
│    638 │                                                                                         │
│    639 │   def _apply(self, fn):                                                                 │
│    640 │   │   for module in self.children():                                                    │
│ ❱  641 │   │   │   module._apply(fn)                                                             │
│    642 │   │                                                                                     │
│    643 │   │   def compute_should_use_set_data(tensor, tensor_applied):                          │
│    644 │   │   │   if torch._has_compatible_shallow_copy_type(tensor, tensor_applied):           │
│                                                                                                  │
│ C:\Users\user\miniconda3\lib\site-packages\torch\nn\modules\module.py:641 in _apply              │
│                                                                                                  │
│    638 │                                                                                         │
│    639 │   def _apply(self, fn):                                                                 │
│    640 │   │   for module in self.children():                                                    │
│ ❱  641 │   │   │   module._apply(fn)                                                             │
│    642 │   │                                                                                     │
│    643 │   │   def compute_should_use_set_data(tensor, tensor_applied):                          │
│    644 │   │   │   if torch._has_compatible_shallow_copy_type(tensor, tensor_applied):           │
│                                                                                                  │
│ C:\Users\user\miniconda3\lib\site-packages\torch\nn\modules\module.py:641 in _apply              │
│                                                                                                  │
│    638 │                                                                                         │
│    639 │   def _apply(self, fn):                                                                 │
│    640 │   │   for module in self.children():                                                    │
│ ❱  641 │   │   │   module._apply(fn)                                                             │
│    642 │   │                                                                                     │
│    643 │   │   def compute_should_use_set_data(tensor, tensor_applied):                          │
│    644 │   │   │   if torch._has_compatible_shallow_copy_type(tensor, tensor_applied):           │
│                                                                                                  │
│ C:\Users\user\miniconda3\lib\site-packages\torch\nn\modules\module.py:641 in _apply              │
│                                                                                                  │
│    638 │                                                                                         │
│    639 │   def _apply(self, fn):                                                                 │
│    640 │   │   for module in self.children():                                                    │
│ ❱  641 │   │   │   module._apply(fn)                                                             │
│    642 │   │                                                                                     │
│    643 │   │   def compute_should_use_set_data(tensor, tensor_applied):                          │
│    644 │   │   │   if torch._has_compatible_shallow_copy_type(tensor, tensor_applied):           │
│                                                                                                  │
│ C:\Users\user\miniconda3\lib\site-packages\torch\nn\modules\module.py:664 in _apply              │
│                                                                                                  │
│    661 │   │   │   # track autograd history of `param_applied`, so we have to use                │
│    662 │   │   │   # `with torch.no_grad():`                                                     │
│    663 │   │   │   with torch.no_grad():                                                         │
│ ❱  664 │   │   │   │   param_applied = fn(param)                                                 │
│    665 │   │   │   should_use_set_data = compute_should_use_set_data(param, param_applied)       │
│    666 │   │   │   if should_use_set_data:                                                       │
│    667 │   │   │   │   param.data = param_applied                                                │
│                                                                                                  │
│ C:\Users\user\miniconda3\lib\site-packages\torch\nn\modules\module.py:749 in <lambda>            │
│                                                                                                  │
│    746 │   │   Returns:                                                                          │
│    747 │   │   │   Module: self                                                                  │
│    748 │   │   """                                                                               │
│ ❱  749 │   │   return self._apply(lambda t: t.cuda(device))                                      │
│    750 │                                                                                         │
│    751 │   def ipu(self: T, device: Optional[Union[int, device]] = None) -> T:                   │
│    752 │   │   r"""Moves all model parameters and buffers to the IPU.                            │
│                                                                                                  │
│ C:\Users\user\miniconda3\lib\site-packages\torch\cuda\__init__.py:221 in _lazy_init              │
│                                                                                                  │
│   218 │   │   │   │   "Cannot re-initialize CUDA in forked subprocess. To use CUDA with "        │
│   219 │   │   │   │   "multiprocessing, you must use the 'spawn' start method")                  │
│   220 │   │   if not hasattr(torch._C, '_cuda_getDeviceCount'):                                  │
│ ❱ 221 │   │   │   raise AssertionError("Torch not compiled with CUDA enabled")                   │
│   222 │   │   if _cudart is None:                                                                │
│   223 │   │   │   raise AssertionError(                                                          │
│   224 │   │   │   │   "libcudart functions unavailable. It looks like you have a broken build?   │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
AssertionError: Torch not compiled with CUDA enabled

I am also attaching my video card information. Just to make sure that it has enough resources.

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 528.24       Driver Version: 528.24       CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name            TCC/WDDM | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ... WDDM  | 00000000:01:00.0  On |                  N/A |
| 21%   45C    P0    52W / 200W |   1398MiB /  8192MiB |      1%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

12GB VRAM not enough? RTX 3060 - Linux Mint 21.1

When I try to generate an image from a 512x512 image (tried both jpg and png), I get the following error output. The error appears to be roughly the same whether I use the gradio webui or just straight from the command line. Any idea what might be causing this?

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /home/wh33t/instruct-pix2pix/edit_cli.py:128 in <module>                                         │
│                                                                                                  │
│   125                                                                                            │
│   126                                                                                            │
│   127 if __name__ == "__main__":                                                                 │
│ ❱ 128 │   main()                                                                                 │
│   129                                                                                            │
│                                                                                                  │
│ /home/wh33t/instruct-pix2pix/edit_cli.py:98 in main                                              │
│                                                                                                  │
│    95 │   │   input_image.save(args.output)                                                      │
│    96 │   │   return                                                                             │
│    97 │                                                                                          │
│ ❱  98 │   with torch.no_grad(), autocast("cuda"), model.ema_scope():                             │
│    99 │   │   cond = {}                                                                          │
│   100 │   │   cond["c_crossattn"] = [model.get_learned_conditioning([args.edit])]                │
│   101 │   │   input_image = 2 * torch.tensor(np.array(input_image)).float() / 255 - 1            │
│                                                                                                  │
│ /home/wh33t/anaconda3/envs/ip2p/lib/python3.8/contextlib.py:113 in __enter__                     │
│                                                                                                  │
│   110 │   │   # they are only needed for recreation, which is not possible anymore               │
│   111 │   │   del self.args, self.kwds, self.func                                                │
│   112 │   │   try:                                                                               │
│ ❱ 113 │   │   │   return next(self.gen)                                                          │
│   114 │   │   except StopIteration:                                                              │
│   115 │   │   │   raise RuntimeError("generator didn't yield") from None                         │
│   116                                                                                            │
│                                                                                                  │
│ /home/wh33t/instruct-pix2pix/./stable_diffusion/ldm/models/diffusion/ddpm_edit.py:185 in         │
│ ema_scope                                                                                        │
│                                                                                                  │
│    182 │   @contextmanager                                                                       │
│    183 │   def ema_scope(self, context=None):                                                    │
│    184 │   │   if self.use_ema:                                                                  │
│ ❱  185 │   │   │   self.model_ema.store(self.model.parameters())                                 │
│    186 │   │   │   self.model_ema.copy_to(self.model)                                            │
│    187 │   │   │   if context is not None:                                                       │
│    188 │   │   │   │   print(f"{context}: Switched to EMA weights")                              │
│                                                                                                  │
│ /home/wh33t/instruct-pix2pix/./stable_diffusion/ldm/modules/ema.py:62 in store                   │
│                                                                                                  │
│   59 │   │     parameters: Iterable of `torch.nn.Parameter`; the parameters to be                │
│   60 │   │   │   temporarily stored.                                                             │
│   61 │   │   """                                                                                 │
│ ❱ 62 │   │   self.collected_params = [param.clone() for param in parameters]                     │
│   63 │                                                                                           │
│   64 │   def restore(self, parameters):                                                          │
│   65 │   │   """                                                                                 │
│                                                                                                  │
│ /home/wh33t/instruct-pix2pix/./stable_diffusion/ldm/modules/ema.py:62 in <listcomp>              │
│                                                                                                  │
│   59 │   │     parameters: Iterable of `torch.nn.Parameter`; the parameters to be                │
│   60 │   │   │   temporarily stored.                                                             │
│   61 │   │   """                                                                                 │
│ ❱ 62 │   │   self.collected_params = [param.clone() for param in parameters]                     │
│   63 │                                                                                           │
│   64 │   def restore(self, parameters):                                                          │
│   65 │   │   """                                                                                 │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
RuntimeError: CUDA out of memory. Tried to allocate 50.00 MiB (GPU 0; 11.75 GiB total capacity; 9.65 GiB already allocated; 30.00 MiB free; 9.83 GiB reserved in total by PyTorch) If reserved memory is >> allocated 
memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

expected scalar type Half but found Float

Running wsl ubuntu 18, RTX 2080

Initially i had #19, switched to fp16 now i'm getting (excuse the paste):

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /home/:)/instruct-pix2pix/edit_app.py:270 in <module>                                       │
│                                                                                                  │
│   267                                                                                            │
│   268                                                                                            │
│   269 if __name__ == "__main__":                                                                 │
│ ❱ 270 │   main()                                                                                 │
│   271                                                                                            │
│                                                                                                  │
│ /home/:)/instruct-pix2pix/edit_app.py:115 in main                                           │
│                                                                                                  │
│   112 │   model.eval().cuda()                                                                    │
│   113 │   model_wrap = K.external.CompVisDenoiser(model)                                         │
│   114 │   model_wrap_cfg = CFGDenoiser(model_wrap)                                               │
│ ❱ 115 │   null_token = model.get_learned_conditioning([""])                                      │
│   116 │   example_image = Image.open("imgs/example.jpg").convert("RGB")                          │
│   117 │                                                                                          │
│   118 │   def load_example(                                                                      │
│                                                                                                  │
│ /home/:)/instruct-pix2pix/./stable_diffusion/ldm/models/diffusion/ddpm_edit.py:588 in       │
│ get_learned_conditioning                                                                         │
│                                                                                                  │
│    585 │   def get_learned_conditioning(self, c):                                                │
│    586 │   │   if self.cond_stage_forward is None:                                               │
│    587 │   │   │   if hasattr(self.cond_stage_model, 'encode') and callable(self.cond_stage_mod  │
│ ❱  588 │   │   │   │   c = self.cond_stage_model.encode(c)                                       │
│    589 │   │   │   │   if isinstance(c, DiagonalGaussianDistribution):                           │
│    590 │   │   │   │   │   c = c.mode()                                                          │
│    591 │   │   │   else:                                                                         │
│                                                                                                  │
│ /home/:)/instruct-pix2pix/./stable_diffusion/ldm/modules/encoders/modules.py:162 in encode  │
│                                                                                                  │
│   159 │   │   return z                                                                           │
│   160 │                                                                                          │
│   161 │   def encode(self, text):                                                                │
│ ❱ 162 │   │   return self(text)                                                                  │
│   163                                                                                            │
│   164                                                                                            │
│   165 class FrozenCLIPTextEmbedder(nn.Module):                                                   │
│                                                                                                  │
│ /home/rei/micromamba/envs/ip2p/lib/python3.8/site-packages/torch/nn/modules/module.py:1110 in    │
│ _call_impl                                                                                       │
│                                                                                                  │
│   1107 │   │   # this function, and just call forward.                                           │
│   1108 │   │   if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks o  │
│   1109 │   │   │   │   or _global_forward_hooks or _global_forward_pre_hooks):                   │
│ ❱ 1110 │   │   │   return forward_call(*input, **kwargs)                                         │
│   1111 │   │   # Do not call functions when jit is used                                          │
│   1112 │   │   full_backward_hooks, non_full_backward_hooks = [], []                             │
│   1113 │   │   if self._backward_hooks or _global_backward_hooks:                                │
│                                                                                                  │
│ /home/:)/instruct-pix2pix/./stable_diffusion/ldm/modules/encoders/modules.py:156 in forward │
│                                                                                                  │
│   153 │   │   batch_encoding = self.tokenizer(text, truncation=True, max_length=self.max_lengt   │
│   154 │   │   │   │   │   │   │   │   │   │   return_overflowing_tokens=False, padding="max_le   │
│   155 │   │   tokens = batch_encoding["input_ids"].to(self.device)                               │
│ ❱ 156 │   │   outputs = self.transformer(input_ids=tokens)                                       │
│   157 │   │                                                                                      │
│   158 │   │   z = outputs.last_hidden_state                                                      │
│   159 │   │   return z                                                                           │
│                                                                                                  │
│ /home/:)/micromamba/envs/ip2p/lib/python3.8/site-packages/torch/nn/modules/module.py:1110 in    │
│ _call_impl                                                                                       │
│                                                                                                  │
│   1107 │   │   # this function, and just call forward.                                           │
│   1108 │   │   if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks o  │
│   1109 │   │   │   │   or _global_forward_hooks or _global_forward_pre_hooks):                   │
│ ❱ 1110 │   │   │   return forward_call(*input, **kwargs)                                         │
│   1111 │   │   # Do not call functions when jit is used                                          │
│   1112 │   │   full_backward_hooks, non_full_backward_hooks = [], []                             │
│   1113 │   │   if self._backward_hooks or _global_backward_hooks:                                │
│                                                                                                  │
│ /home/:)/micromamba/envs/ip2p/lib/python3.8/site-packages/transformers/models/clip/modeling_cli │
│ p.py:722 in forward                                                                              │
│                                                                                                  │
│    719 │   │   >>> last_hidden_state = outputs.last_hidden_state                                 │
│    720 │   │   >>> pooled_output = outputs.pooler_output  # pooled (EOS token) states            │
│    721 │   │   ```"""                                                                            │
│ ❱  722 │   │   return self.text_model(                                                           │
│    723 │   │   │   input_ids=input_ids,                                                          │
│    724 │   │   │   attention_mask=attention_mask,                                                │
│    725 │   │   │   position_ids=position_ids,                                                    │
│                                                                                                  │
│ /home/:)/micromamba/envs/ip2p/lib/python3.8/site-packages/torch/nn/modules/module.py:1110 in    │
│ _call_impl                                                                                       │
│                                                                                                  │
│   1107 │   │   # this function, and just call forward.                                           │
│   1108 │   │   if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks o  │
│   1109 │   │   │   │   or _global_forward_hooks or _global_forward_pre_hooks):                   │
│ ❱ 1110 │   │   │   return forward_call(*input, **kwargs)                                         │
│   1111 │   │   # Do not call functions when jit is used                                          │
│   1112 │   │   full_backward_hooks, non_full_backward_hooks = [], []                             │
│   1113 │   │   if self._backward_hooks or _global_backward_hooks:                                │
│                                                                                                  │
│ /home/:)/micromamba/envs/ip2p/lib/python3.8/site-packages/transformers/models/clip/modeling_cli │
│ p.py:643 in forward                                                                              │
│                                                                                                  │
│    640 │   │   │   # [bsz, seq_len] -> [bsz, 1, tgt_seq_len, src_seq_len]                        │
│    641 │   │   │   attention_mask = _expand_mask(attention_mask, hidden_states.dtype)            │
│    642 │   │                                                                                     │
│ ❱  643 │   │   encoder_outputs = self.encoder(                                                   │
│    644 │   │   │   inputs_embeds=hidden_states,                                                  │
│    645 │   │   │   attention_mask=attention_mask,                                                │
│    646 │   │   │   causal_attention_mask=causal_attention_mask,                                  │
│                                                                                                  │
│ /home/:)/micromamba/envs/ip2p/lib/python3.8/site-packages/torch/nn/modules/module.py:1110 in    │
│ _call_impl                                                                                       │
│                                                                                                  │
│   1107 │   │   # this function, and just call forward.                                           │
│   1108 │   │   if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks o  │
│   1109 │   │   │   │   or _global_forward_hooks or _global_forward_pre_hooks):                   │
│ ❱ 1110 │   │   │   return forward_call(*input, **kwargs)                                         │
│   1111 │   │   # Do not call functions when jit is used                                          │
│   1112 │   │   full_backward_hooks, non_full_backward_hooks = [], []                             │
│   1113 │   │   if self._backward_hooks or _global_backward_hooks:                                │
│                                                                                                  │
│ /home/:)/micromamba/envs/ip2p/lib/python3.8/site-packages/transformers/models/clip/modeling_cli │
│ p.py:574 in forward                                                                              │
│                                                                                                  │
│    571 │   │   │   │   │   causal_attention_mask,                                                │
│    572 │   │   │   │   )                                                                         │
│    573 │   │   │   else:                                                                         │
│ ❱  574 │   │   │   │   layer_outputs = encoder_layer(                                            │
│    575 │   │   │   │   │   hidden_states,                                                        │
│    576 │   │   │   │   │   attention_mask,                                                       │
│    577 │   │   │   │   │   causal_attention_mask,                                                │
│                                                                                                  │
│ /home/:)/micromamba/envs/ip2p/lib/python3.8/site-packages/torch/nn/modules/module.py:1110 in    │
│ _call_impl                                                                                       │
│                                                                                                  │
│   1107 │   │   # this function, and just call forward.                                           │
│   1108 │   │   if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks o  │
│   1109 │   │   │   │   or _global_forward_hooks or _global_forward_pre_hooks):                   │
│ ❱ 1110 │   │   │   return forward_call(*input, **kwargs)                                         │
│   1111 │   │   # Do not call functions when jit is used                                          │
│   1112 │   │   full_backward_hooks, non_full_backward_hooks = [], []                             │
│   1113 │   │   if self._backward_hooks or _global_backward_hooks:                                │
│                                                                                                  │
│ /home/:)/micromamba/envs/ip2p/lib/python3.8/site-packages/transformers/models/clip/modeling_cli │
│ p.py:317 in forward                                                                              │
│                                                                                                  │
│    314 │   │   residual = hidden_states                                                          │
│    315 │   │                                                                                     │
│    316 │   │   hidden_states = self.layer_norm1(hidden_states)                                   │
│ ❱  317 │   │   hidden_states, attn_weights = self.self_attn(                                     │
│    318 │   │   │   hidden_states=hidden_states,                                                  │
│    319 │   │   │   attention_mask=attention_mask,                                                │
│    320 │   │   │   causal_attention_mask=causal_attention_mask,                                  │
│                                                                                                  │
│ /home/:)/micromamba/envs/ip2p/lib/python3.8/site-packages/torch/nn/modules/module.py:1110 in    │
│ _call_impl                                                                                       │
│                                                                                                  │
│   1107 │   │   # this function, and just call forward.                                           │
│   1108 │   │   if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks o  │
│   1109 │   │   │   │   or _global_forward_hooks or _global_forward_pre_hooks):                   │
│ ❱ 1110 │   │   │   return forward_call(*input, **kwargs)                                         │
│   1111 │   │   # Do not call functions when jit is used                                          │
│   1112 │   │   full_backward_hooks, non_full_backward_hooks = [], []                             │
│   1113 │   │   if self._backward_hooks or _global_backward_hooks:                                │
│                                                                                                  │
│ /home/:)/micromamba/envs/ip2p/lib/python3.8/site-packages/transformers/models/clip/modeling_cli │
│ p.py:257 in forward                                                                              │
│                                                                                                  │
│    254 │   │                                                                                     │
│    255 │   │   attn_probs = nn.functional.dropout(attn_weights, p=self.dropout, training=self.t  │
│    256 │   │                                                                                     │
│ ❱  257 │   │   attn_output = torch.bmm(attn_probs, value_states)                                 │
│    258 │   │                                                                                     │
│    259 │   │   if attn_output.size() != (bsz * self.num_heads, tgt_len, self.head_dim):          │
│    260 │   │   │   raise ValueError(                                                             │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
RuntimeError: expected scalar type Half but found Float

Slight changes required to run on windows.

I managed to get this running on windows.

The two primary issues were the checkpoint script is for bash, which can be solved by either installing bash or downloading the link manually. Then you create a checkpoints directory and put the checkpoint in there.

The second issue is the specified version of transformers does not work on Windows, failing out with a CLIP issue. To solve that edit the requirements.txt and environment.yaml to change transformers from 4.19.2 to 4.25.1.

After that it should fire right up and you can use the webui fine.

RevengeSMPpvp.aternos.me

How to share a google colab version that accepts a folder of images for create videos?

Hi!

I changed a bit the Google Colab version: https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/InstructPix2Pix_using_diffusers.ipynb

I made it so it accepts a folder of images (frames) to after create a video.

I would like to share it publicly so others can use it but without changing the main notebook. Also that when they load again the notebook it doesn't loose their changes. And also that it doesn't popup when running the cells telling it isn't secure and my showing my email as The Owner.

Just like the link I provided. Any suggestions? Have searched everywhere and the google colab support doesn't work. Also asked ChatGPT and doesn't give me a proper answer.

Any suggestion will be much appreciated!

After create enviroment and running python edit_app.py get an error

Hi!
I get an error trying to make it work. I git cloned the repository, installed and created the enviroment and run python edit_app.py.

Got this error:

Loading model from checkpoints/instruct-pix2pix-00-22000.ckpt Traceback (most recent call last): File "edit_app.py", line 268, in <module> main() File "edit_app.py", line 109, in main model = load_model_from_config(config, args.ckpt, args.vae_ckpt) File "edit_app.py", line 78, in load_model_from_config pl_sd = torch.load(ckpt, map_location="cpu") File "/home/zaesarpo/anaconda3/envs/ip2p/lib/python3.8/site-packages/torch/serialization.py", line 705, in load with _open_zipfile_reader(opened_file) as opened_zipfile: File "/home/zaesarpo/anaconda3/envs/ip2p/lib/python3.8/site-packages/torch/serialization.py", line 243, in __init__ super(_open_zipfile_reader, self).__init__(torch._C.PyTorchFileReader(name_or_buffer)) RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory
Any suggestions?

Appreciate

Conda problems?

when I run

conda env create -f environment.yaml

I get

Collecting package metadata (repodata.json): done
Solving environment: failed

ResolvePackageNotFound: 
  - cudatoolkit=11.3
  - numpy=1.19.2
  - python=3.8.5
  - torchvision=0.12.0
  - pytorch=1.11.0
  - pip=20.3

My device info:
MacBook Air M1 2020

Negative prompts

Hi, is it possible to use negative prompts?

I'm considering converting your project into an extension for a1111 webui

I was mentioned in a discussion about this project being an extension for automatic1111/stable-diffusion-webui.

In a quick look around your code, I deteremined that it could take about a day to implement it. Really, it could take as little as fifteen minutes, but I'm not very familiar with your code, so I put an estimate for a long days worth of time to hash out little details.

But, I figured that you might be interested in doing this yourself.

So the first thing is, I noticed the attached license already gives permission for reuse as long as I release it under the same conditions.
I have no money to gain from this, nor have any plans to. If it's something I decide to take on, it'll be for the experience. So if you are not interested in doing it, may I have your blessing?

The second thing, here's some information about how extensions work, and the approach to turn it into an extension if you choose to do so.
During loading of the webui, it looks in it's directory labeled extensions, each subdirectory is considered an extension.
If a user installed the extension using a github url, or clicked on a provided name (from known extensions, from a file in the project that has urls), it installs them with the folder name as the same name as the github repository.

The first thing it checks for is a file called install.py in the project. The intention of this file is to check if dependencies are installed, and install them if they are not. This doesn't work as clean as you'd expect, because it only runs on a reboot of the app.

In my experience, users will check for an update using the extensions tab in the ui, hit the apply and restart button. This "Apply and Restart" button does a soft restart, which does not run the install.py file, but it does reload the other python scripts and javascript files.

My solution was to check and handle this in the other python files.

In your extensions project directory, it will assume that python scripts it should load are in the scripts folder, and javascript in the javascript folder.
I'm mentioning javascript here first, because it has less to mention, although I don't see that you use it, but for completeness I'll mention it.

It reads the javascipt folder, scrapes the names, and creates a html <script src=yourfilename.js> tag in the head. Since each of these files will be loaded in alphabetical order, you don't need to import them from one to another, just know they are loaded in the DOM.

For the python files, it reads the scripts folder files by name, reads it as a file and appends it into an object and runs exec. This is to create a namespace. So each file is loaded independently. It will then read each namespace for an object that has type of Scripts, which a script can inherit from modules/scripts.Scripts. Those type will be added to the scripts dropdown on either txt2img or img2img, or both, depending on your return for the show method.
For the other files, they will be in the projects namespace. But, you have options with them, such as putting it in settings, interfering with an image generation with prepocess, or postprocess, or even having it as a tab.
These can be done by using a callback defined in the projects modules/scripts_callbacks.py file.

The callback you'd be interested in is the on_ui_tabs.
https://github.com/AUTOMATIC1111/stable-diffusion-webui/blob/master/modules/script_callbacks.py#L236

This callback allows you to give it a function that defines the ui. Here's an example from one that I did.

tab = MyTab(basedir)        
script_callbacks.on_ui_tabs(tab.ui)

From this file, notice that I instantiated the object first since I wasn't using a function. You don't need to separate the components and rows declaration like I did, but notice that my ui method starts with a gr.blocks.
https://github.com/Gerschel/sd-web-ui-quickcss/blob/master/scripts/quickcss.py#L128

You might have to clean-up extra files, rename some directories. sys.append some namespaces. But that's primarily it.
You can do probably do this in about 15 to 20 minutes.

If you want to know more, I've spent way too much time learning their codebase, and I can answer some questions.

Is it possible to input many images and give you the same outlook?

Hi,

I'm working on styling a video . For this, I get the frames of the video and want to use instruct-pix2pix on all the frames and output all of them with the same style.

To make it clear, the frames are on the same place with one person moving. Giving the same prompt for all the frames as input at once, will it give me the same style for all?

Let me know if I did explain me well

wishlist - enhancement (probably out of scope)

I basically want to guide the prompt by an image.

eg.
"make it look like [IMAGE UPLOADED]"

I guess this will take another white paper to get there or must you use CLIP or image to prompt - but probably will lose something in translation.

kind of like img2img button on this webui -

https://github.com/AUTOMATIC1111/stable-diffusion-webui

As Automatic1111 extension?

I really like this solution and have been looking for something similar for a long time!
Could you make this as an extension for Automatic1111? The gradio is a given there too.
I would really like it!

i am using on rtx 2060 6gb ...although it was working b4 on automatic111 extension but now no more ? any solution to this

RuntimeError: CUDA out of memory. Tried to allocate 30.00 MiB (GPU 0; 6.00 GiB total capacity; 4.95 GiB already allocated; 0 bytes free; 5.02 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

can you make it run in free colab?

please create a free level colab to run this

How to keep image size?

Is there any command to keep the same image size?

Trying to run on a Mac with M1 Pro

Is there a way to run this on Mac Silicon (M1 Pro 32GB?). I tried the first conda command in your instructions, and it naturally returned with a cuda error. I will try now on Colab Pro, which uses nVidia cards and cuda.

Created a colab notebook, but there is no requirements file. Inserted !pip3 install einops k_diffusion omegaconf based on errors. But now, I get (when using inference edit_cli.py). Any suggestions? Maybe I need to install another package?

Loading model from checkpoints/instruct-pix2pix-00-22000.ckpt
Traceback (most recent call last):
File "edit_cli.py", line 128, in
main()
File "edit_cli.py", line 79, in main
model = load_model_from_config(config, args.ckpt, args.vae_ckpt)
File "edit_cli.py", line 41, in load_model_from_config
pl_sd = torch.load(ckpt, map_location="cpu")
File "/usr/local/lib/python3.8/dist-packages/torch/serialization.py", line 777, in load
with _open_zipfile_reader(opened_file) as opened_zipfile:
File "/usr/local/lib/python3.8/dist-packages/torch/serialization.py", line 282, in init
super(_open_zipfile_reader, self).init(torch._C.PyTorchFileReader(name_or_buffer))
RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory

AIG

Have you created a space on HuggingFace? if not can I create one?

First of all congrats on your work!

I've been looking if you created a space on HiggingFace but nothing was found.

Would you like to create one?

If not, can I create one?

Of course giving you full credit

Appreciate

Minor typo in paper

In the appendix section "A.2. Paired Image Generation",

"We generation 100 pairs of images for each pair of captions" > "We generate 100 pairs of images for each pair of captions"

Easy idea about increasing pix2pix target underestanding

Dear researchers, please also consider our newly introduce dynamic-pix2pix architecture, which increse the modeling abbility of pix2pix specially in exteremely limmited data scenarios.
For more information:
https://www.researchgate.net/publication/365448869_Dynamic-Pix2Pix_Noise_Injected_cGAN_for_Modeling_Input_and_Target_Domain_Joint_Distributions_with_Limited_Training_Data

getting distored or blurred previews

demo collab?

Would be nice to have a collab notebook to try it out. I have unsuccessfully tried patching one together using img2img diffusers.
My level of expertise does not allow me to diagnose the errors

import requests
import torch
from PIL import Image
from io import BytesIO

from diffusers import StableDiffusionImg2ImgPipeline

# load the pipeline
device = "cuda"
model_id_or_path = "timbrooks/instruct-pix2pix"
pipe = StableDiffusionImg2ImgPipeline.from_pretrained(model_id_or_path, torch_dtype=torch.float16)

# or download via git clone https://huggingface.co/runwayml/stable-diffusion-v1-5
# and pass `model_id_or_path="./stable-diffusion-v1-5"`.
pipe = pipe.to(device)

# let's download an initial image
url = "https://raw.githubusercontent.com/CompVis/stable-diffusion/main/assets/stable-samples/img2img/sketch-mountains-input.jpg"

response = requests.get(url)
init_image = Image.open(BytesIO(response.content)).convert("RGB")
init_image = init_image.resize((768, 512))

prompt = "make the sky red"

images = pipe(prompt=prompt, image=init_image, strength=1, guidance_scale=7.5).images

images[0].save("red sky")

RuntimeError                              Traceback (most recent call last)

[<ipython-input-19-1db39db9ed03>](https://localhost:8080/#) in <module>
     24 prompt = "make the sky red"
     25 
---> 26 images = pipe(prompt=prompt, image=init_image, strength=1, guidance_scale=7.5).images
     27 
     28 images[0].save("red sky")

6 frames

[/usr/local/lib/python3.8/dist-packages/torch/nn/modules/conv.py](https://localhost:8080/#) in _conv_forward(self, input, weight, bias)
    457                             weight, bias, self.stride,
    458                             _pair(0), self.dilation, self.groups)
--> 459         return F.conv2d(input, weight, bias, self.stride,
    460                         self.padding, self.dilation, self.groups)
    461 

RuntimeError: Given groups=1, weight of size [320, 8, 3, 3], expected input[2, 4, 64, 96] to have 8 channels, but got 4 channels instead

Parameter effect and values for reproducing some examples

I get slightly lower quality compared to what is shown project page when using default parameters:

Can you give a hint for what are good values for some of the parameters and explain their effect :

for example what are

cfg_text = 7.5
cfg_image = 1.5

I got these for some of the examples inputs from the project page (images might also be slightly different then what you used)

for some other inputs like the "girl with pearl earing" the output is unchanged.

should I expect better results with other parameter settings?

Thanks

How to upload many photos on the Colab version?

Hi!
Have seen you have created the Colab version:
https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/InstructPix2Pix_using_diffusers.ipynb

Would like to know how to make it work, from providing a folder with many photos and that it works all of them with just one prompt

Appreciate

instruct-pix2pix - Error loading script

I installed the extension, but after restarting auto 1111 I get this error:

Error loading script: instruct-pix2pix.py
Traceback (most recent call last):
File "E:\MyProject\A.I\StableDiffusion\SD 2.0 install\stable-diffusion-webui\modules\scripts.py", line 205, in load_scripts
module = script_loading.load_module(scriptfile.path)
File "E:\MyProject\A.I\StableDiffusion\SD 2.0 install\stable-diffusion-webui\modules\script_loading.py", line 13, in load_module
exec(compiled, module.dict)
File "E:\MyProject\A.I\StableDiffusion\SD 2.0 install\stable-diffusion-webui\extensions\stable-diffusion-webui-instruct-pix2pix\scripts\instruct-pix2pix.py", line 24, in
from modules.ui_common import create_output_panel
ModuleNotFoundError: No module named 'modules.ui_common'

Сan someone tell me what to do? Maybe reinstall all over?

hf timbrooks/instruct-pix2pix on A10G space

I tried to duplicate space on hf using a A10G small and its not working w that arch.

Fetching 15 files: 100%|██████████| 15/15 [01:26<00:00,  5.74s/it]
You have disabled the safety checker for <class 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion_instruct_pix2pix.StableDiffusionInstructPix2PixPipeline'> by passing `safety_checker=None`. Ensure that you abide to the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend to keep the safety filter enabled in all public facing circumstances, disabling it only for use-cases that involve analyzing network behavior or auditing its results. For more information, please have a look at https://github.com/huggingface/diffusers/pull/254 .
/home/user/.pyenv/versions/3.8.9/lib/python3.8/site-packages/torch/cuda/__init__.py:145: UserWarning: 
NVIDIA A10G with CUDA capability sm_86 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70.
If you want to use the NVIDIA A10G GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/

  warnings.warn(incompatible_device_warn.format(device_name, capability, " ".join(arch_list), device_name))
Running on local URL:  http://0.0.0.0:7860

To create a public link, set `share=True` in `launch()`.

Idk how to run it

How do i run it locally ? :(

The setup instructions are very sparse for a layman like be cuz i am very illiterate when it comes to coding, I have however installed SD locally and I was wondering if there is an easy way to get this to work in a WebGUI similar to that. Thanks !

checkpoint

Hi, I could not download the checkpoint. Could you please share it with other ways? Thanks! @timothybrooks

Loading the model consistently fails

Platform: Windows 10 x64 v22H2
Software: Python 3.8.5
Terminal: Anaconda Powershell Prompt v22.9.0
The system just had windows installed yesterday, so it should be in a very vanilla state

When running command:

(ip2p) PS C:\Users\me\src\instruct-pix2pix> python edit_cli.py --input ..\..\Pictures\input.jpg --output ..\..\Pictures\modded.jpg --edit "turn him into a cyborg"

It receives the following error

Loading model from checkpoints/instruct-pix2pix-00-22000.ckpt
Traceback (most recent call last):
  File "edit_cli.py", line 128, in <module>
    main()
  File "edit_cli.py", line 79, in main
    model = load_model_from_config(config, args.ckpt, args.vae_ckpt)
  File "edit_cli.py", line 41, in load_model_from_config
    pl_sd = torch.load(ckpt, map_location="cpu")
  File "C:\Users\Me\anaconda3\envs\ip2p\lib\site-packages\torch\serialization.py", line 705, in load
    with _open_zipfile_reader(opened_file) as opened_zipfile:
  File "C:\Users\Me\anaconda3\envs\ip2p\lib\site-packages\torch\serialization.py", line 243, in __init__
    super(_open_zipfile_reader, self).__init__(torch._C.PyTorchFileReader(name_or_buffer))
RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory

Similarly, when running the following command:
When running command:

(ip2p) PS C:\Users\me\src\instruct-pix2pix> python edit_app.py

It receives the following error

Loading model from checkpoints/instruct-pix2pix-00-22000.ckpt
Traceback (most recent call last):
  File "edit_app.py", line 268, in <module>
    main()
  File "edit_app.py", line 109, in main
    model = load_model_from_config(config, args.ckpt, args.vae_ckpt)
  File "edit_app.py", line 78, in load_model_from_config
    pl_sd = torch.load(ckpt, map_location="cpu")
  File "C:\Users\me\anaconda3\envs\ip2p\lib\site-packages\torch\serialization.py", line 705, in load
    with _open_zipfile_reader(opened_file) as opened_zipfile:
  File "C:\Users\me\anaconda3\envs\ip2p\lib\site-packages\torch\serialization.py", line 243, in __init__
    super(_open_zipfile_reader, self).__init__(torch._C.PyTorchFileReader(name_or_buffer))
RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory

Model Drive Link

Hello,
Thank you so much for the amazing work. Model downloading is very slow even in the presence of high quality network. I have tried on both colab and local and both are requiring to download for more than 2 hours.

Can you please google drive link ?
Thanks

It works

ERROR: "ModuleNotFoundError: No module named 'torch.distributed.algorithms.model_averaging'"

Loading model from checkpoints/instruct-pix2pix-00-22000.ckpt Global Step: 22000 Traceback (most recent call last): File "edit_cli.py", line 129, in <module> main() File "edit_cli.py", line 80, in main model = load_model_from_config(config, args.ckpt, args.vae_ckpt) File "edit_cli.py", line 53, in load_model_from_config model = instantiate_from_config(config.model) File "/home/ubuntu/projects/txt2img/instruct-pix2pix/stable_diffusion/ldm/util.py", line 85, in instantiate_from_config return get_obj_from_str(config["target"])(**config.get("params", dict())) File "/home/ubuntu/projects/txt2img/instruct-pix2pix/stable_diffusion/ldm/util.py", line 93, in get_obj_from_str return getattr(importlib.import_module(module, package=None), cls) File "/usr/lib/python3.8/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "<frozen importlib._bootstrap>", line 1014, in _gcd_import File "<frozen importlib._bootstrap>", line 991, in _find_and_load File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked File "<frozen importlib._bootstrap>", line 671, in _load_unlocked File "<frozen importlib._bootstrap_external>", line 843, in exec_module File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed File "/home/ubuntu/projects/txt2img/instruct-pix2pix/./stable_diffusion/ldm/models/diffusion/ddpm_edit.py", line 15, in <module> import pytorch_lightning as pl File "/home/ubuntu/projects/txt2img/instruct-pix2pix/venv38/lib/python3.8/site-packages/pytorch_lightning/__init__.py", line 35, in <module> from pytorch_lightning.callbacks import Callback # noqa: E402 File "/home/ubuntu/projects/txt2img/instruct-pix2pix/venv38/lib/python3.8/site-packages/pytorch_lightning/callbacks/__init__.py", line 28, in <module> from pytorch_lightning.callbacks.pruning import ModelPruning File "/home/ubuntu/projects/txt2img/instruct-pix2pix/venv38/lib/python3.8/site-packages/pytorch_lightning/callbacks/pruning.py", line 31, in <module> from pytorch_lightning.core.module import LightningModule File "/home/ubuntu/projects/txt2img/instruct-pix2pix/venv38/lib/python3.8/site-packages/pytorch_lightning/core/__init__.py", line 16, in <module> from pytorch_lightning.core.module import LightningModule File "/home/ubuntu/projects/txt2img/instruct-pix2pix/venv38/lib/python3.8/site-packages/pytorch_lightning/core/module.py", line 50, in <module> from pytorch_lightning.trainer.connectors.logger_connector.fx_validator import _FxValidator File "/home/ubuntu/projects/txt2img/instruct-pix2pix/venv38/lib/python3.8/site-packages/pytorch_lightning/trainer/__init__.py", line 17, in <module> from pytorch_lightning.trainer.trainer import Trainer File "/home/ubuntu/projects/txt2img/instruct-pix2pix/venv38/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 57, in <module> from pytorch_lightning.loops import PredictionLoop, TrainingEpochLoop File "/home/ubuntu/projects/txt2img/instruct-pix2pix/venv38/lib/python3.8/site-packages/pytorch_lightning/loops/__init__.py", line 15, in <module> from pytorch_lightning.loops.batch import TrainingBatchLoop # noqa: F401 File "/home/ubuntu/projects/txt2img/instruct-pix2pix/venv38/lib/python3.8/site-packages/pytorch_lightning/loops/batch/__init__.py", line 15, in <module> from pytorch_lightning.loops.batch.training_batch_loop import TrainingBatchLoop # noqa: F401 File "/home/ubuntu/projects/txt2img/instruct-pix2pix/venv38/lib/python3.8/site-packages/pytorch_lightning/loops/batch/training_batch_loop.py", line 20, in <module> from pytorch_lightning.loops.optimization.manual_loop import _OUTPUTS_TYPE as _MANUAL_LOOP_OUTPUTS_TYPE File "/home/ubuntu/projects/txt2img/instruct-pix2pix/venv38/lib/python3.8/site-packages/pytorch_lightning/loops/optimization/__init__.py", line 15, in <module> from pytorch_lightning.loops.optimization.manual_loop import ManualOptimization # noqa: F401 File "/home/ubuntu/projects/txt2img/instruct-pix2pix/venv38/lib/python3.8/site-packages/pytorch_lightning/loops/optimization/manual_loop.py", line 23, in <module> from pytorch_lightning.loops.utilities import _build_training_step_kwargs, _extract_hiddens File "/home/ubuntu/projects/txt2img/instruct-pix2pix/venv38/lib/python3.8/site-packages/pytorch_lightning/loops/utilities.py", line 29, in <module> from pytorch_lightning.strategies.parallel import ParallelStrategy File "/home/ubuntu/projects/txt2img/instruct-pix2pix/venv38/lib/python3.8/site-packages/pytorch_lightning/strategies/__init__.py", line 15, in <module> from pytorch_lightning.strategies.bagua import BaguaStrategy # noqa: F401 File "/home/ubuntu/projects/txt2img/instruct-pix2pix/venv38/lib/python3.8/site-packages/pytorch_lightning/strategies/bagua.py", line 30, in <module> from pytorch_lightning.strategies.ddp import DDPStrategy File "/home/ubuntu/projects/txt2img/instruct-pix2pix/venv38/lib/python3.8/site-packages/pytorch_lightning/strategies/ddp.py", line 65, in <module> from torch.distributed.algorithms.model_averaging.averagers import ModelAverager ModuleNotFoundError: No module named 'torch.distributed.algorithms.model_averaging'

CUDA out of memory

After installation with some adventures (mentioned in other issues :) ) I got Web UI to run, but not the process. I am getting CUDA out of error message and so far, googling told me about code editing to send data in batches or changing Enviromental variables.
I tried to set PYTORCH_CUDA_ALLOC_CONF to max_split_size_mb:128 and max_split_size_mb:512 with no change.

I am on windows with 2080ti

my error when I press "Load Example" button (or try to run direct python command with it). Same Happens with any other image when I load it in, add text prompt and press "Generate" button.

Traceback (most recent call last):
  File "C:\Users\***\AppData\Local\Programs\Python\Python310\lib\site-packages\gradio\routes.py", line 337, in run_predict
    output = await app.get_blocks().process_api(
  File "C:\Users\***\AppData\Local\Programs\Python\Python310\lib\site-packages\gradio\blocks.py", line 1015, in process_api
    result = await self.call_function(
  File "C:\Users\***\AppData\Local\Programs\Python\Python310\lib\site-packages\gradio\blocks.py", line 833, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "C:\Users\***\AppData\Local\Programs\Python\Python310\lib\site-packages\anyio\to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "C:\Users\***\AppData\Local\Programs\Python\Python310\lib\site-packages\anyio\_backends\_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "C:\Users\***\AppData\Local\Programs\Python\Python310\lib\site-packages\anyio\_backends\_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "E:\Instruct-pix2pix\instruct-pix2pix-main\edit_app.py", line 125, in load_example
    return [example_image, example_instruction] + generate(
  File "E:\Instruct-pix2pix\instruct-pix2pix-main\edit_app.py", line 160, in generate
    with torch.no_grad(), autocast("cuda"), model.ema_scope():
  File "C:\Users\***\AppData\Local\Programs\Python\Python310\lib\contextlib.py", line 135, in __enter__
    return next(self.gen)
  File "E:\Instruct-pix2pix\instruct-pix2pix-main\./stable_diffusion\ldm\models\diffusion\ddpm_edit.py", line 185, in ema_scope
    self.model_ema.store(self.model.parameters())
  File "E:\Instruct-pix2pix\instruct-pix2pix-main\./stable_diffusion\ldm\modules\ema.py", line 62, in store
    self.collected_params = [param.clone() for param in parameters]
  File "E:\Instruct-pix2pix\instruct-pix2pix-main\./stable_diffusion\ldm\modules\ema.py", line 62, in <listcomp>
    self.collected_params = [param.clone() for param in parameters]
RuntimeError: CUDA out of memory. Tried to allocate 58.00 MiB (GPU 0; 11.00 GiB total capacity; 10.04 GiB already allocated; 0 bytes free; 10.21 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF```

Any recommendations on how to get pass this?

Thanks.

cannot import name 'StableDiffusionInstructPix2PixPipeline' from 'diffusers'

Hi,

After cloning the repository and setting up the environment, I keep getting the following error when trying to run edit_app.py:

(ip2p) PS C:\Users\julia\instruct-pix2pix> python edit_app.py
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ C:\Users\julia\instruct-pix2pix\edit_app.py:9 in │
│ │
│ 6 import gradio as gr │
│ 7 import torch │
│ 8 from PIL import Image, ImageOps │
│ ❱ 9 from diffusers import StableDiffusionInstructPix2PixPipeline │
│ 10 │
│ 11 │
│ 12 help_text = """ │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
ImportError: cannot import name 'StableDiffusionInstructPix2PixPipeline' from 'diffusers'
(C:\Users\julia.conda\envs\ip2p\lib\site-packages\diffusers_init_.py)

Do you have a suggestion for how I could fix this? Thank you very much in advance

Is there any way to change the sampling method?

I noticed only two lines responsible for this:
z = K.sampling.sample_euler_ancestral(model_wrap_cfg, z, sigmas, extra_args=extra_args)
generation_params = { "ip2p": "Yes", "Prompt": instruction, "Negative Prompt": negative_prompt, "steps": steps, "sampler": "Euler A", ....

Will it be enough to make changes to these lines? Please with an example(dpm adaptive(sigma_min, sigma_max)), thank you very much.
You have created an incredible model for editing images based on instructions.

RuntimeError: CUDA out of memory

RuntimeError: CUDA out of memory. Tried to allocate 58.00 MiB (GPU 0; 11.00 GiB total capacity; 10.04 GiB already allocated; 0 bytes free; 10.21 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

I've tried closing everything else that uses any GPU memory, but it always says "0 bytes free".

Trying to use edit_app.py in PowerShell on Windows 10
conda 22.11.1
2080Ti 11GB

timothybrooks / instruct-pix2pix Goto Github PK

instruct-pix2pix's People

Contributors

Stargazers

Watchers

Forkers

instruct-pix2pix's Issues

Recommend Projects

Recommend Topics

Recommend Org