Giter Site home page Giter Site logo

diffbir's People

Contributors

0x3f3f3f3fun avatar dashbe avatar ziyannchen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

diffbir's Issues

RuntimeError: User specified an unsupported autocast device_type 'cuda:0'

Hello Team,

When trying to run the following command : python inference.py --config configs/model/cldm.yaml --ckpt weights/general_full_v1.ckpt --steps 50 --sr_scale 1 --image_size 512 --input results/maxout/ --color_fix_type wavelet --resize_back --output results/detailed/ --disable_preprocess_model --device cuda on Linux Docker container (with a AMD 7900XT), I have the following error :

RuntimeError: User specified an unsupported autocast device_type 'cuda:0'

Before commit 30355a1 I was able to perform some image restoration, once pulled latest commit, I got the given error.

Best regards,

Nikos

24GB GPU out of memory

How much GPU RAM it needs to run? I have a 24GB 3090 and it still complains out of memory.

Installation problems on Windows

1. When I try to run ━ conda install xformers==0.0.16 -c xformers ,

I receive the following message:

Collecting package metadata (current_repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Collecting package metadata (repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.

PackagesNotFoundError: The following packages are not available from current channels:

  - xformers==0.0.16

Current channels:

  - https://conda.anaconda.org/xformers/win-64
  - https://conda.anaconda.org/xformers/noarch
  - https://conda.anaconda.org/conda-forge/win-64
  - https://conda.anaconda.org/conda-forge/noarch
  - https://conda.anaconda.org/pytorch/win-64
  - https://conda.anaconda.org/pytorch/noarch
  - https://repo.anaconda.com/pkgs/main/win-64
  - https://repo.anaconda.com/pkgs/main/noarch
  - https://repo.anaconda.com/pkgs/r/win-64
  - https://repo.anaconda.com/pkgs/r/noarch
  - https://repo.anaconda.com/pkgs/msys2/win-64
  - https://repo.anaconda.com/pkgs/msys2/noarch

To search for alternate channels that may provide the conda package you're
looking for, navigate to

    https://anaconda.org

and use the search bar at the top of the page.

2. When I'm trying to run ━ pip install -r requirements.txt ,

the following error appears:

ERROR: Could not find a version that satisfies the requirement triton (from versions: none)
ERROR: No matching distribution found for triton

Сan you please explain to me what this could be related to? How is it possible to avoid the occurrence of these two errors?

Thanks.

Question about video memory consumption, and computational resource requirements

There is a similar work StableSR, which also uses the Stable Diffusion base model, and its GPU memory requirements are huge. And sometimes even tiling does not save the situation in any way.

StableSR is not able to properly process images with resolution higher than 190px, I personally tried it with different data and parameters, and it's just a huge waste of time.

How much more accessible would your work be to most graphics gas pedals, like those with 8 or 16gb of video memory?

Thank you very much, for your attention.

Is the calculation of loss influenced by I_{HQ} during the training of LAControlNet?

I_{HQ} only works during the SwinIR training? I couldn't locate any involvement of I_{HQ} in the LAControlNet training code. Whether "|I_{HQ} - I_{reg}|" is computed as part of the LAControlNet training process? LAControlNet only need to I_{reg} as input? If I intend to directly train LAControlNet using I_{HQ} and I_{reg}, will {I_HQ} be incorporated in the process?

PermissionError: [Errno 13] Permission denied

(diffbir) E:\AI>python e:\ai\diffbir\inference.py --config E:\AI\DiffBIR\configs\model\cldm.yaml --ckpt E:\AI\DiffBIR\ckpt --reload_swinir --swinir_ckpt E:\AI\DiffBIR\ckpt --steps 50 --input E:\AI\DiffBIR\lq_dir --sr_scale 1 --image_size 512 --color_fix_type wavelet --resize_back --output E:\AI\DiffBIR\hq_dirE:\AI\DiffBIR\inference.py --config E:\AI\DiffBIR\configs\model\cldm.yaml --ckpt E:\AI\DiffBIR\ckpt --reload_swinir --swinir_ckpt E:\AI\DiffBIR\ckpt --steps 50 --input E:\AI\DiffBIR\lq_dir --sr_scale 1 --image_size 512 --color_fix_type wavelet --resize_back --output E:\AI\DiffBIR\hq_dir
E:\Anaconda3\envs\diffbir\lib\site-packages\torchaudio\backend\utils.py:62: UserWarning: No audio backend is available.
warnings.warn("No audio backend is available.")
No module 'xformers'. Proceeding without it.
Global seed set to 231
ControlLDM: Running in eps-prediction mode
DiffusionWrapper has 865.91 M params.
making attention of type 'vanilla' with 512 in_channels
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
making attention of type 'vanilla' with 512 in_channels
E:\Anaconda3\envs\diffbir\lib\site-packages\torch\functional.py:478: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\TensorShape.cpp:2895.)
return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
Setting up [LPIPS] perceptual loss: trunk [alex], v[0.1], spatial [off]
E:\Anaconda3\envs\diffbir\lib\site-packages\torchvision\models_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and will be removed in 0.15, please use 'weights' instead.
warnings.warn(
E:\Anaconda3\envs\diffbir\lib\site-packages\torchvision\models_utils.py:223: UserWarning: Arguments other than a weight enum or None for 'weights' are deprecated since 0.13 and will be removed in 0.15. The current behavior is equivalent to passing weights=AlexNet_Weights.IMAGENET1K_V1. You can also use weights=AlexNet_Weights.DEFAULT to get the most up-to-date weights.
warnings.warn(msg)
Loading model from: E:\Anaconda3\envs\diffbir\lib\site-packages\lpips\weights\v0.1\alex.pth
Traceback (most recent call last):
File "e:\ai\diffbir\inference.py", line 212, in
main()
File "e:\ai\diffbir\inference.py", line 140, in main
load_state_dict(model, torch.load(args.ckpt, map_location="cpu"), strict=True)
File "E:\Anaconda3\envs\diffbir\lib\site-packages\torch\serialization.py", line 699, in load
with _open_file_like(f, 'rb') as opened_file:
File "E:\Anaconda3\envs\diffbir\lib\site-packages\torch\serialization.py", line 230, in _open_file_like
return _open_file(name_or_buffer, mode)
File "E:\Anaconda3\envs\diffbir\lib\site-packages\torch\serialization.py", line 211, in init
super(_open_file, self).init(open(name, mode))
PermissionError: [Errno 13] Permission denied: 'E:\AI\DiffBIR\ckpt'

About this error? (run on M2)

(diffbir) pwoj@pwoj-mbpro DiffBIR % python gradio_diffbir.py --ckpt general_full_v1.ckpt --config configs/model/cldm.yaml --reload_swinir --swinir_ckpt general_swinir_v1.ckpt Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions. Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions. OMP: Error #15: Initializing libiomp5.dylib, but found libomp.dylib already initialized. OMP: Hint This means that multiple copies of the OpenMP runtime have been linked into the program. That is dangerous, since it can degrade performance or cause incorrect results. The best thing to do is to ensure that only a single OpenMP runtime is linked into the process, e.g. by avoiding static linking of the OpenMP runtime in any library. As an unsafe, unsupported, undocumented workaround you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program to continue to execute, but that may cause crashes or silently produce incorrect results. For more information, please see http://www.intel.com/software/products/support/. zsh: abort python gradio_diffbir.py --ckpt general_full_v1.ckpt --config --reload_swini

Failure during training

I started training on A100 GPU with about 2000 training images. It completed about 900 Epochs, then the process ended abruptly without any errors. I can see several checkpoint step files.
I also tried to restart the traning by setting the resume path to the folder containing step files. But gives error {folder} is a directory.
Any help would be highly appreciated.

Thanks

Error in inference

Thanks for the excellent work. But some errors here:
Image_20230907193331
I put lq images with size 512 in /home/notebook/data/group/DiffusionFace/DiffBIR/image/TestCrop.
I have no idea about the error. Thanks in advance

failed finding central directory

for general image inference

/home/bc/Projects/OpenSource/DiffBIR/venvDiffBIR/lib/python3.10/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3190.)
return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
Setting up [LPIPS] perceptual loss: trunk [alex], v[0.1], spatial [off]
/home/bc/Projects/OpenSource/DiffBIR/venvDiffBIR/lib/python3.10/site-packages/torchvision/models/_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead.
warnings.warn(
/home/bc/Projects/OpenSource/DiffBIR/venvDiffBIR/lib/python3.10/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or None for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing weights=AlexNet_Weights.IMAGENET1K_V1. You can also use weights=AlexNet_Weights.DEFAULT to get the most up-to-date weights.
warnings.warn(msg)
Loading model from: /home/bc/Projects/OpenSource/DiffBIR/venvDiffBIR/lib/python3.10/site-packages/lpips/weights/v0.1/alex.pth
Traceback (most recent call last):
File "/home/bc/Projects/OpenSource/DiffBIR/inference.py", line 216, in
main()
File "/home/bc/Projects/OpenSource/DiffBIR/inference.py", line 141, in main
load_state_dict(model, torch.load(args.ckpt, map_location="cpu"), strict=True)
File "/home/bc/Projects/OpenSource/DiffBIR/venvDiffBIR/lib/python3.10/site-packages/torch/serialization.py", line 777, in load
with _open_zipfile_reader(opened_file) as opened_zipfile:
File "/home/bc/Projects/OpenSource/DiffBIR/venvDiffBIR/lib/python3.10/site-packages/torch/serialization.py", line 282, in init
super(_open_zipfile_reader, self).init(torch._C.PyTorchFileReader(name_or_buffer))
RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory

`requirements.txt`

Thanks for this work! Now, I'm getting a strange error where it's telling me ModuleNotFoundError: No module named 'pytorch_lightning.utilities.distributed' but I already have pytorch-lightning==1.8.5.post0. Any idea?

Here's the full stack trace

Traceback (most recent call last):
  File "/content/diffbir/inference.py", line 15, in <module>
Sep 02 at 22:50:30.984
    from model.cldm import ControlLDM
  File "/content/diffbir/model/cldm.py", line 18, in <module>
Sep 02 at 22:50:30.984
    from ldm.models.diffusion.ddpm import LatentDiffusion
  File "/content/diffbir/ldm/models/diffusion/ddpm.py", line 20, in <module>
Sep 02 at 22:50:30.985
    from pytorch_lightning.utilities.distributed import rank_zero_only
Sep 02 at 22:50:30.985
ModuleNotFoundError: No module named 'pytorch_lightning.utilities.distributed'

Model downloading problems: HuggingFace connection error

When I tried to inference,this error happened.I don‘t know how to solve this problem.

huggingface hub.utils.errors .LocalEntryNotFoundError: An error happened while trying to locate the file on the Hub and we cannot find the requested files in the local cache. please check your connection and try again or make sure your Internet connection is on.

Issue when try to train model

I have done the split creating the list train.list and val.list. I also edited the config files according to the instructions in the readme.md but I got an error, it seemed like there was a problem reading the image data so I checked again but couldn't find any problems. Can you help me ?

Screenshot from 2023-09-11 15-21-54

Does DiffBIR support input at arbitrary resolutions?

Excellent work, but I noticed while reading the paper and code that it seems to not support arbitrary resolution input (instead, it forcibly scales the image). This feature is supported in both PatchDM and StableSR.

If it indeed doesn't support large image restoration, are there plans to include this feature?

Multi GPU Support?

Loving the results, but maxing out GPU 24GB VRAM. Can this be run with mutli GPUs, where it will continue on the second GPU so it doesn't run out of VRAM? Any changes I have to make to this:

python inference.py \
--input inputs/general \
--config configs/model/cldm.yaml \
--ckpt weights/general_full_v1.ckpt \
--reload_swinir --swinir_ckpt weights/general_swinir_v1.ckpt \
--steps 50 \
--sr_scale 4 \
--image_size 512 \
--color_fix_type wavelet --resize_back \
--output results/general

?

6gb vram LAPTOP, CUDA out of memory

PS E:\AI\DiffBIR\DiffBIR> venv\Scripts\activate.ps1
(venv) PS E:\AI\DiffBIR\DiffBIR> python gradio_diffbir.py --ckpt ./general_full_v1.ckpt --config configs/model/cldm.yaml
--reload_swinir --swinir_ckpt ./general_swinir_v1.ckpt --device cuda
ControlLDM: Running in eps-prediction mode
Setting up MemoryEfficientCrossAttention. Query dim is 320, context_dim is None and using 5 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 320, context_dim is 1024 and using 5 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 320, context_dim is None and using 5 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 320, context_dim is 1024 and using 5 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 640, context_dim is None and using 10 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 640, context_dim is 1024 and using 10 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 640, context_dim is None and using 10 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 640, context_dim is 1024 and using 10 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 1024 and using 20 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 1024 and using 20 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 1024 and using 20 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 1024 and using 20 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 1024 and using 20 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 1024 and using 20 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 640, context_dim is None and using 10 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 640, context_dim is 1024 and using 10 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 640, context_dim is None and using 10 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 640, context_dim is 1024 and using 10 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 640, context_dim is None and using 10 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 640, context_dim is 1024 and using 10 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 320, context_dim is None and using 5 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 320, context_dim is 1024 and using 5 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 320, context_dim is None and using 5 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 320, context_dim is 1024 and using 5 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 320, context_dim is None and using 5 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 320, context_dim is 1024 and using 5 heads.
DiffusionWrapper has 865.91 M params.
making attention of type 'vanilla-xformers' with 512 in_channels
building MemoryEfficientAttnBlock with 512 in_channels...
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
making attention of type 'vanilla-xformers' with 512 in_channels
building MemoryEfficientAttnBlock with 512 in_channels...
Setting up MemoryEfficientCrossAttention. Query dim is 320, context_dim is None and using 5 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 320, context_dim is 1024 and using 5 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 320, context_dim is None and using 5 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 320, context_dim is 1024 and using 5 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 640, context_dim is None and using 10 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 640, context_dim is 1024 and using 10 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 640, context_dim is None and using 10 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 640, context_dim is 1024 and using 10 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 1024 and using 20 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 1024 and using 20 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 1024 and using 20 heads.
E:\AI\DiffBIR\DiffBIR\venv\lib\site-packages\torch\functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ..\aten\src\ATen\native\TensorShape.cpp:3484.)
return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
Setting up [LPIPS] perceptual loss: trunk [alex], v[0.1], spatial [off]
E:\AI\DiffBIR\DiffBIR\venv\lib\site-packages\torchvision\models_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead.
warnings.warn(
E:\AI\DiffBIR\DiffBIR\venv\lib\site-packages\torchvision\models_utils.py:223: UserWarning: Arguments other than a weight enum or None for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing weights=AlexNet_Weights.IMAGENET1K_V1. You can also use weights=AlexNet_Weights.DEFAULT to get the most up-to-date weights.
warnings.warn(msg)
Loading model from: E:\AI\DiffBIR\DiffBIR\venv\lib\site-packages\lpips\weights\v0.1\alex.pth
reload swinir model from ./general_swinir_v1.ckpt
Traceback (most recent call last):
File "E:\AI\DiffBIR\DiffBIR\gradio_diffbir.py", line 40, in
model.to(args.device)
File "E:\AI\DiffBIR\DiffBIR\venv\lib\site-packages\pytorch_lightning\core\mixins\device_dtype_mixin.py", line 109, in to
return super().to(*args, **kwargs)
File "E:\AI\DiffBIR\DiffBIR\venv\lib\site-packages\torch\nn\modules\module.py", line 1145, in to
return self._apply(convert)
File "E:\AI\DiffBIR\DiffBIR\venv\lib\site-packages\torch\nn\modules\module.py", line 797, in _apply
module._apply(fn)
File "E:\AI\DiffBIR\DiffBIR\venv\lib\site-packages\torch\nn\modules\module.py", line 797, in _apply
module._apply(fn)
File "E:\AI\DiffBIR\DiffBIR\venv\lib\site-packages\torch\nn\modules\module.py", line 797, in _apply
module._apply(fn)
[Previous line repeated 3 more times]
File "E:\AI\DiffBIR\DiffBIR\venv\lib\site-packages\torch\nn\modules\module.py", line 820, in _apply
param_applied = fn(param)
File "E:\AI\DiffBIR\DiffBIR\venv\lib\site-packages\torch\nn\modules\module.py", line 1143, in convert
return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 58.00 MiB (GPU 0; 6.00 GiB total capacity; 5.30 GiB already allocated; 0 bytes free; 5.34 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

how much vram does it support?

finetune stable diffusion

Hi, thanks for this excellent work. I am working my own dataset (medical images) now. I see the paper and find the stable diffusion is frozen and also the autoencoder. could you please share the finetuning stable diffusion code if possible?
thanks a lot for your time

Training duration on A100

I started training with about 2000 images with a batch size of 10.

  1. How long does traning take for 2000 images set? Currently, it is at about 800 Epoch and each Epoch taking about 2.5 mins. Doesn't show total Epochs to process.
  2. I can see files like step=49999.ckpt etc. are created after every 10000 steps. If the training is stopped and started again, will it resume from where it was stopped?
  3. Can the training be done on CPU only?

Thanks,
Kiran

Plans for other stable diffusion models

Hello, I wanted to ask if there are any plans regards using other stable diffusion models.

I can imagine that stable diffusion models which are fine-tuned on higher resolution images or higher quality images to work better compared to the base model.

cldm.py下的sample_log函数

@torch.no_grad()
def sample_log(self, cond, steps):
sampler = SpacedSampler(self)
b, c, h, w = cond["c_concat"][0].shape
shape = (b, self.channels, h // 8, w // 8)
samples = sampler.sample(steps, shape, cond, unconditional_guidance_scale=1.0, unconditional_conditioning=None)
return samples

sampler.sample传递的参数是不是有问题,还是sampler的定义有问题,麻烦解答一下

It is possible to use the models from JS?

Is there a JS interface? Or maybe the models are hosted on HuggingFace?

PS: I wasn't sure if this is the best way to reach out, happy to use a more suitable channel. Thanks!

Where to host?

Team,

When I try to run this code, I get the following error when pressing run after uploading a photo:

RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 7.43 GiB total capacity; 6.75 GiB already allocated; 10.44 MiB free; 6.82 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Any recommendations on the kind of EC2 instance I would need to run this code? Can you recommend any platforms in addition to AWS?

Will other samplers be supported in the future?

I noticed that you've removed the DDIM option in the current version of the code, even though it didn't seem to work in the initial version. Sampling efficiency is one of the obstacles to practicality, especially for high-resolution images. Will DDIM and DPM-Solver samplers be supported in the future?

not support apple m1

cutlassF is not supported because:
device=cpu (supported: {'cuda'})
flshattF is not supported because:
device=cpu (supported: {'cuda'})
dtype=torch.float32 (supported: {torch.bfloat16, torch.float16})
max(query.shape[-1] != value.shape[-1]) > 128
Operator wasn't built - see python -m xformers.info for more info
tritonflashattF is not supported because:
device=cpu (supported: {'cuda'})
dtype=torch.float32 (supported: {torch.bfloat16, torch.float16})
max(query.shape[-1] != value.shape[-1]) > 128
Operator wasn't built - see python -m xformers.info for more info
triton is not available
smallkF is not supported because:
max(query.shape[-1] != value.shape[-1]) > 32
unsupported embed per head: 512

OpenXLab issues

I see that OpenXLab has issues every day, is it possible that you hot DIffBIR on hugging face or replicate?
They're more reliable

about inference_face.py

The line 178 in inference_face.py is

restored_img = restored_img[:lq_resized.height, :lq_resized.width, :]

It seems, however, the place where restored_img is declared is in the line 158

restored_img = face_helper.paste_faces_to_input_image(
    upsample_img=bg_img
)

but this is inside of

if not args.has_aligned:

If this condition is not satisfied, restored_img will become an undefined variable and cause an error.
What is this variable supposed to be? Is it restored_face?

Problem with Pytorch Lightning

Traceback (most recent call last):
File "C:\Users\wdson\Downloads\Compressed\DiffBIR-main\inference.py", line 15, in
from model.cldm import ControlLDM
File "C:\Users\wdson\Downloads\Compressed\DiffBIR-main\model\cldm.py", line 18, in
from ldm.models.diffusion.ddpm import LatentDiffusion
File "C:\Users\wdson\Downloads\Compressed\DiffBIR-main\ldm\models\diffusion\ddpm.py", line 20, in
from pytorch_lightning.utilities.distributed import rank_zero_only
ModuleNotFoundError: No module named 'pytorch_lightning.utilities.distributed'

I am getting this error, even though I have Pytorch Lightning installed. Also, I could not find "requirements.txt" in the repository, so I installed the modules as they were being required.

open clip model erro

when I run the following command
“python inference.py
--input inputs/demo/general
--config configs/model/cldm.yaml
--ckpt weights/general_full_v1.ckpt
--reload_swinir --swinir_ckpt weights/general_swinir_v1.ckpt
--steps 50
--sr_scale 4
--image_size 512
--color_fix_type wavelet --resize_back
--output results/demo/general
--device cuda”

An error occurred
2023-09-13 17-14-13屏幕截图

cache files is
2023-09-13 17-18-20屏幕截图

Can you help me, how should I solve this problem

Where is the code for Latent Image Guidance?

I'm sorry to bother you again, as I have already started research based on your work.
I'm very interested in the Latent Image Guidance in Section 3.3, but I only found the classifier-free guidance strength in the inference code. I further found the relevant code in the sampler, but it seems like it's not enabled in the current version?

Installation

Hi how can i download and install ? where we install on windows? i use Windows can you guide me plz thanks

request.txt

hello,when i install the diffbir, i meet with a problem

when i run "pip install -r requirements.txt", cmd shows "ERROR: Could not find a version that satisfies the requirement triton"

my already used python is 3.11, i don't know if this is the cause of this problem

thank you for your time

image

and what's more, when i input the "conda install pytorch==1.12.1 torchvision==0.13.1 cudatoolkit=11.3 -c pytorch", cmd shows as follow picture:
image

sorry for my junior question

about urls for RealESRGAN checkpoint

line 326 in realesrganer.py is

model_path=f"https://github.com/xinntao/Real-ESRGAN/releases/download/v0.1.0/RealESRGAN_x{scale}plus.pth",

when scale == 2, it tries to download RealESRGAN_x2plus.pth from v0.1.0, but v0.1.0 doesn't have x2plus checkpoint.
It should be download from v0.2.1

About train

image
I want to know what is the parameter of hq_dir_path and validation_set_size? Can you illustrate a specific example? I'd appreciate it if you'd get back to me.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.