nvlabs / odise Goto Github PK

Official PyTorch implementation of ODISE: Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models [CVPR 2023 Highlight]

Home Page: https://arxiv.org/abs/2303.04803

License: Other

Python 99.51% Dockerfile 0.49%

deep-learning instance-segmentation panoptic-segmentation pytorch semantic-segmentation diffusion-models text-image-retrieval zero-shot-learning open-vocabulary open-vocabulary-segmentation

odise's Issues

The performance obtained is not ideal

Hello, thank you for your excellent work. Can you provide a detailed environment configuration for running your code? The results I achieved locally differ significantly from the expectations you provided.

Training Time

Thanks for releasing the code. How long does your method need to train?

checkpoint for trained model without Implicit Captioner

Hi, do you have the checkpoint for the ODISE trained without Implicit Captioner? Thanks.

conda error while downloading the specified pytorch cuda versions

When I try to install these versions "conda install pytorch=1.13.1 torchvision=0.14.1 pytorch-cuda=11.6 -c pytorch -c nvidia", I got the following error.

**"Downloading and Extracting Packages
CondaError: Downloaded bytes did not match Content-Length
url: https://conda.anaconda.org/nvidia/linux-64/libcufft-dev-10.7.1.112-ha5ce4c0_0.tar.bz2
target_path: /home/aub/anaconda3/pkgs/libcufft-dev-10.7.1.112-ha5ce4c0_0.tar.bz2
Content-Length: 206803679
downloaded bytes: 102857120

CancelledError()
CancelledError()
CancelledError()
CancelledError() "**

I have tried to update my conda but did not work

Training with clipped_grad_norm value of NaN

When I train on coco dataset, I find that the clipped_grad_norm value is NaN and total_loss is difficult to decrease, what is wrong or what might be the reason for this？

No confidence score in the demo result

Hi, how can I get the confidence score of each instance segmentation in the demo.py?

Abut the different of code and paper

The paper claims that ODISE freezes the Denoising Unet. However, upon inspecting the code from ODISE's "ldm.py" file, I encountered some aspects that left me uncertain about the actual freezing status of the Unet. This code is from ODISE/odise/modeling/meta_arch/ldm.py 974

Anyone tried this model on CityScapes?

AssertionError: datasets/coco/panoptic_train2017/000000000009.png

I have a problem like title when I train this model . What should I do to solve this?

Cant reduce the batch size

My setup is having 8 Titan X GPUs, when i tried to set --ref 32 it gives this error,

/var/spool/slurm/slurmd/job86812/slurm_script: line 50: $benchmarch_logs: ambiguous redirect
Traceback (most recent call last):
File "/home/mu480317/ODISE/./tools/train_net.py", line 392, in
launch(
File "/home/mu480317/.conda/envs/ODISE/lib/python3.9/site-packages/detectron2/engine/launch.py", line 67, in launch
mp.spawn(
File "/home/mu480317/.conda/envs/ODISE/lib/python3.9/site-packages/torch/multiprocessing/spawn.py", line 240, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
File "/home/mu480317/.conda/envs/ODISE/lib/python3.9/site-packages/torch/multiprocessing/spawn.py", line 198, in start_processes
while not context.join():
File "/home/mu480317/.conda/envs/ODISE/lib/python3.9/site-packages/torch/multiprocessing/spawn.py", line 160, in join
raise ProcessRaisedException(msg, error_index, failed_process.pid)
torch.multiprocessing.spawn.ProcessRaisedException:

-- Process 5 terminated with the following error:
Traceback (most recent call last):
File "/home/mu480317/.conda/envs/ODISE/lib/python3.9/site-packages/torch/multiprocessing/spawn.py", line 69, in _wrap
fn(i, *args)
File "/home/mu480317/.conda/envs/ODISE/lib/python3.9/site-packages/detectron2/engine/launch.py", line 126, in _distributed_worker
main_func(*args)
File "/home/mu480317/ODISE/tools/train_net.py", line 319, in main
cfg = auto_scale_workers(cfg, comm.get_world_size())
File "/home/mu480317/ODISE/odise/config/utils.py", line 65, in auto_scale_workers
assert cfg.dataloader.train.total_batch_size % old_world_size == 0, (
AssertionError: Invalid reference_world_size in config! 8 % 32 != 0

When --ref 8 , then the GPU memory is overflowing.

Please help me solve this. Thank you

RuntimeError: expected scalar type Half but found Float

Thanks for your great work!

When I run the code tools/train_net.py with 2 V100 GPUs, I encounter the follow error:

File "/mnt/cap/caijh/app/src/detectron2/detectron2/engine/train_loop.py", line 155, in train
    self.run_step()
  File "/mnt/workspace/code/ODISE/odise/engine/train_loop.py", line 297, in run_step
    grad_norm = self.grad_scaler(
  File "/mnt/workspace/code/ODISE/odise/engine/train_loop.py", line 207, in __call__
    self._scaler.scale(loss).backward(create_graph=create_graph)
  File "/mnt/cap/caijh/anaconda3/envs/odise/lib/python3.9/site-packages/torch/_tensor.py", line 488, in backward
    torch.autograd.backward(
  File "/mnt/cap/caijh/anaconda3/envs/odise/lib/python3.9/site-packages/torch/autograd/__init__.py", line 197, in backward
    Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
  File "/mnt/cap/caijh/anaconda3/envs/odise/lib/python3.9/site-packages/torch/autograd/function.py", line 267, in apply
    return user_fn(self, *args)
  File "/mnt/workspace/code/ODISE/third_party/stable-diffusion/ldm/modules/diffusionmodules/util.py", line 138, in backward
    output_tensors = ctx.run_function(*shallow_copies)
  File "/mnt/workspace/code/ODISE/third_party/stable-diffusion/ldm/modules/attention.py", line 212, in _forward
    x = self.attn1(self.norm1(x)) + x
  File "/mnt/cap/caijh/anaconda3/envs/odise/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/mnt/cap/caijh/anaconda3/envs/odise/lib/python3.9/site-packages/torch/nn/modules/normalization.py", line 190, in forward
    return F.layer_norm(
  File "/mnt/cap/caijh/anaconda3/envs/odise/lib/python3.9/site-packages/torch/nn/functional.py", line 2515, in layer_norm
    return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)
RuntimeError: expected scalar type Half but found Float

The arguments are

./tools/train_net.py --config-file configs/Panoptic/odise_label_coco_50e.py --num-gpus 2 --amp

Appreciate any idea to solve this issue, thank you.

Great work! Expecting the released code!

512x512 configuration as in ablation studies

Hello, could you share the $512\times512$ configuration used in the ablation study? Is there any other change other than the resolution?

I've just modify all 1024 into 512 in configs/common/data/coco_panoptic_semseg.py. It diffs like this:

--- a/configs/common/data/coco_panoptic_semseg.py
+++ b/configs/common/data/coco_panoptic_semseg.py
@@ -49,10 +49,10 @@ dataloader.train = L(build_d2_train_dataloader)(
             L(T.ResizeScale)(
                 min_scale=0.1,
                 max_scale=2.0,
-                target_height=1024,
-                target_width=1024,
+                target_height=512,
+                target_width=512,
             ),
-            L(T.FixedSizeCrop)(crop_size=(1024, 1024)),
+            L(T.FixedSizeCrop)(crop_size=(512, 512)),
         ],
         image_format="RGB",
     ),
@@ -68,7 +68,7 @@ dataloader.test = L(build_d2_test_dataloader)(
     mapper=L(DatasetMapper)(
         is_train=False,
         augmentations=[
-            L(T.ResizeShortestEdge)(short_edge_length=1024, sample_style="choice", max_size=2560),
+            L(T.ResizeShortestEdge)(short_edge_length=512, sample_style="choice", max_size=1280),
diff --git a/configs/common/models/odise_with_caption.py b/configs/common/models/odise_with_caption.py
index e2862cb..03a2bf8 100644
--- a/configs/common/models/odise_with_caption.py
+++ b/configs/common/models/odise_with_caption.py
@@ -25,7 +25,7 @@ model.backbone = L(FeatureExtractorBackbone)(
     ),
     out_features=["s2", "s3", "s4", "s5"],
     use_checkpoint=True,
-    slide_training=True,
+    slide_training=False,

I suppose $512\times512$ does not require slides so I turned it off as well. I wonder if these are consistent with your configuration.

Running equipment and running time

Hello, I have learned about your work. I feel really cool. I would like to ask, what is the running equipment and running time?

Can you share some implementation details about the result about 'K-Means Clustering of Frozen Diffusion Features'??

About 'K-Means Clustering of Frozen Diffusion Features', how do you perform on the dataset? Because the LDM model accept the text input to generate the new image samples, and what do you input to obtain which layers' latent feature map and how do you perform the k-menas cluster? Great thanks.

System RAM crashes while loading model in Google Colab

Thanks for the great Colab

I have a problems.

System RAM out of memory
When executing the code below, it overflows the system RAM and crashes.
I installed xformers, but I couldn't avoid it. Is there any solution?

model = instantiate_odise(cfg.model)

AttributeError: module 'keras.backend' has no attribute 'is_tensor'

inference(input_image, vocab, label_list)

Running the above gives the error in the title at google colab.
Should I change the versions of libraries such as einops and keras?

a weird bug

Thanks for the nice work!
So I was playing with some images using the hugging face demo, and I found out that the model is able to detect the coffee maker in the scene if I use the LVIS categories. However, if I just use a single category "coffee maker,coffee machine", the model is not able to detect the coffee maker in the image. Do you know what might be the problem here? BTW, I can provide the image if you want.

error in gradio demo app.py

I can run demo.py and demo.ipynb, but when I run the app.py file by the command python demo/app.py, the following error occurs:

Traceback (most recent call last):
File "/home/luoc/workspace/ODISE/demo/app.py", line 294, in
examples_handler = gr.Examples(
File "/home/luoc/miniconda3/envs/odise/lib/python3.9/site-packages/gradio/helpers.py", line 71, in create_examples
client_utils.synchronize_async(examples_obj.create)
File "/home/luoc/miniconda3/envs/odise/lib/python3.9/site-packages/gradio_client/utils.py", line 359, in synchronize_async
return fsspec.asyn.sync(fsspec.asyn.get_loop(), func, *args, **kwargs) # type: ignore
File "/home/luoc/miniconda3/envs/odise/lib/python3.9/site-packages/fsspec/asyn.py", line 100, in sync
raise return_result
File "/home/luoc/miniconda3/envs/odise/lib/python3.9/site-packages/fsspec/asyn.py", line 55, in _runner
result[0] = await coro
File "/home/luoc/miniconda3/envs/odise/lib/python3.9/site-packages/gradio/helpers.py", line 278, in create
await self.cache()
File "/home/luoc/miniconda3/envs/odise/lib/python3.9/site-packages/gradio/helpers.py", line 312, in cache
prediction = await Context.root_block.process_api(
File "/home/luoc/miniconda3/envs/odise/lib/python3.9/site-packages/gradio/blocks.py", line 1108, in process_api
result = await self.call_function(
File "/home/luoc/miniconda3/envs/odise/lib/python3.9/site-packages/gradio/blocks.py", line 915, in call_function
prediction = await anyio.to_thread.run_sync(
File "/home/luoc/miniconda3/envs/odise/lib/python3.9/site-packages/anyio/to_thread.py", line 31, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "/home/luoc/miniconda3/envs/odise/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
return await future
File "/home/luoc/miniconda3/envs/odise/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 867, in run
result = context.run(func, *args)
File "/home/luoc/workspace/ODISE/demo/app.py", line 253, in inference
model=models[model_name],
KeyError: None

Should the input image be in RGB or BGR?

Thanks for the excellent work in open source.
The run_on_image function says the input image should be in BGR order. But in the demo code, the input image is in RGB mode. So, I'm unsure about which mode would yield better results.

Additionally, I found that the ODISE(Lable) model doesn't recognize the "poles". What could be the reason? Is the prompt "poles" incorrect?

predictions['sem_seg'] does not contain the background class

The shape of predictions['sem_seg'] is [133, H, W] for coco, does it contain the background class, then how to get the correct semantic segmentation output？

Used GPUs -

Dear authors,

thank you for your brilliant work!
I have one question concerning the used GPUs. According to NVIDIA the V100s are available with 16GB and 32GB as well. Which ones did you use?

BR
Thanos

Error while installing ODISE

I am running into many errors while installing ODISE. Mainly with compiling Mask2Former.

Errors include:

 fatal error: 'crypt.h' file not found

 fatal error: 'cusparse.h' file not found

Here is my workaround (or at least attempted).

Install Mask2former from their repo (https://github.com/facebookresearch/Mask2Former/blob/main/INSTALL.md) -- this is the main issue. However make sure you use python 3.9.

Once you are able to install Detectron2 and Mask2former. Then you should be set for ODISE.

I had to append CUDA_HOME="/usr/local/cuda-11.3" pip install -e . CUDA_HOME to install detectron2 and mask2former inside my conda environment.

could you provide code about evaluation of the task of open-vocabulary object detection on the LVIS?

could you provide code about evaluation of the task of open-vocabulary object detection on the LVIS? I don't seem to find code of that

Installation puts too-new version of numpy

When installing following the instructions in the readme, we end up with numpy version 1.25.2

This breaks detectron2 visualizer, which throws error: module ‘numpy‘ has no attribute ‘bool‘.

Therefore, I needed to downgrade numpy to v1.23.*: conda install numpy==1.23.*

Then it works as expected

install‘s q：about detectron2

run：pip install -e .

return：ERROR: Could not build wheels for detectron2, which is required to install pyproject.toml-based projects

detail：
39\detectron2\model_zoo\configs\new_baselines
copying detectron2\model_zoo\configs\new_baselines\mask_rcnn_R_50_FPN_50ep_LSJ.py -> build\lib.win-amd64-cpython-39\detectron2\model_zoo\configs\new_baselines
running build_ext
D:\anaconda\envs\nerf\lib\site-packages\torch\utils\cpp_extension.py:358: UserWarning: Error checking compiler version for cl: [WinError 2] 系统找不到指定的文件。
warnings.warn(f'Error checking compiler version for {compiler}: {error}')
building 'detectron2._C' extension
creating C:\Users\PaXini_035\AppData\Local\Temp\pip-install-a7zcsi7z\detectron2_c46d9e951c9e4544ac7db943756ba092\build\temp.win-amd64-cpython-39
creating C:\Users\PaXini_035\AppData\Local\Temp\pip-install-a7zcsi7z\detectron2_c46d9e951c9e4544ac7db943756ba092\build\temp.win-amd64-cpython-39\Release
creating C:\Users\PaXini_035\AppData\Local\Temp\pip-install-a7zcsi7z\detectron2_c46d9e951c9e4544ac7db943756ba092\build\temp.win-amd64-cpython-39\Release\Users
creating C:\Users\PaXini_035\AppData\Local\Temp\pip-install-a7zcsi7z\detectron2_c46d9e951c9e4544ac7db943756ba092\build\temp.win-amd64-cpython-39\Release\Users\PaXini_035
creating C:\Users\PaXini_035\AppData\Local\Temp\pip-install-a7zcsi7z\detectron2_c46d9e951c9e4544ac7db943756ba092\build\temp.win-amd64-cpython-39\Release\Users\PaXini_035\AppData
creating C:\Users\PaXini_035\AppData\Local\Temp\pip-install-a7zcsi7z\detectron2_c46d9e951c9e4544ac7db943756ba092\build\temp.win-amd64-cpython-39\Release\Users\PaXini_035\AppData\Local
creating C:\Users\PaXini_035\AppData\Local\Temp\pip-install-a7zcsi7z\detectron2_c46d9e951c9e4544ac7db943756ba092\build\temp.win-amd64-cpython-39\Release\Users\PaXini_035\AppData\Local\Temp
creating C:\Users\PaXini_035\AppData\Local\Temp\pip-install-a7zcsi7z\detectron2_c46d9e951c9e4544ac7db943756ba092\build\temp.win-amd64-cpython-39\Release\Users\PaXini_035\AppData\Local\Temp\pip-install-a7zcsi7z
creating C:\Users\PaXini_035\AppData\Local\Temp\pip-install-a7zcsi7z\detectron2_c46d9e951c9e4544ac7db943756ba092\build\temp.win-amd64-cpython-39\Release\Users\PaXini_035\AppData\Local\Temp\pip-install-a7zcsi7z\detectron2_c46d9e951c9e4544ac7db943756ba092
creating C:\Users\PaXini_035\AppData\Local\Temp\pip-install-a7zcsi7z\detectron2_c46d9e951c9e4544ac7db943756ba092\build\temp.win-amd64-cpython-39\Release\Users\PaXini_035\AppData\Local\Temp\pip-install-a7zcsi7z\detectron2_c46d9e951c9e4544ac7db943756ba092\detectron2
error: could not create 'C:\Users\PaXini_035\AppData\Local\Temp\pip-install-a7zcsi7z\detectron2_c46d9e951c9e4544ac7db943756ba092\build\temp.win-amd64-cpython-39\Release\Users\PaXini_035\AppData\Local\Temp\pip-install-a7zcsi7z\detectron2_c46d9e951c9e4544ac7db943756ba092\detectron2': 文件名或扩展名太长。
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for detectron2
Running setup.py clean for detectron2
Building wheel for lvis (setup.py) ... done
Created wheel for lvis: filename=lvis-0.5.3-py3-none-any.whl size=14020 sha256=d9272abdad25f5a6bfe26b3f5bf00c215e352eb030e512cdfc85a1dc3100997e
Stored in directory: C:\Users\PaXini_035\AppData\Local\Temp\pip-ephem-wheel-cache-polfx76g\wheels\56\46\42\dc63fcf42b15c084a2d44b6d6854d3dd27d0f3886363ce582b
Building wheel for panopticapi (setup.py) ... done
Created wheel for panopticapi: filename=panopticapi-0.1-py3-none-any.whl size=9302 sha256=17b9b66051da4a373f6fceff0b62c995b4a3173b8ba6d5a8560729a47abed543
Stored in directory: C:\Users\PaXini_035\AppData\Local\Temp\pip-ephem-wheel-cache-polfx76g\wheels\52\9a\3e\b664fb2d7b0016a15b505840f9d97ece85bbc203b74debcde0
Building wheel for pathtools (setup.py) ... done
Created wheel for pathtools: filename=pathtools-0.1.2-py3-none-any.whl size=8801 sha256=bd9d445360da0cdea47b35a8206cac311954640517ba449e1678f28c5e10878b
Stored in directory: c:\users\paxini_035\appdata\local\pip\cache\wheels\ac\67\0c\7406f4ff2becf8690a173e4ad09fad416c31dd5ddcb23b7f9d
Building wheel for future (setup.py) ... done
Created wheel for future: filename=future-0.18.3-py3-none-any.whl size=492055 sha256=d9023b0844c47de4abc7637c72575b36e901f949ed769d05226bd5478252fbdc
Stored in directory: c:\users\paxini_035\appdata\local\pip\cache\wheels\56\e1\4e\6ceef740e8a6cd23736ece789be212141ec1a451067edcb87f
Successfully built diffdist antlr4-python3-runtime mask2former test-tube lvis panopticapi pathtools future
Failed to build detectron2
ERROR: Could not build wheels for detectron2, which is required to install pyproject.toml-based projects

Setup issue with mask2former

  running build_ext
  building 'MultiScaleDeformableAttention' extension
  Emitting ninja build file //ODISE/third_party/Mask2Former/build/temp.linux-x86_64-cpython-39/build.ninja...
  error: [Errno 2] No such file or directory: '//ODISE/third_party/Mask2Former/build/temp.linux-x86_64-cpython-39/build.ninja'
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
error: legacy-install-failure

× Encountered error while trying to install package.
╰─> mask2former

note: This is an issue with the package mentioned above, not pip.
hint: See above for output from the failure.

Is the error related to ninja build?

Some questions about the code

Thank you for your outstanding work.

I have thoroughly reviewed the paper and the code. Most of it is clear and understandable. However, I find the following sections rather perplexing: self.alpha_cond and self.alpha_cond_time_embed

self.alpha_cond = nn.Parameter(torch.zeros_like(self.ldm_extractor.ldm.uncond_inputs))
self.alpha_cond_time_embed = nn.Parameter(torch.zeros(self.ldm_extractor.ldm.unet.time_embed[-1].out_features))

It appears that self.alpha_cond and self.alpha_cond_time_embed are used to interact with prefixes (as referenced here), which are generated by the Implicit Captioner. Subsequently, the results of this interaction are fed into the Latent Diffusion Model.

I'm curious about the necessity of the following operation (as mentioned here):

batched_inputs["cond_inputs"] = (self.ldm_extractor.ldm.uncond_inputs + torch.tanh(self.alpha_cond) * prefix_embed).

It seems that we could directly feed prefix_embed into the Latent Diffusion Model. I would like to understand the purpose and rationale behind introducing self.alpha_cond and self.alpha_cond_time_embed. Has any previous work employed such an operation?

I eagerly anticipate your response. Thank you very much.

out of memory

CUDA out of memory. Tried to allocate 32.00 MiB (GPU 0; 23.69 GiB total capacity; 21.31 GiB already allocated; 12.06 MiB free; 21.46 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting

I am working on a gpu with 24 gb and I run out of space and I am using crops of 512x512 with batch size 1

Is this a memory leak problem? I am also using garbage collector but that only delays running into the 'out of memory' error.
Help is appreciated :)

cityscape evaluation

could you please indicate what the command for evaluating on cityscape is?
thanks in advance.

How to convert to onnx format? Do you have a built-in script

Evaluation of open-vocabulary instance segmentation on the Table B.5?

Dear @shalinidemello @xvjiarui,

Could you provide more details about how to reproduce the results in Table B.5? e.g. do we need to train on this instance task or just use the panoptic pretrained model? Could you provide the training and evaluation code? Is this from this line ?

Thank you!

environment implement

i follow your install.md step by step and one by one .but unfortunately i still confronted this problem when i run pip install -e .
Failed to build mask2former
ERROR: Could not build wheels for mask2former, which is required to install pyproject.toml-based projects

i really confused about all this stuff and tried for a long time ,could you kind god give some solutions or some great links that i could download these beautiful datasets ?

much appreciate!!

fatal error: cusparse.h: No such file or directory

when run "pip install -e .",
the error happen:
lude/ATen/cuda/CUDAContext.h:6:10: fatal error: cusparse.h: No such file or directory
#include <cusparse.h>
^~~~~~~~~~~~
compilation terminated.
error: command '/home/cheng/ws/miniconda3/envs/odise/bin/nvcc' failed with exit code 1
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
error: legacy-install-failure

× Encountered error while trying to install package.
╰─> detectron2

note: This is an issue with the package mentioned above, not pip.
hint: See above for output from the failure.

Minimum GPU requirements

I get CUDA out of memory error when I run python demo/demo.py --input demo/examples/coco.jpg --output demo/coco_pred.jpg --vocab "black pickup truck, pickup truck; blue sky, sky" on RTX 3060 GPU with 12GB of vram.

Last lines of the error is as follows:

output_features[k] = torch.zeros(
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 176.00 MiB (GPU 0; 11.73 GiB total capacity; 8.91 GiB already allocated; 136.75 MiB free; 9.09 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

What are the minimum requirements for running inference code? Is there a way to prevent getting these errors on less powerful systems? Is it possible to perform inference using CPU?

Thanks!

strange result

#

error in demo

Hello, GREAT JOB! It looks like your demo shows an error when inference, do you have any plan to fix it? Looking forward to play with it :D

The version problem of stable dffusion

Does ODISE only support Stable Diffusion v1.3, does it support stable diffusion of 1.5 or higher？

Detail about 'background' class

Hi, thanks for your great work.

I have a question about the classifier in this paper, I want to know whether you used the 'background' class in the $C_{train}$.

We encode the names of all the categories in Ctrain with the frozen text encoder and define the set of embeddings of all the training categories' names as: Equation4

If you used a background class, is it learnable or fixed?

Performance of the demo varies dramatically with the CUDA device

Hi, and thank you for your fantastic work! I have encountered a minor issue that I'd like to bring to your attention. I found that when I change the line model.to(cfg.train.device) to model.to("cuda:1") (or any other device) in the demo.ipynb, there is a significant difference in the generated segmentation map compared to the original (please see the attached image below).

The original code runs perfectly fine and produces results consistent with those in the paper. However, when I make this modification, I don't encounter any specific warnings or errors, so I'm uncertain where the issue lies (I suspect that perhaps some modules are not loaded correctly). I'd greatly appreciate your help with this issue. Thank you!

RuntimeError: CUDA error: invalid argument with 3090 GPU

Thanks for your great work. When I try to train the model with eight 3090 GPUs with the following commands,
./tools/train_net.py --config-file configs/Panoptic/odise_label_coco_50e.py --num-gpus 8 --amp --ref 32

The following errors are encountered.

Starting training from iteration 0
 Exception during training:
Traceback (most recent call last):
  File "/home/dazhi/miniconda3/envs/odise/lib/python3.9/site-packages/detectron2/engine/train_loop.py", line 149, in train
    self.run_step()
  File "/home/zoloz/8T-1/zitong/code/ODISE/odise/engine/train_loop.py", line 297, in run_step
    grad_norm = self.grad_scaler(
  File "/home/zoloz/8T-1/zitong/code/ODISE/odise/engine/train_loop.py", line 207, in __call__
    self._scaler.scale(loss).backward(create_graph=create_graph)
  File "/home/dazhi/miniconda3/envs/odise/lib/python3.9/site-packages/torch/_tensor.py", line 488, in backward
    torch.autograd.backward(
  File "/home/dazhi/miniconda3/envs/odise/lib/python3.9/site-packages/torch/autograd/__init__.py", line 197, in backward
    Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
  File "/home/dazhi/miniconda3/envs/odise/lib/python3.9/site-packages/torch/autograd/function.py", line 267, in apply
    return user_fn(self, *args)
  File "/home/dazhi/miniconda3/envs/odise/lib/python3.9/site-packages/ldm/modules/diffusionmodules/util.py", line 142, in backward
    input_grads = torch.autograd.grad(
  File "/home/dazhi/miniconda3/envs/odise/lib/python3.9/site-packages/torch/autograd/__init__.py", line 300, in grad
    return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
  File "/home/dazhi/miniconda3/envs/odise/lib/python3.9/site-packages/torch/autograd/function.py", line 267, in apply
    return user_fn(self, *args)
  File "/home/dazhi/miniconda3/envs/odise/lib/python3.9/site-packages/torch/autogr
[06/17 03:38:05 d2.engine.hooks]: Total training time: 0:00:25 (0:00:00 on hooks)
[06/17 03:38:05 d2.utils.events]: odise_label_coco_50e_bs16x8/default  iter: 0/368752    lr: N/A  max_mem: 19297M
Traceback (most recent call last):
  File "/home/zoloz/8T-1/zitong/code/ODISE/./tools/train_net.py", line 392, in <module>
    launch(
  File "/home/dazhi/miniconda3/envs/odise/lib/python3.9/site-packages/detectron2/engine/launch.py", line 67, in launch
    mp.spawn(
  File "/home/dazhi/miniconda3/envs/odise/lib/python3.9/site-packages/torch/multiprocessing/spawn.py", line 240, in spawn
    return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
  File "/home/dazhi/miniconda3/envs/odise/lib/python3.9/site-packages/torch/multiprocessing/spawn.py", line 198, in start_processes
    while not context.join():
  File "/home/dazhi/miniconda3/envs/odise/lib/python3.9/site-packages/torch/multiprocessing/spawn.py", line 160, in join
    raise ProcessRaisedException(msg, error_index, failed_process.pid)
torch.multiprocessing.spawn.ProcessRaisedException:

-- Process 1 terminated with the following error:
Traceback (most recent call last):
  File "/home/dazhi/miniconda3/envs/odise/lib/python3.9/site-packages/torch/multiprocessing/spawn.py", line 69, in _wrap
    fn(i, *args)
  File "/home/dazhi/miniconda3/envs/odise/lib/python3.9/site-packages/detectron2/engine/launch.py", line 126, in _distributed_worker
    main_func(*args)
  File "/home/zoloz/8T-1/zitong/code/ODISE/tools/train_net.py", line 363, in main
    do_train(args, cfg)
  File "/home/zoloz/8T-1/zitong/code/ODISE/tools/train_net.py", line 309, in do_train
    trainer.train(start_iter, cfg.train.max_iter)
  File "/home/dazhi/miniconda3/envs/odise/lib/python3.9/site-packages/detectron2/engine/train_loop.py", line 149, in train
    self.run_step()
  File "/home/zoloz/8T-1/zitong/code/ODISE/odise/engine/train_loop.py", line 297, in run_step
    grad_norm = self.grad_scaler(
  File "/home/zoloz/8T-1/zitong/code/ODISE/odise/engine/train_loop.py", line 207, in __call__
    self._scaler.scale(loss).backward(create_graph=create_graph)
  File "/home/dazhi/miniconda3/envs/odise/lib/python3.9/site-packages/torch/_tensor.py", line 488, in backward
    torch.autograd.backward(
  File "/home/dazhi/miniconda3/envs/odise/lib/python3.9/site-packages/torch/autograd/__init__.py", line 197, in backward
    Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
  File "/home/dazhi/miniconda3/envs/odise/lib/python3.9/site-packages/torch/autograd/function.py", line 267, in apply
    return user_fn(self, *args)
  File "/home/dazhi/miniconda3/envs/odise/lib/python3.9/site-packages/ldm/modules/diffusionmodules/util.py", line 142, in backward
    input_grads = torch.autograd.grad(
  File "/home/dazhi/miniconda3/envs/odise/lib/python3.9/site-packages/torch/autograd/__init__.py", line 300, in grad
    return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
  File "/home/dazhi/miniconda3/envs/odise/lib/python3.9/site-packages/torch/autograd/function.py", line 267, in apply
    return user_fn(self, *args)
  File "/home/dazhi/miniconda3/envs/odise/lib/python3.9/site-packages/torch/autograd/function.py", line 414, in wrapper
    outputs = fn(ctx, *args)
  File "/home/dazhi/miniconda3/envs/odise/lib/python3.9/site-packages/xformers/ops/fmha/__init__.py", line 111, in backward
    grads = _memory_efficient_attention_backward(
  File "/home/dazhi/miniconda3/envs/odise/lib/python3.9/site-packages/xformers/ops/fmha/__init__.py", line 382, in _memory_efficient_attention_backward
    grads = op.apply(ctx, inp, grad)
  File "/home/dazhi/miniconda3/envs/odise/lib/python3.9/site-packages/xformers/ops/fmha/cutlass.py", line 184, in apply
    (grad_q, grad_k, grad_v,) = cls.OPERATOR(
  File "/home/dazhi/miniconda3/envs/odise/lib/python3.9/site-packages/torch/_ops.py", line 442, in __call__
    return self._op(*args, **kwargs or {})
RuntimeError: CUDA error: invalid argument
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

nvlabs / odise Goto Github PK

odise's Issues

Recommend Projects

Recommend Topics

Recommend Org