foundationvision / generateu Goto Github PK

View Code? Open in Web Editor NEW

122.0 122.0 6.0 14.73 MB

[CVPR2024] Generative Region-Language Pretraining for Open-Ended Object Detection

Python 90.82% C++ 2.96% Cuda 6.20% Shell 0.02%

mllm multimodality object-detection open-vocabulary open-vocabulary-detection open-world

generateu's People

Stargazers

Watchers

Forkers

a574824551 drkhoinguyen djj-gyx noip2019 dwhnicholas whuhxb

generateu's Issues

Some weights of T5ForConditionalGeneration were not initialized from the model checkpoint at google/flan-t5-base and are newly initialized: ['temp']

Thank you very much for your outstanding work. However, when I was training vg_swinT.yaml, I encountered the following issue:

Some weights of T5ForConditionalGeneration were not initialized from the model checkpoint at google/flan-t5-base and are newly initialized: ['temp']
You should probably TRAIN this model on a downstream task to be able to use it for predictions and inference.

Moreover, the training results were only:

AP	AP50	AP75	APs	APm	APl	APr	APc	APf
0.013	0.027	0.010	0.001	0.007	0.057	0.000	0.001	0.025

Is this issue caused by the flan-t5-base model not loading correctly? I hope to get your advice on this matter. Thank you.

where is "google/flan-t5-base" in vg_swinL.yaml ?

can you tell me how can load it? thx

About multi-modal large model initialization.

Thanks for your excellent work. Due to limited resources, I would like to learn the part of training detection head from multi-modal large model initialization. Please kindly transfer a code in your busy schedule for learning and academic research only.

Evaluation results of the model.

It is a interesting work. When I just evaluate the pretrained model from the author provided, I get the lower results:

Is there any wrong for testing process? Can author provide the python code of "DDETRSVLUniWithTTA"? Is the paper result got by "DDETRSVLUniWithTTA" process?

Thanks!

torch.cuda.OutOfMemoryError

File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/linear.py", line 114, in forward
return F.linear(input, self.weight, self.bias)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 496.00 MiB. GPU 0 has a total capacty of 14.75 GiB of which 235.06 MiB is free. Process 20398 has 14.52 GiB memory in use. Of the allocated memory 14.06 GiB is allocated by PyTorch, and 330.42 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Run error

When I run pip3 install -r requirements.txt, I get the following error:
ERROR: Invalid requirement: -e . --user
pip3: error: no such option: --user

I would appreciate a solution, if possible. Thanks.

Inference code for user-defined data

Interested in your work. Thank you very much for it.
I would like to compare with results of other model as qualitative.
Can I know your plans for this?

COCO zero-shot

@clin1223
Hi, thanks for your significant work!
We want to reproduce the COCO zero-shot results In Table 3.
We generate the text embeddings via clip-vit-large-patch14-336. We replace the ZERO_SHOT_WEIGHT with the generated embeds.
Unfortunately, the results are 0.
Could you please give some points to us? Could you please provide the corresponding COCO-80-embeddings?
Thanks! Have a nice day!

By the way, we generate the COCO-80-embeddings as follows.

model_path = "clip-vit-large-patch14-336"
model = CLIPTextModel.from_pretrained(model_path)
tokenizer = AutoTokenizer.from_pretrained(model_path)
inputs = tokenizer(['a '+ class], padding=True, return_tensors="pt")
outputs = model(**inputs)
text_features = outputs.pooler_output

We obtain a numpy array, 80* 768.

where is the inference_propainter.py and inputs dir, can you share them?

是否有推理代码？

您好，请问能否开源单张图片的推理及可视化代码呢？

Cannot reproduce results as shown in the paper

Hello,

Thanks for the good work.

I was trying to reproduce the results on LVIS but getting different number. Can you please check this?

Thanks

为什么执行python3 launch.py --nn 1 --uni 1 \ --config-file projects/DDETRS/configs/vg_swinT.yaml OUTPUT_DIR outputs/${EXP_NAME}后没有反应

About t5_loss

Thank you for your fascinating work. I noticed in the code that t5_loss is not included in the weight_dict, which implies that t5_loss does not get optimized, meaning the T5 model does not get updated. Is this an oversight, or is there a specific reason for this configuration?

foundationvision / generateu Goto Github PK

generateu's People

Stargazers

Watchers

Forkers

generateu's Issues

Recommend Projects

Recommend Topics

Recommend Org