zhanggongjie / sam-detr Goto Github PK

[CVPR'2022] SAM-DETR & SAM-DETR++: Official PyTorch Implementation

License: MIT License

Python 72.10% Shell 3.25% C++ 2.23% Cuda 22.42%

pytorch object-detection detr computer-vision transformer detection vision-transformer deep-learning vision machine-learning

sam-detr's People

Contributors

Stargazers

Watchers

Forkers

sarahchenjieyi yingchen001 jasonkks sarahbytedance detrresearch minisoco zaku-zaku capager hay-man nicolesherwood 0x8235 jihao0309 obsidian6s sorrowyn molierflower e-kiss-me awekling musherm n0wwa tufo830 billionerd pleasureforge windb3ll vince1089 moko3016 dl-vit brightyoun mona0809 ixc123 lxxiaoxin1 emily6688 cv-det whfay tellmewhy1122 affinityc stnjumu arcral yaoxinzhuo zhenshen-mla whmjohn assistechsol zqs9810 qiguobin zjufkq zheng-2019 cas2ggb spaci-yanghaonan ynzhou1 richardminsoogo-ml

sam-detr's Issues

Why the q_content is the same as the q_content_points?

Hi,
What's the difference between the q_content and q_content_points?

Obtain Query POS Embeddings in Semantically Aligner

Thank you very much for your contribution!
I want to know why the Query POS Embedding in Semantically Aligner has the operation of "*0.5" before it passes through Sin POS Embedding.

SAM-DETR/models/transformer_decoder.py

Line 256 in df4e567

    
           q_pos_scale = reference_boxes[:, :, 2:].reshape(bs, num_queries, 1, 2).expand(-1, -1, self.nheads, -1) * 0.5

I look forward to your valuable reply，Thank you.

import errrors on different versions of pytorch and visualize qustions

First of all, thx for your opensource code!

Known issues

There are some errors with your import in attention.py
1.

if float(torch.__version__[:3]) < 1.7:
   from torch._overrides import has_torch_function, handle_torch_function
else:
  from torch.overrides import has_torch_function, handle_torch_function

It seems quite simple, but when torch.version returns like 1.11.0, this could lead to version error, as float(torch.version[:3]) < 1.7 will return 1.1<1.7 True.
I recommand you could split torch.version return by '.' to compare version more precisely.
2. and if pytorch version >=1.8.1

from torch.nn.modules.linear import  _LinearWithBias

will lead to import error, as _LinearWithBias seems not accessable in pytorch>=1.8.1, where can be replaced by

from torch.nn.modules.linear import NonDynamicallyQuantizableLinear as _LinearWithBias

I'll appreciate it if you could add version compatability like above with this too

My question

I'm new to cv so, forgive me if my question is easy.
After training on my own dataset, I get many .pth files in my output directory, I think it contains model parameters.
I want to know that having trained model parameters, how could I load this model to detect on 1 specific image.
That's to say, I want to give the model an image, and get back model's detection result in some format.
Thx if you could reply.

How to introduce DN-DETR denoising method into SAM-DETR?

Thanks for the wonderful work.
I have seen that DN is introduced into SAM-Deter ++, but there is no corresponding disclosure code. Could you briefly describe the method if it is convenient? Can I just replace the input from the Decoder with something like a DN?
thank you

Does SAM help Deformable DETR?

Thanks for the wonderful work.
I would like to know if SAM as a plug-in module will improve Deformable DETR?

The question about emb_dim in cross_attention module

Hi, I found that compared to other DETR variants, the q and k dimensions in SAM cross-attention use SPx8 to be higher. I would like to ask if it is fairer to compare with SPx1.

Generalized Box IoU consistently reporting degenerate boxes

Hello. First off, thank you for making your work's code available here on GitHub. It is well organized and maintained.

My question is, since I have tried applying your SAM-DETR model to my custom object detection dataset for training and validation, I am consistently seeing the generalized_box_iou utility function raise AssertionErrors saying the model's predicted bounding boxes are degenerate as the (lx, ly) coordinates are greater than the (rx, ry) coordinates (this check makes sense to me, however, I am not sure how to solve the issue). I have also added a check on the len(boxes1) > 0 to make sure at least one box was predicted in a batch of images.

SAM-DETR/util/box_ops.py

Line 44 in aaf1936

def generalized_box_iou(boxes1, boxes2):

Would you have any ideas why the model would be predicting degenerate bounding box coordinates from time to time, ending training prematurely?

Can you please provide the GFLOPs calculation code

Thank you for your work!
Can you please provide the GFLOPs calculation code

only use object detection? custom dataset do not have segmentation labels

Thanks for the work. If only detection is used instead of both seg and det , will the mAP decrease? Can you estimate the magnitude of the decline?

swin transformer backbone

hi. Can I use Swin Transformer as a backbone instead of resnet50? If so, what changes should be made to the swin transformer(pretrained on imgnet22k) ?

how the reference box works?

I want to know how the reference box works,why they can locate the object to be detected precisely？
That is important to me,thank you very much!

what means “the same embeding space”

No results in detecting

Thanks for your great work.
When I run the code "bash scripts/r50_smca_e12_4gpu.sh", I get an error "FileNotFoundError: [Errno 2] No such file or directory: 'data/coco/train2017/000000151988.jpg'". Thus, I add one line "--eval" into "scripts/r50_smca_e12_4gpu.sh", and I run again. And I can successfully run the code. But get none results like this:

code_root/
└── data/
    └── coco/
        ├── train2017/
        ├── val2017/
        └── annotations/
        	├── instances_train2017.json
        	└── instances_val2017.json

This does not seem to meet the file structure written in your code.
The correct Tree seems to be like this:

code_root/
└── data/
    └── coco/
        └── images/
               ├── train2017/
               └──val2017/
        └── annotations/
        	├── instances_train2017.json
        	└── instances_val2017.json

Please check it out, thx