Giter Site home page Giter Site logo

idea-research / detrex Goto Github PK

View Code? Open in Web Editor NEW
1.8K 27.0 194.0 9.21 MB

detrex is a research platform for DETR-based object detection, segmentation, pose estimation and other visual recognition tasks.

Home Page: https://detrex.readthedocs.io/en/latest/

License: Apache License 2.0

Python 93.88% C++ 0.66% Cuda 5.42% Shell 0.05%
detr object-detection pytorch dino state-of-the-art dab-detr deformable-detr conditional-detr dn-detr group-detr

detrex's Introduction

🦖detrex: Benchmarking Detection Transformers

release docs Documentation Status GitHub PRs Welcome open issues

Introduction

detrex is an open-source toolbox that provides state-of-the-art Transformer-based detection algorithms. It is built on top of Detectron2 and its module design is partially borrowed from MMDetection and DETR. Many thanks for their nicely organized code. The main branch works with Pytorch 1.10+ or higher (we recommend Pytorch 1.12).

Major Features
  • Modular Design. detrex decomposes the Transformer-based detection framework into various components which help users easily build their own customized models.

  • Strong Baselines. detrex provides a series of strong baselines for Transformer-based detection models. We have further boosted the model performance from 0.2 AP to 1.1 AP through optimizing hyper-parameters among most of the supported algorithms.

  • Easy to Use. detrex is designed to be light-weight and easy for users to use:

Apart from detrex, we also released a repo Awesome Detection Transformer to present papers about Transformer for detection and segmentation.

Fun Facts

The repo name detrex has several interpretations:

  • detr-ex : We take our hats off to DETR and regard this repo as an extension of Transformer-based detection algorithms.

  • det-rex : rex literally means 'king' in Latin. We hope this repo can help advance the state of the art on object detection by providing the best Transformer-based detection algorithms from the research community.

  • de-t.rex : de means 'the' in Dutch. T.rex, also called Tyrannosaurus Rex, means 'king of the tyrant lizards' and connects to our research work 'DINO', which is short for Dinosaur.

What's New

v0.5.0 was released on 16/07/2023:

Please see changelog.md for details and release history.

Installation

Please refer to Installation Instructions for the details of installation.

Getting Started

Please refer to Getting Started with detrex for the basic usage of detrex. We also provides other tutorials for:

Although some of the tutorials are currently presented with relatively simple content, we will constantly improve our documentation to help users achieve a better user experience.

Documentation

Please see documentation for full API documentation and tutorials.

Model Zoo

Results and models are available in model zoo.

Supported methods

Please see projects for the details about projects that are built based on detrex.

License

This project is released under the Apache 2.0 license.

Acknowledgement

  • detrex is an open-source toolbox for Transformer-based detection algorithms created by researchers of IDEACVR. We appreciate all contributions to detrex!
  • detrex is built based on Detectron2 and part of its module design is borrowed from MMDetection, DETR, and Deformable-DETR.

Citation

If you use this toolbox in your research or wish to refer to the baseline results published here, please use the following BibTeX entries:

  • Citing detrex:
@misc{ren2023detrex,
      title={detrex: Benchmarking Detection Transformers}, 
      author={Tianhe Ren and Shilong Liu and Feng Li and Hao Zhang and Ailing Zeng and Jie Yang and Xingyu Liao and Ding Jia and Hongyang Li and He Cao and Jianan Wang and Zhaoyang Zeng and Xianbiao Qi and Yuhui Yuan and Jianwei Yang and Lei Zhang},
      year={2023},
      eprint={2306.07265},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}
Citing Supported Algorithms
@inproceedings{carion2020end,
  title={End-to-end object detection with transformers},
  author={Carion, Nicolas and Massa, Francisco and Synnaeve, Gabriel and Usunier, Nicolas and Kirillov, Alexander and Zagoruyko, Sergey},
  booktitle={European conference on computer vision},
  pages={213--229},
  year={2020},
  organization={Springer}
}

@inproceedings{
  zhu2021deformable,
  title={Deformable {\{}DETR{\}}: Deformable Transformers for End-to-End Object Detection},
  author={Xizhou Zhu and Weijie Su and Lewei Lu and Bin Li and Xiaogang Wang and Jifeng Dai},
  booktitle={International Conference on Learning Representations},
  year={2021},
  url={https://openreview.net/forum?id=gZ9hCDWe6ke}
}

@inproceedings{meng2021-CondDETR,
  title       = {Conditional DETR for Fast Training Convergence},
  author      = {Meng, Depu and Chen, Xiaokang and Fan, Zejia and Zeng, Gang and Li, Houqiang and Yuan, Yuhui and Sun, Lei and Wang, Jingdong},
  booktitle   = {Proceedings of the IEEE International Conference on Computer Vision (ICCV)},
  year        = {2021}
}

@inproceedings{
  liu2022dabdetr,
  title={{DAB}-{DETR}: Dynamic Anchor Boxes are Better Queries for {DETR}},
  author={Shilong Liu and Feng Li and Hao Zhang and Xiao Yang and Xianbiao Qi and Hang Su and Jun Zhu and Lei Zhang},
  booktitle={International Conference on Learning Representations},
  year={2022},
  url={https://openreview.net/forum?id=oMI9PjOb9Jl}
}

@inproceedings{li2022dn,
  title={Dn-detr: Accelerate detr training by introducing query denoising},
  author={Li, Feng and Zhang, Hao and Liu, Shilong and Guo, Jian and Ni, Lionel M and Zhang, Lei},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={13619--13627},
  year={2022}
}

@inproceedings{
  zhang2023dino,
  title={{DINO}: {DETR} with Improved DeNoising Anchor Boxes for End-to-End Object Detection},
  author={Hao Zhang and Feng Li and Shilong Liu and Lei Zhang and Hang Su and Jun Zhu and Lionel Ni and Heung-Yeung Shum},
  booktitle={The Eleventh International Conference on Learning Representations },
  year={2023},
  url={https://openreview.net/forum?id=3mRwyG5one}
}

@InProceedings{Chen_2023_ICCV,
  author    = {Chen, Qiang and Chen, Xiaokang and Wang, Jian and Zhang, Shan and Yao, Kun and Feng, Haocheng and Han, Junyu and Ding, Errui and Zeng, Gang and Wang, Jingdong},
  title     = {Group DETR: Fast DETR Training with Group-Wise One-to-Many Assignment},
  booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
  month     = {October},
  year      = {2023},
  pages     = {6633-6642}
}

@InProceedings{Jia_2023_CVPR,
  author    = {Jia, Ding and Yuan, Yuhui and He, Haodi and Wu, Xiaopei and Yu, Haojun and Lin, Weihong and Sun, Lei and Zhang, Chao and Hu, Han},
  title     = {DETRs With Hybrid Matching},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  month     = {June},
  year      = {2023},
  pages     = {19702-19712}
}

@InProceedings{Li_2023_CVPR,
  author    = {Li, Feng and Zhang, Hao and Xu, Huaizhe and Liu, Shilong and Zhang, Lei and Ni, Lionel M. and Shum, Heung-Yeung},
  title     = {Mask DINO: Towards a Unified Transformer-Based Framework for Object Detection and Segmentation},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  month     = {June},
  year      = {2023},
  pages     = {3041-3050}
}

@article{yan2023bridging,
  title={Bridging the Gap Between End-to-end and Non-End-to-end Multi-Object Tracking},
  author={Yan, Feng and Luo, Weixin and Zhong, Yujie and Gan, Yiyang and Ma, Lin},
  journal={arXiv preprint arXiv:2305.12724},
  year={2023}
}

@InProceedings{Chen_2023_CVPR,
  author    = {Chen, Fangyi and Zhang, Han and Hu, Kai and Huang, Yu-Kai and Zhu, Chenchen and Savvides, Marios},
  title     = {Enhanced Training of Query-Based Object Detection via Selective Query Recollection},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  month     = {June},
  year      = {2023},
  pages     = {23756-23765}
}

detrex's People

Contributors

alrightkami avatar charles-xie avatar czczup avatar eltociear avatar fangyi-chen avatar felixcaae avatar fengli-ust avatar fengxiuyaun avatar froestiago avatar haozhang534 avatar hukkelas avatar jasam-sheja avatar jiadingcn avatar pakcheera avatar rayleizhu avatar rentainhe avatar rodriguhe avatar shenyi0220 avatar slongliu avatar thangngoc89 avatar tosemml avatar triple-mu avatar xx025 avatar zengzhaoyang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

detrex's Issues

SwinV2 support

Is SwinV2 in your plan? Will you release DINO-SwinV2-L and DINO-SwinV2-G model?

Plan to support for ONNX

Hello,

Thanks for your release of DETR-like models! As I know, DETR can be converted to ONNX now, will you have plan to support these DETR-like models for ONNX?
I am looking forward to your reply, thanks!

Custom data set training using pre-trained weights

Hi there, thank you so much for your hard work for release.
Could you please provide a tutorial for custom data training using pre-trained weights?
And is it possible at the moment to fine-tune DINO with Swin backbones?

Fine-tune on custom dataset

Hello, I tried to train a custom dataset on detrex, but an RTX-3090 still reports out of memory. The same device is trained enough in the DINO source code, I guess this is because of the extra memory consumption brought by the Detrex framework, or some of my parameter settings?

About difference between the epoch and iters

Hi, thanks for releasing this repo. I'm the user of the original DINO, DAB-DETR, DN-DETR, and the other DETR-like models. I really like the work your team has done.

however, I have some questions about this work on the epoch. All of us know that on the original DINO models, we can set the epoch like 12,24, or 36, which is easy to understand, and we can set the batch_size like 1,2 or more.
For instance, if we have 1000 total pic in my dataset, so the GPU will cope with the 2 pics per time, and in one epoch they will iterate the whole dataset 500 times when batch_size is 2. It's good for me to understand.

In the detrex, We inherited some impl from detectron2, The configs for iteration in detrex are dependence on max_iters
train.max_iter = 90000 train.eval_period = 5000 train.log_period = 20
as the code shows, it's hard for me to understand the 90000 500, what do they stand for? maybe I say more clearly that is what relationship between max_iters and epoch or another thing. the second is what is the mean about 1 iters.

If I have 1000pics and I want to train it 12 epoch and the batch_size is 2, how can I set the max_iter?

Custom MaskDINO training crashes when setting number of classes

My goal is to fine-tune MaskDINO on a custom dataset with only one class.
For that I changed the config following the instructions in this issue:

import datetime
from detrex.config import get_config
from .models.maskdino_r50 import model
from .data.coco_instance_seg import dataloader

from fvcore.common.param_scheduler import MultiStepParamScheduler
from detectron2.config import LazyCall as L
from detectron2.solver import WarmupParamScheduler

# get default config
train = get_config("common/train.py").train

# max training iterations
train.max_iter = 36875

# warmup lr scheduler
lr_multiplier = L(WarmupParamScheduler)(
    scheduler=L(MultiStepParamScheduler)(
        values=[1.0, 0.1],
        milestones=[32777, 35509],
    ),
    warmup_length=10 / train.max_iter,
    warmup_factor=1.0,
)

optimizer = get_config("common/optim.py").AdamW

# initialize checkpoint to be loaded
train.init_checkpoint = "./projects/maskdino/maskdino_r50_50ep_300q_hid2048_3sd1_instance_maskenhanced_mask46.3ap_box51.7ap.pth" 
train.output_dir = "./output/maskdino/" + datetime.datetime.now().strftime("%d%m_%H%M")

# run evaluation every n iters
train.eval_period = 500

# log training infomation every n iters
train.log_period = 20

# save checkpoint every n iters
train.checkpointer.period = 9999

# gradient clipping for training
train.clip_grad.enabled = True
train.clip_grad.params.max_norm = 0.01
train.clip_grad.params.norm_type = 2

# set training devices
train.device = "cuda" # or "cuda:1" or "cpu"

# modify optimizer config
optimizer.lr = 1e-4
optimizer.betas = (0.9, 0.999)
optimizer.weight_decay = 1e-4
optimizer.params.lr_factor_func = lambda module_name: 0.1 if "backbone" in module_name else 1

# modify dataloader config
dataloader.train.num_workers = 1
dataloader.train.total_batch_size = 1

# dump the testing results into output_dir for visualization
dataloader.evaluator.output_dir = train.output_dir

# data
dataloader.train.dataset.names = 'graffiti_train'
dataloader.test.dataset.names = 'graffiti_test'

# number of classes
model.num_classes = 1

However, training crashes with following log:

�[4m�[5m�[31mERROR�[0m �[32m[12/06 15:54:21 d2.config.instantiate]: �[0mError when instantiating projects.maskdino.maskdino.MaskDINO!
Traceback (most recent call last):
  File "tools/train_net_graffiti.py", line 232, in <module>
    launch(
  File "/home/jovyan/data/kamila/detrex/detectron2/detectron2/engine/launch.py", line 82, in launch
    main_func(*args)
  File "tools/train_net_graffiti.py", line 227, in main
    do_train(args, cfg)
  File "tools/train_net_graffiti.py", line 161, in do_train
    model = instantiate(cfg.model)
  File "/home/jovyan/data/kamila/detrex/detectron2/detectron2/config/instantiate.py", line 83, in instantiate
    return cls(**cfg)
TypeError: __init__() got an unexpected keyword argument 'num_classes'

If I'm not including the number of classes in the config, the training goes normally, but crushes at inference:

[12/06 11:08:47] d2.evaluation.evaluator INFO: Inference done 483/483. Dataloading: 0.0023 s/iter. Inference: 0.1729 s/iter. Eval: 1.3440 s/iter. Total: 1.5193 s/iter. ETA=0:00:00
[12/06 11:08:47] d2.evaluation.evaluator INFO: Total inference time: 0:12:06.363428 (1.519589 s / iter per device, on 1 devices)
[12/06 11:08:47] d2.evaluation.evaluator INFO: Total inference pure compute time: 0:01:22 (0.172937 s / iter per device, on 1 devices)
[12/06 11:08:47] d2.evaluation.coco_evaluation INFO: Preparing results for COCO format ...
[12/06 11:08:47] d2.engine.train_loop ERROR: Exception during training:
Traceback (most recent call last):
  File "/home/jovyan/data/kamila/detrex/detectron2/detectron2/engine/train_loop.py", line 150, in train
    self.after_step()
  File "/home/jovyan/data/kamila/detrex/detectron2/detectron2/engine/train_loop.py", line 180, in after_step
    h.after_step()
  File "/home/jovyan/data/kamila/detrex/detectron2/detectron2/engine/hooks.py", line 555, in after_step
    self._do_eval()
  File "/home/jovyan/data/kamila/detrex/detectron2/detectron2/engine/hooks.py", line 528, in _do_eval
    results = self._func()
  File "tools/train_net_graffiti.py", line 194, in <lambda>
    hooks.EvalHook(cfg.train.eval_period, lambda: do_test(cfg, model)),
  File "tools/train_net_graffiti.py", line 135, in do_test
    ret = inference_on_dataset(
  File "/home/jovyan/data/kamila/detrex/detectron2/detectron2/evaluation/evaluator.py", line 204, in inference_on_dataset
    results = evaluator.evaluate()
  File "/home/jovyan/data/kamila/detrex/detectron2/detectron2/evaluation/coco_evaluation.py", line 206, in evaluate
    self._eval_predictions(predictions, img_ids=img_ids)
  File "/home/jovyan/data/kamila/detrex/detectron2/detectron2/evaluation/coco_evaluation.py", line 240, in _eval_predictions
    assert category_id < num_classes, (
AssertionError: A prediction has class=9, but the dataset only has 1 classes and predicted class id should be in [0, 0].

If you could fix this or help me figure out what I am doing wrong I would be very thankful!

DINO evaluation on custom dataset crushes

When fine-tuning with following config:

import datetime
from detrex.config import get_config
from .models.dino_swin_large_384 import model


# number of classes
model.num_classes = 1

# get default config
dataloader = get_config("common/data/coco_detr_graffiti.py").dataloader
optimizer = get_config("common/optim.py").AdamW
lr_multiplier = get_config("common/coco_schedule.py").lr_multiplier_12ep
train = get_config("common/train.py").train

# modify training config
train.init_checkpoint = "./projects/dino/dino_swin_large_4scale_12ep.pth"
train.output_dir = "./output/graffiti_dino/satellit" + datetime.datetime.now().strftime("%d%m_%H%M")

# max training iterations
train.max_iter = 50000

# run evaluation every n iters
train.eval_period = 100000  # FYI evaluatilon crashes! TODO fix

# log training infomation every 20 iters
train.log_period = 20

# save checkpoint every n iters
train.checkpointer.period = 9999

# gradient clipping for training
train.clip_grad.enabled = True
train.clip_grad.params.max_norm = 0.1
train.clip_grad.params.norm_type = 2

# set training devices
train.device = "cuda" # or "cuda:1" or "cpu"
model.device = train.device

# modify optimizer config
optimizer.lr = 1e-4
optimizer.betas = (0.9, 0.999)
optimizer.weight_decay = 1e-4
optimizer.params.lr_factor_func = lambda module_name: 0.1 if "backbone" in module_name else 1

# modify dataloader config
dataloader.train.num_workers = 2
dataloader.train.dataset.filter_empty = False 

# please notice that this is total batch size
# surpose you're using 4 gpus for training and the batch size for
# each gpu is 16/4 = 4
dataloader.train.total_batch_size = 2

# data
dataloader.train.dataset.names = 'graffiti_train'
dataloader.test.dataset.names = 'graffiti_test'
dataloader.evaluator.dataset_name = 'graffiti_val'

# dump the testing results into output_dir for visualization
dataloader.evaluator.output_dir = train.output_dir

I'm getting an exception:

[32m[12/11 10:28:13 d2.utils.events]: �[0m eta: 20:01:48  iter: 79  total_loss: 11.8  loss_class: 0.3539  loss_bbox: 0.1169  loss_giou: 0.3934  loss_class_0: 0.5056  loss_bbox_0: 0.1242  loss_giou_0: 0.4234  loss_class_1: 0.5262  loss_bbox_1: 0.09443  loss_giou_1: 0.3764  loss_class_2: 0.409  loss_bbox_2: 0.111  loss_giou_2: 0.3898  loss_class_3: 0.3466  loss_bbox_3: 0.1062  loss_giou_3: 0.3604  loss_class_4: 0.3582  loss_bbox_4: 0.1168  loss_giou_4: 0.3903  loss_class_enc: 0.6693  loss_bbox_enc: 0.147  loss_giou_enc: 0.5353  loss_class_dn: 0.01112  loss_bbox_dn: 0.1165  loss_giou_dn: 0.4171  loss_class_dn_0: 0.1569  loss_bbox_dn_0: 0.219  loss_giou_dn_0: 0.6382  loss_class_dn_1: 0.03851  loss_bbox_dn_1: 0.1345  loss_giou_dn_1: 0.4488  loss_class_dn_2: 0.02079  loss_bbox_dn_2: 0.1128  loss_giou_dn_2: 0.407  loss_class_dn_3: 0.01118  loss_bbox_dn_3: 0.1179  loss_giou_dn_3: 0.4205  loss_class_dn_4: 0.008087  loss_bbox_dn_4: 0.1166  loss_giou_dn_4: 0.4164  time: 1.3978  data_time: 0.0050  lr: 0.0001  max_mem: 35711M
�[32m[12/11 10:28:40 d2.data.datasets.coco]: �[0mLoaded 37 images in COCO format from /home/jovyan/data/kamila/data/satellit/splitted/test/annotations/instances_default.json
�[32m[12/11 10:28:40 d2.data.build]: �[0mDistribution of instances among all 1 categories:
�[36m|  category  | #instances   |
|:----------:|:-------------|
|  graffiti  | 91           |
|            |              |�[0m
�[32m[12/11 10:28:40 d2.data.common]: �[0mSerializing 37 elements to byte tensors and concatenating them all ...
�[32m[12/11 10:28:40 d2.data.common]: �[0mSerialized dataset takes 0.09 MiB
�[32m[12/11 10:28:40 d2.evaluation.evaluator]: �[0mStart inference on 37 batches
�[32m[12/11 10:28:43 d2.evaluation.evaluator]: �[0mInference done 11/37. Dataloading: 0.0010 s/iter. Inference: 0.2747 s/iter. Eval: 0.0006 s/iter. Total: 0.2763 s/iter. ETA=0:00:07
�[32m[12/11 10:28:49 d2.evaluation.evaluator]: �[0mInference done 30/37. Dataloading: 0.0013 s/iter. Inference: 0.2739 s/iter. Eval: 0.0005 s/iter. Total: 0.2757 s/iter. ETA=0:00:01
�[32m[12/11 10:28:51 d2.evaluation.evaluator]: �[0mTotal inference time: 0:00:08.947697 (0.279616 s / iter per device, on 1 devices)
�[32m[12/11 10:28:51 d2.evaluation.evaluator]: �[0mTotal inference pure compute time: 0:00:08 (0.273675 s / iter per device, on 1 devices)
�[32m[12/11 10:28:51 d2.evaluation.coco_evaluation]: �[0mPreparing results for COCO format ...
�[32m[12/11 10:28:51 d2.evaluation.coco_evaluation]: �[0mSaving results to ./output/graffiti_dino/satellit1112_1026/coco_instances_results.json
�[32m[12/11 10:28:51 d2.evaluation.coco_evaluation]: �[0mEvaluating predictions with unofficial COCO API...
Loading and preparing results...
�[4m�[5m�[31mERROR�[0m �[32m[12/11 10:28:51 d2.engine.train_loop]: �[0mException during training:
Traceback (most recent call last):
  File "/home/jovyan/data/kamila/detrex/detectron2/detectron2/engine/train_loop.py", line 150, in train
    self.after_step()
  File "/home/jovyan/data/kamila/detrex/detectron2/detectron2/engine/train_loop.py", line 180, in after_step
    h.after_step()
  File "/home/jovyan/data/kamila/detrex/detectron2/detectron2/engine/hooks.py", line 555, in after_step
    self._do_eval()
  File "/home/jovyan/data/kamila/detrex/detectron2/detectron2/engine/hooks.py", line 528, in _do_eval
    results = self._func()
  File "tools/train_net_satellit_graffiti.py", line 194, in <lambda>
    hooks.EvalHook(cfg.train.eval_period, lambda: do_test(cfg, model)),
  File "tools/train_net_satellit_graffiti.py", line 135, in do_test
    ret = inference_on_dataset(
  File "/home/jovyan/data/kamila/detrex/detectron2/detectron2/evaluation/evaluator.py", line 204, in inference_on_dataset
    results = evaluator.evaluate()
  File "/home/jovyan/data/kamila/detrex/detectron2/detectron2/evaluation/coco_evaluation.py", line 206, in evaluate
    self._eval_predictions(predictions, img_ids=img_ids)
  File "/home/jovyan/data/kamila/detrex/detectron2/detectron2/evaluation/coco_evaluation.py", line 266, in _eval_predictions
    _evaluate_predictions_on_coco(
  File "/home/jovyan/data/kamila/detrex/detectron2/detectron2/evaluation/coco_evaluation.py", line 590, in _evaluate_predictions_on_coco
    coco_dt = coco_gt.loadRes(coco_results)
  File "/opt/conda/lib/python3.8/site-packages/pycocotools/coco.py", line 327, in loadRes
    assert set(annsImgIds) == (set(annsImgIds) & set(self.getImgIds())), \
AssertionError: Results do not correspond to current coco set
�[32m[12/11 10:28:51 d2.engine.hooks]: �[0mOverall training speed: 97 iterations in 0:02:16 (1.4051 s / it)
�[32m[12/11 10:28:51 d2.engine.hooks]: �[0mTotal training time: 0:02:27 (0:00:11 on hooks)
�[32m[12/11 10:28:51 d2.utils.events]: �[0m eta: 20:00:31  iter: 99  total_loss: 8.968  loss_class: 0.3145  loss_bbox: 0.07012  loss_giou: 0.3334  loss_class_0: 0.4652  loss_bbox_0: 0.08328  loss_giou_0: 0.3327  loss_class_1: 0.5017  loss_bbox_1: 0.06474  loss_giou_1: 0.2563  loss_class_2: 0.3895  loss_bbox_2: 0.07384  loss_giou_2: 0.3273  loss_class_3: 0.3095  loss_bbox_3: 0.06951  loss_giou_3: 0.329  loss_class_4: 0.3126  loss_bbox_4: 0.07025  loss_giou_4: 0.3316  loss_class_enc: 0.5665  loss_bbox_enc: 0.1357  loss_giou_enc: 0.4332  loss_class_dn: 0.002992  loss_bbox_dn: 0.07017  loss_giou_dn: 0.3381  loss_class_dn_0: 0.1288  loss_bbox_dn_0: 0.1546  loss_giou_dn_0: 0.5158  loss_class_dn_1: 0.02353  loss_bbox_dn_1: 0.08345  loss_giou_dn_1: 0.3674  loss_class_dn_2: 0.01641  loss_bbox_dn_2: 0.07573  loss_giou_dn_2: 0.3332  loss_class_dn_3: 0.009212  loss_bbox_dn_3: 0.07121  loss_giou_dn_3: 0.3396  loss_class_dn_4: 0.00324  loss_bbox_dn_4: 0.07031  loss_giou_dn_4: 0.3378  time: 1.3907  data_time: 0.0049  lr: 0.0001  max_mem: 35711M
Traceback (most recent call last):
  File "tools/train_net_satellit_graffiti.py", line 232, in <module>
    launch(
  File "/home/jovyan/data/kamila/detrex/detectron2/detectron2/engine/launch.py", line 82, in launch
    main_func(*args)
  File "tools/train_net_satellit_graffiti.py", line 227, in main
    do_train(args, cfg)
  File "tools/train_net_satellit_graffiti.py", line 211, in do_train
    trainer.train(start_iter, cfg.train.max_iter)
  File "/home/jovyan/data/kamila/detrex/detectron2/detectron2/engine/train_loop.py", line 150, in train
    self.after_step()
  File "/home/jovyan/data/kamila/detrex/detectron2/detectron2/engine/train_loop.py", line 180, in after_step
    h.after_step()
  File "/home/jovyan/data/kamila/detrex/detectron2/detectron2/engine/hooks.py", line 555, in after_step
    self._do_eval()
  File "/home/jovyan/data/kamila/detrex/detectron2/detectron2/engine/hooks.py", line 528, in _do_eval
    results = self._func()
  File "tools/train_net_satellit_graffiti.py", line 194, in <lambda>
    hooks.EvalHook(cfg.train.eval_period, lambda: do_test(cfg, model)),
  File "tools/train_net_satellit_graffiti.py", line 135, in do_test
    ret = inference_on_dataset(
  File "/home/jovyan/data/kamila/detrex/detectron2/detectron2/evaluation/evaluator.py", line 204, in inference_on_dataset
    results = evaluator.evaluate()
  File "/home/jovyan/data/kamila/detrex/detectron2/detectron2/evaluation/coco_evaluation.py", line 206, in evaluate
    self._eval_predictions(predictions, img_ids=img_ids)
  File "/home/jovyan/data/kamila/detrex/detectron2/detectron2/evaluation/coco_evaluation.py", line 266, in _eval_predictions
    _evaluate_predictions_on_coco(
  File "/home/jovyan/data/kamila/detrex/detectron2/detectron2/evaluation/coco_evaluation.py", line 590, in _evaluate_predictions_on_coco
    coco_dt = coco_gt.loadRes(coco_results)
  File "/opt/conda/lib/python3.8/site-packages/pycocotools/coco.py", line 327, in loadRes
    assert set(annsImgIds) == (set(annsImgIds) & set(self.getImgIds())), \
AssertionError: Results do not correspond to current coco set

ImportError: Cannot import 'detrex._C', therefore 'MultiScaleDeformableAttention' is not available

I install the detrex==0.1.0 success. But when running the train_net, I get the error.

error info

WARNING:root:Pytorch pre-release version 1.13.0.dev20220914+cu113 - assuming intent to test it
projects/dino/configs/dino_r50_4scale_12ep.py
[09/29 14:20:44 detectron2]: Rank of current process: 0. World size: 1
cuobjdump info    : File '/data/lr/anaconda3/envs/detrex/lib/python3.10/site-packages/torchvision/_C.so' does not contain device code
[09/29 14:20:46 detectron2]: Environment info:
----------------------  ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
sys.platform            linux
Python                  3.10.4 (main, Mar 31 2022, 08:41:55) [GCC 7.5.0]
numpy                   1.23.3
detectron2              0.6 @/data/lr/detrex/detectron2/detectron2
detectron2._C           not built correctly: /data/lr/detrex/detectron2/detectron2/_C.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN2at4_ops19empty_memory_format4callEN3c108ArrayRefIlEENS2_8optionalINS2_10ScalarTypeEEENS5_INS2_6LayoutEEENS5_INS2_6DeviceEEENS5_IbEENS5_INS2_12MemoryFormatEEE
Compiler ($CXX)         c++ (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
CUDA compiler           Build cuda_11.6.r11.6/compiler.31057947_0
detectron2 arch flags   7.0
DETECTRON2_ENV_MODULE   <not set>
PyTorch                 1.13.0.dev20220914+cu113 @/data/lr/anaconda3/envs/detrex/lib/python3.10/site-packages/torch
PyTorch debug build     False
GPU available           Yes
GPU 0,1,2,3             NVIDIA A40 (arch=8.6)
Driver version          515.65.01
CUDA_HOME               /usr/local/cuda
Pillow                  9.2.0
torchvision             0.14.0.dev20220928 @/data/lr/anaconda3/envs/detrex/lib/python3.10/site-packages/torchvision
torchvision arch flags  /data/lr/anaconda3/envs/detrex/lib/python3.10/site-packages/torchvision/_C.so
fvcore                  0.1.5.post20220512
iopath                  0.1.9
cv2                     4.5.5
----------------------  ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
PyTorch built with:
  - GCC 9.3
  - C++ Version: 201402
  - Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.6.0 (Git Hash 52b5f107dd9cf10910aaa19cb47f3abf9b349815)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - CUDA Runtime 11.3
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86
  - CuDNN 8.4.1  (built against CUDA 11.6)
    - Built with CuDNN 8.3.2
  - Magma 2.5.2
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.3, CUDNN_VERSION=8.3.2, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -fabi-version=11 -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.13.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, 

[09/29 14:20:46 detectron2]: Command line arguments: Namespace(config_file='projects/dino/configs/dino_r50_4scale_12ep.py', resume=False, eval_only=False, num_gpus=1, num_machines=1, machine_rank=0, dist_url='tcp://127.0.0.1:50156', opts=[])
[09/29 14:20:46 detectron2]: Contents of args.config_file=projects/dino/configs/dino_r50_4scale_12ep.py:
from detrex.config import get_config
from .models.dino_r50 import model

# get default config
dataloader = get_config("common/data/coco_detr.py").dataloader
optimizer = get_config("common/optim.py").AdamW
lr_multiplier = get_config("common/coco_schedule.py").lr_multiplier_12ep
train = get_config("common/train.py").train

# modify training config
train.init_checkpoint = "detectron2://ImageNetPretrained/torchvision/R-50.pkl"
train.output_dir = "./output/dino_r50_4scale_12ep"
train.max_iter = 90000
train.clip_grad.enabled = True
train.clip_grad.params.max_norm = 0.1
train.clip_grad.params.norm_type = 2
train.seed = 42

# modify optimizer config
optimizer.weight_decay = 1e-4
optimizer.params.lr_factor_func = lambda module_name: 0.1 if "backbone" in module_name else 1

# modify dataloader config
dataloader.train.dataset.filter_empty = False
dataloader.train.num_workers = 16

WARNING [09/29 14:20:46 d2.config.lazy]: The config contains objects that cannot serialize to a valid yaml. ./output/dino_r50_4scale_12ep/config.yaml is human-readable but cannot be loaded.
WARNING [09/29 14:20:46 d2.config.lazy]: Config is saved using cloudpickle at ./output/dino_r50_4scale_12ep/config.yaml.pkl.
[09/29 14:20:46 detectron2]: Full config saved to ./output/dino_r50_4scale_12ep/config.yaml
Traceback (most recent call last):
  File "/data/lr/detrex/tools/train_net.py", line 231, in <module>
    launch(
  File "/data/lr/detrex/detectron2/detectron2/engine/launch.py", line 82, in launch
    main_func(*args)
  File "/data/lr/detrex/tools/train_net.py", line 225, in main
    do_train(args, cfg)
  File "/data/lr/detrex/tools/train_net.py", line 159, in do_train
    model = instantiate(cfg.model)
  File "/data/lr/detrex/detectron2/detectron2/config/instantiate.py", line 67, in instantiate
    cfg = {k: instantiate(v) for k, v in cfg.items()}
  File "/data/lr/detrex/detectron2/detectron2/config/instantiate.py", line 67, in <dictcomp>
    cfg = {k: instantiate(v) for k, v in cfg.items()}
  File "/data/lr/detrex/detectron2/detectron2/config/instantiate.py", line 67, in instantiate
    cfg = {k: instantiate(v) for k, v in cfg.items()}
  File "/data/lr/detrex/detectron2/detectron2/config/instantiate.py", line 67, in <dictcomp>
    cfg = {k: instantiate(v) for k, v in cfg.items()}
  File "/data/lr/detrex/detectron2/detectron2/config/instantiate.py", line 83, in instantiate
    return cls(**cfg)
  File "/data/lr/detrex/projects/dino/modeling/dino_transformer.py", line 45, in __init__
    attn=MultiScaleDeformableAttention(
  File "/data/lr/anaconda3/envs/detrex/lib/python3.10/site-packages/detrex-0.1.0-py3.10-linux-x86_64.egg/detrex/layers/multi_scale_deform_attn.py", line 373, in __init__
    raise ImportError(err)
ImportError: Cannot import 'detrex._C', therefore 'MultiScaleDeformableAttention' is not available. detrex is not compiled successfully, please build following the instructions!

Install log

Building wheel detrex-0.1.0
running install
/data/lr/anaconda3/envs/detrex/lib/python3.10/site-packages/setuptools/command/install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
  warnings.warn(
/data/lr/anaconda3/envs/detrex/lib/python3.10/site-packages/setuptools/command/easy_install.py:144: EasyInstallDeprecationWarning: easy_install command is deprecated. Use build and pip and other standards-based tools.
  warnings.warn(
running bdist_egg
running egg_info
writing detrex.egg-info/PKG-INFO
writing dependency_links to detrex.egg-info/dependency_links.txt
writing requirements to detrex.egg-info/requires.txt
writing top-level names to detrex.egg-info/top_level.txt
/data/lr/anaconda3/envs/detrex/lib/python3.10/site-packages/torch/utils/cpp_extension.py:472: UserWarning: Attempted to use ninja as the BuildExtension backend but we could not find ninja.. Falling back to using the slow distutils backend.
  warnings.warn(msg.format('we could not find ninja.'))
reading manifest file 'detrex.egg-info/SOURCES.txt'
adding license file 'LICENSE'
writing manifest file 'detrex.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
copying detrex/version.py -> build/lib.linux-x86_64-cpython-310/detrex
running build_ext
/data/lr/anaconda3/envs/detrex/lib/python3.10/site-packages/torch/utils/cpp_extension.py:383: UserWarning: The detected CUDA version (11.6) has a minor version mismatch with the version that was used to compile PyTorch (11.3). Most likely this shouldn't be a problem.
  warnings.warn(CUDA_MISMATCH_WARN.format(cuda_str_version, torch.version.cuda))
creating build/bdist.linux-x86_64/egg
creating build/bdist.linux-x86_64/egg/detrex
creating build/bdist.linux-x86_64/egg/detrex/utils
copying build/lib.linux-x86_64-cpython-310/detrex/utils/dist.py -> build/bdist.linux-x86_64/egg/detrex/utils
copying build/lib.linux-x86_64-cpython-310/detrex/utils/misc.py -> build/bdist.linux-x86_64/egg/detrex/utils
copying build/lib.linux-x86_64-cpython-310/detrex/utils/__init__.py -> build/bdist.linux-x86_64/egg/detrex/utils
copying build/lib.linux-x86_64-cpython-310/detrex/version.py -> build/bdist.linux-x86_64/egg/detrex
creating build/bdist.linux-x86_64/egg/detrex/modeling
creating build/bdist.linux-x86_64/egg/detrex/modeling/losses
copying build/lib.linux-x86_64-cpython-310/detrex/modeling/losses/dice_loss.py -> build/bdist.linux-x86_64/egg/detrex/modeling/losses
copying build/lib.linux-x86_64-cpython-310/detrex/modeling/losses/smooth_l1_loss.py -> build/bdist.linux-x86_64/egg/detrex/modeling/losses
copying build/lib.linux-x86_64-cpython-310/detrex/modeling/losses/focal_loss.py -> build/bdist.linux-x86_64/egg/detrex/modeling/losses
copying build/lib.linux-x86_64-cpython-310/detrex/modeling/losses/giou_loss.py -> build/bdist.linux-x86_64/egg/detrex/modeling/losses
copying build/lib.linux-x86_64-cpython-310/detrex/modeling/losses/__init__.py -> build/bdist.linux-x86_64/egg/detrex/modeling/losses
copying build/lib.linux-x86_64-cpython-310/detrex/modeling/losses/utils.py -> build/bdist.linux-x86_64/egg/detrex/modeling/losses
copying build/lib.linux-x86_64-cpython-310/detrex/modeling/losses/cross_entropy_loss.py -> build/bdist.linux-x86_64/egg/detrex/modeling/losses
creating build/bdist.linux-x86_64/egg/detrex/modeling/criterion
copying build/lib.linux-x86_64-cpython-310/detrex/modeling/criterion/criterion.py -> build/bdist.linux-x86_64/egg/detrex/modeling/criterion
copying build/lib.linux-x86_64-cpython-310/detrex/modeling/criterion/__init__.py -> build/bdist.linux-x86_64/egg/detrex/modeling/criterion
copying build/lib.linux-x86_64-cpython-310/detrex/modeling/criterion/base_criterion.py -> build/bdist.linux-x86_64/egg/detrex/modeling/criterion
creating build/bdist.linux-x86_64/egg/detrex/modeling/matcher
copying build/lib.linux-x86_64-cpython-310/detrex/modeling/matcher/match_cost.py -> build/bdist.linux-x86_64/egg/detrex/modeling/matcher
copying build/lib.linux-x86_64-cpython-310/detrex/modeling/matcher/matcher.py -> build/bdist.linux-x86_64/egg/detrex/modeling/matcher
copying build/lib.linux-x86_64-cpython-310/detrex/modeling/matcher/modified_matcher.py -> build/bdist.linux-x86_64/egg/detrex/modeling/matcher
copying build/lib.linux-x86_64-cpython-310/detrex/modeling/matcher/__init__.py -> build/bdist.linux-x86_64/egg/detrex/modeling/matcher
creating build/bdist.linux-x86_64/egg/detrex/modeling/backbone
copying build/lib.linux-x86_64-cpython-310/detrex/modeling/backbone/timm_backbone.py -> build/bdist.linux-x86_64/egg/detrex/modeling/backbone
copying build/lib.linux-x86_64-cpython-310/detrex/modeling/backbone/convnext.py -> build/bdist.linux-x86_64/egg/detrex/modeling/backbone
copying build/lib.linux-x86_64-cpython-310/detrex/modeling/backbone/resnet.py -> build/bdist.linux-x86_64/egg/detrex/modeling/backbone
copying build/lib.linux-x86_64-cpython-310/detrex/modeling/backbone/focalnet.py -> build/bdist.linux-x86_64/egg/detrex/modeling/backbone
copying build/lib.linux-x86_64-cpython-310/detrex/modeling/backbone/__init__.py -> build/bdist.linux-x86_64/egg/detrex/modeling/backbone
copying build/lib.linux-x86_64-cpython-310/detrex/modeling/backbone/torchvision_backbone.py -> build/bdist.linux-x86_64/egg/detrex/modeling/backbone
copying build/lib.linux-x86_64-cpython-310/detrex/modeling/__init__.py -> build/bdist.linux-x86_64/egg/detrex/modeling
creating build/bdist.linux-x86_64/egg/detrex/modeling/neck
copying build/lib.linux-x86_64-cpython-310/detrex/modeling/neck/__init__.py -> build/bdist.linux-x86_64/egg/detrex/modeling/neck
copying build/lib.linux-x86_64-cpython-310/detrex/modeling/neck/channel_mapper.py -> build/bdist.linux-x86_64/egg/detrex/modeling/neck
copying build/lib.linux-x86_64-cpython-310/detrex/_C.cpython-310-x86_64-linux-gnu.so -> build/bdist.linux-x86_64/egg/detrex
creating build/bdist.linux-x86_64/egg/detrex/config
creating build/bdist.linux-x86_64/egg/detrex/config/configs
creating build/bdist.linux-x86_64/egg/detrex/config/configs/common
copying build/lib.linux-x86_64-cpython-310/detrex/config/configs/common/coco_schedule.py -> build/bdist.linux-x86_64/egg/detrex/config/configs/common
copying build/lib.linux-x86_64-cpython-310/detrex/config/configs/common/train.py -> build/bdist.linux-x86_64/egg/detrex/config/configs/common
creating build/bdist.linux-x86_64/egg/detrex/config/configs/common/data
copying build/lib.linux-x86_64-cpython-310/detrex/config/configs/common/data/coco_detr.py -> build/bdist.linux-x86_64/egg/detrex/config/configs/common/data
copying build/lib.linux-x86_64-cpython-310/detrex/config/configs/common/data/coco.py -> build/bdist.linux-x86_64/egg/detrex/config/configs/common/data
copying build/lib.linux-x86_64-cpython-310/detrex/config/configs/common/optim.py -> build/bdist.linux-x86_64/egg/detrex/config/configs/common
copying build/lib.linux-x86_64-cpython-310/detrex/config/__init__.py -> build/bdist.linux-x86_64/egg/detrex/config
copying build/lib.linux-x86_64-cpython-310/detrex/config/config.py -> build/bdist.linux-x86_64/egg/detrex/config
copying build/lib.linux-x86_64-cpython-310/detrex/__init__.py -> build/bdist.linux-x86_64/egg/detrex
creating build/bdist.linux-x86_64/egg/detrex/data
copying build/lib.linux-x86_64-cpython-310/detrex/data/detr_dataset_mapper.py -> build/bdist.linux-x86_64/egg/detrex/data
copying build/lib.linux-x86_64-cpython-310/detrex/data/__init__.py -> build/bdist.linux-x86_64/egg/detrex/data
creating build/bdist.linux-x86_64/egg/detrex/layers
copying build/lib.linux-x86_64-cpython-310/detrex/layers/shape_spec.py -> build/bdist.linux-x86_64/egg/detrex/layers
copying build/lib.linux-x86_64-cpython-310/detrex/layers/position_embedding.py -> build/bdist.linux-x86_64/egg/detrex/layers
copying build/lib.linux-x86_64-cpython-310/detrex/layers/layer_norm.py -> build/bdist.linux-x86_64/egg/detrex/layers
copying build/lib.linux-x86_64-cpython-310/detrex/layers/transformer.py -> build/bdist.linux-x86_64/egg/detrex/layers
copying build/lib.linux-x86_64-cpython-310/detrex/layers/conv.py -> build/bdist.linux-x86_64/egg/detrex/layers
copying build/lib.linux-x86_64-cpython-310/detrex/layers/box_ops.py -> build/bdist.linux-x86_64/egg/detrex/layers
copying build/lib.linux-x86_64-cpython-310/detrex/layers/mlp.py -> build/bdist.linux-x86_64/egg/detrex/layers
copying build/lib.linux-x86_64-cpython-310/detrex/layers/denoising.py -> build/bdist.linux-x86_64/egg/detrex/layers
copying build/lib.linux-x86_64-cpython-310/detrex/layers/attention.py -> build/bdist.linux-x86_64/egg/detrex/layers
copying build/lib.linux-x86_64-cpython-310/detrex/layers/multi_scale_deform_attn.py -> build/bdist.linux-x86_64/egg/detrex/layers
copying build/lib.linux-x86_64-cpython-310/detrex/layers/__init__.py -> build/bdist.linux-x86_64/egg/detrex/layers
byte-compiling build/bdist.linux-x86_64/egg/detrex/utils/dist.py to dist.cpython-310.pyc
byte-compiling build/bdist.linux-x86_64/egg/detrex/utils/misc.py to misc.cpython-310.pyc
byte-compiling build/bdist.linux-x86_64/egg/detrex/utils/__init__.py to __init__.cpython-310.pyc
byte-compiling build/bdist.linux-x86_64/egg/detrex/version.py to version.cpython-310.pyc
byte-compiling build/bdist.linux-x86_64/egg/detrex/modeling/losses/dice_loss.py to dice_loss.cpython-310.pyc
byte-compiling build/bdist.linux-x86_64/egg/detrex/modeling/losses/smooth_l1_loss.py to smooth_l1_loss.cpython-310.pyc
byte-compiling build/bdist.linux-x86_64/egg/detrex/modeling/losses/focal_loss.py to focal_loss.cpython-310.pyc
byte-compiling build/bdist.linux-x86_64/egg/detrex/modeling/losses/giou_loss.py to giou_loss.cpython-310.pyc
byte-compiling build/bdist.linux-x86_64/egg/detrex/modeling/losses/__init__.py to __init__.cpython-310.pyc
byte-compiling build/bdist.linux-x86_64/egg/detrex/modeling/losses/utils.py to utils.cpython-310.pyc
byte-compiling build/bdist.linux-x86_64/egg/detrex/modeling/losses/cross_entropy_loss.py to cross_entropy_loss.cpython-310.pyc
byte-compiling build/bdist.linux-x86_64/egg/detrex/modeling/criterion/criterion.py to criterion.cpython-310.pyc
byte-compiling build/bdist.linux-x86_64/egg/detrex/modeling/criterion/__init__.py to __init__.cpython-310.pyc
byte-compiling build/bdist.linux-x86_64/egg/detrex/modeling/criterion/base_criterion.py to base_criterion.cpython-310.pyc
byte-compiling build/bdist.linux-x86_64/egg/detrex/modeling/matcher/match_cost.py to match_cost.cpython-310.pyc
byte-compiling build/bdist.linux-x86_64/egg/detrex/modeling/matcher/matcher.py to matcher.cpython-310.pyc
byte-compiling build/bdist.linux-x86_64/egg/detrex/modeling/matcher/modified_matcher.py to modified_matcher.cpython-310.pyc
byte-compiling build/bdist.linux-x86_64/egg/detrex/modeling/matcher/__init__.py to __init__.cpython-310.pyc
byte-compiling build/bdist.linux-x86_64/egg/detrex/modeling/backbone/timm_backbone.py to timm_backbone.cpython-310.pyc
byte-compiling build/bdist.linux-x86_64/egg/detrex/modeling/backbone/convnext.py to convnext.cpython-310.pyc
byte-compiling build/bdist.linux-x86_64/egg/detrex/modeling/backbone/resnet.py to resnet.cpython-310.pyc
byte-compiling build/bdist.linux-x86_64/egg/detrex/modeling/backbone/focalnet.py to focalnet.cpython-310.pyc
byte-compiling build/bdist.linux-x86_64/egg/detrex/modeling/backbone/__init__.py to __init__.cpython-310.pyc
byte-compiling build/bdist.linux-x86_64/egg/detrex/modeling/backbone/torchvision_backbone.py to torchvision_backbone.cpython-310.pyc
byte-compiling build/bdist.linux-x86_64/egg/detrex/modeling/__init__.py to __init__.cpython-310.pyc
byte-compiling build/bdist.linux-x86_64/egg/detrex/modeling/neck/__init__.py to __init__.cpython-310.pyc
byte-compiling build/bdist.linux-x86_64/egg/detrex/modeling/neck/channel_mapper.py to channel_mapper.cpython-310.pyc
byte-compiling build/bdist.linux-x86_64/egg/detrex/config/configs/common/coco_schedule.py to coco_schedule.cpython-310.pyc
byte-compiling build/bdist.linux-x86_64/egg/detrex/config/configs/common/train.py to train.cpython-310.pyc
byte-compiling build/bdist.linux-x86_64/egg/detrex/config/configs/common/data/coco_detr.py to coco_detr.cpython-310.pyc
byte-compiling build/bdist.linux-x86_64/egg/detrex/config/configs/common/data/coco.py to coco.cpython-310.pyc
byte-compiling build/bdist.linux-x86_64/egg/detrex/config/configs/common/optim.py to optim.cpython-310.pyc
byte-compiling build/bdist.linux-x86_64/egg/detrex/config/__init__.py to __init__.cpython-310.pyc
byte-compiling build/bdist.linux-x86_64/egg/detrex/config/config.py to config.cpython-310.pyc
byte-compiling build/bdist.linux-x86_64/egg/detrex/__init__.py to __init__.cpython-310.pyc
byte-compiling build/bdist.linux-x86_64/egg/detrex/data/detr_dataset_mapper.py to detr_dataset_mapper.cpython-310.pyc
byte-compiling build/bdist.linux-x86_64/egg/detrex/data/__init__.py to __init__.cpython-310.pyc
byte-compiling build/bdist.linux-x86_64/egg/detrex/layers/shape_spec.py to shape_spec.cpython-310.pyc
byte-compiling build/bdist.linux-x86_64/egg/detrex/layers/position_embedding.py to position_embedding.cpython-310.pyc
byte-compiling build/bdist.linux-x86_64/egg/detrex/layers/layer_norm.py to layer_norm.cpython-310.pyc
byte-compiling build/bdist.linux-x86_64/egg/detrex/layers/transformer.py to transformer.cpython-310.pyc
byte-compiling build/bdist.linux-x86_64/egg/detrex/layers/conv.py to conv.cpython-310.pyc
byte-compiling build/bdist.linux-x86_64/egg/detrex/layers/box_ops.py to box_ops.cpython-310.pyc
byte-compiling build/bdist.linux-x86_64/egg/detrex/layers/mlp.py to mlp.cpython-310.pyc
byte-compiling build/bdist.linux-x86_64/egg/detrex/layers/denoising.py to denoising.cpython-310.pyc
byte-compiling build/bdist.linux-x86_64/egg/detrex/layers/attention.py to attention.cpython-310.pyc
byte-compiling build/bdist.linux-x86_64/egg/detrex/layers/multi_scale_deform_attn.py to multi_scale_deform_attn.cpython-310.pyc
byte-compiling build/bdist.linux-x86_64/egg/detrex/layers/__init__.py to __init__.cpython-310.pyc
creating stub loader for detrex/_C.cpython-310-x86_64-linux-gnu.so
byte-compiling build/bdist.linux-x86_64/egg/detrex/_C.py to _C.cpython-310.pyc
creating build/bdist.linux-x86_64/egg/EGG-INFO
copying detrex.egg-info/PKG-INFO -> build/bdist.linux-x86_64/egg/EGG-INFO
copying detrex.egg-info/SOURCES.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying detrex.egg-info/dependency_links.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying detrex.egg-info/requires.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying detrex.egg-info/top_level.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
writing build/bdist.linux-x86_64/egg/EGG-INFO/native_libs.txt
zip_safe flag not set; analyzing archive contents...
detrex.__pycache__._C.cpython-310: module references __file__
creating 'dist/detrex-0.1.0-py3.10-linux-x86_64.egg' and adding 'build/bdist.linux-x86_64/egg' to it
removing 'build/bdist.linux-x86_64/egg' (and everything under it)
Processing detrex-0.1.0-py3.10-linux-x86_64.egg
removing '/data/lr/anaconda3/envs/detrex/lib/python3.10/site-packages/detrex-0.1.0-py3.10-linux-x86_64.egg' (and everything under it)
creating /data/lr/anaconda3/envs/detrex/lib/python3.10/site-packages/detrex-0.1.0-py3.10-linux-x86_64.egg
Extracting detrex-0.1.0-py3.10-linux-x86_64.egg to /data/lr/anaconda3/envs/detrex/lib/python3.10/site-packages
detrex 0.1.0 is already the active version in easy-install.pth

Installed /data/lr/anaconda3/envs/detrex/lib/python3.10/site-packages/detrex-0.1.0-py3.10-linux-x86_64.egg
Processing dependencies for detrex==0.1.0
Searching for psutil==5.9.2
Best match: psutil 5.9.2
Processing psutil-5.9.2-py3.10-linux-x86_64.egg
psutil 5.9.2 is already the active version in easy-install.pth

Using /data/lr/anaconda3/envs/detrex/lib/python3.10/site-packages/psutil-5.9.2-py3.10-linux-x86_64.egg
Searching for scipy==1.9.1
Best match: scipy 1.9.1
Adding scipy 1.9.1 to easy-install.pth file

Using /data/lr/anaconda3/envs/detrex/lib/python3.10/site-packages
Searching for pytest==7.1.3
Best match: pytest 7.1.3
Adding pytest 7.1.3 to easy-install.pth file
Installing py.test script to /data/lr/anaconda3/envs/detrex/bin
Installing pytest script to /data/lr/anaconda3/envs/detrex/bin

Using /data/lr/anaconda3/envs/detrex/lib/python3.10/site-packages
Searching for timm==0.6.7
Best match: timm 0.6.7
Adding timm 0.6.7 to easy-install.pth file

Using /data/lr/anaconda3/envs/detrex/lib/python3.10/site-packages
Searching for autoflake==1.6.1
Best match: autoflake 1.6.1
Adding autoflake 1.6.1 to easy-install.pth file
Installing autoflake script to /data/lr/anaconda3/envs/detrex/bin

Using /data/lr/anaconda3/envs/detrex/lib/python3.10/site-packages
Searching for black==22.3.0
Best match: black 22.3.0
Adding black 22.3.0 to easy-install.pth file
Installing black script to /data/lr/anaconda3/envs/detrex/bin
Installing blackd script to /data/lr/anaconda3/envs/detrex/bin

Using /data/lr/anaconda3/envs/detrex/lib/python3.10/site-packages
Searching for isort==4.3.21
Best match: isort 4.3.21
Adding isort 4.3.21 to easy-install.pth file
Installing isort script to /data/lr/anaconda3/envs/detrex/bin

Using /data/lr/anaconda3/envs/detrex/lib/python3.10/site-packages
Searching for flake8==3.8.1
Best match: flake8 3.8.1
Adding flake8 3.8.1 to easy-install.pth file
Installing flake8 script to /data/lr/anaconda3/envs/detrex/bin

Using /data/lr/anaconda3/envs/detrex/lib/python3.10/site-packages
Searching for pybind11==2.10.0
Best match: pybind11 2.10.0
Adding pybind11 2.10.0 to easy-install.pth file
Installing pybind11-config script to /data/lr/anaconda3/envs/detrex/bin

Using /data/lr/anaconda3/envs/detrex/lib/python3.10/site-packages
Searching for omegaconf==2.1.0
Best match: omegaconf 2.1.0
Adding omegaconf 2.1.0 to easy-install.pth file

Using /data/lr/anaconda3/envs/detrex/lib/python3.10/site-packages
Searching for hydra-core==1.1.2
Best match: hydra-core 1.1.2
Adding hydra-core 1.1.2 to easy-install.pth file

Using /data/lr/anaconda3/envs/detrex/lib/python3.10/site-packages
Searching for cloudpickle==2.2.0
Best match: cloudpickle 2.2.0
Adding cloudpickle 2.2.0 to easy-install.pth file

Using /data/lr/anaconda3/envs/detrex/lib/python3.10/site-packages
Searching for numpy==1.23.3
Best match: numpy 1.23.3
Adding numpy 1.23.3 to easy-install.pth file
Installing f2py script to /data/lr/anaconda3/envs/detrex/bin
Installing f2py3 script to /data/lr/anaconda3/envs/detrex/bin
Installing f2py3.10 script to /data/lr/anaconda3/envs/detrex/bin

Using /data/lr/anaconda3/envs/detrex/lib/python3.10/site-packages
Searching for tomli==2.0.1
Best match: tomli 2.0.1
Adding tomli 2.0.1 to easy-install.pth file

Using /data/lr/anaconda3/envs/detrex/lib/python3.10/site-packages
Searching for py==1.11.0
Best match: py 1.11.0
Adding py 1.11.0 to easy-install.pth file

Using /data/lr/anaconda3/envs/detrex/lib/python3.10/site-packages
Searching for pluggy==1.0.0
Best match: pluggy 1.0.0
Adding pluggy 1.0.0 to easy-install.pth file

Using /data/lr/anaconda3/envs/detrex/lib/python3.10/site-packages
Searching for packaging==21.3
Best match: packaging 21.3
Adding packaging 21.3 to easy-install.pth file

Using /data/lr/anaconda3/envs/detrex/lib/python3.10/site-packages
Searching for iniconfig==1.1.1
Best match: iniconfig 1.1.1
Adding iniconfig 1.1.1 to easy-install.pth file

Using /data/lr/anaconda3/envs/detrex/lib/python3.10/site-packages
Searching for attrs==22.1.0
Best match: attrs 22.1.0
Adding attrs 22.1.0 to easy-install.pth file

Using /data/lr/anaconda3/envs/detrex/lib/python3.10/site-packages
Searching for torchvision==0.14.0.dev20220928
Best match: torchvision 0.14.0.dev20220928
Adding torchvision 0.14.0.dev20220928 to easy-install.pth file

Using /data/lr/anaconda3/envs/detrex/lib/python3.10/site-packages
Searching for torch==1.13.0.dev20220914+cu113
Best match: torch 1.13.0.dev20220914+cu113
Adding torch 1.13.0.dev20220914+cu113 to easy-install.pth file
Installing convert-caffe2-to-onnx script to /data/lr/anaconda3/envs/detrex/bin
Installing convert-onnx-to-caffe2 script to /data/lr/anaconda3/envs/detrex/bin
Installing torchrun script to /data/lr/anaconda3/envs/detrex/bin

Using /data/lr/anaconda3/envs/detrex/lib/python3.10/site-packages
Searching for pyflakes==2.2.0
Best match: pyflakes 2.2.0
Adding pyflakes 2.2.0 to easy-install.pth file
Installing pyflakes script to /data/lr/anaconda3/envs/detrex/bin

Using /data/lr/anaconda3/envs/detrex/lib/python3.10/site-packages
Searching for mypy-extensions==0.4.3
Best match: mypy-extensions 0.4.3
Adding mypy-extensions 0.4.3 to easy-install.pth file

Using /data/lr/anaconda3/envs/detrex/lib/python3.10/site-packages
Searching for pathspec==0.10.1
Best match: pathspec 0.10.1
Adding pathspec 0.10.1 to easy-install.pth file

Using /data/lr/anaconda3/envs/detrex/lib/python3.10/site-packages
Searching for platformdirs==2.5.2
Best match: platformdirs 2.5.2
Adding platformdirs 2.5.2 to easy-install.pth file

Using /data/lr/anaconda3/envs/detrex/lib/python3.10/site-packages
Searching for click==8.1.3
Best match: click 8.1.3
Adding click 8.1.3 to easy-install.pth file

Using /data/lr/anaconda3/envs/detrex/lib/python3.10/site-packages
Searching for mccabe==0.6.1
Best match: mccabe 0.6.1
Adding mccabe 0.6.1 to easy-install.pth file

Using /data/lr/anaconda3/envs/detrex/lib/python3.10/site-packages
Searching for pycodestyle==2.6.0
Best match: pycodestyle 2.6.0
Adding pycodestyle 2.6.0 to easy-install.pth file
Installing pycodestyle script to /data/lr/anaconda3/envs/detrex/bin

Using /data/lr/anaconda3/envs/detrex/lib/python3.10/site-packages
Searching for PyYAML==6.0
Best match: PyYAML 6.0
Adding PyYAML 6.0 to easy-install.pth file

Using /data/lr/anaconda3/envs/detrex/lib/python3.10/site-packages
Searching for antlr4-python3-runtime==4.8
Best match: antlr4-python3-runtime 4.8
Adding antlr4-python3-runtime 4.8 to easy-install.pth file

Using /data/lr/anaconda3/envs/detrex/lib/python3.10/site-packages
Searching for pyparsing==3.0.9
Best match: pyparsing 3.0.9
Adding pyparsing 3.0.9 to easy-install.pth file

Using /data/lr/anaconda3/envs/detrex/lib/python3.10/site-packages
Searching for Pillow==9.2.0
Best match: Pillow 9.2.0
Adding Pillow 9.2.0 to easy-install.pth file

Using /data/lr/anaconda3/envs/detrex/lib/python3.10/site-packages
Searching for requests==2.28.1
Best match: requests 2.28.1
Adding requests 2.28.1 to easy-install.pth file

Using /data/lr/anaconda3/envs/detrex/lib/python3.10/site-packages
Searching for typing-extensions==4.3.0
Best match: typing-extensions 4.3.0
Adding typing-extensions 4.3.0 to easy-install.pth file

Using /data/lr/anaconda3/envs/detrex/lib/python3.10/site-packages
Searching for certifi==2022.9.14
Best match: certifi 2022.9.14
Adding certifi 2022.9.14 to easy-install.pth file

Using /data/lr/anaconda3/envs/detrex/lib/python3.10/site-packages
Searching for urllib3==1.26.12
Best match: urllib3 1.26.12
Adding urllib3 1.26.12 to easy-install.pth file

Using /data/lr/anaconda3/envs/detrex/lib/python3.10/site-packages
Searching for idna==3.4
Best match: idna 3.4
Adding idna 3.4 to easy-install.pth file

Using /data/lr/anaconda3/envs/detrex/lib/python3.10/site-packages
Searching for charset-normalizer==2.1.1
Best match: charset-normalizer 2.1.1
Adding charset-normalizer 2.1.1 to easy-install.pth file
Installing normalizer script to /data/lr/anaconda3/envs/detrex/bin

Using /data/lr/anaconda3/envs/detrex/lib/python3.10/site-packages
Finished processing dependencies for detrex==0.1.0

RuntimeError: Undefined backend is not a valid device type

During the installation, no error occurs. When I use the model containing ms_deform_attn to test the code, the losses.backward() will occur the following error.

Traceback (most recent call last):
File "/DATA1/code/detrex/detectron2/detectron2/engine/train_loop.py", line 149, in train
self.run_step()
File "/DATA1/code/detrex/tools/train_net_debug.py", line 115, in run_step
losses.backward()
File "/home/ubuntu/anaconda3/envs/detrex/lib/python3.8/site-packages/torch/_tensor.py", line 307, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
File "/home/ubuntu/anaconda3/envs/detrex/lib/python3.8/site-packages/torch/autograd/init.py", line 154, in backward
Variable._execution_engine.run_backward(
File "/home/ubuntu/anaconda3/envs/detrex/lib/python3.8/site-packages/torch/autograd/function.py", line 199, in apply
return user_fn(self, *args)
File "/home/ubuntu/anaconda3/envs/detrex/lib/python3.8/site-packages/torch/autograd/function.py", line 340, in wrapper
outputs = fn(ctx, *args)
File "/DATA1/code/detrex/detrex/layers/multi_scale_deform_attn.py", line 83, in backward
grad_value, grad_sampling_loc, grad_attn_weight = _C.ms_deform_attn_backward(
RuntimeError: Undefined backend is not a valid device type
python-BaseException

Process finished with exit code -1

If it is possible to use features with more than 3 channels as input?

Hi there,
Thanks for your excellent work!
I'm trying to train DINO on my own data, which are not regular RGB images but images with more channels (more than 7). I found that in torch, the input channel can be modifed by adding an extra conv layer, and then concat it with the original model. However, I didn't find examples of doing so in detrex (or detectron2). ;(
So I wonder if I could do this with detrex. If possible, would you mind giving me an example?

Thanks a lot! XD

About training with DN-Deformable-DETR-R50

When I train DN-Deformable-DETR-R50 in 12 epochs, cause I only have one Tesla A-100 GPU, I set the dataloader.train.total_batch_size = 4 and train.max_iter = 360000. And the result of the AP and AP50 is 46.5559 and 64.1131 in iteration 334999 and the result seems to be better.
Why it's so high... the detection accuracy.

GPU memory consumption increases over the Epoch a lot

Hi,

thanks for the great work. I successful trained the 4-scale-swin-L model on a smaller Dataset with a few 1000 images which are 1024x1024 each. When training now with 100k images (same size as before) per epoch the GPU memory seems to slowly increase until the GPU goes out of memory. Is that the normal behavior? I train with batchsize 1 per GPU on TITAN RTX 24GB GPUs.

Thinking it might be the bigger serialized dataset which takes 435.86 MiB - I went from a Swin -L to a Swin-B backbone with the same result just after more iterations. Starting with around 12GB being allocated at the start - it goes up until it doesn't fit the 24GB in ~20k iterations. I attached the log of that training with 1 GPU and batch size 1.

Support would be highly appreciated :)

example_log.LOG

Unable to reproduce the results of DINO-R50-4scale-12ep (48.9 v.s. 49.2)

    > > Thanks, Maybe I dont convert torchvision model to d2 model. thanks again. I will close this issue.

You're welcome, feel free to reopen it if you meet some other problems, You can also try to set the training seed to 42

# your config
train.seed = 42

which aligns the official training settings, this may help you to reproduce the 49.95 AP results @adamqian111

Hello, after fixing the random seed to 42, I still can not reproduce the results of dino_r50_4scale_12ep which is 49.2 in your repo. Do you have any ideas where the problem might bt?
image

Here is the full log:
log_detrex_dino_r50_4scale_12ep_48.87.txt

Originally posted by @backseason in #118 (comment)

Detrex python package

Hi, is there a plan to release a python package of detrex? That would be a massive help in installation and also solving the Detectron2 submodule install.
Many thanks for your work!

Frequently Asked Questions

We keep this issue open to collect frequently asked questions and their solutions from the users.

Feel free to leave your comment here if you find any frequent issues and have ways to help others to solve them.

Notes

  • If you meed some convergence problem with less gpus, it's better to set a larger batch-size (batch-size=8/16) by setting dataloader.train.total_batch_size for training as mentioned in this issue: #219

FAQs

1. ImportError: Cannot import 'detrex._C', therefore 'MultiScaleDeformableAttention' is not available.

detrex need CUDA runtime to build the MultiScaleDeformableAttention operator. In most cases, users do not need to specify this environment variable if you have installed cuda correctly. The default path of CUDA runtime is usr/local/cuda. If you find your CUDA_HOME is None. You may solve it as follows:

  • If you've already installed CUDA runtime in your environments, specify the environment variable (here we take cuda-11.3 as an example):
export CUDA_HOME=/path/to/cuda-11.3/
  • If you do not find the CUDA runtime in your environments, consider install it following the CUDA Toolkit Installation to install CUDA. Then specify the environment variable CUDA_HOME.
  • After setting CUDA_HOME, rebuild detrex again by running pip install -e .

You can also refer to these issues for more details: #98, #85

2. How to not filter empty annotations during training.

There're three ways for you to not filter empty annotations during training.

  1. modify configs in configs/common/data/coco_detr.py as follows:
dataloader.train = L(build_detection_train_loader)(
    dataset=L(get_detection_dataset_dicts)(names="coco_2017_train", filter_empty=False),
    ...,
)
  1. modify configs in projects as dino_r50_4scale_24ep.py.
# your config.py
dataloader = get_config("common/data/coco_detr.py").dataloader

# modify dataloader config
# not filter empty annotations during training
dataloader.train.dataset.filter_empty = False
  1. modify your training scripts to override the config.
cd detrex
python tools/train_net.py --config-file projects/dino/configs/path/to/config.py --num-gpus 8 dataloader.train.dataset.filter_empy=False

You can also refer to these issues for more details: #78 (comment)

3. RuntimeError: The server socket has failed to listen on any local network address. The server socket has failed to bind to [::]:54980 (errno: 98 - Address already in use).

This means that the process you started earlier did not exit correctly, there's two solution:

  1. kill the process you started before totally
  2. change the running port by setting --dist-url
python tools/train_net.py \
    --config-file path/to/config.py \
    --num-gpus 8 \
    --dist-url tcp://127.0.0.1:12345 \
4. DINO CPU inference Please refer to this PR #157 for more details
5. Training coco-like custom dataset Please refer to this PR #186 for more details.

AssertionError: Attribute 'thing_classes' in the metadata of 'coco_2017_train' cannot be set to a different value!

Traceback (most recent call last):
File "tools/train_net.py", line 229, in
launch(
File "/root/detrex/detectron2/detectron2/engine/launch.py", line 82, in launch
main_func(*args)
File "tools/train_net.py", line 224, in main
do_train(args, cfg)
File "tools/train_net.py", line 167, in do_train
train_loader = instantiate(cfg.dataloader.train)
File "/root/detrex/detectron2/detectron2/config/instantiate.py", line 67, in instantiate
cfg = {k: instantiate(v) for k, v in cfg.items()}
File "/root/detrex/detectron2/detectron2/config/instantiate.py", line 67, in
cfg = {k: instantiate(v) for k, v in cfg.items()}
File "/root/detrex/detectron2/detectron2/config/instantiate.py", line 83, in instantiate
return cls(**cfg)
File "/root/detrex/detectron2/detectron2/data/build.py", line 241, in get_detection_dataset_dicts
dataset_dicts = [DatasetCatalog.get(dataset_name) for dataset_name in names]
File "/root/detrex/detectron2/detectron2/data/build.py", line 241, in
dataset_dicts = [DatasetCatalog.get(dataset_name) for dataset_name in names]
File "/root/detrex/detectron2/detectron2/data/catalog.py", line 58, in get
return f()
File "/root/detrex/detectron2/detectron2/data/datasets/coco.py", line 500, in
DatasetCatalog.register(name, lambda: load_coco_json(json_file, image_root, name))
File "/root/detrex/detectron2/detectron2/data/datasets/coco.py", line 80, in load_coco_json
meta.thing_classes = thing_classes
File "/root/detrex/detectron2/detectron2/data/catalog.py", line 148, in setattr
assert oldval == val, (
AssertionError: Attribute 'thing_classes' in the metadata of 'coco_2017_train' cannot be set to a different value!
['person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light', 'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard', 'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch', 'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone', 'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear', 'hair drier', 'toothbrush'] != ['mouse', 'laptop', 'keyboard']

Roadmap of detrex

We keep this issue open to collect feature requests from users and hear your voice.

You can either:

  • Suggest a new feature by leaving a comment.
  • Vote for a feature request with 👍. (Remember that developers are sometimes busy and cannot respond to all feature requests, so vote for your most favorable one!)
  • Tell us that you would like to help implement new features or review the PRs. (This is the greatest thing to hear about!)

v0.2.1

  • MaskDINO. (Pre-release in #154)
  • Support SwinV2 backbone (#152)
  • Add a basic usage notebook for inference on custom images. ( #119 )
  • ViT backbone (#138)

Collection

  • DQ-DET: https://arxiv.org/pdf/2307.12239.pdf
  • RT-DETR
  • Co-DETR
  • Better data augmentation code format: #204
  • Support wandb to log training informations and visualize sampled results
  • TensorRT for fast inference
  • ONNX for DETR models
  • Docker or python package for detrex (High Priority)
  • Object365 Dataloader
  • Semantic Seg/Instance Seg/Key points tasks

DINO R50复现不到49.2的精度啊

你们开源的dino的r50的config有问题啊。。如果用R50的pkl,应该是BGR模式,mean和std要相应的变化。。我修正了之后跑了12个ep,AP也只有48.3左右,没有达到你们报的49.2。又多跑了三四个ep,也就到48.7。我看了mm里面有人复现了dino的一个branch,试了一堆seed,ap也只是到48.9左右。。你们在训练时是用了什么*操作吗?能不能把训练的log也分享一下啊

About fp16 in training

I have one v100 32g GPU, can i config like this"train.amp.enable=True"?
As I know v100 could use AMP(Automatic Mixed Precision)
Will fp16 accelerate training if i can use fp16?
Hope your reply~!

Potential problem of GroupConditionalSelfAttention

Hi there,

I see the implementation of GroupConditionalSelfAttention and find that it may have some problems.

N, B, C = query_content.shape
q = query_content + query_pos
k = key_content + key_pos
v = value
# hack in attention layer to implement group-detr
if self.training:
q = torch.cat(q.split(N // self.group_nums, dim=0), dim=1)
k = torch.cat(k.split(N // self.group_nums, dim=0), dim=1)
v = torch.cat(v.split(N // self.group_nums, dim=0), dim=1)
q = q.reshape(N, B, self.num_heads, C // self.num_heads).permute(
1, 2, 0, 3
) # (B, num_heads, N, head_dim)
k = k.reshape(N, B, self.num_heads, C // self.num_heads).permute(1, 2, 0, 3)
v = v.reshape(N, B, self.num_heads, C // self.num_heads).permute(1, 2, 0, 3)

In line 154-158, qkv are reshaped to N=num_groups * num_queries_per_group. In this way, all queries in all groups interact with each other, rather than interacting within each group. Is this implementation a potential mistake? Or I misunderstand something?

About the number of params in detrex

Hi, there, I'm coming back. For now, maybe I want to find out where I can find the total parameters of the entire model in detrex.

I'm using the dino. In the original repo, I can find it from the info.txt file which outputs the log Information, the num of parameters will show below the code block.

[08/19 12:39:32.347]: number of params:46397196
[08/19 12:39:32.349]: params:

in detrex, I can't find it from log.txt, so where can I find this or where can I calculate this number from the file which end with py in detrex.

Looking forward for your reply.

Roadmap of detrex documentation and tutorials

We keep this issue open to collect documentation requests from users and hear your voice.

You can either:

  • Suggest new documentation content or tutorial by leaving a comment.
  • Vote for a documentation request with 👍. (Remember that developers are sometimes busy and cannot respond to all documentation requests, so vote for your most favorable one!)

Collection

  • Custom dataset tutorial: #78
  • Register coco-like custom dataset for training please refer to #186

how to change the number of classes

Hi, thanks for this helpful project. I am using a custom dataset. All goes well, but the number of classes cannot be changed. I tried to add a config to the file projects/dab_detr/configs/dab_detr_r50_50ep.py by adding a line "model.num_classes=3", but it seems that this is not enough. The following errors will occur if I add this line to the config file:

File "/home/binxiao/code/py3_env/lib/python3.8/site-packages/detectron2/engine/train_loop.py", line 149, in train
self.run_step()
File "/home/binxiao/code/detrex/tools/train_net.py", line 96, in run_step
loss_dict = self.model(data)
File "/home/binxiao/code/py3_env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/binxiao/code/py3_env/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 886, in forward
output = self.module(*inputs[0], **kwargs[0])
File "/home/binxiao/code/py3_env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/binxiao/code/detrex/projects/dab_detr/modeling/dab_detr.py", line 198, in forward
loss_dict = self.criterion(output, targets)
File "/home/binxiao/code/py3_env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/binxiao/code/detrex/detrex/modeling/criterion/criterion.py", line 230, in forward
losses.update(self.get_loss(loss, outputs, targets, indices, num_boxes))
File "/home/binxiao/code/detrex/detrex/modeling/criterion/criterion.py", line 198, in get_loss
return loss_map[loss](outputs, targets, indices, num_boxes, **kwargs)
File "/home/binxiao/code/detrex/detrex/modeling/criterion/criterion.py", line 162, in loss_boxes
src_boxes = outputs["pred_boxes"][idx]
RuntimeError: CUDA error: device-side assert triggered

Some questions on DAB-DETR

Hi there,
Thanks for the great repo and the nice compilation of so many power DETR methods. But when I look into the code, I have some problems understanding some of them. Hope to get some advice here.

One is

query = layer(
query,
key,
value,
query_pos=query_pos,
key_pos=key_pos,
query_sine_embed=query_sine_embed,
attn_masks=attn_masks,
query_key_padding_mask=query_key_padding_mask,
key_padding_mask=key_padding_mask,
is_first_layer=(idx == 0),
**kwargs,
)

where the first layer is indicated by is_first_layer. Only in the first decoder transformer layer, the cross_attn's query_content would add query_pos. I wonder why it is the case? I think query_pos is already added to query in self_attn before cross_attn in the first layer? What if I remove the addition of query_pos in the first cross_attn?

Another one is

if self.bbox_embed is not None:
# predict offsets and added to the input normalized anchor boxes.
offsets = self.bbox_embed(query)
offsets[..., : self.embed_dim] += inverse_sigmoid(reference_boxes)
new_reference_boxes = offsets[..., : self.embed_dim].sigmoid()
if idx != self.num_layers - 1:
intermediate_ref_boxes.append(new_reference_boxes)
reference_boxes = new_reference_boxes.detach()

where reference_boxes for the next layer is a detached version of new_reference_boxes. What's the idea here of detach()? Does removing it affect the performance a lot?

The third one is about initialization:

nn.init.constant_(self.bbox_embed.layers[-1].weight.data, 0)
nn.init.constant_(self.bbox_embed.layers[-1].bias.data, 0)

What's the idea of initializing the last fc layer in MLP as all 0?

Sorry for so many questions proposed at a time. It's totally ok if there's no 'standard' answer. Just any advice would be very much appreciated if it can boost the understanding of this great model.

Thanks in advance!

Reproudce DINO-R50 and get a higher result as 49.9 with batch_size=1 and nGPU=8

I reproduce DINO with dino_r50_4scale_12ep.py and set batch_size=1. I use max_iter=90000 x 2 and drops learning rate at 165000th iteration. Then, I got a result higher than this repo reports. Since this result (49.9) is obviously better than the current result (49.2) so there may be something wrong with my setting? Or this may be a better training setting than the default one (batch size=2).

[11/30 21:53:00 d2.evaluation.fast_eval_api]: COCOeval_opt.accumulate() finished in 4.95 seconds.
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.499
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.674
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.546
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.326
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.531
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.645
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.380
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.659
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.731
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.573
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.772
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.883
[11/30 21:53:00 d2.evaluation.coco_evaluation]: Evaluation results for bbox: 
|   AP   |  AP50  |  AP75  |  APs   |  APm   |  APl   |
|:------:|:------:|:------:|:------:|:------:|:------:|
| 49.890 | 67.436 | 54.627 | 32.601 | 53.056 | 64.499 |

Would DINO work with ConvNext?

Hi there,
I noticed that in detrex, there's ConvNext.py under detrex/modeling/backbone. However, in project/dino, there aren't any configs with ConvNext. I wonder if we can use ConvNext as backbone in place of res50 or swin transformer? If possible, would it perform comparable with swin?
Thanks for your consideration!

Question about DAB DETR Implementation

Dear author, thanks for your great work! I have several doubts about the implementation of DAB DETR, specifically, the code about dab decoder as follows:

if self.modulate_hw_attn:
ref_hw_cond = self.ref_anchor_head(query).sigmoid()
query_sine_embed[..., self.embed_dim // 2 :] *= (
ref_hw_cond[..., 0] / obj_center[..., 2]
).unsqueeze(-1)
query_sine_embed[..., : self.embed_dim // 2] *= (
ref_hw_cond[..., 1] / obj_center[..., 3]
).unsqueeze(-1)

My understanding of codes:
query_sine_embed = PE(A_q), shape=(300, bs, 512), represent (300, bs, (PE(x), PE(y), PE(w), PE(h)))
self.embed_dim = 256
obj_center: shape=(300, bs, 4), represents coordinates of anchors (300, bs, (x, y, h, w))
ref_hw_cond: shape=(300, bs, 2), represents estimate h' and w of' reference point (300, bs, (h', w'))

Question:

The code: query_sine_embed[..., self.embed_dim // 2 :] *= (ref_hw_cond[..., 0] / obj_center[..., 2] 

query_sine_embed[..., self.embed_dim // 2 :]: shape=(300, bs, 384), represents (300, bs, (PE(y), PE(h), PE(w)))
ref_hw_cond[..., 0] : shape=(300, bs, 1), represents (300, bs, w')
obj_center[..., 2]: shape=(300, bs, 1), represents (300, bs, w)

So why normalize all of PE(y), PE(h), PE(w) with w' / w?

Wrong place of "sys.path.insert(0, "./") " in demo.py

Hi, I found demo.py return an error and I believed it's because someone change the content of demo.py and put "sys.path.insert(0, "./")" in a wrong place. I think it should be before "from demo.predictors import VisualizationDemo" or the importing cannot be executed. Please check it.

How can i get TP, FP, FN and loss graph? and I wonder the meaning of learning result metrics.json

Hi, I conducted the learning with custom dataset in COCO-format using "dino_swin_large_224_4scale_12ep.py".

1. I want to get "TP, FP, FN" values in the learning process.

2. I would like to check the performance graph of loss for each of Train and Validation Dataset that results after learning.

The picture below is a performance graph that results when learning is conducted in YOLOv5. I want to check a similar graph when I run detrex.
image

3. As a result of learning, loss_bbox (0-4), loss_class (0-4), loss_giou_ (0-4), and total_loss are output, and I want to know what each means.

image

ImportError: Cannot import 'detrex._C', therefore 'MultiScaleDeformableAttention' is not available.

Hello!

I'm following the installation guide, but facing issues when trying to run models with MultiScaleDeformableAttention. As the title states, I'm receiving the following error:

ImportError: Cannot import 'detrex._C', therefore 'MultiScaleDeformableAttention' is not available.

Relevant packages from conda list:

detectron2 0.6 dev_0
detrex 0.1.0 dev_0
pytorch 1.10.1 py3.7_cuda10.2_cudnn7.6.5_0 pytorch
pytorch-mutex 1.0 cuda pytorch
torchaudio 0.10.1 py37_cu102 pytorch
torchvision 0.11.2 py37_cu102 pytorch

  1. Clean conda environment with python version 3.7
  2. Install pytorch according to the PyTorch installation guides.
  3. Clone detrex
  4. Init and update submodules
  5. python -m pip install -e detectron2
  6. pip install -e .

However when running:

python tools/train_net.py --config-file projects/dino/configs/dino_r50_4scale_12ep.py

I get the error above.

Is there any known solutions for this issue?

Custom MaskDINO training crashes with a RuntimeError: Global alloc not supported yet

When I run:
cd /home/jovyan/data/kamila/detrex && python tools/train_net.py --config-file projects/maskdino/configs/maskdino_r50_coco_instance_seg_50ep.py

I get following exception:

�[32m[12/06 07:56:38 d2.engine.train_loop]: �[0mStarting training from iteration 0
/opt/conda/lib/python3.8/site-packages/torch/functional.py:445: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at  /opt/pytorch/pytorch/aten/src/ATen/native/TensorShape.cpp:2156.)
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
�[4m�[5m�[31mERROR�[0m �[32m[12/06 07:56:48 d2.engine.train_loop]: �[0mException during training:
Traceback (most recent call last):
  File "/home/jovyan/data/kamila/detrex/detectron2/detectron2/engine/train_loop.py", line 149, in train
    self.run_step()
  File "tools/train_net_graffiti.py", line 95, in run_step
    loss_dict = self.model(data)
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/jovyan/data/kamila/detrex/projects/maskdino/maskdino.py", line 162, in forward
    losses = self.criterion(outputs, targets,mask_dict)
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/jovyan/data/kamila/detrex/projects/maskdino/modeling/criterion.py", line 388, in forward
    indices = self.matcher(aux_outputs, targets)
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
    return func(*args, **kwargs)
  File "/home/jovyan/data/kamila/detrex/projects/maskdino/modeling/matcher.py", line 223, in forward
    return self.memory_efficient_forward(outputs, targets, cost)
  File "/opt/conda/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
    return func(*args, **kwargs)
  File "/home/jovyan/data/kamila/detrex/projects/maskdino/modeling/matcher.py", line 165, in memory_efficient_forward
    cost_dice = batch_dice_loss_jit(out_mask, tgt_mask)
RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript (most recent call last):
RuntimeError: Global alloc not supported yet

�[32m[12/06 07:56:48 d2.engine.hooks]: �[0mOverall training speed: 3 iterations in 0:00:04 (1.3375 s / it)
�[32m[12/06 07:56:48 d2.engine.hooks]: �[0mTotal training time: 0:00:04 (0:00:00 on hooks)
�[32m[12/06 07:56:49 d2.utils.events]: �[0m eta: 4 days, 17:05:31  iter: 5  total_loss: 109.9  loss_ce: 4.103  loss_mask: 1.045  loss_dice: 1.17  loss_bbox: 0.2272  loss_giou: 0.1185  loss_ce_dn: 0.3122  loss_mask_dn: 1.603  loss_dice_dn: 1.205  loss_bbox_dn: 0.5139  loss_giou_dn: 0.328  loss_ce_0: 3.23  loss_mask_0: 1.752  loss_dice_0: 1.341  loss_bbox_0: 0.132  loss_giou_0: 0.1527  loss_ce_dn_0: 0.3276  loss_mask_dn_0: 2.752  loss_dice_dn_0: 3.534  loss_bbox_dn_0: 1.171  loss_giou_dn_0: 0.8217  loss_ce_1: 3.531  loss_mask_1: 1.334  loss_dice_1: 0.9349  loss_bbox_1: 0.09554  loss_giou_1: 0.111  loss_ce_dn_1: 0.2215  loss_mask_dn_1: 1.734  loss_dice_dn_1: 1.629  loss_bbox_dn_1: 0.795  loss_giou_dn_1: 0.4993  loss_ce_2: 3.148  loss_mask_2: 1.184  loss_dice_2: 1.401  loss_bbox_2: 0.1707  loss_giou_2: 0.1778  loss_ce_dn_2: 0.3696  loss_mask_dn_2: 1.644  loss_dice_dn_2: 1.608  loss_bbox_dn_2: 0.6091  loss_giou_dn_2: 0.4159  loss_ce_3: 3.638  loss_mask_3: 1.113  loss_dice_3: 1.233  loss_bbox_3: 0.2145  loss_giou_3: 0.1722  loss_ce_dn_3: 0.3632  loss_mask_dn_3: 1.652  loss_dice_dn_3: 1.442  loss_bbox_dn_3: 0.5413  loss_giou_dn_3: 0.3912  loss_ce_4: 3.436  loss_mask_4: 1.092  loss_dice_4: 1.122  loss_bbox_4: 0.2232  loss_giou_4: 0.1488  loss_ce_dn_4: 0.297  loss_mask_dn_4: 1.637  loss_dice_dn_4: 1.272  loss_bbox_dn_4: 0.5192  loss_giou_dn_4: 0.3572  loss_ce_5: 3.891  loss_mask_5: 1.315  loss_dice_5: 1.075  loss_bbox_5: 0.2148  loss_giou_5: 0.1458  loss_ce_dn_5: 0.2682  loss_mask_dn_5: 1.616  loss_dice_dn_5: 1.213  loss_bbox_dn_5: 0.5183  loss_giou_dn_5: 0.3461  loss_ce_6: 3.985  loss_mask_6: 1.168  loss_dice_6: 1.15  loss_bbox_6: 0.245  loss_giou_6: 0.1321  loss_ce_dn_6: 0.2713  loss_mask_dn_6: 1.57  loss_dice_dn_6: 1.197  loss_bbox_dn_6: 0.514  loss_giou_dn_6: 0.3357  loss_ce_7: 4.099  loss_mask_7: 1.093  loss_dice_7: 1.2  loss_bbox_7: 0.2284  loss_giou_7: 0.1184  loss_ce_dn_7: 0.3286  loss_mask_dn_7: 1.586  loss_dice_dn_7: 1.212  loss_bbox_dn_7: 0.5173  loss_giou_dn_7: 0.3303  loss_ce_8: 4.038  loss_mask_8: 1.049  loss_dice_8: 1.206  loss_bbox_8: 0.2284  loss_giou_8: 0.1187  loss_ce_dn_8: 0.3104  loss_mask_dn_8: 1.605  loss_dice_dn_8: 1.197  loss_bbox_dn_8: 0.51  loss_giou_dn_8: 0.3282  loss_ce_interm: 3.285  loss_mask_interm: 1.477  loss_dice_interm: 1.158  loss_bbox_interm: 0.6809  loss_giou_interm: 0.4867  time: 1.1044  data_time: 0.1023  lr: 0.0001  max_mem: 19044M
Traceback (most recent call last):
  File "tools/train_net_graffiti.py", line 232, in <module>
    launch(
  File "/home/jovyan/data/kamila/detrex/detectron2/detectron2/engine/launch.py", line 82, in launch
    main_func(*args)
  File "tools/train_net_graffiti.py", line 227, in main
    do_train(args, cfg)
  File "tools/train_net_graffiti.py", line 211, in do_train
    trainer.train(start_iter, cfg.train.max_iter)
  File "/home/jovyan/data/kamila/detrex/detectron2/detectron2/engine/train_loop.py", line 149, in train
    self.run_step()
  File "tools/train_net_graffiti.py", line 95, in run_step
    loss_dict = self.model(data)
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/jovyan/data/kamila/detrex/projects/maskdino/maskdino.py", line 162, in forward
    losses = self.criterion(outputs, targets,mask_dict)
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/jovyan/data/kamila/detrex/projects/maskdino/modeling/criterion.py", line 388, in forward
    indices = self.matcher(aux_outputs, targets)
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
    return func(*args, **kwargs)
  File "/home/jovyan/data/kamila/detrex/projects/maskdino/modeling/matcher.py", line 223, in forward
    return self.memory_efficient_forward(outputs, targets, cost)
  File "/opt/conda/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
    return func(*args, **kwargs)
  File "/home/jovyan/data/kamila/detrex/projects/maskdino/modeling/matcher.py", line 165, in memory_efficient_forward
    cost_dice = batch_dice_loss_jit(out_mask, tgt_mask)
RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript (most recent call last):
RuntimeError: Global alloc not supported yet

So far I figured out the reason why it may appear.
It seems that the workaround that uses batch_dice_loss instead of batch_dice_loss_jit as discussed in the issue fixes it, however, the training speed increases.

Would really appreciate you looking at it.

DINO inference on a CPU only machine fails

Hi,

Always end up with this common error Cannot import detrex._C', therefore 'MultiScaleDeformableAttention' is not available.

Inference script with train.device and model.device = "cpu" works but it still requires a Cuda dependent/enabled machine.

Is there a way to bypass this? Is there a way to deploy/run DINO on a CPU only machine/docker?

Many thanks!

About DINO change on 0.2.0

Hi, it's glad that you and your team release the new version about 0.2.0.
I found that the major update for this version is in DAB-DETR, which modified some configurations. I read in README that the new version will be in Rebuild cleaner config files for projects. Do you mean DABDETR? I am a user of DINO, whether the update of DINO is the update of the training weight of DINO-r50-12ep, cause it has a better training effect. The main model configuration files were not adjusted.

Run on CPU

Hi,
Is it possible to train the model only using a CPU? I tried changing projects/dab_detr/configs/models/dab_detr_r50.py file so that device is cpu, but it seems to still try to use GPUs.

Cuda out of memory error when training mask-Dino

Hello,

I am trying to train your newly uploaded mask-Dino on a new dataset (still in the COCO format).

I am using 8 Tesla V100-SXM2-16GB GPUs, and getting a cuda out of memory error message, even when setting the batch size to 1 for each GPU. (I am training using the detrex training script).

Importantly, the code crashes only after few training iterations and not immediately.

Is this behavior predictable? which hardware did you use to train the model?

Thanks,

Screen Shot 2022-12-05 at 14 26 28

Meet error when training DINO, Cannot import 'detrex._C', therefore 'MultiScaleDeformableAttention' is not available.

Traceback (most recent call last):
File "/share/home/wut_xujw/detrex/tools/train_net.py", line 229, in
launch(
File "/share/home/wut_xujw/detrex/detectron2/detectron2/engine/launch.py", line 82, in launch
main_func(*args)
File "/share/home/wut_xujw/detrex/tools/train_net.py", line 224, in main
do_train(args, cfg)
File "/share/home/wut_xujw/detrex/tools/train_net.py", line 159, in do_train
model = instantiate(cfg.model)
File "/share/home/wut_xujw/detrex/detectron2/detectron2/config/instantiate.py", line 67, in instantiate
cfg = {k: instantiate(v) for k, v in cfg.items()}
File "/share/home/wut_xujw/detrex/detectron2/detectron2/config/instantiate.py", line 67, in
cfg = {k: instantiate(v) for k, v in cfg.items()}
File "/share/home/wut_xujw/detrex/detectron2/detectron2/config/instantiate.py", line 67, in instantiate
cfg = {k: instantiate(v) for k, v in cfg.items()}
File "/share/home/wut_xujw/detrex/detectron2/detectron2/config/instantiate.py", line 67, in
cfg = {k: instantiate(v) for k, v in cfg.items()}
File "/share/home/wut_xujw/detrex/detectron2/detectron2/config/instantiate.py", line 83, in instantiate
return cls(**cfg)
File "/share/home/wut_xujw/detrex/projects/dino/modeling/dino_transformer.py", line 45, in init
attn=MultiScaleDeformableAttention(
File "/share/home/wut_xujw/detrex/detrex/layers/multi_scale_deform_attn.py", line 373, in init
raise ImportError(err)
ImportError: Cannot import 'detrex._C', therefore 'MultiScaleDeformableAttention' is not available. detrex is not compiled successfully, please build following the instructions!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.