Giter Site home page Giter Site logo

tsd's People

Contributors

a157801 avatar sense-x avatar songguanglu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tsd's Issues

Can TSD head be used on top of RetinaNet?

As I understood, the TSD fine tunes proposals from the RPNs existing in Faster RCNN, Mask RCNN, Cascada RCNN etc.
In RetinaNet we don't have region proposals but instead the head convolves the different levels of the FPN using anchors.
Theoretically, what if a certain spatial convolution location is good for the classification but a slightly offset one is better for regression, just like in the TSD case?
Wouldn't RetinaNet benefit from a TSD head as well?

Could you share the training log and evaluation script?

I have trained TSD on the Docker.

  • baseline: tsd faster rcnn:
  • dataset: stanford cars.

envs:

  • cuda=10.1
  • pytorch=1.3.0
  • mmcv=0.4.3

My training log is as below, I might get a promising model.

{"mode": "train", "epoch": 12, "iter": 800, "lr": 0.0002, "time": 0.65021, "data_time": 0.00734, "memory": 3249, "loss_rpn_cls": 0.00914, "loss_rpn_bbox": 0.00379, "loss_cls": 0.16967, "acc": 97.57617, "loss_TSD_cls": 0.16994, "TSD_acc": 97.57617, "loss_bbox": 0.04268, "loss_TSD_bbox": 0.04255, "loss_pc_cls": 0.01316, "loss_pc_loc": 0.22021, "loss": 0.67113} {"mode": "train", "epoch": 12, "iter": 850, "lr": 0.0002, "time": 0.65292, "data_time": 0.00706, "memory": 3249, "loss_rpn_cls": 0.00864, "loss_rpn_bbox": 0.00349, "loss_cls": 0.1724, "acc": 97.51123, "loss_TSD_cls": 0.17322, "TSD_acc": 97.51123, "loss_bbox": 0.04279, "loss_TSD_bbox": 0.04291, "loss_pc_cls": 0.01329, "loss_pc_loc": 0.22873, "loss": 0.68547} {"mode": "train", "epoch": 12, "iter": 900, "lr": 0.0002, "time": 0.64935, "data_time": 0.00684, "memory": 3249, "loss_rpn_cls": 0.00942, "loss_rpn_bbox": 0.00344, "loss_cls": 0.16686, "acc": 97.57764, "loss_TSD_cls": 0.16697, "TSD_acc": 97.57764, "loss_bbox": 0.04192, "loss_TSD_bbox": 0.04198, "loss_pc_cls": 0.01312, "loss_pc_loc": 0.2191, "loss": 0.6628} {"mode": "train", "epoch": 12, "iter": 950, "lr": 0.0002, "time": 0.65193, "data_time": 0.00701, "memory": 3249, "loss_rpn_cls": 0.00878, "loss_rpn_bbox": 0.00356, "loss_cls": 0.17192, "acc": 97.50977, "loss_TSD_cls": 0.17272, "TSD_acc": 97.50977, "loss_bbox": 0.04384, "loss_TSD_bbox": 0.04405, "loss_pc_cls": 0.01319, "loss_pc_loc": 0.22383, "loss": 0.68188} {"mode": "train", "epoch": 12, "iter": 1000, "lr": 0.0002, "time": 0.64752, "data_time": 0.00715, "memory": 3249, "loss_rpn_cls": 0.00823, "loss_rpn_bbox": 0.00349, "loss_cls": 0.17404, "acc": 97.48682, "loss_TSD_cls": 0.17474, "TSD_acc": 97.48682, "loss_bbox": 0.04349, "loss_TSD_bbox": 0.04376, "loss_pc_cls": 0.01332, "loss_pc_loc": 0.21945, "loss": 0.68052}

But I can't test the model using the given command:

./tools/dist_test.sh configs/stanford_cars/faster_rcnn_r50_fpn_TSD_1x_stanford_cars.py work_dirs/faster_rcnn_r50_fpn_TSD_1x_stanford_cars/latest.pth 4 --eval bbox

the output is frastrastrating:

Evaluating bbox... Loading and preparing results... The testing results of the whole dataset is empty.

  1. Could you share your training log and evaluation script here?
  2. How to evaluate the model after each epoch? (I already opened evaluation and set interval=1 on my config)

here is my config:

# model settings
model = dict(
    type="FasterRCNN",
    pretrained="torchvision://resnet50",
    backbone=dict(
        type="ResNet",
        depth=50,
        num_stages=4,
        out_indices=(0, 1, 2, 3),
        frozen_stages=1,
        norm_cfg=dict(type="BN", requires_grad=True),
        style="pytorch",
    ),
    neck=dict(
        type="FPN", in_channels=[256, 512, 1024, 2048], out_channels=256, num_outs=5
    ),
    rpn_head=dict(
        type="RPNHead",
        in_channels=256,
        feat_channels=256,
        anchor_scales=[8],
        anchor_ratios=[0.5, 1.0, 2.0],
        anchor_strides=[4, 8, 16, 32, 64],
        target_means=[0.0, 0.0, 0.0, 0.0],
        target_stds=[1.0, 1.0, 1.0, 1.0],
        loss_cls=dict(type="CrossEntropyLoss", use_sigmoid=True, loss_weight=1.0),
        loss_bbox=dict(type="SmoothL1Loss", beta=1.0 / 9.0, loss_weight=1.0),
    ),
    bbox_roi_extractor=dict(
        type="SingleRoIExtractor",
        roi_layer=dict(type="RoIAlign", out_size=7, sample_num=2),
        out_channels=256,
        featmap_strides=[4, 8, 16, 32],
    ),
    bbox_head=dict(
        type="TSDSharedFCBBoxHead",
        featmap_strides=[4, 8, 16, 32],
        num_fcs=2,
        in_channels=256,
        fc_out_channels=1024,
        roi_feat_size=7,
        num_classes=197,# fg + bg = 196 + 1
        cls_pc_margin=0.3,
        loc_pc_margin=0.3,
        target_means=[0.0, 0.0, 0.0, 0.0],
        target_stds=[0.1, 0.1, 0.2, 0.2],
        reg_class_agnostic=False,
        loss_cls=dict(type="CrossEntropyLoss", use_sigmoid=False, loss_weight=1.0),
        loss_bbox=dict(type="SmoothL1Loss", beta=1.0, loss_weight=1.0),
    ),
)
# model training and testing settings
train_cfg = dict(
    rpn=dict(
        assigner=dict(
            type="MaxIoUAssigner",
            pos_iou_thr=0.7,
            neg_iou_thr=0.3,
            min_pos_iou=0.3,
            ignore_iof_thr=-1,
        ),
        sampler=dict(
            type="RandomSampler",
            num=256,
            pos_fraction=0.5,
            neg_pos_ub=-1,
            add_gt_as_proposals=False,
        ),
        allowed_border=0,
        pos_weight=-1,
        debug=False,
    ),
    rpn_proposal=dict(
        nms_across_levels=False,
        nms_pre=2000,
        nms_post=2000,
        max_num=2000,
        nms_thr=0.7,
        min_bbox_size=0,
    ),
    rcnn=dict(
        assigner=dict(
            type="MaxIoUAssigner",
            pos_iou_thr=0.5,
            neg_iou_thr=0.5,
            min_pos_iou=0.5,
            ignore_iof_thr=-1,
        ),
        sampler=dict(
            type="RandomSampler",
            num=512,
            pos_fraction=0.25,
            neg_pos_ub=-1,
            add_gt_as_proposals=True,
        ),
        pos_weight=-1,
        debug=False,
    ),
)
test_cfg = dict(
    rpn=dict(
        nms_across_levels=False,
        nms_pre=1000,
        nms_post=1000,
        max_num=1000,
        nms_thr=0.7,
        min_bbox_size=0,
    ),
    rcnn=dict(score_thr=0.05, nms=dict(type="nms", iou_thr=0.5), max_per_img=1)
    # soft-nms is also supported for rcnn testing
    # e.g., nms=dict(type='soft_nms', iou_thr=0.5, min_score=0.05)
)
# dataset settings
dataset_type = "StanfordcarsDataset"
data_root = 'data/stanford_car'
img_norm_cfg = dict(
    mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True
)
train_pipeline = [
    dict(type="LoadImageFromFile"),
    dict(type="LoadAnnotations", with_bbox=True),
    dict(type="Resize", img_scale=(1333, 800), keep_ratio=True),
    dict(type="RandomFlip", flip_ratio=0.5),
    dict(type="Normalize", **img_norm_cfg),
    dict(type="Pad", size_divisor=32),
    dict(type="DefaultFormatBundle"),
    dict(type="Collect", keys=["img", "gt_bboxes", "gt_labels"]),
]
test_pipeline = [
    dict(type="LoadImageFromFile"),
    dict(
        type="MultiScaleFlipAug",
        img_scale=(1333, 800),
        flip=False,
        transforms=[
            dict(type="Resize", keep_ratio=True),
            dict(type="RandomFlip"),
            dict(type="Normalize", **img_norm_cfg),
            dict(type="Pad", size_divisor=32),
            dict(type="ImageToTensor", keys=["img"]),
            dict(type="Collect", keys=["img"]),
        ],
    ),
]
data = dict(
    imgs_per_gpu=2,
    workers_per_gpu=2,
    train=dict(
        type=dataset_type,
        ann_file=data_root+'/annotations/train.json',
        img_prefix=data_root+'/cars_train/',
        pipeline=train_pipeline),
    val=dict(
        type=dataset_type,
        ann_file=data_root+'/annotations/test.json',
        img_prefix=data_root+'/cars_test/',
        pipeline=test_pipeline),
    test=dict(
        type=dataset_type,
        ann_file=data_root+'/annotations/test.json',
        img_prefix=data_root+'/cars_test/',
        pipeline=test_pipeline))

evaluation = dict(interval=1, metric="bbox")

# optimizer
optimizer = dict(type="SGD", lr=0.02, momentum=0.9, weight_decay=0.0001)
# optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2))
optimizer_config = dict(grad_clip=None)
# learning policy
lr_config = dict(
    policy="step",
    warmup="linear",
    warmup_iters=500,
    warmup_ratio=0.001,
    step=[8, 11],
)

checkpoint_config = dict(interval=1)
# yapf:disable
log_config = dict(
    interval=50,
    hooks=[
        dict(type="TextLoggerHook"),
        # dict(type='TensorboardLoggerHook')
    ],
)
# yapf:enable
# runtime settings
total_epochs = 12
dist_params = dict(backend="nccl")
log_level = "INFO"
work_dir = "./work_dirs/faster_rcnn_r50_fpn_TSD_1x_stanford_cars"
load_from = None
resume_from = None
workflow = [("train", 1)]

Hope to get reply from you soon, thx.

Why the rcnn score_thr is set 0.00 in all TSD configs?

I really like your work! However, I have a question about it.
If the score_thr is set 0.00, the mAP can increase 0.2~0.3. Therefore, I wonder that if the faster-rcnn based TSD is compared with traditional faster rcnn under the same score_thr? Thanks.

KeyError: 'TSDSharedFCBBoxHead is not in the head registry'

While running this, i encountered the error

from mmdet.apis import inference_detector, init_detector, show_result_pyplot
config_file = '/md/tsd/faster_rcnn_x101_64x4d_fpn_TSD.py'
checkpoint_file = '/md/tsd/faser_rcnn_X101_64x4d_TSD.pth'
model = init_detector(config_file, checkpoint_file, device='cuda:0')

Environment

sys.platform: linux
Python: 3.6.9 (default, Oct 8 2020, 12:12:24) [GCC 8.4.0]
CUDA available: True
CUDA_HOME: /usr/local/cuda
NVCC: Cuda compilation tools, release 10.1, V10.1.243
GPU 0: Tesla T4
GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
PyTorch: 1.3.1+cu100
PyTorch compiling details: PyTorch built with:

  • GCC 7.3
  • Intel(R) Math Kernel Library Version 2019.0.4 Product Build 20190411 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v0.20.5 (Git Hash 0125f28c61c1f822fd48570b4c1066f96fcb9b2e)
  • OpenMP 201511 (a.k.a. OpenMP 4.5)
  • NNPACK is enabled
  • CUDA Runtime 10.0
  • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_37,code=compute_37
  • CuDNN 7.6.3
  • Magma 2.5.1
  • Build settings: BLAS=MKL, BUILD_NAMEDTENSOR=OFF, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -fopenmp -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Wno-stringop-overflow, DISABLE_NUMA=1, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=True, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF,

TorchVision: 0.4.2+cu100
OpenCV: 4.5.1
MMCV: 0.4.3
MMDetection: 1.1.0+unknown
MMDetection Compiler: GCC 7.5
MMDetection CUDA Compiler: 10.1

Error traceback


KeyError Traceback (most recent call last)
in ()
5 # Setup a checkpoint file to load
6 checkpoint_file = '/md/tsd/faser_rcnn_X101_64x4d_TSD.pth'# initialize the detector
----> 7 model = init_detector(config_file, checkpoint_file, device='cuda:0')

8 frames
/content/mmdetection-1.1.0/mmdet/apis/inference.py in init_detector(config, checkpoint, device)
32 'but got {}'.format(type(config)))
33 config.model.pretrained = None
---> 34 model = build_detector(config.model, test_cfg=config.test_cfg)
35 if checkpoint is not None:
36 checkpoint = load_checkpoint(model, checkpoint)

/content/mmdetection-1.1.0/mmdet/models/builder.py in build_detector(cfg, train_cfg, test_cfg)
41
42 def build_detector(cfg, train_cfg=None, test_cfg=None):
---> 43 return build(cfg, DETECTORS, dict(train_cfg=train_cfg, test_cfg=test_cfg))

/content/mmdetection-1.1.0/mmdet/models/builder.py in build(cfg, registry, default_args)
13 return nn.Sequential(*modules)
14 else:
---> 15 return build_from_cfg(cfg, registry, default_args)
16
17

/content/mmdetection-1.1.0/mmdet/utils/registry.py in build_from_cfg(cfg, registry, default_args)
77 for name, value in default_args.items():
78 args.setdefault(name, value)
---> 79 return obj_cls(**args)

/content/mmdetection-1.1.0/mmdet/models/detectors/faster_rcnn.py in init(self, backbone, rpn_head, bbox_roi_extractor, bbox_head, train_cfg, test_cfg, neck, shared_head, pretrained)
25 train_cfg=train_cfg,
26 test_cfg=test_cfg,
---> 27 pretrained=pretrained)

/content/mmdetection-1.1.0/mmdet/models/detectors/two_stage.py in init(self, backbone, neck, shared_head, rpn_head, bbox_roi_extractor, bbox_head, mask_roi_extractor, mask_head, train_cfg, test_cfg, pretrained)
45 self.bbox_roi_extractor = builder.build_roi_extractor(
46 bbox_roi_extractor)
---> 47 self.bbox_head = builder.build_head(bbox_head)
48
49 if mask_head is not None:

/content/mmdetection-1.1.0/mmdet/models/builder.py in build_head(cfg)
33
34 def build_head(cfg):
---> 35 return build(cfg, HEADS)
36
37

/content/mmdetection-1.1.0/mmdet/models/builder.py in build(cfg, registry, default_args)
13 return nn.Sequential(*modules)
14 else:
---> 15 return build_from_cfg(cfg, registry, default_args)
16
17

/content/mmdetection-1.1.0/mmdet/utils/registry.py in build_from_cfg(cfg, registry, default_args)
68 if obj_cls is None:
69 raise KeyError('{} is not in the {} registry'.format(
---> 70 obj_type, registry.name))
71 elif inspect.isclass(obj_type):
72 obj_cls = obj_type

KeyError: 'TSDSharedFCBBoxHead is not in the head registry'

TypeError "forward() missing 2 required positional arguments: 'feats' and 'rois'" in pytorch2onnx

When I try to convert pytorch model to onnx by running tools/pytorch2onnx.py, I encountered this error.

Script

python tools/pytorch2onnx.py configs/OpenImages_configs/r50-FPN-1x_classsampling_TSD/r50-FPN-1x_classsampling_TSD.py checkpoints/r50-FPN-1x_classsampling_TSD.pth --out r50-FPN-1x_classsampling_TSD.onnx

Environment

sys.platform: linux
Python: 3.7.5 (default, Oct 25 2019, 15:51:11) [GCC 7.3.0]
CUDA available: True
CUDA_HOME: /usr/local/cuda-10.1
NVCC: Cuda compilation tools, release 10.1, V10.1.105
GPU 0: GeForce RTX 2080 Ti
GCC: gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609
PyTorch: 1.2.0
PyTorch compiling details: PyTorch built with:

  • GCC 7.3
  • Intel(R) Math Kernel Library Version 2019.0.4 Product Build 20190411 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v0.18.1
  • OpenMP 201511 (a.k.a. OpenMP 4.5)
  • NNPACK is enabled
  • CUDA Runtime 10.0
  • NVCC architecture flags: -gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_50,code=compute_50
  • CuDNN 7.6.2
  • Magma 2.5.1
  • Build settings: BLAS=MKL, BUILD_NAMEDTENSOR=OFF, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -fopenmp -DUSE_FBGEMM -DUSE_QNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Wno-stringop-overflow, DISABLE_NUMA=1, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=True, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,

TorchVision: 0.4.0
OpenCV: 4.1.1
MMCV: 0.4.3
MMDetection: 1.1.0+fb1fdd7
MMDetection Compiler: GCC 5.4
MMDetection CUDA Compiler: 10.1

Error traceback

/content/anaconda3/envs/face/lib/python3.7/site-packages/mmdet-1.1.0+fb1fdd7-py3.7-linux-x86_64.egg/mmdet/core/bbox/transforms.py:163: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if bboxes.size(0) > 0:
/content/anaconda3/envs/face/lib/python3.7/site-packages/mmdet-1.1.0+fb1fdd7-py3.7-linux-x86_64.egg/mmdet/models/roi_extractors/single_level.py:99: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if inds.any():
/content/anaconda3/envs/face/lib/python3.7/site-packages/mmdet-1.1.0+fb1fdd7-py3.7-linux-x86_64.egg/mmdet/ops/roi_align/roi_align.py:149: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  assert rois.dim() == 2 and rois.size(1) == 5
Traceback (most recent call last):
  File "tools/pytorch2onnx.py", line 127, in <module>
    main()
  File "tools/pytorch2onnx.py", line 119, in main
    onnx_model = export_onnx_model(model, (input_data,), args.passes)
  File "tools/pytorch2onnx.py", line 44, in export_onnx_model
    operator_export_type=OperatorExportTypes.ONNX_ATEN_FALLBACK,
  File "/content/anaconda3/envs/face/lib/python3.7/site-packages/torch/onnx/__init__.py", line 132, in export
    strip_doc_string, dynamic_axes)
  File "/content/anaconda3/envs/face/lib/python3.7/site-packages/torch/onnx/utils.py", line 64, in export
    example_outputs=example_outputs, strip_doc_string=strip_doc_string, dynamic_axes=dynamic_axes)
  File "/content/anaconda3/envs/face/lib/python3.7/site-packages/torch/onnx/utils.py", line 329, in _export
    _retain_param_name, do_constant_folding)
  File "/content/anaconda3/envs/face/lib/python3.7/site-packages/torch/onnx/utils.py", line 213, in _model_to_graph
    graph, torch_out = _trace_and_get_graph_from_model(model, args, training)
  File "/content/anaconda3/envs/face/lib/python3.7/site-packages/torch/onnx/utils.py", line 171, in _trace_and_get_graph_from_model
    trace, torch_out = torch.jit.get_trace_graph(model, args, _force_outplace=True)
  File "/content/anaconda3/envs/face/lib/python3.7/site-packages/torch/jit/__init__.py", line 256, in get_trace_graph
    return LegacyTracedModule(f, _force_outplace, return_inputs)(*args, **kwargs)
  File "/content/anaconda3/envs/face/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in __call__
    result = self.forward(*input, **kwargs)
  File "/content/anaconda3/envs/face/lib/python3.7/site-packages/torch/jit/__init__.py", line 323, in forward
    out = self.inner(*trace_inputs)
  File "/content/anaconda3/envs/face/lib/python3.7/site-packages/torch/nn/modules/module.py", line 545, in __call__
    result = self._slow_forward(*input, **kwargs)
  File "/content/anaconda3/envs/face/lib/python3.7/site-packages/torch/nn/modules/module.py", line 531, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "/content/anaconda3/envs/face/lib/python3.7/site-packages/mmdet-1.1.0+fb1fdd7-py3.7-linux-x86_64.egg/mmdet/models/detectors/two_stage.py", line 128, in forward_dummy
    cls_score, bbox_pred = self.bbox_head(bbox_feats)
  File "/content/anaconda3/envs/face/lib/python3.7/site-packages/torch/nn/modules/module.py", line 545, in __call__
    result = self._slow_forward(*input, **kwargs)
  File "/content/anaconda3/envs/face/lib/python3.7/site-packages/torch/nn/modules/module.py", line 531, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "/content/anaconda3/envs/face/lib/python3.7/site-packages/mmdet-1.1.0+fb1fdd7-py3.7-linux-x86_64.egg/mmdet/core/fp16/decorators.py", line 130, in new_func
    return old_func(*args, **kwargs)
TypeError: forward() missing 2 required positional arguments: 'feats' and 'rois'

Thanks a lot!

Pretrained weights for OpenImages dataset (600 classes)?

Thanks for sharing your great work!

I wanted to know - have you shared the pretrained weights for your prize-winning OpenImages network with 600 classes? Per my understanding, the current pretrained models are for the COCO dataset.

If not shared currently, could you please consider doing so? It will be really helpful, as I am interested in using the 600 Class object-detector for one of my projects.

UnboundLocalError: local variable 'pc_cls_loss' referenced before assignment

2020-06-07 11:02:32,849 - mmdet - INFO - Epoch [1][15950/36314] lr: 0.04000, eta: 9 days, 19:08:30, time: 2.144, data_time: 0.781, memory: 16486, loss_rpn_cls: 0.0568, loss_rpn_bbox: 0.0406, loss_cls: 0.3449, acc: 90.2331, loss_TSD_cls: 0.3340, TSD_acc: 90.5631, loss_bbox: 0.1293, loss_TSD_bbox: 0.1197, loss_pc_cls: 0.0607, loss_pc_loc: 0.1979, loss: 1.2839
Traceback (most recent call last):
File "./tools/train.py", line 151, in
main()
File "./tools/train.py", line 147, in main
meta=meta)
File "/home/yckj0114/code/TSD/mmdet/apis/train.py", line 165, in train_detector
runner.run(data_loaders, cfg.workflow, cfg.total_epochs)
File "/home/yckj0114/whl/mmcv-0.4.4/mmcv/runner/runner.py", line 380, in run
epoch_runner(data_loaders[i], **kwargs)
File "/home/yckj0114/whl/mmcv-0.4.4/mmcv/runner/runner.py", line 278, in train
self.model, data_batch, train_mode=True, **kwargs)
File "/home/yckj0114/code/TSD/mmdet/apis/train.py", line 75, in batch_processor
losses = model(**data)
File "/home/yckj0114/anaconda3/envs/TSD/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/home/yckj0114/anaconda3/envs/TSD/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 447, in forward
output = self.module(*inputs[0], **kwargs[0])
File "/home/yckj0114/anaconda3/envs/TSD/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/home/yckj0114/code/TSD/mmdet/core/fp16/decorators.py", line 49, in new_func
return old_func(*args, **kwargs)
File "/home/yckj0114/code/TSD/mmdet/models/detectors/base.py", line 147, in forward
return self.forward_train(img, img_metas, **kwargs)
File "/home/yckj0114/code/TSD/mmdet/models/detectors/two_stage.py", line 222, in forward_train
self.train_cfg.rcnn, img_metas)
File "/home/yckj0114/code/TSD/mmdet/core/fp16/decorators.py", line 127, in new_func
return old_func(*args, **kwargs)
File "/home/yckj0114/code/TSD/mmdet/models/bbox_heads/tsd_bbox_head.py", line 370, in get_target
target_stds=self.target_stds)
File "/home/yckj0114/code/TSD/mmdet/core/bbox/bbox_target.py", line 100, in bbox_target_tsd
target_stds=target_stds)
File "/home/yckj0114/code/TSD/mmdet/core/utils/misc.py", line 24, in multi_apply
return tuple(map(list, zip(*map_results)))
File "/home/yckj0114/code/TSD/mmdet/core/bbox/bbox_target.py", line 226, in bbox_target_single_tsd
return labels, label_weights, bbox_targets, bbox_weights, TSD_labels, TSD_label_weights, TSD_bbox_targets, TSD_bbox_weights, pc_cls_loss, pc_loc_loss
UnboundLocalError: local variable 'pc_cls_loss' referenced before assignment

Question about training with fewer GPUs

Thanks for sharing your great work! I have several questions:

  1. It seems like that you implement your training on 16 GPUs. I want to ask whether training on 4 GPUs will degrade the performance? Have you carried experiments on 4 GPUs?
  2. You mentioned in your paper that SyncBN is used but it seems like this is missed in your config files? But the performance reported in this repo seems similar with the results in your paper.

BTW, I suggest you to merge all your TSD configs into a single directory :)

Welcome update to OpenMMLab 2.0

Welcome update to OpenMMLab 2.0

I am Vansin, the technical operator of OpenMMLab. In September of last year, we announced the release of OpenMMLab 2.0 at the World Artificial Intelligence Conference in Shanghai. We invite you to upgrade your algorithm library to OpenMMLab 2.0 using MMEngine, which can be used for both research and commercial purposes. If you have any questions, please feel free to join us on the OpenMMLab Discord at https://discord.gg/amFNsyUBvm or add me on WeChat (van-sin) and I will invite you to the OpenMMLab WeChat group.

Here are the OpenMMLab 2.0 repos branches:

OpenMMLab 1.0 branch OpenMMLab 2.0 branch
MMEngine 0.x
MMCV 1.x 2.x
MMDetection 0.x 、1.x、2.x 3.x
MMAction2 0.x 1.x
MMClassification 0.x 1.x
MMSegmentation 0.x 1.x
MMDetection3D 0.x 1.x
MMEditing 0.x 1.x
MMPose 0.x 1.x
MMDeploy 0.x 1.x
MMTracking 0.x 1.x
MMOCR 0.x 1.x
MMRazor 0.x 1.x
MMSelfSup 0.x 1.x
MMRotate 1.x 1.x
MMYOLO 0.x

Attention: please create a new virtual environment for OpenMMLab 2.0.

InstallationError: nvcc fatal : Unsupported gpu architecture 'compute_75' ninja: build stopped: subcommand failed. Error compiling objects for extension

Thanks for your error report and we appreciate it a lot.

Checklist

  1. I have searched related issues but cannot get the expected help. YES
  2. The bug has not been fixed in the latest version. YES

Describe the bug
raise InstallationError(exc_msg)
pip._internal.exceptions.InstallationError: Command errored out with exit status 1: /home/raouf/anaconda3/envs/open-mmlab/bin/python -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/home/raouf/mmdetection/setup.py'"'"'; file='"'"'/home/raouf/mmdetection/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' develop --no-deps Check the logs for full command output.
Removed build tracker: '/tmp/pip-req-tracker-4weaq6b4'

Reproduction

  1. What command or script did you run? MMDETECTRON INSTALLATION
pip install -v -e .
  1. Did you make any modifications on the code or config? NO
    Did you understand what you have modified? .
  2. What dataset did you use?

Environment

  1. Please run python mmdet/utils/collect_env.py PROBLEM WITH MMDET INSTALLATION
  2. You may add addition that may be helpful for locating the problem, such as
    • How you installed PyTorch [e.g., pip, conda, source]
    • Other environment variables that may be related (such as $PATH, $LD_LIBRARY_PATH, $PYTHONPATH, etc.)

Error traceback
If applicable, paste the error trackback here.

Non-user install because site-packages writeable
Created temporary directory: /tmp/pip-ephem-wheel-cache-cyi012tq
Created temporary directory: /tmp/pip-req-tracker-4weaq6b4
Initialized build tracking at /tmp/pip-req-tracker-4weaq6b4
Created build tracker: /tmp/pip-req-tracker-4weaq6b4
Entered build tracker: /tmp/pip-req-tracker-4weaq6b4
Created temporary directory: /tmp/pip-install-bds8rzv2
Obtaining file:///home/raouf/mmdetection
  Added file:///home/raouf/mmdetection to build tracker '/tmp/pip-req-tracker-4weaq6b4'
    Running setup.py (path:/home/raouf/mmdetection/setup.py) egg_info for package from file:///home/raouf/mmdetection
    Created temporary directory: /tmp/pip-pip-egg-info-48w9rucq
    Running command python setup.py egg_info
    running egg_info
    creating /tmp/pip-pip-egg-info-48w9rucq/mmdet.egg-info
    writing /tmp/pip-pip-egg-info-48w9rucq/mmdet.egg-info/PKG-INFO
    writing dependency_links to /tmp/pip-pip-egg-info-48w9rucq/mmdet.egg-info/dependency_links.txt
    writing requirements to /tmp/pip-pip-egg-info-48w9rucq/mmdet.egg-info/requires.txt
    writing top-level names to /tmp/pip-pip-egg-info-48w9rucq/mmdet.egg-info/top_level.txt
    writing manifest file '/tmp/pip-pip-egg-info-48w9rucq/mmdet.egg-info/SOURCES.txt'
    reading manifest file '/tmp/pip-pip-egg-info-48w9rucq/mmdet.egg-info/SOURCES.txt'
    writing manifest file '/tmp/pip-pip-egg-info-48w9rucq/mmdet.egg-info/SOURCES.txt'
  Source in /home/raouf/mmdetection has version 2.1.0+0d67223, which satisfies requirement mmdet==2.1.0+0d67223 from file:///home/raouf/mmdetection
  Removed mmdet==2.1.0+0d67223 from file:///home/raouf/mmdetection from build tracker '/tmp/pip-req-tracker-4weaq6b4'
Requirement already satisfied: matplotlib in /home/raouf/anaconda3/envs/open-mmlab/lib/python3.7/site-packages (from mmdet==2.1.0+0d67223) (3.2.2)
Requirement already satisfied: mmcv>=0.6.0 in /home/raouf/anaconda3/envs/open-mmlab/lib/python3.7/site-packages (from mmdet==2.1.0+0d67223) (0.6.1)
Requirement already satisfied: numpy in /home/raouf/anaconda3/envs/open-mmlab/lib/python3.7/site-packages (from mmdet==2.1.0+0d67223) (1.18.1)
Requirement already satisfied: Pillow<=6.2.2 in /home/raouf/anaconda3/envs/open-mmlab/lib/python3.7/site-packages (from mmdet==2.1.0+0d67223) (6.2.2)
Requirement already satisfied: six in /home/raouf/anaconda3/envs/open-mmlab/lib/python3.7/site-packages (from mmdet==2.1.0+0d67223) (1.15.0)
Requirement already satisfied: terminaltables in /home/raouf/anaconda3/envs/open-mmlab/lib/python3.7/site-packages (from mmdet==2.1.0+0d67223) (3.1.0)
Requirement already satisfied: torch>=1.3 in /home/raouf/anaconda3/envs/open-mmlab/lib/python3.7/site-packages (from mmdet==2.1.0+0d67223) (1.5.1)
Requirement already satisfied: torchvision in /home/raouf/anaconda3/envs/open-mmlab/lib/python3.7/site-packages (from mmdet==2.1.0+0d67223) (0.6.0a0+35d732a)
Requirement already satisfied: kiwisolver>=1.0.1 in /home/raouf/anaconda3/envs/open-mmlab/lib/python3.7/site-packages (from matplotlib->mmdet==2.1.0+0d67223) (1.2.0)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /home/raouf/anaconda3/envs/open-mmlab/lib/python3.7/site-packages (from matplotlib->mmdet==2.1.0+0d67223) (2.4.7)
Requirement already satisfied: python-dateutil>=2.1 in /home/raouf/anaconda3/envs/open-mmlab/lib/python3.7/site-packages (from matplotlib->mmdet==2.1.0+0d67223) (2.8.1)
Requirement already satisfied: cycler>=0.10 in /home/raouf/anaconda3/envs/open-mmlab/lib/python3.7/site-packages (from matplotlib->mmdet==2.1.0+0d67223) (0.10.0)
Requirement already satisfied: yapf in /home/raouf/anaconda3/envs/open-mmlab/lib/python3.7/site-packages (from mmcv>=0.6.0->mmdet==2.1.0+0d67223) (0.30.0)
Requirement already satisfied: addict in /home/raouf/anaconda3/envs/open-mmlab/lib/python3.7/site-packages (from mmcv>=0.6.0->mmdet==2.1.0+0d67223) (2.2.1)
Requirement already satisfied: pyyaml in /home/raouf/anaconda3/envs/open-mmlab/lib/python3.7/site-packages (from mmcv>=0.6.0->mmdet==2.1.0+0d67223) (5.3.1)
Requirement already satisfied: opencv-python>=3 in /home/raouf/anaconda3/envs/open-mmlab/lib/python3.7/site-packages (from mmcv>=0.6.0->mmdet==2.1.0+0d67223) (4.2.0.34)
Requirement already satisfied: future in /home/raouf/anaconda3/envs/open-mmlab/lib/python3.7/site-packages (from torch>=1.3->mmdet==2.1.0+0d67223) (0.18.2)
Installing collected packages: mmdet
  Running setup.py develop for mmdet
    Running command /home/raouf/anaconda3/envs/open-mmlab/bin/python -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/home/raouf/mmdetection/setup.py'"'"'; __file__='"'"'/home/raouf/mmdetection/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' develop --no-deps
    running develop
    running egg_info
    writing mmdet.egg-info/PKG-INFO
    writing dependency_links to mmdet.egg-info/dependency_links.txt
    writing requirements to mmdet.egg-info/requires.txt
    writing top-level names to mmdet.egg-info/top_level.txt
    reading manifest file 'mmdet.egg-info/SOURCES.txt'
    writing manifest file 'mmdet.egg-info/SOURCES.txt'
    running build_ext
    building 'mmdet.ops.nms.nms_ext' extension
    Emitting ninja build file /home/raouf/mmdetection/build/temp.linux-x86_64-3.7/build.ninja...
    Compiling objects...
    Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
    [1/1] /usr/bin/nvcc -DWITH_CUDA -I/home/raouf/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/include -I/home/raouf/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/home/raouf/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/include/TH -I/home/raouf/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/include/THC -I/home/raouf/anaconda3/envs/open-mmlab/include/python3.7m -c -c /home/raouf/mmdetection/mmdet/ops/nms/src/cuda/nms_kernel.cu -o /home/raouf/mmdetection/build/temp.linux-x86_64-3.7/mmdet/ops/nms/src/cuda/nms_kernel.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=nms_ext -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_75,code=sm_75 -std=c++14
    FAILED: /home/raouf/mmdetection/build/temp.linux-x86_64-3.7/mmdet/ops/nms/src/cuda/nms_kernel.o
    /usr/bin/nvcc -DWITH_CUDA -I/home/raouf/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/include -I/home/raouf/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/home/raouf/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/include/TH -I/home/raouf/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/include/THC -I/home/raouf/anaconda3/envs/open-mmlab/include/python3.7m -c -c /home/raouf/mmdetection/mmdet/ops/nms/src/cuda/nms_kernel.cu -o /home/raouf/mmdetection/build/temp.linux-x86_64-3.7/mmdet/ops/nms/src/cuda/nms_kernel.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=nms_ext -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_75,code=sm_75 -std=c++14
    nvcc fatal   : Unsupported gpu architecture 'compute_75'
    ninja: build stopped: subcommand failed.
    Traceback (most recent call last):
      File "/home/raouf/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1423, in _run_ninja_build
        check=True)
      File "/home/raouf/anaconda3/envs/open-mmlab/lib/python3.7/subprocess.py", line 512, in run
        output=stdout, stderr=stderr)
    subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/home/raouf/mmdetection/setup.py", line 304, in <module>
        zip_safe=False)
      File "/home/raouf/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/setuptools/__init__.py", line 161, in setup
        return distutils.core.setup(**attrs)
      File "/home/raouf/anaconda3/envs/open-mmlab/lib/python3.7/distutils/core.py", line 148, in setup
        dist.run_commands()
      File "/home/raouf/anaconda3/envs/open-mmlab/lib/python3.7/distutils/dist.py", line 966, in run_commands
        self.run_command(cmd)
      File "/home/raouf/anaconda3/envs/open-mmlab/lib/python3.7/distutils/dist.py", line 985, in run_command
        cmd_obj.run()
      File "/home/raouf/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/setuptools/command/develop.py", line 38, in run
        self.install_for_development()
      File "/home/raouf/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/setuptools/command/develop.py", line 140, in install_for_development
        self.run_command('build_ext')
      File "/home/raouf/anaconda3/envs/open-mmlab/lib/python3.7/distutils/cmd.py", line 313, in run_command
        self.distribution.run_command(command)
      File "/home/raouf/anaconda3/envs/open-mmlab/lib/python3.7/distutils/dist.py", line 985, in run_command
        cmd_obj.run()
      File "/home/raouf/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/setuptools/command/build_ext.py", line 87, in run
        _build_ext.run(self)
      File "/home/raouf/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/Cython/Distutils/old_build_ext.py", line 186, in run
        _build_ext.build_ext.run(self)
      File "/home/raouf/anaconda3/envs/open-mmlab/lib/python3.7/distutils/command/build_ext.py", line 340, in run
        self.build_extensions()
      File "/home/raouf/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 603, in build_extensions
        build_ext.build_extensions(self)
      File "/home/raouf/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/Cython/Distutils/old_build_ext.py", line 195, in build_extensions
        _build_ext.build_ext.build_extensions(self)
      File "/home/raouf/anaconda3/envs/open-mmlab/lib/python3.7/distutils/command/build_ext.py", line 449, in build_extensions
        self._build_extensions_serial()
      File "/home/raouf/anaconda3/envs/open-mmlab/lib/python3.7/distutils/command/build_ext.py", line 474, in _build_extensions_serial
        self.build_extension(ext)
      File "/home/raouf/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/setuptools/command/build_ext.py", line 208, in build_extension
        _build_ext.build_extension(self, ext)
      File "/home/raouf/anaconda3/envs/open-mmlab/lib/python3.7/distutils/command/build_ext.py", line 534, in build_extension
        depends=ext.depends)
      File "/home/raouf/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 437, in unix_wrap_ninja_compile
        with_cuda=with_cuda)
      File "/home/raouf/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1163, in _write_ninja_file_and_compile_objects
        error_prefix='Error compiling objects for extension')
      File "/home/raouf/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1436, in _run_ninja_build
        raise RuntimeError(message)
    RuntimeError: Error compiling objects for extension
ERROR: Command errored out with exit status 1: /home/raouf/anaconda3/envs/open-mmlab/bin/python -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/home/raouf/mmdetection/setup.py'"'"'; __file__='"'"'/home/raouf/mmdetection/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' develop --no-deps Check the logs for full command output.
Exception information:
Traceback (most recent call last):
  File "/home/raouf/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/pip/_internal/cli/base_command.py", line 188, in _main
    status = self.run(options, args)
  File "/home/raouf/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/pip/_internal/cli/req_command.py", line 185, in wrapper
    return func(self, options, args)
  File "/home/raouf/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/pip/_internal/commands/install.py", line 407, in run
    use_user_site=options.use_user_site,
  File "/home/raouf/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/pip/_internal/req/__init__.py", line 71, in install_given_reqs
    **kwargs
  File "/home/raouf/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/pip/_internal/req/req_install.py", line 790, in install
    unpacked_source_directory=self.unpacked_source_directory,
  File "/home/raouf/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/pip/_internal/operations/install/editable_legacy.py", line 51, in install_editable
    cwd=unpacked_source_directory,
  File "/home/raouf/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/pip/_internal/utils/subprocess.py", line 241, in call_subprocess
    raise InstallationError(exc_msg)
pip._internal.exceptions.InstallationError: Command errored out with exit status 1: /home/raouf/anaconda3/envs/open-mmlab/bin/python -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/home/raouf/mmdetection/setup.py'"'"'; __file__='"'"'/home/raouf/mmdetection/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' develop --no-deps Check the logs for full command output.
Removed build tracker: '/tmp/pip-req-tracker-4weaq6b4'

Bug fix
If you have already identified the reason, you can provide the information here. If you are willing to create a PR to fix it, please also leave a comment here and that would be much appreciated!

Question about how to set finest_scale value?

finest_scale

I noticed that before the feature extractor in the TSD branch, features are mapped to four levels according to their size. But the finest_scale is set to 56*[2, 4, 8], its size is very different from the anchor point size 8 * [4, 8, 16, 32, 64]. It may cause rois_ and delta_c and delta_r to be at the same level according to their rois_ size, but the feature x2 used to calculate delta_c and delta_r comes from other feature levels. Will this cause differences in features?

rcnn nms score_thre?

Hello, I notice that rcnn header's nms score_thre in TSD is 0.00, which in normal setting is 0.05. In TSD paper, I haven't notice some chapter to declare this. Is this a fair comparsion?

Should not overwrite two_stage.py

I think you shouldn't just overwrite two_stage.py. Creating a new file like tsd_two_stage.py was better.
This was not friendly, it kept me unrealizing that base structure was different from typical fasterrcnn until the error was thrown.

loss function

In paper you have multi loss
L=Lrpn+Lcls+Lloc +LDcls+LDloc+Mcls+Mloc
Is this same quality as standard mmdetection losses or results is better with Margin Loss and your special IoU loss?
Or this losses inside TSDSharedFCBBoxHead?

PS: thank you publishing this code, especial at mmdetection i love it

tsd

First of all, thank the author for his selfless sharing. I've looked at your paper. The results in Table 5 are obviously improved. I'd like to ask whether TSD has improved in training other data??

Open-Images datasets

Hi, I'm considering training the model on OpenImages dataset. I was wondering where could I get the annotations ('challenge-2019-train-detection-bbox.txt' in configs)? Since the download link you provided is no longer valid.

RuntimeError: cuda runtime error (11) : invalid argument at mmdet/ops/nms/src/nms_kernel.cu:111

Thanks for your error report and we appreciate it a lot.

Checklist

  1. I have searched related issues but cannot get the expected help.
  2. The bug has not been fixed in the latest version.

Describe the bug
A clear and concise description of what the bug is.

Reproduction

  1. What command or script did you run?
    tools/dist_test.sh configs/TSD_configs/faster_rcnn_r50_fpn_TSD_1x.py work_dirs/tsdfoodlogo/epoch_1.pth 2 --eval mAP

  2. Did you make any modifications on the code or config? Did you understand what you have modified?
    no

  3. What dataset did you use?
    yes
    Environment

  4. Please run python mmdet/utils/collect_env.py to collect necessary environment infomation and paste it here.
    (TSD) root@f3c72eff4988:/workspace/hq/TSD-master# python mmdet/utils/collect_env.py
    sys.platform: linux
    Python: 3.7.10 (default, Feb 26 2021, 18:47:35) [GCC 7.3.0]
    CUDA available: True
    CUDA_HOME: /usr/local/cuda
    NVCC: Cuda compilation tools, release 10.1, V10.1.243
    GPU 0,1: GeForce GTX 1080 Ti
    GCC: gcc (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609
    PyTorch: 1.3.1
    PyTorch compiling details: PyTorch built with:

  • GCC 7.3
  • Intel(R) Math Kernel Library Version 2020.0.2 Product Build 20200624 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v0.20.5 (Git Hash 0125f28c61c1f822fd48570b4c1066f96fcb9b2e)
  • OpenMP 201511 (a.k.a. OpenMP 4.5)
  • NNPACK is enabled
  • CUDA Runtime 9.2
  • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_37,code=compute_37
  • CuDNN 7.6.3
  • Magma 2.5.1
  • Build settings: BLAS=MKL, BUILD_NAMEDTENSOR=OFF, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -fopenmp -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Wno-stringop-overflow, DISABLE_NUMA=1, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=True, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF,

TorchVision: 0.4.2
OpenCV: 4.5.1
MMCV: 0.4.3
MMDetection: 1.1.0+unknown
MMDetection Compiler: GCC 5.4
MMDetection CUDA Compiler: 10.1

  1. You may add addition that may be helpful for locating the problem, such as
    i use docker,the pytorch was downloaded from TsinghuaYuan.For the others, I followed the instructions in INSTALL above.

Error traceback
(TSD) root@f3c72eff4988:/workspace/hq/TSD-master# tools/dist_test.sh configs/TSD_configs/faster_rcnn_r50_fpn_TSD_1x.py work_dirs/tsdfoodlogo/epoch_1.pth 2 --eval mAP


Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.


[ ] 0/29887, elapsed: 0s, ETA:THCudaCheck FAIL file=mmdet/ops/nms/src/nms_kernel.cu line=111 error=11 : invalid argument
THCudaCheck FAIL file=mmdet/ops/nms/src/nms_kernel.cu line=111 error=11 : invalid argument
Traceback (most recent call last):
File "tools/test.py", line 178, in
main()
File "tools/test.py", line 163, in main
outputs, _ = multi_gpu_test(model, data_loader, args.tmpdir, args.gpu_collect)
File "/workspace/hq/TSD-master/mmdet/apis/test.py", line 59, in multi_gpu_test
result = model(return_loss=False, rescale=True, **data)
File "/opt/conda/envs/TSD/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "/opt/conda/envs/TSD/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 442, in forward
output = self.module(*inputs[0], **kwargs[0])
File "/opt/conda/envs/TSD/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "/workspace/hq/TSD-master/mmdet/core/fp16/decorators.py", line 49, in new_func
return old_func(*args, **kwargs)
File "/workspace/hq/TSD-master/mmdet/models/detectors/base.py", line 151, in forward
return self.forward_test(img, img_metas, **kwargs)
File "/workspace/hq/TSD-master/mmdet/models/detectors/base.py", line 132, in forward_test
return self.simple_test(imgs[0], img_metas[0], **kwargs)
File "/workspace/hq/TSD-master/mmdet/models/detectors/two_stage.py", line 378, in simple_test
x, img_metas, proposal_list, self.test_cfg.rcnn, rescale=rescale
File "/workspace/hq/TSD-master/mmdet/models/detectors/two_stage.py", line 362, in tsd_simple_test_bboxes
cfg=rcnn_test_cfg,
File "/workspace/hq/TSD-master/mmdet/core/fp16/decorators.py", line 130, in new_func
return old_func(*args, **kwargs)
File "/workspace/hq/TSD-master/mmdet/models/bbox_heads/bbox_head.py", line 184, in get_det_bboxes
bboxes, scores, cfg.score_thr, cfg.nms, cfg.max_per_img
File "/workspace/hq/TSD-master/mmdet/core/post_processing/bbox_nms.py", line 60, in multiclass_nms
dets, keep = nms_op(torch.cat([bboxes_for_nms, scores[:, None]], 1), **nms_cfg_)
File "/workspace/hq/TSD-master/mmdet/ops/nms/nms_wrapper.py", line 54, in nms
inds = nms_cuda.nms(dets_th, iou_thr)
RuntimeError: cuda runtime error (11) : invalid argument at mmdet/ops/nms/src/nms_kernel.cu:111
Traceback (most recent call last):
File "tools/test.py", line 178, in
main()
File "tools/test.py", line 163, in main
outputs, _ = multi_gpu_test(model, data_loader, args.tmpdir, args.gpu_collect)
File "/workspace/hq/TSD-master/mmdet/apis/test.py", line 59, in multi_gpu_test
result = model(return_loss=False, rescale=True, **data)
File "/opt/conda/envs/TSD/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "/opt/conda/envs/TSD/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 442, in forward
output = self.module(*inputs[0], **kwargs[0])
File "/opt/conda/envs/TSD/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "/workspace/hq/TSD-master/mmdet/core/fp16/decorators.py", line 49, in new_func
return old_func(*args, **kwargs)
File "/workspace/hq/TSD-master/mmdet/models/detectors/base.py", line 151, in forward
return self.forward_test(img, img_metas, **kwargs)
File "/workspace/hq/TSD-master/mmdet/models/detectors/base.py", line 132, in forward_test
return self.simple_test(imgs[0], img_metas[0], **kwargs)
File "/workspace/hq/TSD-master/mmdet/models/detectors/two_stage.py", line 378, in simple_test
x, img_metas, proposal_list, self.test_cfg.rcnn, rescale=rescale
File "/workspace/hq/TSD-master/mmdet/models/detectors/two_stage.py", line 362, in tsd_simple_test_bboxes
cfg=rcnn_test_cfg,
File "/workspace/hq/TSD-master/mmdet/core/fp16/decorators.py", line 130, in new_func
return old_func(*args, **kwargs)
File "/workspace/hq/TSD-master/mmdet/models/bbox_heads/bbox_head.py", line 184, in get_det_bboxes
bboxes, scores, cfg.score_thr, cfg.nms, cfg.max_per_img
File "/workspace/hq/TSD-master/mmdet/core/post_processing/bbox_nms.py", line 60, in multiclass_nms
dets, keep = nms_op(torch.cat([bboxes_for_nms, scores[:, None]], 1), **nms_cfg_)
File "/workspace/hq/TSD-master/mmdet/ops/nms/nms_wrapper.py", line 54, in nms
inds = nms_cuda.nms(dets_th, iou_thr)
RuntimeError: cuda runtime error (11) : invalid argument at mmdet/ops/nms/src/nms_kernel.cu:111
terminate called after throwing an instance of 'std::runtime_error'
what(): NCCL error in: /opt/conda/conda-bld/pytorch_1573049387353/work/torch/lib/c10d/../c10d/NCCLUtils.hpp:84, unhandled cuda error
terminate called after throwing an instance of 'std::runtime_error'
what(): NCCL error in: /opt/conda/conda-bld/pytorch_1573049387353/work/torch/lib/c10d/../c10d/NCCLUtils.hpp:84, unhandled cuda error
Traceback (most recent call last):
File "/opt/conda/envs/TSD/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/opt/conda/envs/TSD/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/opt/conda/envs/TSD/lib/python3.7/site-packages/torch/distributed/launch.py", line 253, in
main()
File "/opt/conda/envs/TSD/lib/python3.7/site-packages/torch/distributed/launch.py", line 249, in main
cmd=cmd)
subprocess.CalledProcessError: Command '['/opt/conda/envs/TSD/bin/python', '-u', 'tools/test.py', '--local_rank=1', 'configs/TSD_configs/faster_rcnn_r50_fpn_TSD_1x.py', 'work_dirs/tsdfoodlogo/epoch_1.pth', '--launcher', 'pytorch', '--eval', 'mAP']' died with <Signals.SIGABRT: 6>.

Excuse me, where is the problem, does anyone know, thank you all!

Bug fix
If you have already identified the reason, you can provide the information here. If you are willing to create a PR to fix it, please also leave a comment here and that would be much appreciated!

DDP problem when migrate TSD to mmdet2.0

I am trying to migrate TSD to mmdet2.0, anything is ok when training faster_rcnn_TSD only on sigle GPU.
When I run TSD with DDP, some error happened. Similar error #2153

I have tried set find_unused_parameters=True in DDP, this makes the error not happen, but makes the program stuck.
Does anyone have any suggestions?

Traceback (most recent call last):
File "./tools/train.py", line 178, in
main()
File "./tools/train.py", line 167, in main
train_detector(
File "/home/zhaoxin/workspace/mmdetection/mmdet/apis/train.py", line 150, in train_detector
runner.run(data_loaders, cfg.workflow, cfg.total_epochs)
File "/home/zhaoxin/tools/miniconda3/envs/torch1.6/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 125, in run
epoch_runner(data_loaders[i], **kwargs)
File "/home/zhaoxin/tools/miniconda3/envs/torch1.6/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 50, in train
self.run_iter(data_batch, train_mode=True)
File "/home/zhaoxin/tools/miniconda3/envs/torch1.6/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 29, in run_iter
outputs = self.model.train_step(data_batch, self.optimizer,
File "/home/zhaoxin/tools/miniconda3/envs/torch1.6/lib/python3.8/site-packages/mmcv/parallel/distributed.py", line 49, in train_step
self.reducer.prepare_for_backward([])
RuntimeError: Expected to have finished reduction in the prior iteration before starting a new one. This error indicates that your module has parameters that were not used in producing loss. You can enable unused parameter detection by (1) passing the keyword argument find_unused_parameters=True to torch.nn.parallel.DistributedDataParallel; (2) making sure all forward function outputs participate in calculating loss. If you already have done the above two steps, then the distributed data parallel module wasn't able to locate the output tensors in the return value of your module's forward function. Please include the loss function and the structure of the return value of forward of your module when reporting this issue (e.g. list, dict, iterable).

Some questions about TSD_cls and TSD_bbox

There are two branches in the framework, one is TSD, and the other is sibling head. On my own data set, TSD performance is better, but the sibling are not good. What are the possible reasons for this? In addition, delta_c should act on TSD_cls, but I did not find the corresponding code.

How to illustrate the heatmap thermal diagram?

I've read the paper "Revisiting the Sibling Head in Object Detector". In this paper, a sensitive heat map of sensitivity for Location and Classification has been illustratied to show the problem of spatial misalignment. However, this paper doesn't show how to get the sensitivity number at different location. Is there anyone can tell me the specific method of getting sensitivity at each location?
Thks.

inconsistency of mmcv or mmdetection?

I met this problem running demo/inference_demo.ipynb. It seems to be caused by a mismatch version of mmcv/mmdetection installed. It is fine if I switched back to use mmdetection config file ('faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py') with weights ('faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth').
The mmdetection==2.2.0+977dacb and mmcv==1.0rc were installed by directly compiled followed the mmdetection homepage.
Which version of mmcv and mmdetection should I use?

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-7-8ef91ae9ee21> in <module>
      1 # build the model from a config file and a checkpoint file
----> 2 model = init_detector(str(config_file), str(checkpoint_file), device='cuda:0')

~/Softwares/mmdetection/mmdet/apis/inference.py in init_detector(config, checkpoint, device)
     31                         f'but got {type(config)}')
     32     config.model.pretrained = None
---> 33     model = build_detector(config.model, test_cfg=config.test_cfg)
     34     if checkpoint is not None:
     35         checkpoint = load_checkpoint(model, checkpoint)

~/Softwares/mmdetection/mmdet/models/builder.py in build_detector(cfg, train_cfg, test_cfg)
     65 def build_detector(cfg, train_cfg=None, test_cfg=None):
     66     """Build detector."""
---> 67     return build(cfg, DETECTORS, dict(train_cfg=train_cfg, test_cfg=test_cfg))

~/Softwares/mmdetection/mmdet/models/builder.py in build(cfg, registry, default_args)
     30         return nn.Sequential(*modules)
     31     else:
---> 32         return build_from_cfg(cfg, registry, default_args)
     33 
     34 

~/Softwares/mmcv/mmcv/utils/registry.py in build_from_cfg(cfg, registry, default_args)
    165         for name, value in default_args.items():
    166             args.setdefault(name, value)
--> 167     return obj_cls(**args)

TypeError: __init__() got an unexpected keyword argument 'bbox_roi_extractor'

How to use it with A100 Gpus?

I have tried four environments to use it it with A100, but failed.

(1)
a100
torch1.4.0+cudatoolkit10.0
nccl2+cuda10.0
mmcv 0.4.4
gcc7
ERROR
(2)
a100
torch1.4.0+cudatoolkit10.1
nccl2+cuda10.1
mmcv 0.4.4
gcc7
ERROR
(3)
dockerfile that you provided
mmcv-0.4.4
(4)
py3.6 + torch1.7 + cuda11.0 + gcc7 + nccl2.7.3

I really spend a lot of time in the environment setting. But it seems that TSD is out-of-dated currently:
A100 needs cuda11.0 and pytorch>=1.7
TSD needs mmcv0.4.4 + mmdet1.0 + pytorch<=1.4

Could anyone offer a TSD + mmdet2.x code with me? Thanks a lot!

An error occurred while using fp16

Traceback (most recent call last):
File "tools/train.py", line 151, in
main()
File "tools/train.py", line 147, in main
meta=meta)
File "/cache/user-job-dir/codes/TSD-master/mmdet/apis/train.py", line 165, in train_detector
runner.run(data_loaders, cfg.workflow, cfg.total_epochs)
File "/home/work/anaconda/lib/python3.6/site-packages/mmcv/runner/runner.py", line 380, in run
epoch_runner(data_loaders[i], **kwargs)
File "/home/work/anaconda/lib/python3.6/site-packages/mmcv/runner/runner.py", line 278, in train
self.model, data_batch, train_mode=True, **kwargs)
File "/cache/user-job-dir/codes/TSD-master/mmdet/apis/train.py", line 75, in batch_processor
losses = model(**data)
File "/home/work/anaconda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/home/work/anaconda/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 150, in forward
return self.module(*inputs[0], **kwargs[0])
File "/home/work/anaconda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/cache/user-job-dir/codes/TSD-master/mmdet/core/fp16/decorators.py", line 75, in new_func
output = old_func(*new_args, **new_kwargs)
File "/cache/user-job-dir/codes/TSD-master/mmdet/models/detectors/base.py", line 147, in forward
return self.forward_train(img, img_metas, **kwargs)
File "/cache/user-job-dir/codes/TSD-master/mmdet/models/detectors/two_stage.py", line 218, in forward_train
cls_score, bbox_pred, TSD_cls_score, TSD_bbox_pred, delta_c, delta_r = self.bbox_head(bbox_feats, x[:self.bbox_roi_extractor.num_inputs], rois)
File "/home/work/anaconda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/cache/user-job-dir/codes/TSD-master/mmdet/models/bbox_heads/tsd_bbox_head.py", line 245, in forward
tsd_feats_cls = self.align_pooling_pc[i](feats[i], rois_, delta_c_)
File "/home/work/anaconda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(input, kwargs)
File "/cache/user-job-dir/codes/TSD-master/mmdet/ops/dcn/deform_pool.py", line 341, in forward
self.trans_std)
File "/cache/user-job-dir/codes/TSD-master/mmdet/ops/dcn/deform_pool.py", line 50, in forward
ctx.part_size, ctx.sample_per_part, ctx.trans_std)
RuntimeError: expected scalar type Half but found Float (datac10::Half at /home/work/anaconda/lib/python3.6/site-packages/torch/include/ATen/core/TensorMethods.h:1386)
frame #0: std::function<std::string ()>::operator()() const + 0x11 (0x7f532d3d4441 in /home/work/anaconda/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #1: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x2a (0x7f532d3d3d7a in /home/work/anaconda/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #2: c10::Half
at::Tensor::datac10::Half() const + 0xcf (0x7f524f2b135f in /cache/user-job-dir/codes/TSD-master/mmdet/ops/dcn/deform_pool_cuda.cpython-36m-x86_64-linux-gnu.so)
frame #3: + 0x1aed4 (0x7f524f2aced4 in /cache/user-job-dir/codes/TSD-master/mmdet/ops/dcn/deform_pool_cuda.cpython-36m-x86_64-linux-gnu.so)
frame #4: + 0x1b681 (0x7f524f2ad681 in /cache/user-job-dir/codes/TSD-master/mmdet/ops/dcn/deform_pool_cuda.cpython-36m-x86_64-linux-gnu.so)
frame #5: DeformablePSROIPoolForward(at::Tensor, at::Tensor, at::Tensor, at::Tensor, at::Tensor, int, int, int, int, int, int, int, float, int, int, int, int, int, float) + 0x1aa (0x7f524f2ad944 in /cache/user-job-dir/codes/TSD-master/mmdet/ops/dcn/deform_pool_cuda.cpython-36m-x86_64-linux-gnu.so)
frame #6: deform_psroi_pooling_cuda_forward(at::Tensor, at::Tensor, at::Tensor, at::Tensor, at::Tensor, int, float, int, int, int, int, int, float) + 0x202 (0x7f524f29e892 in /cache/user-job-dir/codes/TSD-master/mmdet/ops/dcn/deform_pool_cuda.cpython-36m-x86_64-linux-gnu.so)
frame #7: + 0x1952d (0x7f524f2ab52d in /cache/user-job-dir/codes/TSD-master/mmdet/ops/dcn/deform_pool_cuda.cpython-36m-x86_64-linux-gnu.so)
frame #8: + 0x197ee (0x7f524f2ab7ee in /cache/user-job-dir/codes/TSD-master/mmdet/ops/dcn/deform_pool_cuda.cpython-36m-x86_64-linux-gnu.so)
frame #9: + 0x160a5 (0x7f524f2a80a5 in /cache/user-job-dir/codes/TSD-master/mmdet/ops/dcn/deform_pool_cuda.cpython-36m-x86_64-linux-gnu.so)

frame #16: THPFunction_apply(_object
, _object
) + 0x6b1 (0x7f532dbb2481 in /home/work/anaconda/lib/python3.6/site-packages/torch/lib/libtorch_python.so)

It seems that the mmcv version does not match.

Hello, we use the command: python train.py ../configs/faster_rcnn_r152_fpn_TSD_1x.py --work_dir exp/TSD_r152/ --validate to train the network, throwing TypeError: init() got an unexpected keyword argument'bbox_roi_extractor'

python train.py ../configs/faster_rcnn_r152_fpn_TSD_1x.py --work_dir exp/TSD_r152/ --validate
2020-07-15 13:31:48,830 - mmdet - INFO - Environment info:

sys.platform: linux
Python: 3.7.5 (default, Oct 25 2019, 15:51:11) [GCC 7.3.0]
CUDA available: True
CUDA_HOME: /usr/local/cuda-10.1
NVCC: Cuda compilation tools, release 10.1, V10.1.243
GPU 0: GeForce RTX 2080 Ti
GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
PyTorch: 1.5.0
PyTorch compiling details: PyTorch built with:

  • GCC 7.3
  • C++ Version: 201402
  • Intel(R) Math Kernel Library Version 2019.0.5 Product Build 20190808 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v0.21.1 (Git Hash 7d2fd500bc78936d1d648ca713b901012f470dbc)
  • OpenMP 201511 (a.k.a. OpenMP 4.5)
  • NNPACK is enabled
  • CPU capability usage: AVX2
  • CUDA Runtime 10.2
  • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_37,code=compute_37
  • CuDNN 7.6.5
  • Magma 2.5.2
  • Build settings: BLAS=MKL, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -fopenmp -DNDEBUG -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DUSE_INTERNAL_THREADPOOL_IMPL -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF,

TorchVision: 0.6.0
OpenCV: 4.2.0
MMCV: 1.0.2
MMDetection: 2.3.0rc0+unknown
MMDetection Compiler: GCC 7.3
MMDetection CUDA Compiler: 10.1

2020-07-15 13:31:48,831 - mmdet - INFO - Distributed training: False
2020-07-15 13:31:48,831 - mmdet - INFO - Config:
/home/along/lmf_workspace/mmdetection/TSD-master/configs/faster_rcnn_r152_fpn_TSD_1x.py

model settings

model = dict(
type='FasterRCNN',
pretrained='torchvision://resnet152',
backbone=dict(
type='ResNet',
depth=152,
num_stages=4,
out_indices=(0, 1, 2, 3),
frozen_stages=1,
norm_cfg=dict(type='BN', requires_grad=True),
style='pytorch'),
neck=dict(
type='FPN',
in_channels=[256, 512, 1024, 2048],
out_channels=256,
num_outs=5),
rpn_head=dict(
type='RPNHead',
in_channels=256,
feat_channels=256,
anchor_scales=[8],
anchor_ratios=[0.5, 1.0, 2.0],
anchor_strides=[4, 8, 16, 32, 64],
target_means=[.0, .0, .0, .0],
target_stds=[1.0, 1.0, 1.0, 1.0],
loss_cls=dict(
type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
loss_bbox=dict(type='SmoothL1Loss', beta=1.0 / 9.0, loss_weight=1.0)),
bbox_roi_extractor=dict(
type='SingleRoIExtractor',
roi_layer=dict(type='RoIAlign', out_size=7, sample_num=2),
out_channels=256,
featmap_strides=[4, 8, 16, 32]),
bbox_head=dict(
type='TSDSharedFCBBoxHead',
featmap_strides=[4, 8, 16, 32],
num_fcs=2,
in_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
num_classes=81,
cls_pc_margin=0.3,
loc_pc_margin=0.3,
target_means=[0., 0., 0., 0.],
target_stds=[0.1, 0.1, 0.2, 0.2],
reg_class_agnostic=False,
loss_cls=dict(
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),
loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0)))

model training and testing settings

train_cfg = dict(
rpn=dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.7,
neg_iou_thr=0.3,
min_pos_iou=0.3,
ignore_iof_thr=-1),
sampler=dict(
type='RandomSampler',
num=256,
pos_fraction=0.5,
neg_pos_ub=-1,
add_gt_as_proposals=False),
allowed_border=0,
pos_weight=-1,
debug=False),
rpn_proposal=dict(
nms_across_levels=False,
nms_pre=2000,
nms_post=2000,
max_num=2000,
nms_thr=0.7,
min_bbox_size=0),
rcnn=dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.5,
neg_iou_thr=0.5,
min_pos_iou=0.5,
ignore_iof_thr=-1),
sampler=dict(
type='RandomSampler',
num=512,
pos_fraction=0.25,
neg_pos_ub=-1,
add_gt_as_proposals=True),
pos_weight=-1,
debug=False))
test_cfg = dict(
rpn=dict(
nms_across_levels=False,
nms_pre=1000,
nms_post=1000,
max_num=1000,
nms_thr=0.7,
min_bbox_size=0),
rcnn=dict(
score_thr=0.00, nms=dict(type='nms', iou_thr=0.5), max_per_img=100)
# soft-nms is also supported for rcnn testing
# e.g., nms=dict(type='soft_nms', iou_thr=0.5, min_score=0.05)
)

dataset settings

dataset_type = 'CocoDataset'
data_root = 'data/coco/'
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations', with_bbox=True),
dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
dict(type='RandomFlip', flip_ratio=0.5),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
]
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(1333, 800),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img']),
])
]
data = dict(
imgs_per_gpu=2,
workers_per_gpu=2,
train=dict(
type=dataset_type,
ann_file=data_root + 'annotations/instances_train2017.json',
img_prefix=data_root + 'train2017/',
pipeline=train_pipeline),
val=dict(
type=dataset_type,
ann_file=data_root + 'annotations/instances_val2017.json',
img_prefix=data_root + 'val2017/',
pipeline=test_pipeline),
test=dict(
type=dataset_type,
ann_file=data_root + 'annotations/instances_val2017.json',
img_prefix=data_root + 'val2017/',
pipeline=test_pipeline))
evaluation = dict(interval=1, metric='bbox')

optimizer

optimizer = dict(type='SGD', lr=0.04, momentum=0.9, weight_decay=0.0001)
optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2))

learning policy

lr_config = dict(
policy='step',
warmup='linear',
warmup_iters=3600,
warmup_ratio=1.0 / 32,
step=[9, 12])
checkpoint_config = dict(interval=1)

yapf:disable

log_config = dict(
interval=50,
hooks=[
dict(type='TextLoggerHook'),
# dict(type='TensorboardLoggerHook')
])

yapf:enable

runtime settings

total_epochs = 14
dist_params = dict(backend='nccl')
log_level = 'INFO'
work_dir = './work_dirs/faster_rcnn_r50_fpn_1x'
load_from = None
resume_from = None
workflow = [('train', 1)]

Traceback (most recent call last):
File "train.py", line 151, in
main()
File "train.py", line 124, in main
cfg.model, train_cfg=cfg.train_cfg, test_cfg=cfg.test_cfg)
File "/home/along/lmf_workspace/mmdetection/mmdetection-master/mmdet/models/builder.py", line 67, in build_detector
return build(cfg, DETECTORS, dict(train_cfg=train_cfg, test_cfg=test_cfg))
File "/home/along/lmf_workspace/mmdetection/mmdetection-master/mmdet/models/builder.py", line 32, in build
return build_from_cfg(cfg, registry, default_args)
File "/home/along/anaconda3/envs/lmf/lib/python3.7/site-packages/mmcv/utils/registry.py", line 167, in build_from_cfg
return obj_cls(**args)
TypeError: init() got an unexpected keyword argument 'bbox_roi_extractor'

TypeError: FasterRCNN: __init__() got an unexpected keyword argument 'bbox_roi_extractor'

Hi,

I am trying to run TSD inside a docker using the following python (/lh points mapped the host directory)

from mmdet.apis import init_detector, inference_detector
import mmcv
import sys

config_file = '/lh/r50-FPN-1x_classsampling_TSD.py'
checkpoint_file = '/lh/r50-FPN-1x_classsampling_TSD.pth'

# build the model from a config file and a checkpoint file
model = init_detector(config_file, checkpoint_file, device='cuda:0')

# test a video and show the results
video = mmcv.VideoReader(sys.argv[1])
for frame in video:
    result = inference_detector(model, frame)
    model.show_result(frame, result, wait_time=1, out_file='/lh/out.jpg')
    break #just a test

I get the following:

Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/site-packages/mmcv/utils/registry.py", line 52, in build_from_cfg
    return obj_cls(**args)
TypeError: __init__() got an unexpected keyword argument 'bbox_roi_extractor'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/lh/inf.py", line 12, in <module>
    model = init_detector(config_file, checkpoint_file, device='cuda:0')
  File "/mmdetection/mmdet/apis/inference.py", line 40, in init_detector
    model = build_detector(config.model, test_cfg=config.get('test_cfg'))
  File "/mmdetection/mmdet/models/builder.py", line 59, in build_detector
    cfg, default_args=dict(train_cfg=train_cfg, test_cfg=test_cfg))
  File "/opt/conda/lib/python3.7/site-packages/mmcv/utils/registry.py", line 212, in build
    return self.build_func(*args, **kwargs, registry=self)
  File "/opt/conda/lib/python3.7/site-packages/mmcv/cnn/builder.py", line 27, in build_model_from_cfg
    return build_from_cfg(cfg, registry, default_args)
  File "/opt/conda/lib/python3.7/site-packages/mmcv/utils/registry.py", line 55, in build_from_cfg
    raise type(e)(f'{obj_cls.__name__}: {e}')
TypeError: FasterRCNN: __init__() got an unexpected keyword argument 'bbox_roi_extractor'

this is using mmdetection 2.20. the docker is:

ARG PYTORCH="1.6.0"
ARG CUDA="10.1"
ARG CUDNN="7"

FROM pytorch/pytorch:${PYTORCH}-cuda${CUDA}-cudnn${CUDNN}-devel

ENV TORCH_CUDA_ARCH_LIST="6.0 6.1 7.0+PTX"
ENV TORCH_NVCC_FLAGS="-Xfatbin -compress-all"
ENV CMAKE_PREFIX_PATH="$(dirname $(which conda))/../"

RUN apt-get update && apt-get install -y ffmpeg libsm6 libxext6 git ninja-build libglib2.0-0 libsm6 libxrender-dev libxext6 \
    && apt-get clean \
    && rm -rf /var/lib/apt/lists/*

# Install MMCV
RUN pip install mmcv-full==1.3.17 -f https://download.openmmlab.com/mmcv/dist/cu101/torch1.6.0/index.html

# Install MMDetection
RUN conda clean --all
RUN git clone https://github.com/open-mmlab/mmdetection.git /mmdetection
WORKDIR /mmdetection
ENV FORCE_CUDA="1"
RUN pip install -r requirements/build.txt
RUN pip install --no-cache-dir -e .

Any help will be greatly appreciated!

About the confidence threshold

Hello, I have a question for you. It's about the confidence threshold. I see that the confidence threshold to visualize the bboxes and masks is 0.3 as follows:
`# TODO: merge this method with the one in BaseDetector
def show_result(img,
result,
class_names,
score_thr=0.3,
wait_time=0,
show=True,
out_file=None):
"""Visualize the detection results on the image.

Args:
    img (str or np.ndarray): Image filename or loaded image.
    result (tuple[list] or list): The detection result, can be either
        (bbox, segm) or just bbox.
    class_names (list[str] or tuple[str]): A list of class names.
    score_thr (float): The threshold to visualize the bboxes and masks.
    wait_time (int): Value of waitKey param.
    show (bool, optional): Whether to show the image with opencv or not.
    out_file (str, optional): If specified, the visualization result will
        be written to the out file instead of shown in a window.`

but during the evaluation ti is 0.00.

test_cfg = dict( rpn=dict( nms_across_levels=False, nms_pre=1000, nms_post=1000, max_num=1000, nms_thr=0.7, min_bbox_size=0), rcnn=dict( score_thr=0.00, nms=dict(type='nms', iou_thr=0.5), max_per_img=100) # soft-nms is also supported for rcnn testing # e.g., nms=dict(type='soft_nms', iou_thr=0.5, min_score=0.05)

But the confidence threshold in some other model is different, so I want to ask was that any rules about the confidence threshold? Or we can just set annually by ourself?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.