ggjy / defeat.pytorch Goto Github PK

Python 87.69% C++ 5.23% Cuda 7.02% Shell 0.07%

defeat.pytorch's Introduction

DeFeat.pytorch Code Base

Implementation of our CVPR2021 paper Distilling Object Detectors via Decoupled Features

Abstract

Knowledge distillation is a widely used paradigm for inheriting information from a complicated teacher network to a compact student network and maintaining the strong performance. Different from image classification, object detectors are much more sophisticated with multiple loss functions in which features that semantic information rely on are tangled. In this paper, we point out that the information of features derived from regions excluding objects are also essential for distilling the student detector, which is usually ignored in existing approaches. In addition, we elucidate that features from different regions should be assigned with different importance during distillation. To this end, we present a novel distillation algorithm via decoupled features (DeFeat) for learning a better student detector. Specifically, two levels of decoupled features will be processed for embedding useful information into the student, i.e., decoupled features from neck and decoupled proposals from classification head. Extensive experiments on various detectors with different backbones show that the proposed DeFeat is able to surpass the state-of-the-art distillation methods for object detection. For example, DeFeat improves ResNet50 based Faster R-CNN from 37.4% to 40.9% mAP, and improves ResNet50 based RetinaNet from 36.5% to 39.7% mAP on COCO benchmark.

Environments

Python 3.7
MMDetection 2.x
This repo uses: mmdet-v2.0 mmcv-0.5.6 cuda 10.1

VOC Results

Notes:

Faster RCNN based model
Batch: sample_per_gpu x gpu_num

Model	BN	Grad clip	Batch	Lr schd	box AP	Log
R101	bn	None	8x2	0.01	81.70
R101	bn	None	8x2	0.02	82.27	GoogleDrive
R101	syncbn	max=35	8x2	0.01	81.59
R101	syncbn	None	8x2	0.02	81.83
R50	bn	max=35	8x2	0.02	80.97
R50	syncbn	None	8x2	0.02	80.76
R50	syncbn	max=35	8x2	0.01	80.66
R50	bn	None	8x2	0.01	80.52	GoogleDrive
R101-50-FGFI-w1	bn	max=35	8x2	0.01	81.04	GoogleDrive
R101-50-FGFI-w2	bn	max=35	8x2	0.01	81.17	GoogleDrive
R101-50-FGFI-w2	bn	max=35	8x2	0.01	82.04	GoogleDrive

Acknowledgement

Our code is based on the open source project MMDetection.

defeat.pytorch's People

Contributors

Stargazers

Watchers

Forkers

zichihuanian scott-mao jie311 zzl777 hdjsjyl fuzhaoke lapnour wenbaoqiao stevetsui gwliu213

defeat.pytorch's Issues

How to calculate ground-truth mask?

Thanks a lot for amazing repo.

I read your source code but something is not cleared about how to calculate ground-truth mask.

DeFeat.pytorch/mmdet/models/dense_heads/anchor_head.py

Line 168 in c46a793

def _map_roi_levels(self, rois, num_levels):

    def _map_roi_levels(self, rois, num_levels):
        scale = torch.sqrt(
            (rois[:, 2] - rois[:, 0] + 1) * (rois[:, 3] - rois[:, 1] + 1))
        target_lvls = torch.floor(torch.log2(scale / 56 + 1e-6))
        target_lvls = target_lvls.clamp(min=0, max=num_levels - 1).long()
        return target_lvls

rois here is ground-truth bboxes. As my understanding Faster RCNN, neck has 5 levels and you want to map each bounding box to only one level. What is magic number 56 here?
And this

DeFeat.pytorch/mmdet/models/dense_heads/anchor_head.py

Line 262 in c46a793

def get_gt_mask(self, cls_scores, img_metas, gt_bboxes):

    def get_gt_mask(self, cls_scores, img_metas, gt_bboxes):
        featmap_sizes = [featmap.size()[-2:] for featmap in cls_scores]
        featmap_strides = self.anchor_generator.strides
        imit_range = [0, 0, 0, 0, 0]
        with torch.no_grad():
            mask_batch = []

            for batch in range(len(gt_bboxes)):
                mask_level = []
                target_lvls = self._map_roi_levels(gt_bboxes[batch], len(featmap_sizes))
                for level in range(len(featmap_sizes)):
                    gt_level = gt_bboxes[batch][target_lvls==level]  # gt_bboxes: BatchsizexNpointx4coordinate
                    h, w = featmap_sizes[level][0], featmap_sizes[level][1]
                    mask_per_img = torch.zeros([h, w], dtype=torch.double).cuda()
                    for ins in range(gt_level.shape[0]):
                        gt_level_map = gt_level[ins] / featmap_strides[level]
                        lx = max(int(gt_level_map[0]) - imit_range[level], 0)
                        rx = min(int(gt_level_map[2]) + imit_range[level], w)
                        ly = max(int(gt_level_map[1]) - imit_range[level], 0)
                        ry = min(int(gt_level_map[3]) + imit_range[level], h)
                        if (lx == rx) or (ly == ry):
                            mask_per_img[ly, lx] += 1
                        else:
                            mask_per_img[ly:ry, lx:rx] += 1
                    mask_per_img = (mask_per_img > 0).double()
                    mask_level.append(mask_per_img)
                mask_batch.append(mask_level)
            
            mask_batch_level = []
            for level in range(len(mask_batch[0])):
                tmp = []
                for batch in range(len(mask_batch)):
                    tmp.append(mask_batch[batch][level])
                mask_batch_level.append(torch.stack(tmp, dim=0))
                
        return mask_batch_level

As my understanding, after mapping ground-truth box to one level correspond feature map size, from location of ground-truth box in the image we map it to location in feature map (simple scale by width and height). Is that right?

Please correct me if I dont undestood correctly. Thanks.

请问可以提供faster-rcnn-v2-374.pth吗？

训练KD的时候，需要这个。请问可以提供吗？

About the train log of on coco dataset

Thank you for ur great efforts. I met some problem when trying to reproduce ur results on coco dataset. I think it should have some difference on training settings or coefficients.
Would u mind release the train log of Faster RCNN and retinanet on COCO dataset?
It would be really helpful!

Training Steps

Thanks for your code. I have studied your code. The training process contains two stages. Firstly, the teacher and student networks are separately trained based on the same dataset. Then, the distillation is performed based on the pre-trained teacher and student networks.

Could you tell me this is right?

Thank you.

KeyError: 'metric mAP is not supported'

请问出现这个报错是什么原因呢？是mmdet的版本不对还是我没有你们的某些更改过的代码呢？

Where is the setting of temperature parameters for the consistency loss?

As titled, thanks.

请问在自制coco数据集上训练时需要怎样增加class

这是在学生网络训练时的报错
File "/kaggle/working/defeatpytorch/mmdet/datasets/coco.py", line 310, in format_results
result_files = self.results2json(results, jsonfile_prefix)
File "/kaggle/working/defeatpytorch/mmdet/datasets/coco.py", line 243, in results2json
json_results = self._det2json(results)
File "/kaggle/working/defeatpytorch/mmdet/datasets/coco.py", line 181, in _det2json
data['category_id'] = self.cat_ids[label]
IndexError: list index out of range
，考虑到模型会对背景进行分类，所以想请问一下在自制的coco数据集上应该怎么添加类别呢？
我现在的数据集的内容是这3类
shsy5y
cort
astro

About student pretrain

Thanks for your nice work, and, I wonder that when distilling, is student initialized with a pretrained model(e.g. after training for 12 epochs) or just with a pretrained backbone?

如何进行多卡训练

小白求指教，怎么样开始训练，是先要下好数据集和下好主干网络的权值吗

大佬，有one-stage的config吗

如题。
有没有retinaNet的config文件吗，另外大佬有没有在yolo上试过？

感谢~

RetinaNet config

Hi. thanks for your great work, can you share the config of retinanet?

>我把runner_kd文件复制到runner文件夹之后，用github上的mmcv把作者项目中的mmcv替换了，然后运行了python setup.py develop安装好了mmdet2.0但是pip list之后还是没有mmcv这个环境，并且运行项目也说我缺少mmcv

您好，非常感谢您开源您的代码。我在配置完环境之后发现会报错:ImportError: cannot import name 'Runner_kd' from 'mmcv.runner'。我看到您说从source编译mmcv即可解决这个问题，请问什么是从source编译mmcv？不是指直接运行 python setup.py develop吗？还望指教，谢谢！

安装mmcv的时候不要用mmlab官方的mmcv，用作者自带的mmcv进行安装（clone之后进入mmcv目录编译就行），安装完之后pip list 可以看到mmcv库的location是xxxx/DeFeat.pytorch/mmcv，保证你当前环境运行的mmcv是作者带的，应该就没问题了

你好我想问下，怎么在MMCV目录进行编译呢还是个小白怎么按作者的进行mmcv安装呀？还望指教下谢谢

先找到对应版本的mmcv，去open-mmlab/mmcv 下git clone，然后把作者的runner_kd.py 拷到mmcv下，再运行官方的编译命令安装mmcv：pip install -e . 可以参考一下这里的从源安装的教程 mmcv v0.5.6

Originally posted by @ruiningTang in #14 (comment)

训练报错

您好，我正在复现您的代码。可以正常编译训练，但是运行python tools/train.py --config configs/kd_faster_rcnn/voc_stu_faster_rcnn_r50_FGFI.py报错如下：
请问mmdet里面是不是缺少什么更改过的模块？

2021-09-09 16:41:56,230 - mmdet - INFO - workflow: [('train', 1)], max: 4 epochs
Traceback (most recent call last):
File "tools/train.py", line 176, in
main()
File "tools/train.py", line 172, in main
meta=meta)
File "/home/dgfs/zl/DeFeat/mmdet/apis/train.py", line 476, in train_detector
runner.run(data_loaders, cfg.workflow, cfg.total_epochs)
File "/home/dgfs/anaconda3/envs/defeat/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 127, in run
epoch_runner(data_loaders[i], **kwargs)
File "/home/dgfs/anaconda3/envs/defeat/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 50, in train
self.run_iter(data_batch, train_mode=True, **kwargs)
File "/home/dgfs/anaconda3/envs/defeat/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 27, in run_iter
self.model, data_batch, train_mode=train_mode, **kwargs)
File "/home/dgfs/zl/DeFeat/mmdet/apis/train.py", line 79, in batch_processor
losses = model(**data)
File "/home/dgfs/anaconda3/envs/defeat/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/dgfs/anaconda3/envs/defeat/lib/python3.7/site-packages/mmcv/parallel/data_parallel.py", line 42, in forward
return super().forward(*inputs, **kwargs)
File "/home/dgfs/anaconda3/envs/defeat/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 159, in forward
return self.module(*inputs[0], **kwargs[0])
File "/home/dgfs/anaconda3/envs/defeat/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/dgfs/zl/DeFeat/mmdet/core/fp16/decorators.py", line 49, in new_func
return old_func(*args, **kwargs)
File "/home/dgfs/zl/DeFeat/mmdet/models/detectors/base.py", line 148, in forward
return self.forward_train(img, img_metas, **kwargs)
File "/home/dgfs/zl/DeFeat/mmdet/models/detectors/two_stage_kd.py", line 185, in forward_train
if 'mask-neck-one' in kd_cfg.type:
AttributeError: 'NoneType' object has no attribute 'type'

About GT-mask?

@ggjy
Thanks a lot for your work. I have read your paper. As my understanding, GT-mask is generated from ground-truth boxes and neck features? Is that right? Could you talk briefly an idea of generating GT-mask? I read source code but not clear about it. Thank you.

Releasing Date

Hi,

Would you mind sharing the date when you will release the code?

Thanks.

请教作者，请问论文中中间层特征的蒸馏损失，在代码的哪些部分?

作者您好，感谢您所贡献的代码和论文。想请教一下论文中提到的有关解耦目标和背景的中间层特征蒸馏损失，这部分大概在代码的哪部分。能否指教一下？

安装报错

Traceback (most recent call last):
File "tools/train.py", line 176, in
main()
File "tools/train.py", line 71, in main
cfg = Config.fromfile(args.config)
File "/home/micro/users/zjl/DeFeat.pytorch-main/mmcv-0.5.6/mmcv/utils/config.py", line 165, in fromfile
cfg_dict, cfg_text = Config._file2dict(filename)
File "/home/micro/users/zjl/DeFeat.pytorch-main/mmcv-0.5.6/mmcv/utils/config.py", line 84, in _file2dict
filename = osp.abspath(osp.expanduser(filename))
File "/home/micro/anaconda3/envs/ida/lib/python3.8/posixpath.py", line 231, in expanduser
path = os.fspath(path)
TypeError: expected str, bytes or os.PathLike object, not NoneType

libcudart.so.10.0:cannot open shared object file

您好，我按照要求安装了python3.6, pytorch1.5, mmdet2.0.0, mmcv0.5.6,cuda10.1 ，为啥报错libcudart.so.10.0:cannot open shared object file？

pascal voc的精度

感谢您做的工作，我近期也在尝试对这篇文章进行复现，对于pascal voc数据集的精度，我有个问题想问一下。
文章中pasca voc的R01教师网络和R50学生网络的精度分别为：
Teacher R101-FPN 82.13
Student R50-FPN 80.53
但我在用mmdetection跑实验后发现，R101和R50并没有这么大的差距，所以想问下您，教师网络和学生网络是否存在不同的操作，比如教师网络使用了数据增强，但学生网络没有使用。
还请您抽时间回复，谢谢

Question about the code

Hi @ggjy,
When I read the code about the 'decoupled-cls'. I have some problems with the code.

Here, the 'KL-decouple' uses the labels to distinguish the positive and negative samples. The labels are from the student's bbox_targets here. It is generated by the student's proposals. But the 'cls_score_s' and 'cls_score_t' are here that use the student model and the teacher model to forward the teacher's proposals(sampling_results).

So I'm wondering if there exists any mismatch because the cls_scores are from the teacher's proposals while the labels are from the student's proposals.

Looking forward to your reply, thanks~

请问你是怎么安好的啊我现在已经github下载了mmcv0.5.6版本的，然后runner_kd那个文件也复制过去了，之后怎么做呀？> >

您好，非常感谢您开源您的代码。我在配置完环境之后发现会报错:ImportError: cannot import name 'Runner_kd' from 'mmcv.runner'。我看到您说从source编译mmcv即可解决这个问题，请问什么是从source编译mmcv？不是指直接运行 python setup.py develop吗？还望指教，谢谢！

安装mmcv的时候不要用mmlab官方的mmcv，用作者自带的mmcv进行安装（clone之后进入mmcv目录编译就行），安装完之后pip list 可以看到mmcv库的location是xxxx/DeFeat.pytorch/mmcv，保证你当前环境运行的mmcv是作者带的，应该就没问题了

你好我想问下，怎么在MMCV目录进行编译呢还是个小白怎么按作者的进行mmcv安装呀？还望指教下谢谢

先找到对应版本的mmcv，去open-mmlab/mmcv 下git clone，然后把作者的runner_kd.py 拷到mmcv下，再运行官方的编译命令安装mmcv：pip install -e . 可以参考一下这里的从源安装的教程 mmcv v0.5.6

谢谢你的回复我按你的方法安好了但是现在又报了这种错误
raceback (most recent call last):
File "tools/train.py", line 176, in
main()
File "tools/train.py", line 71, in main
cfg = Config.fromfile(args.config)
File "/home/Disk-2T/DeFeat.pytorch-main/mmcv/utils/config.py", line 165, in fromfile
cfg_dict, cfg_text = Config._file2dict(filename)
File "/home/Disk-2T/DeFeat.pytorch-main/mmcv/utils/config.py", line 84, in _file2dict
filename = osp.abspath(osp.expanduser(filename))
File "/home/caipeng/anaconda3/envs/ENV/lib/python3.7/posixpath.py", line 235, in expanduser
path = os.fspath(path)
TypeError: expected str, bytes or os.PathLike object, not NoneType

Originally posted by @shuizaola in #14 (comment)

Question about ‘head-cls’ type

Thank u for ur great efforts. I noticed there is a head-cls distillation type in ur source code.
I tried to use this type with setting 'head-cls, KL-decouple' in the config. But the acc get much lower（70% for AP50） than the original student model.

Do u have any recommend settings and hyperparameters of the 'head-cls' type? thanks

Successfully load teacher network checkpoint. Traceback (most recent call last): File "tools/train_kd.py", line 204, in <module> main() File "tools/train_kd.py", line 200, in main meta=meta) File "/home/Disk-2T/DeFeat.pytorch-main/mmdet/apis/train.py", line 532, in train_detector_kd meta=meta) TypeError: 'module' object is not callable 为什么会报这种错误呀我使用的命令：python tools/train_kd.py --validate --work-dir /home/Disk-2T/DeFeat.pytorch-main --config configs/kd_faster_rcnn/voc_stu_faster_rcnn_r50_decouple_neck.py --config-t /home/Disk-2T/DeFeat.pytorch-main/configs/kd_faster_rcnn/voc_tea_faster_rcnn_r101.py 大家帮我看看

python setup.py报错

Traceback (most recent call last):
File "setup.py", line 195, in
write_version_py()
File "setup.py", line 71, in write_version_py
sha = get_hash()
File "setup.py", line 56, in get_hash
raise ImportError('Unable to get git version')
ImportError: Unable to get git version

Can you share the pretrained models of Res101 and Res50?
And how is the setting of Res50(1/4)?

KeyError: 'proposal_list'

About VOC pretrained weight

First of all, thank you for your works

I want to experiment for voc dataset.
on your paper, faster rcnn with resnet 101 records 82.13. but, i cant get that weight.
so, can i get a pretrained weight for reproduce your works?

sincerely,

配置问题

感谢您所贡献的代码
运行过python setup.py develop后仍旧出现ModuleNotFoundError: No module named 'mmcv._ext' 请问如何解决呢