ziweiwangthu / bidet Goto Github PK

This is the official pytorch implementation for paper: BiDet: An Efficient Binarized Object Detector, which is accepted by CVPR2020.

License: MIT License

Shell 0.85% MATLAB 0.36% Python 71.05% C++ 2.90% Cuda 13.80% C 11.04%

bidet's Introduction

BiDet

This is the official pytorch implementation for paper: BiDet: An Efficient Binarized Object Detector, which is accepted by CVPR2020. The code contains training and testing two binarized object detectors, SSD300 and Faster R-CNN, using our BiDet method on two datasets, PASCAL VOC and Microsoft COCO 2014.

Update

2021.1: Our extended version of BiDet is accepted by T-PAMI! We further improve the performance of binary detectors and extend our method to multi model compression methods. Check it out here.
2021.4.19: We provide BiDet-SSD300 pretrained weight on Pascal VOC dataset which achieves 66.0% mAP as described in the paper. You can download it here.

Quick Start

Prerequisites

python 3.6+
pytorch 1.0+
other packages include numpy, cv2, matplotlib, pillow, cython, cffi, msgpack, easydict, pyyaml

Dataset Preparation

We conduct experiments on PASCAL VOC and Microsoft COCO 2014 datasets.

PASCAL VOC

We train our model on the VOC 0712 trainval sets and test it on the VOC 07 test set. For downloading, just run:

sh data/scripts/VOC2007.sh # <directory>
sh data/scripts/VOC2012.sh # <directory>

Please specify a path to download your data in, or the default path is ~/data/.

COCO

We train our model on the COCO 2014 trainval35k subset and evaluate it on minival5k. For downloading, just run:

sh data/scripts/COCO2014.sh

Also, you can specify a path to save the data.

After downloading both datasets, please modify file faster_rcnn/lib/datasets/factory.py line 24 and file faster_rcnn/lib/datasets/coco.py line 36 by replacing path/to/dataset with your voc and coco dataset path respectively.

Pretrained Backbone

The backbones for our BiDet-SSD300 and BiDet-Faster R-CNN are VGG16 and Resnet-18. We pretrain them on the ImageNet dataset. You can download the pretrained weights on: VGG16 and ResNet18. After downloading them from Google Drive, please put them in ssd/pretrain and faster_rcnn/pretrain respectively.

Training and Testing

Assume you've finished all steps above, you can start using the code easily.

SSD

For training SSD, just run:

$ python ssd/train_bidet_ssd.py --dataset='VOC/COCO' --data_root='path/to/dataset' --basenet='path/to/pretrain_backbone'

For testing on VOC, just run:

$ python ssd/eval_voc.py --weight_path='path/to/weight' --voc_root='path/to/voc'

For testing on COCO, just run:

$ python ssd/eval_coco.py --weight_path='path/to/weight' --coco_root='path/to/coco'

Faster R-CNN

First you need to compile the cuda implementation for RoIPooling, RoIAlign and NMS. Just do:

cd faster_rcnn/lib
python setup.py build develop

For training Faster R-CNN, just run:

$ python faster_rcnn/trainval_net.py --dataset='voc/coco' --data_root='path/to/dataset' --basenet='path/to/pretrain_backbone'

For testing, run:

$ python test_net.py --dataset='voc/coco' --checkpoint='path/to/weight'

Citation

Please cite our paper if you find it useful in your research:

@inproceedings{wang2020bidet,
  title={BiDet: An Efficient Binarized Object Detector},
  author={Wang, Ziwei and Wu, Ziyi and Lu, Jiwen and Zhou, Jie},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={2049--2058},
  year={2020}
}

Frequently Asked Questions

What is the difference between BiDet and BiDet (SC)?

They are two different binary neural networks (BNNs) architectures. As BiDet can be regarded as a training strategy with the IB and sparse object priors loss, we adopt popular BNN methods as our models. BiDet means applying our method to Xnor-Net like architecture, with BN-->BinActiv-->BinConv orders and scaling factors. BiDet (SC) means applying our method to Bi-Real-Net like architecture, with additional shortcuts. This repo provides implementations of BiDet (SC) for both SSD and Faster R-CNN.

Do you modify the structure of these detection networks?

For Faster R-CNN with ResNet-18 backbone, we do no modification. For SSD300 with VGG16 backbone, we restructure it to make it suitable for BNNs. Please refer to this issue for more details.

Is the BiDet detectors fully binarized?

Yes, both the backbone and detection heads of BiDet detectors are binarized. One of the main contributions of our work is that we show FULLY binarized object detectors can still get relatively good performance on large-scale datasets such as PASCAL VOC and COCO.

How do you calculate the model parameter size and FLOPs?

I use an open source PyTorch libary to do so.

Why the saved model weight has much larger size than reported in the paper? Why the weight values are not binarized? How about the inference speed?

Currently there is no official support for binary operations such as Xnor and bitcount in PyTorch, so all BNN researchers use normal weight (float32) to approximate them by binarization at inference time. That is why model size is large and weight values not binarized. As for the inference speed, this is very important for BNNs, but as I said, PyTorch doesn't have acceleration for these operations, so it will be slow using PyTorch. I recommend you to try some BNN inference libraries, such as daBNN. Please refer to this and this issue for more details.

The training is not stable.

Yes. The training of BNNs is known to be unstable and requires fine-tuning. Please refer to this issue for more detailed discussions.

Acknowledgement

We thank the authors of ssd.pytorch and faster-rcnn.pytorch for opening source their wonderful works. We thank daquexian for providing his implementation of Bi-Real-Net.

License

BiDet is released under the MIT License. See the LICENSE file for more details.

Contact

If you have any questions about the code, please contact Ziyi Wu [email protected]

bidet's People

Stargazers

Watchers

bidet's Issues

有关于模型在coco数据集上的表现

博士，您好
看您的介绍像是国人，如此我就直接用中文表达我的疑惑了，望请解答。

请问BiDet中涉及到的模型在coco数据上能否达到yolov3在coco数据集上的效果，即输入相近的Map值。

例如在coco数据上达到30左右的Map值。

我看您在此仓的部分issue中有提到一个基于Resnet18的模型，其Map值为14.4，且论文中展示的在coco数据集上的Map为15.7，请问这些值是此仓的模型的最理想值吗，能否有提升的空间，即还能否高

祝好

Cannot get correct result in eval_coco OR eval_voc

Hello, I encountered a problem.
During the training phase, due to the small GPU memory problem, I changed batch_size in train_bidet_ssd.py to 16 and num_worker to 4, and lr remained at the default 1e-3.
However, during the training process, the loss output will be NaN around 30,000 iterations, but when I evaluate the model parameters saved before becoming NaN, the output AP and mAP are both around 0.0xx.
Moreover, when I reduce the default learning rate to 1e-4, although the model loss will not have the problem of nan, it cannot obtain the correct evaluation result.
May I ask if I have unfinished configuration operations and how I will solve this problem.

Contrast experiment

When you do a comparison experiments, For example, Bi-Real method is used to binarize the network,.The SSD_vgg16 structure used in Bi-Real is same as the SSD_vgg16 used in BiDet(SC)? Structure changes include add BN, clip maxpool,and so on.

runtimeError: Errors in loading state_dict for module list

Hi, I have trouble in loading pretrained vgg model, and met the problem as follows:

My environment configs: python 3.5.6, torch 0.4.0. Are there some wrongs in my configs? could you tell me your configs or corresponding solutions? Looking forward to your reply!

some questions about the model size

Hi, could you tell me the size of your ssd or faster rcnn model? I found that my own trained faster rcnn model takes 142.14MB space! It is still too large.

How long does it take to inference an image?

请问一下识别一张图像，耗时多少呢？

Question about the corresponding code of loss function

Hi, Wang. Thanks for your great work.
Can you specify the corresponding source code for your loss function in paper:

Thank you.

Resnet18的layer中存在未二值化的卷积

`
def _make_layer(self, block, planes, blocks, stride=1, **kwargs):
downsample = None
if stride != 1 or self.inplanes != planes * block.expansion:
conv = nn.Conv2d
ds_out_planes = planes * block.expansion
downsample = nn.Sequential(
nn.AvgPool2d(2, stride=stride, ceil_mode=True),
conv(self.inplanes, ds_out_planes, kernel_size=1, stride=1, bias=False),
nn.BatchNorm2d(ds_out_planes)
)

    layers = []
    layers.append(block(self.inplanes, planes, stride, downsample, **kwargs))
    self.inplanes = planes * block.expansion
    for _ in range(1, blocks):
        layers.append(block(self.inplanes, planes, **kwargs))

    return nn.Sequential(*layers)

博士您好，如上面这段代码，的第四行，该卷积未被二值化，想请教一下这是为什么呢？

About xnor-bitcount implementation

Hi, really appreciated for your excellent work.

Like many other open-source binary quantization repositories, I notice that you conduct BinarizeConv2d based on torch.nn.functional.conv2d. For training everything is good. But for inference, it seems that the absence of xnor-bitcount based convolution keeps this excellent work from extreme superiority.

Have you implemented this or have you intended to do so? Thanks very much.

MC sampling replace IB(information bottleneck) in the code

Hi, Wang. Thanks for your great work. I have some doubts,Can you explain the following doubts?

The IB loss in the paper is replace MC sampling int the code.
1.What is the meaning of MC sampling? Is it Monte Carlo （蒙特卡洛）.
2.What is the relationship between MC sampling and IB principle?
3.The feature maps are L2 normalization in the code。What is the relationship between L2 normalization and MC sampling？

Thank you

MC sampling

数据集划分问题

您在训练代码时，只用到了训练集和测试集，未使用验证集(训练过程没有评估结果)，是这样吗？

The question about training stage in faster_rcnn

Hello, I'm sorry to bother you. I met some problems in the training of BIDet's Fasters-RCNN, and I would like to consult you:

During training, in the first few epochs, there will be a loss NaN on some Iteration sections;

My training orders are: Python faster_rCNN /trainval_net.py --dataset='coco' --data_root='data/coco' --basenet='pretrain/resnet18.pth' -- mGPUs=True
I set RPN_PRIOR_WEIGHT, RPN_REG_WEIGHT, HEAD_PRIOR_WEIGHT, AND HEAD_PRIOR_WEIGHT as 0.2, 0.1, 0.2, 0.1 respectively from the [begining.]

Have you ever encountered this problem in the training process?

11 conv & 33 conv

Hello! Would you like to discuss three questions with me? Thank you.

Did you discover that BinaryConv for 1*1conv perform poorly in the detection task?

Are binary 11 conv or binary 33 the same for object task?

The Bi-det method is suitable for Yolo_v3?

params

您好，我在学习您的工作，我有一个疑问。
我在用thop计算ssd300的params与MFLOPs时，浮点运算数大致一样，但是params为26.28M，而不是文中所说的100.28M？？？

bidet测试效果不好

用bidet ssd原参数，将args.reg_weight = 0 args.prior_weight = 0，其他参数不变，在voc数据集上测试，得到的mean ap只有零点几，是我在训练时哪里出现了问题？？？

Question on pretrained models.

I've sent an email to the authors roughly 2 weeks ago about the pretrained models but haven't got a response yet (maybe the email didn't go?) so I'm re-iterating the question here.

For obtaining the pretrained weights of ResNet18/VGG16, do you train the networks as a floating point network or do you binarize the networks like in XNOR-Net/Bi-Real Net and then train them on ImageNet to obtain the pretrained weights?

I'm trying to use different backbone networks and an answer to this question would help me in obtaining the pretrained weights for my networks.

关于二值化带来的参数缩减

关于参数缩减，相比vgg、resnet-18，mobilenet的二值化减少的参数量很少，这是什么原因导致的呢？

Training Time

Dear authors,

How long does your model need to train?

关于计算量和参数量

我有注意到您提及通过thop库来计算参数量和计算量。
我的理解是，这个库上没有实现自定义模块的计算方式，您二值化后的模块是如何计算的呢，这部分的代码有开源吗？很期待能够学习。

关于检测头二值化问题的请教

博士您好！
关于fasterrcnn的bbox_head，我尝试在自己的代码上进行了二值化，却导致了精度为零，我还原了这一部分为全精度，确认是二值化导致的。我想请问，在您的代码里面，bbox的最后两层shared_fcs层您进行二值化了吗？具体来说，是以下结构：
(shared_fcs): ModuleList( 13.896 M, 29.052% Params, 13.894 GFLOPs, 54.616% FLOPs, (0): Linear(12.846 M, 26.857% Params, 12.845 GFLOPs, 50.494% FLOPs, in_features=12544, out_features=1024, bias=True) (1): Linear(1.05 M, 2.194% Params, 1.049 GFLOPs, 4.122% FLOPs, in_features=1024, out_features=1024, bias=True) )

Is the backbone network binarized?

Hi, I really appreciate your valuable work.

I just wonder which part of the SSD network is binarized.
In your paper, SSD model consists of a backbone network and a detection network.
Are both parts are binarized?
If not, which layers have to be kept with full precision?

测试错误

你好，我现在已经成功训练训练了模型，但是测试时出现了错误：Network is not defined. 我的训练命令是

python faster_rcnn/trainval_net.py --dataset='coco' --data_root='/home/user/Desktop/wam/BiDet/data' --basenet='/home/user/Desktop/wam/BiDet/faster_rcnn/pretrain/resnet18.pth'

测试命令是：

python test_net.py --dataset='coco' --checkpoint='./logs/coco/bidet18_IB/2021-07-06 20:46:25/model_50_loss_0.771_lr_1.0000000000000002e-06_rpn_cls_0.1611_rpn_bbox_0.0987_rcnn_cls_0.3137_rcnn_bbox_0.197_rpn_prior_0.0_rpn_reg_0.0005_head_prior_0.0_head_reg_0.0.pth'

您能给我一些建议吗，谢谢您。

UnboundLocalError

(base) fsr@3090:~/BiDet-master$ python ssd/train_bidet_ssd.py --dataset='VOC/COCO' --data_root='path/to/dataset' --basenet='path/to/pretrain_backbone'
Traceback (most recent call last):
File "ssd/train_bidet_ssd.py", line 414, in
train()
File "ssd/train_bidet_ssd.py", line 123, in train
ssd_net = build_bidet_ssd('train', cfg['min_dim'], cfg['num_classes'],
UnboundLocalError: local variable 'cfg' referenced before assignment
如何解决？

训练ssd模型时，调整学习率函数参数错误

您好，我在复现您的ssd模型在voc数据集上，遇到了这个错误，adjust_learning_rate(optimizer, args.gamma, step_index)
TypeError: adjust_learning_rate() takes 2 positional arguments but 3 were given
看您定义的def adjust_learning_rate(optimizer, new_lr)只需要两个函数。
是我哪里没弄明白吗？

python setup.py build develop报错

您好，我在编译setup.py时，遇到了error: torch/extension.h: No such file or directory这个错误？我想请教您这个问题要如何解决

The test result of COCO by using faster-rcnn

Sorry to bother you. I use faster-rcnn to test COCO. The mAP is 12.7%, which is lower than the result in the paper (15.7%);
The parameters I used are the default parameters in the trainval_net.py. How many epochs do you choose for training? Thanks for your time. I am waiting for your reply.

我想要您这个训练别的模型，需要怎样准备数据集呢？

安装voc的格式准备的数据集跑报错了，我不知道是什么原因，如果您方便的话麻烦您帮助我一下

PS D:\project\vscode\BiDet> python .\ssd\train_bidet_ssd.py --data_root='D:\project\vscode\BiDet\data\ea89447d' --basenet='D:\project\vscode\BiDet\ssd\pretrain\vgg16.pth'
Loading base network...
Loading the dataset...
Training SSD on: VOC0712
Using the specified args:
Namespace(basenet='D:\project\vscode\BiDet\ssd\pretrain\vgg16.pth', batch_size=32, clip_grad=False, cuda=True, data_root='D:\project\vscode\BiDet\data\ea89447d', dataset='VOC', gamma=0.1, lr=0.001, momentum=0.9, nms_conf_threshold=0.03, num_workers=16, opt='Adam', prior_weight=0.0, reg_weight=0.0, resume=False, sigma=0.0, start_iter=0, weight_decay=0.0, weight_path=None)
C:\Users\dell\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\torch\utils\data\dataloader.py:474: UserWarning: This DataLoader will
create 16 worker processes in total. Our suggested max number of worker in current system is 6 (cpuset is not taken into account), which is smaller than what this DataLoader is going to create. Please be
aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
warnings.warn(_create_warning_msg(
Traceback (most recent call last):
File ".\ssd\train_bidet_ssd.py", line 418, in
train()
File ".\ssd\train_bidet_ssd.py", line 207, in train
images, targets = next(batch_iterator)
File "C:\Users\dell\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\torch\utils\data\dataloader.py", line 517, in next
data = self._next_data()
File "C:\Users\dell\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\torch\utils\data\dataloader.py", line 1199, in _next_data
return self._process_data(data)
File "C:\Users\dell\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\torch\utils\data\dataloader.py", line 1225, in _process_data
data.reraise()
File "C:\Users\dell\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\torch_utils.py", line 429, in reraise
raise self.exc_type(msg)
KeyError: Caught KeyError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "C:\Users\dell\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\torch\utils\data_utils\worker.py", line 202, in _worker_loop
data = fetcher.fetch(index)
File "C:\Users\dell\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\torch\utils\data_utils\fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "C:\Users\dell\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\torch\utils\data_utils\fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "D:\project\vscode\BiDet\ssd\data\voc0712.py", line 116, in getitem
im, gt, h, w = self.pull_item(index)
File "D:\project\vscode\BiDet\ssd\data\voc0712.py", line 131, in pull_item
target = self.target_transform(target, width, height)
File "D:\project\vscode\BiDet\ssd\data\voc0712.py", line 73, in call
label_idx = self.class_to_ind[name]
KeyError: 'warning'

The training configuration of the Faster RCNN on PASCAL VOC

Thanks for providing your source code.

My problem is that I use the default training configuration in the Faster-RCNN/trainval_net.py, which trains model 50 epochs and decay the learning rate every 6 epochs. However, I only achieved 17.18% mAP on test set.

How many epochs do you choose for training? Thanks for your time. I am waiting for your reply.

SSD300_VGG16 architecture

Hi! Have you restructured your network of SSD300_Vgg16?

About DoReFa-Net

How did you get the FLOPs about DoReFa-Net, 4694M？ What is the calculating rule？

instances_valminusminival2014.json not found

Hi,
I am trying to train faster rcnn on the coco dataset. I have downloade the coco dataset following the script that you have provided in the repo. WHen I follow the instructions and start training the model, I get the following error :

`Traceback (most recent call last):
  File "faster_rcnn/trainval_net.py", line 222, in <module>
    imdb, roidb, ratio_list, ratio_index = combined_roidb(args.imdb_name)
  File "/media/Rozhok/BiDet/faster_rcnn/lib/roi_data_layer/roidb.py", line 119, in combined_roidb
    roidbs = [get_roidb(s) for s in imdb_names.split('+')]
  File "/media/Rozhok/BiDet/faster_rcnn/lib/roi_data_layer/roidb.py", line 119, in <listcomp>
    roidbs = [get_roidb(s) for s in imdb_names.split('+')]
  File "/media/Rozhok/BiDet/faster_rcnn/lib/roi_data_layer/roidb.py", line 112, in get_roidb
    imdb = get_imdb(imdb_name)
  File "/media/Rozhok/BiDet/faster_rcnn/lib/datasets/factory.py", line 38, in get_imdb
    return __sets[name]()
  File "/media/Rozhok/BiDet/faster_rcnn/lib/datasets/factory.py", line 31, in <lambda>
    __sets[name] = (lambda split=split, year=year: coco(split, year))
  File "/media/Rozhok/BiDet/faster_rcnn/lib/datasets/coco.py", line 39, in __init__
    self._COCO = COCO(self._get_ann_file())
  File "/home/biometrics/.virtualenvs/retinanet/lib/python3.6/site-packages/pycocotools/coco.py", line 84, in 
__init__
    with open(annotation_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: 
'/home/biometrics/data/coco/annotations/instances_valminusminival2014.json'

I dont have the instances_valminusminival2014.json in my annotations folder. Where can I get this from?

Deploy in Xilinx FPGA

Dear,

I am trying to deploy these models for Xilinx FPGA using the FINN framework. Do you have in your knowledge any incompatibility? Currently, I am having many issues while converting to the ONNX format.

Thanks

The pretrained weight of vgg

When you pre-train the vgg on ImageNet, Do you add BN or shortcut(SC)?

If you don't add it, when you load the pretrained model on BiDet, Will the parameters not match?

BN is train mode while Binary SSD_vgg16 is training ?

Dear Author,

    I'm sorry to bother you .
    I noticed that the BN(Batch Norm) is train mode in the training process of SSD. The four parameters of BN (alpha, beta, meaning, variance) is update with the Binary neural network.
    Do I understand correctly?

   Thank you very much.

The code about information bottleneck

The code of information bottleneeck is as below:

Why it is not same as the formula J1?

Do I understand you correctly? Thank you very much!

Vgg arch in SSD implementation is different from original vgg

Hi Guys, thank you for the awsome implementation. I had a question about the bidet_vgg in the ssd implementation. Your bidet_vgg has got extra layers and you have replaced all the maxpool layers with Conv layers with downsample. Why did you guys make these changes? Is the original vgg architecture too low in complexity to learn if you binarize it?

关于检测头的二值化

您好，关于检测头的二值化，我有以下疑问希望能够请教：
1.在RPN_head中，我注意到您除了将第一层卷积二值化了，还修改了proposal_layer的方式，代码如下：

# define proposal layer
        self.RPN_proposal = _ProposalLayer_IB(self.feat_stride, self.anchor_scales,
                                              self.anchor_ratios, self.sample_sigma)

想请问一下这里的修改的作用是什么吗，它对应的是论文里提及的IB准则吗？
2.关于检测头的二值化，按照我对代码的理解，RPN_Head和Roi_Head中，仅有RPN_Head的第一层被二值化了，这样的理解对吗？
非常感谢！

The code of Auto-Bidet

Do you have any plans to release the code of the Auto-Bidet?
I'm very interested in it!
Thank you very much!

loss为inf

您好，我将bdd100k转换为voc格式后进行训练，有将lr降到很低，但是训练的时候loss一直是inf，您能帮我分析一下吗？

test error

Hello, I encountered an error when executing the test command after the training was completed. The error message was: PermissionError: [Errno 13] Permission denied:'/path'. Can you give a solution? thank you very much!

Training on my own dataset

It's nice to see your work,If i want to train on my own dataset,which parts should i modify

Could you provide a trained SSD model with Bi-det strategy?

Hello! Could you provide a trained SSD model with Bi-det strategy? I need to use it for comparative analysis.

Although I use the tricks proposed in existed issues, NaN is always present during training.

Thank you very much!

Where can I find the BiDet-SC model in the code base for COCO?

Hi, I am working on recreating your results but I cant seem to find the model setting for Bidet-sc which you have reported in the paper. Can you please point me to the file where I can see the skip connections you have built in Bidet-SC version of Bidet? THank you.

关于训练

昨儿尝试在coco2017上训练SSD300的二值网络，今早发现loss呈现这种趋势，您觉得是不是该cancel重新开训了。

请问40K时的一波更新学习率lr咋会引起这么大的反应？

Questions on FLOPs

According to your paper https://arxiv.org/pdf/2003.03961.pdf, input sized 6001000 is employed in binarized Faster R-CNN. However, i have read your code carefully, in model BiDetResNet, the first conv layer is 'nn.Conv2d(3, first_inplanes, kernel_size=7, stride=2, padding=3, bias=False)'. Take 6001000 input ,the FLOPs is 1,411,200,000=1345.8M. This FLOPs is calculated by torchstat. The FLOPs of the first layer single is greater than the FLOPs of whole binarized Faster R-CNN u claimed in ur paper as 781M. Please check this, thx.

loading pretrained model

Hi,
I download your trained model BiDet-SSD300-VOC_66.0.pth and want to load it with the network in bidet.ssd.py. But I met an issue of mismatching size as follows:

RuntimeError: Error(s) in loading state_dict for BiDetSSD:
size mismatch for conf.0.weight: copying a param with shape torch.Size([84, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([324, 512, 3, 3]).
size mismatch for conf.0.bias: copying a param with shape torch.Size([84]) from checkpoint, the shape in current model is torch.Size([324]).
size mismatch for conf.1.weight: copying a param with shape torch.Size([126, 1024, 3, 3]) from checkpoint, the shape in current model is torch.Size([486, 1024, 3, 3]).
size mismatch for conf.1.bias: copying a param with shape torch.Size([126]) from checkpoint, the shape in current model is torch.Size([486]).
size mismatch for conf.2.weight: copying a param with shape torch.Size([126, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([486, 512, 3, 3]).
size mismatch for conf.2.bias: copying a param with shape torch.Size([126]) from checkpoint, the shape in current model is torch.Size([486]).
size mismatch for conf.3.weight: copying a param with shape torch.Size([126, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([486, 256, 3, 3]).
size mismatch for conf.3.bias: copying a param with shape torch.Size([126]) from checkpoint, the shape in current model is torch.Size([486]).
size mismatch for conf.4.weight: copying a param with shape torch.Size([84, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([324, 256, 3, 3]).
size mismatch for conf.4.bias: copying a param with shape torch.Size([84]) from checkpoint, the shape in current model is torch.Size([324]).
size mismatch for conf.5.weight: copying a param with shape torch.Size([84, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([324, 256, 3, 3]).
size mismatch for conf.5.bias: copying a param with shape torch.Size([84]) from checkpoint, the shape in current model is torch.Size([324]).
Can you advise how to fix it?

Thanks

关于faster rcnn的FPN

我注意到原始的faster rcnn代码中，包含了FPN层，它在backbone和rpnhead之间，但您的代码中我似乎没有找到这一结构。
您使用了一些策略来替代FPN吗，还是作了怎样的修改呢，很感谢您可以为我解答。

Default value of \beta (args.reg_weight) and \gamma (args.prior_weight) wrong.

In the paper, it is described that \beta and \gamma were set to 10 and 0.2. However, the given code uses 0 as the default values for both of them, essentially meaning that the code is going to be ran without the proposed additional loss terms. I would appreciate it if the authors could provide some clarification on how \beta and \gamma were actually used during training.

Model Binarized Problem

Hi, I am trying to repeat the experiment in your paper, and meeting some problems in training part. I use the recommended command [python ssd/train_bidet_ssd.py --dataset="VOC" --data_root="D:\01DL\data\VOCdevkit" --basenet="D:\01DL\BiDet-master\ssd\pretrain\vgg16.pth"] to run the training and get the final model saved as " VOC_final.pth". However, the "VOC_final.pth", which I think is the parameters of the trained "BiDet_SSD model", is around 127MB. According to the paper, the output binary model should be around 20MB, since "bidet_ssd" build in bidet_ssd.py. I am confused whether I followed the incorrect way to run your code or missed any parameter setting.
In the test part, I got TypeError as follow:

Hoping for your response, thanks.

Loss goes to NaN at 150K Iterations

Through this issue, I've fixed the problem with the prior/reg loss weights as per the author's response (add 1e-6 to avoid divide by zero).
However, I noticed that my loc_loss and reg_loss became NaN.
I retried with clipping the gradients by setting the --clip_grad option as True.
My loc_loss and reg_losses still became NaN at 150k iteration and the training failed.
The exact command I ran was the following:
python ssd/train_bidet_ssd.py --dataset VOC --data_root ./data/VOCdevkit/ --basenet ./ssd/pretrain/vgg16.pth --clip_grad true
Any help would be appreciated.