Giter Site home page Giter Site logo

spurslipu / yolov3v4-modelcompression-multidatasettraining-multibackbone Goto Github PK

View Code? Open in Web Editor NEW
443.0 443.0 135.0 46.45 MB

YOLO ModelCompression MultidatasetTraining

License: GNU General Public License v3.0

Python 99.29% Shell 0.71%
mobilenetv3 modelcompression multidataset object-detection pruning quantization-aware-training yolo

yolov3v4-modelcompression-multidatasettraining-multibackbone's People

Contributors

billamihom avatar domdal avatar geiright avatar scarcestar-xiecy avatar spurslipu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

yolov3v4-modelcompression-multidatasettraining-multibackbone's Issues

其他检测模型是否通用例如ssd yolo4 ?

@SpursLipu
其他检测模型是否通用例如ssd yolo4 ?还是有很多需要注意的的点需要自己修改
分类和检测的模型压缩有什么区别吗?现在发布的论文都是验证的分类模型,不可以都通用吗?

kmeans anchors

when I use the anchor box clustered from my own data, why map is decreased? use the source code's anchors size, my best map is 0.95, but when I use the anchor size for my own data, the best map just is nearly 0.7

训练自己的数据集

你好,请问下如何制作并训练自己的数据集呢?本人期末project是做X图片里的检测,同时也想学习模型压缩的知识,十分感谢

加载yolov3.weights出错

Traceback (most recent call last):
File "F:/Code/YOLOv3-ModelCompression-MultidatasetTraining-Multibackbone-master/train.py", line 516, in
train(hyp) # train normally
File "F:/Code/YOLOv3-ModelCompression-MultidatasetTraining-Multibackbone-master/train.py", line 151, in train
load_darknet_weights(model, weights, pt=opt.pt)
File "F:\Code\YOLOv3-ModelCompression-MultidatasetTraining-Multibackbone-master\models.py", line 566, in load_darknet_weights
assert ptr == len(weights)
AssertionError

prune

你好,能不能在readme中贴出prune的参考论文链接,小白第一次接触压缩,不知道code中的剪枝是对应哪些论文。还有在进行剪枝稀疏化训练时我查看了代码好像并没有生成新的cfg文件,请问还是使用原来的yolov3.cfg文件吗?还有i请问下,你在训练时percent的参数一般给的是多少,希望能做下参考,十分感谢

训练yolov3-mobilenet

训练yolov3-monilenet的命令中python3 train.py --data data/coco2017.data --batch-size 32 --accumulate 1 -pt --weights weights/yolov3-mobilenet.weights --cfg cfg/yolov3tiny-mobilenet/yolov3tiny-mobilenet-small-coco.cfg --img_size 608 ,cfg为什么是yolov3tiny-mobilenet??

'YOLOLayer' object has no attribute 'grid'

您好,我之前是使用你师兄的代码去训练工程,但是加载训练完成后保存的best.pt去预测视频时候图像被缩放为256x416的:

video 1/1 (1452/67673) /home/yhy/YOLOv3_pruning/ch03.mp4: 256x416 Done.

检测图片的时候也是会被重置为256x416,检测的效果非常差,但是看训练时的mAP倒是挺好,70%左右。
你师兄的工程貌似已经没有在维护了,不知道你有没有在你师兄的工程中遇到图像被缩放为256x416的情况?
找不到原因,所以用你的工程,但是训练的时候报错:

Traceback (most recent call last):
File "train.py", line 497, in
train(hyp) # train normally
File "train.py", line 387, in train
dataloader=testloader)
File "/home/yhy/pruning/test.py", line 74, in test
_ = model(torch.zeros((1, 3, img_size, img_size), device=device)) if device.type != 'cpu' else None # run once
File "/home/yhy/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/home/yhy/.local/lib/python3.6/site-packages/torch/nn/parallel/distributed.py", line 449, in forward
outputs = self.parallel_apply(self._module_copies[:len(inputs)], inputs, kwargs)
File "/home/yhy/.local/lib/python3.6/site-packages/torch/nn/parallel/distributed.py", line 474, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/home/yhy/.local/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 85, in parallel_apply
output.reraise()
File "/home/yhy/.local/lib/python3.6/site-packages/torch/_utils.py", line 394, in reraise
raise self.exc_type(msg)
AttributeError: Caught AttributeError in replica 0 on device 0.
Original Traceback (most recent call last):
File "/home/yhy/.local/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 60, in _worker
output = module(*input, **kwargs)
File "/home/yhy/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/home/yhy/pruning/models.py", line 306, in forward
return self.forward_once(x)
File "/home/yhy/pruning/models.py", line 358, in forward_once
yolo_out.append(module(x, out))
File "/home/yhy/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/home/yhy/pruning/models.py", line 279, in forward
io[..., :2] = torch.sigmoid(io[..., :2]) + self.grid # xy
File "/home/yhy/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 576, in getattr
type(self).name, name))
AttributeError: 'YOLOLayer' object has no attribute 'grid'

这个错误在你师兄或者ultralytics/yolov3的工程里都没有出现。

some question about the yolov3tiny-mobilenet-small-coco?

1.在MobileNetV3中,原作者在bneck中并未使用relu6和linear,relu6和linear是作者在V2中使用的,在V3中使用ReLU和h-swish
2.yolov3tiny-mobilenet-small-coco.cfg中每一个深度卷积和逐点卷积后都有激活函数,这个是如何得出的,在论文中并未看到这样的操作?

MobileNetV3 Detection

在检测部分你使用的是完整的MobileNet-Large和MobileNet-Small,而作者在原文"6.3. Detection"部分做了一些对比试验,提出了两个层C4和C5,第一个层C4,在Large里对应第13个bneck,在Small里对应第9个bneck,C5均对应pooling前的那一层,操作如下“We additionally reduce the channel counts of all feature layers between C4 and C5 by 2”,在COCO结果如下:
微信图片_20200421135814
请问下,你按照这个操作试过没,真的能够保持精度不变的情况下提速吗?

你好,请问次项目支持yolov2的剪枝吗?我在yolov2的骨干网络上加入参残差网络层,然后在后面接上yolov3的3个尺度的检测层,最后剪枝报错了。

报错如下:chenjunsong@chenjunsong-GJ5CN64:~/U-YOLOv3$ python3 normal_prune.py --data data/obj.data --cfg cfg/yolov2/yolo-obj.cfg --weights weights/best.pt --percent 0.1
Namespace(cfg='cfg/yolov2/yolo-obj.cfg', data='data/obj.data', img_size=608, percent=0.1, weights='weights/best.pt')
Model Summary: 120 layers, 4.07562e+07 parameters, 4.07562e+07 gradients
Caching labels (500 found, 0 missing, 0 empty, 0 duplicate, for 500 images): 100%|█████████████████████████████████████████████████████████████████████████████████████| 500/500 [00:00<00:00, 10877.12it/s]
Class Images Targets P R [email protected] F1: 100%|█████████████████████████████████████████████████████████████████████████████████████| 32/32 [00:28<00:00, 1.12it/s]
all 500 2.09e+03 0.57 0.00759 0.135 0.015
Threshold should be less than 0.9868.
The corresponding prune ratio is 0.987.
Channels with Gamma value less than 0.0965 are pruned!
Number of channels has been reduced from 9888 to 8900
Prune ratio: 0.100
layer index: 0 total channel: 32 remaining channel: 30
layer index: 2 total channel: 64 remaining channel: 59
layer index: 5 total channel: 64 remaining channel: 57
layer index: 10 total channel: 128 remaining channel: 115
layer index: 15 total channel: 256 remaining channel: 224
layer index: 18 total channel: 256 remaining channel: 232
layer index: 23 total channel: 512 remaining channel: 470
layer index: 26 total channel: 512 remaining channel: 471
layer index: 29 total channel: 512 remaining channel: 461
layer index: 30 total channel: 1024 remaining channel: 910
layer index: 31 total channel: 512 remaining channel: 453
layer index: 32 total channel: 1024 remaining channel: 923
layer index: 33 total channel: 512 remaining channel: 461
layer index: 34 total channel: 1024 remaining channel: 933
layer index: 41 total channel: 256 remaining channel: 234
layer index: 42 total channel: 512 remaining channel: 457
layer index: 43 total channel: 256 remaining channel: 229
layer index: 44 total channel: 512 remaining channel: 462
layer index: 45 total channel: 256 remaining channel: 229
layer index: 46 total channel: 512 remaining channel: 459
layer index: 53 total channel: 128 remaining channel: 105
layer index: 54 total channel: 256 remaining channel: 238
layer index: 55 total channel: 128 remaining channel: 118
layer index: 56 total channel: 256 remaining channel: 232
layer index: 57 total channel: 128 remaining channel: 113
layer index: 58 total channel: 256 remaining channel: 225
Prune channels: 988 Prune ratio: 0.063
Caching labels (500 found, 0 missing, 0 empty, 0 duplicate, for 500 images): 100%|█████████████████████████████████████████████████████████████████████████████████████| 500/500 [00:00<00:00, 11243.52it/s]
Class Images Targets P R [email protected] F1: 100%|█████████████████████████████████████████████████████████████████████████████████████| 32/32 [00:27<00:00, 1.15it/s]
all 500 2.09e+03 0.584 0.0067 0.134 0.0133
after prune_model_keep_size map is 0.13384984434263122
Model Summary: 120 layers, 3.55866e+07 parameters, 3.55866e+07 gradients
Traceback (most recent call last):
File "normal_prune.py", line 183, in
init_weights_from_loose_model(compact_model, pruned_model, CBL_idx, Other_idx, CBLidx2mask)
File "/home/chenjunsong/U-YOLOv3/utils/prune_utils.py", line 194, in init_weights_from_loose_model
input_mask = get_input_mask(loose_model.module_defs, idx, CBLidx2mask)
File "/home/chenjunsong/U-YOLOv3/utils/prune_utils.py", line 154, in get_input_mask
return CBLidx2mask[idx - 2]
KeyError: 7

yolov4训练

你在训练YOLOv4的时候采用的是yolov4.weights,为什么不用预训练权重yolov4.conv.137训练呢?

知识蒸馏两种网络输出的维度不一致

你好,我在进行知识蒸馏时将训练好的yolov3中的best.pt作为teacher model, 将yolov3-tiny训练好的last.pt作为student model,当开始计算下面这条语句时发生错误:
loss_st = criterion_st(nn.functional.log_softmax(output_s / T, dim=1), nn.functional.softmax(output_t / T, dim=1)) * (T * T) / batch_size
错误信息:RuntimeError: The size of tensor a (45486) must match the size of tensor b (10830) at non-singleton dimension 0
我猜测是由于tiny只有两个yololayer,yolov3有三个yololayer。
知识蒸馏的指令如下:python train.py --data cfg/xray.data --batch_size 2 --KDstr 1 --weights weights/last.pt --cfg cfg/yolov3tiny/yolov3-tiny.cfg --img_size 608 --epochs 80 --quantized 1 --qlayers 72 --t_cfg cfg/yolov3/yolov3.cfg --t_weights yolov3-result/yolov3/best.pt

请问该如何解决这个问题?诚盼回复,十分感谢!

关于加速推理

感谢作者的开源,请问关于加速推理方面。该项目中量化,蒸馏能不能加快推理。

如果我想在yolov3-mobilenet网络的检测部分再加一个4倍降采样的yolo检测层,是要改models.py中的stride = [32, 16, 8, 4]吗?

具体代码:elif mdef['type'] == 'yolo':
yolo_index += 1
stride = [32, 16, 8, 4] # P5, P4, P3 strides
if 'panet' in cfg or 'yolov4' in cfg: # stride order reversed
stride = list(reversed(stride))
layers = mdef['from'] if 'from' in mdef else []
modules = YOLOLayer(anchors=mdef['anchors'][mdef['mask']], # anchor list
nc=mdef['classes'], # number of classes
img_size=img_size, # (416, 416)
yolo_index=yolo_index, # 0, 1, 2...
layers=layers, # output layers
stride=stride[yolo_index])
只要在stride = [32, 16, 8, 4] 中加上4就行了吧,还需要改别的地方吗?

image size

imgsz_min, imgsz_max, imgsz_test = opt.img_size # img sizes (min, max, test)
gs = 64
assert math.fmod(imgsz_min, gs) == 0, '--img-size %g must be a %g-multiple' % (imgsz_min, gs)
你好,今天下午运行你最新版的程序时,我的img_size为608,所以我的imgsz_min也是608,因此出错了,所以不是很理解为何要有这个assert语句,grid size我理解应该是三个yololayer输出的feature map的尺寸,为何这里grid size直接设置为64了

Training error

 0/499      6.3G      2.79     0.299     0.201      3.29         4  1.02e+03: 100%|█| 110/110 [02:37<00:00,  1.43s/it]

Traceback (most recent call last):
File "train.py", line 497, in
train(hyp) # train normally
File "train.py", line 387, in train
dataloader=testloader)
File "D:\2020\prune0513\YOLOv3-ModelCompression-MultidatasetTraining-Multibackbone\test.py", line 74, in test
_ = model(torch.zeros((1, 3, img_size, img_size), device=device)) if device.type != 'cpu' else None # run once
File "C:\ProgramData\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "D:\2020\prune0513\YOLOv3-ModelCompression-MultidatasetTraining-Multibackbone\models.py", line 306, in forward
return self.forward_once(x)
File "D:\2020\prune0513\YOLOv3-ModelCompression-MultidatasetTraining-Multibackbone\models.py", line 358, in forward_once
yolo_out.append(module(x, out))
File "C:\ProgramData\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "D:\2020\prune0513\YOLOv3-ModelCompression-MultidatasetTraining-Multibackbone\models.py", line 279, in forward
io[..., :2] = torch.sigmoid(io[..., :2]) + self.grid # xy
RuntimeError: The size of tensor a (32) must match the size of tensor b (20) at non-singleton dimension 3

在yolov3-mobilenet网络检测的部分,再添加一个yolo检测层,变为四个检测层,anchors也改为12组,但报错,不能训练。

我又尝试在yolov3后面再添加了个yolo检测层,结果还是同样的报错,
报错如下:chenjunsong@chenjunsong-GJ5CN64:~/U-YOLOv3$ python3 train.py --data data/obj.data --batch-size 6 --cfg cfg/yolov3-mobilenet/yolov3-mobilenet1.cfg
Apex recommended for faster mixed precision training: https://github.com/NVIDIA/apex
Namespace(KDstr=-1, accumulate=2, adam=False, batch_size=6, bucket='', cache_images=False, cfg='./cfg/yolov3-mobilenet/yolov3-mobilenet1.cfg', data='data/obj.data', device='', epochs=601, evolve=False, img_size=[320, 640], multi_scale=False, name='', nosave=False, notest=False, prune=-1, pt=False, qlayers=-1, quantized=-1, rect=False, resume=False, s=0.001, single_cls=False, sr=False, t_cfg='', t_weights='', weights='')
Using CUDA device0 _CudaDeviceProperties(name='GeForce GTX 1060', total_memory=6072MB)

Start Tensorboard with "tensorboard --logdir=runs", view at http://localhost:6006/
Traceback (most recent call last):
File "train.py", line 512, in
train(hyp) # train normally
File "train.py", line 94, in train
model = Darknet(cfg, quantized=opt.quantized, qlayers=opt.qlayers).to(device)
File "/home/chenjunsong/U-YOLOv3/models.py", line 364, in init
qlayers=self.qlayers)
File "/home/chenjunsong/U-YOLOv3/models.py", line 233, in create_modules
stride=stride[yolo_index])
IndexError: list index out of range

知识蒸馏出现assertionerror

我在执行蒸馏时使用的是如下指令:--data cfg/xray.data --batch_size 2 --KDstr 1 --weights weights/yolov3-mobilenet.weights --cfg cfg/yolov3-mobilenet/yolov3-mobilenet-coco.cfg --img_size 608 --epochs 80 --quantized 1 --qlayers 72 --t_cfg cfg/yolov3/yolov3.cfg --t_weights weights/best.pt

其中best.pt是我训练yolov3得到的最好的权重文件,为何会出现assertionerror, 该条错误出现在models.py 第544行的assert ptr==len(weights)这条语句。

我在跑yolov3的时候却没有出现错误,跑yolov3的指令如下:--data data/xray.data --batch_size 2 --accumulate 1 -pt --weights weights/yolov3-608.weights --cfg cfg/yolov3/yolov3.cfg --img_size 608 --epochs 60

诚盼回复,谢谢!

Prune problem

当我执行normal prune时,当执行到prune_utils.py 中函数prune_model_keep_size的next_conv = pruned_model.module_list[next_idx][0]语句时出现IndexError: index 0 is out of range的错误信息。
command:

  1. python train.py --data data/xray.data -pt --batch_size 2 --accumulate 1 --weights yolov3-result/yolov3/best.pt --cfg cfg/yolov3/yolov3.cfg -sr --s 0.001 --prune 0 --epochs 120
  2. python normal_prune.py --cfg cfg/yolov3/yolov3.cfg --data data/xray.data --weights yolov3-result/yolov3-prune/sr/best.pt --percent 0.5
    请问下问题出现在哪里,诚盼回复,十分感谢!

正常剪枝之后,fine-tune出现错误

正常剪枝,剪枝率0.8,加载剪枝后的模型进行fine-tune时报错,显示GPU内存溢出,如下图
image
@SpursLipu 麻烦您帮忙看一下,是否是因为剪枝之后的网络结构问题导致训练出错的,感谢。

dorefa量化问题

您好,您的代码里关于dorefa量化的部分不是很理解,dorefa论文看的也比较晕,您能不能大概解释一下下面代码的量化思路?

output = torch.tanh(input)
output = output / 2 / torch.max(torch.abs(output)) + 0.5 # 归一化-[0,1]
scale = float(2 ** self.w_bits - 1)
output = output * scale
output = self.round(output)
output = output / scale
output = 2 * output - 1

我用这种方法做完量化训练后,想把模型转换成int8部署到FPGA上,不太清楚保存的fp32模型要怎么转成int8.

map上升速度问题

你好,我在对yolov3(best.pt)进行稀疏化剪枝后得到的prune_0.5_yolov3.weights,percent为0.5,然后将best.pt和prune_0.5_yolov3.weights进行蒸馏时,发现map上升的很奇怪,描述如下:
epoch=0一直到epoch=4, map: 0.332-->0.528-->0.645-->0.706-->0.801,这段上升的十分快,可是从epoch=5一直到epoch=100,map一直在(0.802,0.83)这段区间内,彷佛蒸馏在前几个epoch就完成了,能否解答这个现象,不是很明白,十分感谢!

关于cbam模块

我在网络中加了cbam模块,训练时出现以下错误,可以帮我分析下什么原因吗

2020-06-23 18-44-35屏幕截图

加载yolov3-mobilenet模型

如果我对yolov3-mobilenet网络结构做了一点改动,想使用你的yolov3-mobilenet预训练模型,命令行需要改动吗?还是说改动train.py文件?

quantization

你好,就之前我提问的为何量化后模型大小没有发生变化,你回答说是实际部署才会变小, 请问下该如何进行实际部署,临近期末课程课题的deadline, 诚盼回复,十分感谢!

知识蒸馏

为何last.pt的模型相对于其他存储下来的模型所占的内存那么小?在使用知识蒸馏时,教师模型是best.pt, 学生模型是last.pt。我正常训练时epoch为60,last.pt和best.pt的map相差不了多少,为何知识蒸馏时要这样设置教师和学生模型

prune后map下降幅度非常大

剪枝后,map从0.84->0.035,为何下降如此的低, prune ratio是0.2, --percent参数是0.2,诚盼回复,十分感谢

YoloV4

请问下这个repo支持yolov4吗

量化

您好,我现在模型大小通过剪枝已经剪到了原来模型10%,请问量化效果还大吗?

Assertion error

python3 train.py --data data/coco2017.data --batch-size 32 --accumulate 1 -pt --weights weights/yolov3-608.weights --cfg cfg/yolov3/yolov3.cfg --img_size 608
when I train yolov3 with the command above, I encountered the assertion error, due to the python code: assert ptr == len(weights), in line 544, models.py, and I debug the code found that just assign the weights when mdef['type'] == 'convolutional' in line 456, models.py, how I solve this problem? Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.