Giter Site home page Giter Site logo

yatenglg / retinanet-pytorch Goto Github PK

View Code? Open in Web Editor NEW
266.0 7.0 96.0 152 KB

Retinanet目标检测算法(简单,明了,易用,全中文注释,单机多卡训练,视频检测)(based on pytorch,Simple, Clear, Mutil GPU)

Python 100.00%
pytorch retinanet object-detection

retinanet-pytorch's Issues

fpn最后的3X3卷积都是用的conv1的吗?

嗨喽大佬你好正在学习你的代码,fpn.py里最后的3*3卷积都是用的self.top_down_conv1吗?这样就共享权重了?那上面怎么还定义了conv2和conv3呢?求大佬的解答

报错

IndexError: The shape of the mask [8, 1] at index 1 does not match the shape of the indexed tensor [8, 67995] at index 1
我将batch_size修改为8,这个错是在哪里进行修改啊

迭代次数

设置训练迭代次数为20000,但是到了之后它不会自动停下。

RuntimeError: Found dtype Double but expected Float

--- load weight finish ---
Setting up a new session...
Max_iter = 120000, Batch_size = 20
Model will train on cuda:[0]
--- Focal_loss alpha = 0.25 ,将对背景类进行衰减,请在目标检测任务中使用 ---
--- Multiboxloss : α=0.25 γ=2 num_classes=21
Set optimizer : SGD (
Parameter Group 0
dampening: 0
initial_lr: 0.001
lr: 0.001
momentum: 0.9
nesterov: False
weight_decay: 0.0005
)
Set scheduler : <torch.optim.lr_scheduler.MultiStepLR object at 0x7f7d7f196e20>
Set lossfunc : multiboxloss(
(loc_loss_fn): SmoothL1Loss()
(cls_loss_fn): focal_loss()
)
Start Train......


/home/pdj/PycharmProjects/lyy/Retinanet-Pytorch/Data/Transfroms_utils.py:263: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
mode = random.choice(self.sample_options)
/home/pdj/PycharmProjects/lyy/Retinanet-Pytorch/Data/Transfroms_utils.py:263: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
mode = random.choice(self.sample_options)

Traceback (most recent call last):
File "/home/pdj/PycharmProjects/lyy/Retinanet-Pytorch/Demo_train.py", line 36, in
trainer(net, train_dataset)
File "/home/pdj/PycharmProjects/lyy/Retinanet-Pytorch/Model/trainer.py", line 122, in call
loss.backward()
File "/home/pdj/anaconda3/envs/lyy/lib/python3.8/site-packages/torch/tensor.py", line 221, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/home/pdj/anaconda3/envs/lyy/lib/python3.8/site-packages/torch/autograd/init.py", line 130, in backward
Variable._execution_engine.run_backward(
RuntimeError: Found dtype Double but expected Float

Process finished with exit code 1
请问个是为什么,我在Transforms.py中明明看到有ConvertFromInts()
并且在Transforms_utils.py中明明看到有return image.astype(np.float32), boxes, labels
为什么就报RuntimeError: Found dtype Double but expected Float这个错误了呢,
难道是 上面VisibleDeprecationWarning这个的问题。
库版本如下:
python 3.8.5 h7579374_1
pytorch 1.7.0 py3.8_cuda10.1.243_cudnn7.6.3_0 pytorch
torchvision 0.8.1 py38_cu101 pytorch
numpy 1.19.4 pypi_0 pypi
opencv-python 4.4.0.46 pypi_0 pypi
yacs 0.1.8 pypi_0 pypi
visdom 0.1.8.9 pypi_0 pypi
vizer 0.1.5 pypi_0 pypi

false positives 特别多的问题

您好,非常感谢您的代码。我尝试将您的focal loss部分加到retinaface模型中,没有改动参数,发现误检特别多。想问下知道可能的原因嘛?是因为我没有进行hard negative mining还是因为我参数没有调对呢?(尝试过修改阈值但作用非常微小)

关于计算loss相关

如题,我看你计算loc和cls损失时都是计算正负样本的总损失。但是最后返回时却只除以了正样本数量。你能解释一下为什么要这样做吗。或者给我一个相关链接也行。

检测速度问题

网络的检测速度出奇的慢呀,有没有想法改善一下呢

ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (6,) + inhomogeneous part.

--- load weight finish ---
Setting up a new session...
Max_iter = 120000, Batch_size = 20
Model will train on cuda:[0]
--- Focal_loss alpha = 0.25 ,将对背景类进行衰减,请在目标检测任务中使用 ---
--- Multiboxloss : α=0.25 γ=2 num_classes=21
Set optimizer : SGD (
Parameter Group 0
dampening: 0
initial_lr: 0.001
lr: 0.001
momentum: 0.9
nesterov: False
weight_decay: 0.0005
)
Set scheduler : <torch.optim.lr_scheduler.MultiStepLR object at 0x00000248040508B0>
Set lossfunc : multiboxloss(
(loc_loss_fn): SmoothL1Loss()
(cls_loss_fn): focal_loss()
)
Start Train......


Traceback (most recent call last):
File "D:\software\PyCharm\PyCharm Community Edition 2022.1.3\plugins\python-ce\helpers\pydev\pydevd.py", line 1491, in _exec
pydev_imports.execfile(file, globals, locals) # execute the script
File "D:\software\PyCharm\PyCharm Community Edition 2022.1.3\plugins\python-ce\helpers\pydev_pydev_imps_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "D:/code/ai/Retinanet/Retinanet-Pytorch-master/Demo_train.py", line 36, in
trainer(net, train_dataset)
File "D:\code\ai\Retinanet\Retinanet-Pytorch-master\Model\trainer.py", line 112, in call
for iteration, (images, boxes, labels, image_names) in enumerate(data_loader):
File "D:\software\supermap\idesktopX\support\MiniConda\conda\envs\retinanet\lib\site-packages\torch\utils\data\dataloader.py", line 435, in next
data = self._next_data()
File "D:\software\supermap\idesktopX\support\MiniConda\conda\envs\retinanet\lib\site-packages\torch\utils\data\dataloader.py", line 1085, in _next_data
return self._process_data(data)
File "D:\software\supermap\idesktopX\support\MiniConda\conda\envs\retinanet\lib\site-packages\torch\utils\data\dataloader.py", line 1111, in _process_data
data.reraise()
File "D:\software\supermap\idesktopX\support\MiniConda\conda\envs\retinanet\lib\site-packages\torch_utils.py", line 428, in reraise
raise self.exc_type(msg)
ValueError: Caught ValueError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "D:\software\supermap\idesktopX\support\MiniConda\conda\envs\retinanet\lib\site-packages\torch\utils\data_utils\worker.py", line 198, in _worker_loop
data = fetcher.fetch(index)
File "D:\software\supermap\idesktopX\support\MiniConda\conda\envs\retinanet\lib\site-packages\torch\utils\data_utils\fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "D:\software\supermap\idesktopX\support\MiniConda\conda\envs\retinanet\lib\site-packages\torch\utils\data_utils\fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "D:\code\ai\Retinanet\Retinanet-Pytorch-master\Data\Dataset_VOC.py", line 48, in getitem
image, boxes, labels = self.transform(image, boxes, labels)
File "D:\code\ai\Retinanet\Retinanet-Pytorch-master\Data\Transfroms.py", line 40, in call
img, boxes, labels = t(img, boxes, labels)
File "D:\code\ai\Retinanet\Retinanet-Pytorch-master\Data\Transfroms_utils.py", line 263, in call
mode = random.choice(self.sample_options)
File "mtrand.pyx", line 920, in numpy.random.mtrand.RandomState.choice
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (6,) + inhomogeneous part.
请问这是什么原因导致的呀

输入尺寸和预测框大小

谢谢提供这个模型,由于我修改了FPN的结构,在训练的时候(8G的显存),输入尺寸为600时,总是出现CUDA out of memory,我想减小到输入尺寸为300,特征图(5层)应该变成 38、19、10、5、3,那么对应的预测框大小该如何设置?

数据集下载

您好!请问可以提供数据集下载链接吗?也希望您能够提供具体的训练步骤,谢谢!

如何分别设置输入图片IMAGE_SIZE的长宽?

感谢提供此代码!但由于原本设置的是将输入图像的长宽都resize到同一IMAGE_SIZE大小(600px),但对于像KITTI这样长宽比悬殊的数据集,原图长宽比大约为1200*375左右,若resize到同一大小,就将导致行人/自行车到目标的像素严重缺失,无法识别。因此我希望能分别设置IMAGE_SIZE的width和height,请问这样的话,对于anchor和feature map大小,以及内部一系列参数的设置该如何修改?

RunTimeError

Traceback (most recent call last):
File "F:/Retinanet-Pytorch-master/Demo_train.py", line 36, in
trainer(net, train_dataset)
File "F:\Retinanet-Pytorch-master\Model\trainer.py", line 115, in call
reg_loss, cls_loss = self.loss_func(cls_logits, bbox_preds, labels, boxes)
File "D:\anaconda3\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "F:\Retinanet-Pytorch-master\Model\struct\MultiBoxLoss.py", line 57, in forward
predicted_locations = predicted_locations[pos_mask, :].view(-1, 4)
RuntimeError: copy_if failed to synchronize: device-side assert triggered
配置文件里面将batch_size减小为1,学习率也进行了修改为1e-4,还是报错,请问是什么原因

assign_priors里面负样本相关

RT,原始论文中 0.5>IOU>0.4的anchor label好像都赋值为-1以此来忽略最终的loss计算,IOU<0.4的才记为负样本。而你的代码中 IOU<0.5的都记为负样本。你这么做的依据在哪或者说有什么其他参考吗

BUG! 当图片(.xml)中不包含任何 object 时!

当训练集中存在一张图片不包含任何目标时,Data文件夹下的Transfroms_utils.py代码在进行boxes[:, 0] /= width计算时,会报错IndexError: too many indices for array。
原因是 这张图片并没有真值图,即xml文件中无法找到bbox,所以报错。
现在的新数据集中,这类图片很常见,希望大神解决一下,谢谢!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.