Giter Site home page Giter Site logo

duankaiwen / lsnet Goto Github PK

View Code? Open in Web Editor NEW
153.0 153.0 28.0 28.01 MB

Location-Sensitive Visual Recognition with Cross-IOU Loss

Python 57.67% Makefile 0.03% C++ 3.00% C 0.21% Jupyter Notebook 35.69% Dockerfile 0.02% Batchfile 0.03% Shell 0.04% Cuda 3.08% Cython 0.25%
instance-segmentation object-detection pose-estimation

lsnet's Introduction

lsnet's People

Contributors

198808xc avatar duankaiwen avatar johnson-magic avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

lsnet's Issues

做实例分割时,36个点的起始问题

请问一下,在做实例分割时,36个点的起始点应该选取哪个点?
另外,如果是在边缘上等距离选取点的话,对于一些极值点,例如拐角处的点有可能选取不到,这种情况下该怎么处理呢?
如果更多的点,比如72个点,甚至256个点,用CrossIOULoss的话,是不是很难优化呢?

bbox-pose training results

config: lsnet_pose_bbox_r50_fpn_1x_coco.py

2021-05-31 22:31:03,588 - mmdet - INFO - Evaluating bbox...
Loading and preparing results...
DONE (t=0.75s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type bbox
DONE (t=3.70s).
Accumulating evaluation results...
DONE (t=0.70s).
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.448
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.625
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.499
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.151
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.607
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.734
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.186
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.482
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.509
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.150
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.700
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.816
2021-05-31 22:31:08,760 - mmdet - INFO - Evaluating keypoints...
Loading and preparing results...
DONE (t=1.40s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type keypoints
DONE (t=5.90s).
Accumulating evaluation results...
DONE (t=0.27s).
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.401
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.732
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets= 20 ] = 0.392
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.389
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.450
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.510
Average Recall (AR) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.818
Average Recall (AR) @[ IoU=0.75 | area= all | maxDets= 20 ] = 0.532
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.469
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.567
2021-05-31 22:31:16,511 - mmdet - INFO - Saving checkpoint at 12 epochs
2021-05-31 22:31:16,811 - mmdet - INFO - Epoch(val) [12][16029] bbox_mAP: 0.4480, bbox_mAP_50: 0.6250, bbox_mAP_75: 0.4990, bbox_mAP_s: 0.1510, bbox_mAP_m: 0.6070, bbox_mAP_l: 0.7340, bbox_mAP_copypaste: 0.448 0.625 0.499 0.151 0.607 0.734, keypoints_mAP: 0.4010, keypoints_mAP_50: 0.7320, keypoints_mAP_75: 0.3920, keypoints_mAP_s: 0.3890, keypoints_mAP_m: 0.4500, keypoints_mAP_l: 0.5100, keypoints_mAP_copypaste: 0.401 0.732 0.392 0.389 0.450 0.510

Error with segmentation

Thanks for your good job!
When I tested the code on the segmentation task, there are the following bug report:

2021-05-14 21:13:25,373 - mmdet - INFO - workflow: [('train', 1)], max: 12 epochs
Traceback (most recent call last):
  File "tools/train.py", line 159, in <module>
    main()
  File "tools/train.py", line 155, in main
    meta=meta)
  File "/home/bit/ming7/LSNet/mmdet/apis/train.py", line 128, in train_detector
    runner.run(data_loaders, cfg.workflow, cfg.total_epochs)
  File "/home/bit/ming7/LSNet/mmcv/mmcv/runner/epoch_based_runner.py", line 122, in run
    epoch_runner(data_loaders[i], **kwargs)
  File "/home/bit/ming7/LSNet/mmcv/mmcv/runner/epoch_based_runner.py", line 27, in train
    for i, data_batch in enumerate(data_loader):
  File "/home/bit/anaconda2/envs/lsnet/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 345, in __next__
    data = self._next_data()
  File "/home/bit/anaconda2/envs/lsnet/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 838, in _next_data
    return self._process_data(data)
  File "/home/bit/anaconda2/envs/lsnet/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 881, in _process_data
    data.reraise()
  File "/home/bit/anaconda2/envs/lsnet/lib/python3.7/site-packages/torch/_utils.py", line 395, in reraise
    raise self.exc_type(msg)
TypeError: __init__() missing 4 required positional arguments: 'casting', 'from_', 'to', and 'i'

It seems that there is something wrong with my annotation? (refer to this url). But I have no idea how to solve it.
Hope to get your reply, thanks !

关于网络预测的向量个数

感谢开源。
我有一个关于网络预测的向量个数的疑惑:为什么要设置pose_out_dim = (self.num_vectors+1)*4,为什么这里要+1呢,多出来的这个向量代表什么?

little bug

in LSNet/code/setup.py
the 13th line:
with open('README.md', encoding='utf-8') as f:

should be modified as below:
with open('../README.md', encoding='utf-8') as f:

预训练模型

@Duankaiwen
在config里面有个:
lsnet_pose_bbox_r50_fpn_1x_coco.py
pretrained='../checkpoints/pretrained/resnet50-19c8e357.pth'
resnet50-19c8e357.pth这个能提供吗?

Question about instance GT

Hello, I'd like to ask a question.For the instance split task, how should we generate the corresponding 36 points?

config, training, run successfully

2021-05-25 06:37:44,269 - mmdet - INFO - Epoch [1][1750/29317] lr: 1.000e-02, eta: 3 days, 4:50:47, time: 0.766, data_time: 0.005, memory: 4482, loss_cls: 0.8888, loss_bbox_init: 0.4031, loss_bbox_refine: 0.8448, loss: 2.1367, grad_norm: 1.6092
2021-05-25 06:38:22,688 - mmdet - INFO - Epoch [1][1800/29317] lr: 1.000e-02, eta: 3 days, 4:46:34, time: 0.768, data_time: 0.005, memory: 4482, loss_cls: 0.8721, loss_bbox_init: 0.3939, loss_bbox_refine: 0.8263, loss: 2.0924, grad_norm: 1.6243
2021-05-25 06:39:01,561 - mmdet - INFO - Epoch [1][1850/29317] lr: 1.000e-02, eta: 3 days, 4:43:59, time: 0.777, data_time: 0.005, memory: 4482, loss_cls: 0.8509, loss_bbox_init: 0.3926, loss_bbox_refine: 0.8214, loss: 2.0648, grad_norm: 1.5601
2021-05-25 06:39:40,492 - mmdet - INFO - Epoch [1][1900/29317] lr: 1.000e-02, eta: 3 days, 4:41:41, time: 0.779, data_time: 0.005, memory: 4482, loss_cls: 0.8548, loss_bbox_init: 0.3930, loss_bbox_refine: 0.8219, loss: 2.0697, grad_norm: 1.7333
2021-05-25 06:40:19,645 - mmdet - INFO - Epoch [1][1950/29317] lr: 1.000e-02, eta: 3 days, 4:40:07, time: 0.783, data_time: 0.005, memory: 4482, loss_cls: 0.8701, loss_bbox_init: 0.3845, loss_bbox_refine: 0.8124, loss: 2.0669, grad_norm: 1.5948
2021-05-25 06:40:58,237 - mmdet - INFO - Epoch [1][2000/29317] lr: 1.000e-02, eta: 3 days, 4:36:58, time: 0.772, data_time: 0.005, memory: 4482, loss_cls: 0.8666, loss_bbox_init: 0.3861, loss_bbox_refine: 0.8200, loss: 2.0727, grad_norm: 1.6296

1.training script:
./code/tools/dist_train.sh ./code/configs/lsnet/lsnet_bbox_r50_fpn_1x_coco.py 2 --work-dir work_dir/lsnet_bbox_r50_fpn_1x_coco here, modified as following:
./code/configs/lsnet/lsnet_bbox_r50_fpn_1x_coco.py ###model
./code/configs/base/datasets/coco_lsvr.py ###dataset
./code/configs/base/schedules/schedule_1x.py ###training
2.about detecting pretrained model, download as following:
https://github.com/HikariTJU/LD/tree/main/configs/gfl
3.but need to train instructions for various tasks yet.

删除bug

再work_dirs文件夹,保存训练的模型,同时也有个laster.pth,可能是带软连接吧,把这个laster.pth删除了,导致整个项目除work_dirs文件夹保留,其它的文件全没了。

errors on training custom instance segmentation dataset

I have prepared a custom instance segmentation dataset, which contains 5 classes (not counting background). It worked fine on original MMDetection framework training (such as: detectoRS, mask-RCNN, HTC), but when I modified file lsnet_segm_r50_fpn_1x_coco.py to train on this dataset, the system report errors:

File "/data2/lixuan/workspace/LSNet/code/mmdet/models/dense_heads/lsnet_head.py", line 1299, in loss
gt_polygons, gt_bboxes = self.process_polygons(gt_masks, cls_scores)
File "/data2/lixuan/workspace/LSNet/code/mmdet/models/dense_heads/lsnet_head.py", line 1742, in process_polygons
gt_polygons_stack = torch.stack(gt_polygons)
RuntimeError: stack expects a non-empty TensorList

I checked file lsnet_head.py, and found that the gt_masks is empty:

def forward_train(self,
                  x,
                  img_metas,
                  gt_bboxes,
                  gt_extremes = None,
                  gt_keypoints = None,
                  gt_masks = None,
                  gt_labels = None,
                  gt_bboxes_ignore=None,
                  proposal_cfg = None,
                  **kwargs):
    outs = self(x)
    print(gt_masks)
    input()

results:
[PolygonMasks(num_masks=0, height=800, width=1088)]

what causes this error and how can I solve it.

测试pose_bbox模型出错

mmcv/visualization/image.py", line 345, in imshow_pose
plt.savefig(out_file)

matplotlib/colors.py", line 260, in _to_rgba_no_colorcycle
raise ValueError(f"Invalid RGBA argument: {orig_c!r}")

ValueError: Invalid RGBA argument: 'magneta'

RuntimeError: invalid argument 5: k not in range for dimension at /pytorch/aten/src/THC/generic/THCTensorTopK.cu:23

**Met a problem during training,

the environment is:**

sys.platform: linux
Python: 3.6.2 |Continuum Analytics, Inc.| (default, Jul 20 2017, 13:51:32) [GCC 4.4.7 20120313 (Red Hat 4.4.7-1)]
CUDA available: True
CUDA_HOME: /usr/local/cuda
NVCC: Build cuda_11.0_bu.TC445_37.28540450_0
GPU 0,1: GeForce GTX 1080 Ti
GCC: gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609
PyTorch: 1.5.0
PyTorch compiling details: PyTorch built with:

  • GCC 7.3
  • C++ Version: 201402
  • Intel(R) Math Kernel Library Version 2019.0.5 Product Build 20190808 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v0.21.1 (Git Hash 7d2fd500bc78936d1d648ca713b901012f470dbc)
  • OpenMP 201511 (a.k.a. OpenMP 4.5)
  • NNPACK is enabled
  • CPU capability usage: AVX2
  • CUDA Runtime 10.2
  • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_37,code=compute_37
  • CuDNN 7.6.5
  • Magma 2.5.2
  • Build settings: BLAS=MKL, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -fopenmp -DNDEBUG -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DUSE_INTERNAL_THREADPOOL_IMPL -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF,

TorchVision: 0.6.0
OpenCV: 4.5.2
MMCV: 0.6.2
MMDetection: 2.2.0+unknown
MMDetection Compiler: GCC 5.4
MMDetection CUDA Compiler: 10.1

I follow the instruction in README and the documents of MMDetection, but the process of training always crashes because of the following error:

2021-05-27 16:58:06,202 - mmdet - INFO - Epoch [1][1250/10687] lr: 1.000e-02, eta: 2 days, 6:48:27, time: 1.571, data_time: 0.010, memory: 3447, loss_cls: 0.4738, loss_bbox_init: 0.0343, loss_bbox_refine: 0.0738, loss_pose_init: 0.7736, loss_pose_refine: 1.4652, loss: 2.8207, grad_norm: 1.2611

Traceback (most recent call last):
File "code/tools/train.py", line 159, in
main()
File "code/tools/train.py", line 155, in main
meta=meta)
File "/home/gy/receive_client/yuanye/LSNet/code/mmdet/apis/train.py", line 128, in train_detector
runner.run(data_loaders, cfg.workflow, cfg.total_epochs)
File "/home/gy/receive_client/yuanye/LSNet/code/mmcv/mmcv/runner/epoch_based_runner.py", line 122, in run
epoch_runner(data_loaders[i], **kwargs)
File "/home/gy/receive_client/yuanye/LSNet/code/mmcv/mmcv/runner/epoch_based_runner.py", line 32, in train
**kwargs)
File "/home/gy/receive_client/yuanye/LSNet/code/mmcv/mmcv/parallel/data_parallel.py", line 31, in train_step
return self.module.train_step(*inputs[0], **kwargs[0])
File "/home/gy/receive_client/yuanye/LSNet/code/mmdet/models/detectors/base.py", line 237, in train_step
losses = self(**data)
File "/home/gy/anaconda3/envs/lsnet/lib/python3.6/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/home/gy/receive_client/yuanye/LSNet/code/mmdet/core/fp16/decorators.py", line 51, in new_func
return old_func(*args, **kwargs)
File "/home/gy/receive_client/yuanye/LSNet/code/mmdet/models/detectors/base.py", line 172, in forward
return self.forward_train(img, img_metas, **kwargs)
File "/home/gy/receive_client/yuanye/LSNet/code/mmdet/models/detectors/lsnet.py", line 55, in forward_train
gt_masks, gt_labels, gt_bboxes_ignore)
File "/home/gy/receive_client/yuanye/LSNet/code/mmdet/models/dense_heads/lsnet_head.py", line 472, in forward_train
losses = self.loss(*loss_inputs, gt_bboxes_ignore=gt_bboxes_ignore)
File "/home/gy/receive_client/yuanye/LSNet/code/mmdet/models/dense_heads/lsnet_head.py", line 1374, in loss
label_channels=label_channels)
File "/home/gy/receive_client/yuanye/LSNet/code/mmdet/models/dense_heads/lsnet_head.py", line 978, in get_targets
unmap_outputs=unmap_outputs)
File "/home/gy/receive_client/yuanye/LSNet/code/mmdet/core/utils/misc.py", line 54, in multi_apply
return tuple(map(list, zip(*map_results)))
File "/home/gy/receive_client/yuanye/LSNet/code/mmdet/models/dense_heads/lsnet_head.py", line 831, in _target_single
gt_bboxes_ignore, gt_labels)
File "/home/gy/receive_client/yuanye/LSNet/code/mmdet/core/bbox/assigners/atss_assigner.py", line 110, in assign
self.topk, dim=0, largest=False)
RuntimeError: invalid argument 5: k not in range for dimension at /pytorch/aten/src/THC/generic/THCTensorTopK.cu:23

Many thanks to anyone who can help me :)

可视化测试

感谢作者优秀工作!

现在可视化测试代码,即显示bbox, extreme points, pose keypoints, segmentation polygons,虽然用了mmdet/models/detectors/lsnet.py, mmcv/visualization/image.py,
但不系统,我修改的比较乱,希望作者抽空提供一个样例,先谢谢啦!
@Duankaiwen

cross iou 代码细节问题

bbox_gt=None, pos_inds=None, eps=1e-6, alpha=0.2, stride=9):

作者您好,感谢您开源优秀的工作。在研究代码的时候,对这里loss_type == 'segm' 时的stride=9不是很理解。请问对于segmentation任务来说,这个stride的含义和作用是什么呢?为什么loss_type =='bbox'时不需要它呢?

祝您新春快乐,工作顺利!

About inferring speed

It is a novel method that pays more attention to real-time speed. Excuse me, compared with other methods, detection, instance segmentation, and human pose inference speed can achieve real-time, such as inferred video can be >50fps per second

The more weights

Thank you for your good work. Can you provide the weight of the model on instance segmentation (using resnet50 and resnet101 to train for 12 epochs without multi-scale training)

A problem when I run your code

When I run the code, I meet one problem that I cannot solve.

The error is:
one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [2, 9, 7, 5]], which is output 0 of AsStridedBackward, is at version 6; expected version 4 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

the mmdet version is 2.2.0, and the pytorch version is 1.7.0

Thanks a lot for your help.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.