Comments (9)
sys.platform: linux
Python: 3.7.11 (default, Jul 27 2021, 14:32:16) [GCC 7.5.0]
CUDA available: True
GPU 0,1,2,3,4,5,6: GeForce RTX 2080 Ti
CUDA_HOME: /usr/local/cuda
NVCC: Cuda compilation tools, release 10.1, V10.1.105
GCC: gcc (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609
PyTorch: 1.7.1
PyTorch compiling details: PyTorch built with:
- GCC 7.3
- C++ Version: 201402
- Intel(R) oneAPI Math Kernel Library Version 2021.4-Product Build 20210904 for Intel(R) 64 architecture applications
- Intel(R) MKL-DNN v1.6.0 (Git Hash 5ef631a030a6f73131c77892041042805a06064f)
- OpenMP 201511 (a.k.a. OpenMP 4.5)
- NNPACK is enabled
- CPU capability usage: AVX2
- CUDA Runtime 10.1
- NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_37,code=compute_37
- CuDNN 7.6.3
- Magma 2.5.2
- Build settings: BLAS=MKL, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DUSE_VULKAN_WRAPPER -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,
TorchVision: 0.8.0a0
OpenCV: 4.5.5
MMCV: 1.4.5
MMCV Compiler: GCC 5.4
MMCV CUDA Compiler: 10.1
MMRotate: 0.1.0+6519a36
上面是收集的环境信息
from mmrotate.
Hi @wxggzz
Please run the following command and upload your error report again.
CUDA_LAUNCH_BLOCKING=1 python tools/train.py ./configs/oriented_rcnn/oriented_rcnn_r50_fpn_1x_dota_le90.py --gpu-ids 5 --work-dir /workspace/mmrotate/work_dirs/ms/oriented_rcnn/0221
from mmrotate.
Hi @wxggzz Please run the following command and upload your error report again.
CUDA_LAUNCH_BLOCKING=1 python tools/train.py ./configs/oriented_rcnn/oriented_rcnn_r50_fpn_1x_dota_le90.py --gpu-ids 5 --work-dir /workspace/mmrotate/work_dirs/ms/oriented_rcnn/0221
hello same result
--------------------
2022-02-21 14:15:20,008 - mmrotate - INFO - workflow: [('train', 1)], max: 12 epochs
2022-02-21 14:15:20,008 - mmrotate - INFO - Checkpoints will be saved to /workspace/mmrotate/ /workspace/mmrotate/work_dirs/ms/oriented_rcnn/0221 by HardDiskBackend.
/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmdet/models/dense_heads/anchor_head.py:123: UserWarning: DeprecationWarning: anchor_generator is deprecated, please use "prior_generator" instead
warnings.warn('DeprecationWarning: anchor_generator is deprecated, '
Traceback (most recent call last):
File "tools/train.py", line 182, in <module>
main()
File "tools/train.py", line 178, in main
meta=meta)
File "/workspace/mmrotate/mmrotate/apis/train.py", line 156, in train_detector
runner.run(data_loaders, cfg.workflow)
File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 127, in run
epoch_runner(data_loaders[i], **kwargs)
File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 50, in train
self.run_iter(data_batch, train_mode=True, **kwargs)
File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 30, in run_iter
**kwargs)
File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmcv/parallel/data_parallel.py", line 75, in train_step
return self.module.train_step(*inputs[0], **kwargs[0])
File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmdet/models/detectors/base.py", line 248, in train_step
losses = self(**data)
File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 109, in new_func
return old_func(*args, **kwargs)
File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmdet/models/detectors/base.py", line 172, in forward
return self.forward_train(img, img_metas, **kwargs)
File "/workspace/mmrotate/mmrotate/models/detectors/two_stage.py", line 150, in forward_train
**kwargs)
File "/workspace/mmrotate/mmrotate/models/roi_heads/oriented_standard_roi_head.py", line 74, in forward_train
img_metas)
File "/workspace/mmrotate/mmrotate/models/roi_heads/oriented_standard_roi_head.py", line 97, in _bbox_forward_train
bbox_results = self._bbox_forward(x, rois)
File "/workspace/mmrotate/mmrotate/models/roi_heads/rotate_standard_roi_head.py", line 170, in _bbox_forward
x[:self.bbox_roi_extractor.num_inputs], rois)
File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 197, in new_func
return old_func(*args, **kwargs)
File "/workspace/mmrotate/mmrotate/models/roi_heads/roi_extractors/rotate_single_level_roi_extractor.py", line 132, in forward
roi_feats_t = self.roi_layers[i](feats[i], rois_)
File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmcv/ops/roi_align_rotated.py", line 177, in forward
self.clockwise)
File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmcv/ops/roi_align_rotated.py", line 78, in forward
clockwise=clockwise)
RuntimeError: CUDA error: an illegal memory access was encountered
terminate called after throwing an instance of 'c10::Error'
what(): CUDA error: an illegal memory access was encountered
Exception raised from create_event_internal at /opt/conda/conda-bld/pytorch_1607370141920/work/c10/cuda/CUDACachingAllocator.cpp:687 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x42 (0x7f55f20eb8b2 in /root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: c10::cuda::CUDACachingAllocator::raw_delete(void*) + 0xad2 (0x7f55f233d982 in /root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/torch/lib/libc10_cuda.so)
frame #2: c10::TensorImpl::release_resources() + 0x4d (0x7f55f20d6b7d in /root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #3: <unknown function> + 0x5fea0a (0x7f562f428a0a in /root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #4: <unknown function> + 0x5feab6 (0x7f562f428ab6 in /root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
<omitting python frames>
frame #23: __libc_start_main + 0xf0 (0x7f5652bf6830 in /lib/x86_64-linux-gnu/libc.so.6)
Below is the log file
20220221_064433.log
from mmrotate.
try
CUDA_VISIBLE_DEVICES=5 PORT=29808 ./tools/dist_train.sh ./configs/oriented_rcnn/oriented_rcnn_r50_fpn_1x_dota_le90.py --work-dir /workspace/mmrotate/work_dirs/ms/oriented_rcnn/0221 1
from mmrotate.
try
CUDA_VISIBLE_DEVICES=5 PORT=29808 ./tools/dist_train.sh ./configs/oriented_rcnn/oriented_rcnn_r50_fpn_1x_dota_le90.py --work-dir /workspace/mmrotate/work_dirs/ms/oriented_rcnn/0221 1
Training with this command is no problem, now I will test it
from mmrotate.
try
CUDA_VISIBLE_DEVICES=5 PORT=29808 ./tools/dist_train.sh ./configs/oriented_rcnn/oriented_rcnn_r50_fpn_1x_dota_le90.py --work-dir /workspace/mmrotate/work_dirs/ms/oriented_rcnn/0221 1
and the test is also the same problem, you have to use dist_test.py
from mmrotate.
CUDA_VISIBLE_DEVICES=5 PORT=29807 \
./tools/dist_test.sh oriented_rcnn/oriented_rcnn_r50_fpn_1x_dota_le90.py \
/workspace/mmrotate/work_dirs/ms/oriented_rcnn/0221/epoch_12.pth 1 --format-only \
--eval-options submission_dir=/workspace/mmrotate/work_dirs/ms/oriented_rcnn/0221/Task1_results
from mmrotate.
CUDA_VISIBLE_DEVICES=5 PORT=29807 \ ./tools/dist_test.sh oriented_rcnn/oriented_rcnn_r50_fpn_1x_dota_le90.py \ /workspace/mmrotate/work_dirs/ms/oriented_rcnn/0221/epoch_12.pth 1 --format-only \ --eval-options submission_dir=/workspace/mmrotate/work_dirs/ms/oriented_rcnn/0221/Task1_results
thanks
from mmrotate.
A successful solution: set smaller nms_pre
test_cfg=dict(
nms_pre=1000,
min_bbox_size=0,
score_thr=0.05,
nms=dict(iou_thr=0.4),
max_per_img=2000))
from mmrotate.
Related Issues (20)
- 模型onnx转换问题
- [Bug] RotatedBoxes.Flip_() results may not match angle_version
- mmrotate\models\losses\kf_iou_loss.py", line 21, in xy_wh_r_2_xy_sigma _shape = xywhr.shape AttributeError: 'NoneType' object has no attribute 'shape'[Bug]
- TypeError: list indices must be integers or slices, not tuple HOT 2
- ssdd HOT 1
- [Help]
- how to add negative sample
- [Bug] mim install mmdet\<3.0.0 系统找不到指定的文件。
- [Docs] 什么鬼文档?各种版本根本对不上! HOT 1
- [Docs] 太不严谨了!!错误百出! HOT 1
- [Docs] 最新文档 部分内容 编写错误!
- [Bug] Can't train model with the existing configs and DOTA v1.0 in the Docker
- [Bug] 执行projects目录下的RR360项目报unexpected keyword argument 'img_shape错误 HOT 2
- 如何在二阶段算法中使用CSL
- [Need Help !!! ] how to get/debug loss with mmrotate HOT 5
- [Bug] The reproduced mAP is much less than the official result(RTMDet, O-RCNN, RetinaNet)/复现mAP离官方宣称结果差别较大 HOT 5
- [Bug] 在验证时想要用预测值和真实值的水平框计算iou应该怎么做? HOT 6
- [Bug] ConvertMask2BoxType function has the wrong order of height and width for PolygonMasks
- [Bug]
- [Bug] image_demo can't output
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mmrotate.