Giter Site home page Giter Site logo

Comments (9)

wxggzz avatar wxggzz commented on July 19, 2024
sys.platform: linux
Python: 3.7.11 (default, Jul 27 2021, 14:32:16) [GCC 7.5.0]
CUDA available: True
GPU 0,1,2,3,4,5,6: GeForce RTX 2080 Ti
CUDA_HOME: /usr/local/cuda
NVCC: Cuda compilation tools, release 10.1, V10.1.105
GCC: gcc (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609
PyTorch: 1.7.1
PyTorch compiling details: PyTorch built with:
  - GCC 7.3
  - C++ Version: 201402
  - Intel(R) oneAPI Math Kernel Library Version 2021.4-Product Build 20210904 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v1.6.0 (Git Hash 5ef631a030a6f73131c77892041042805a06064f)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - CUDA Runtime 10.1
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_37,code=compute_37
  - CuDNN 7.6.3
  - Magma 2.5.2
  - Build settings: BLAS=MKL, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DUSE_VULKAN_WRAPPER -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, 

TorchVision: 0.8.0a0
OpenCV: 4.5.5
MMCV: 1.4.5
MMCV Compiler: GCC 5.4
MMCV CUDA Compiler: 10.1
MMRotate: 0.1.0+6519a36

上面是收集的环境信息

from mmrotate.

zytx121 avatar zytx121 commented on July 19, 2024

Hi @wxggzz
Please run the following command and upload your error report again.

CUDA_LAUNCH_BLOCKING=1 python tools/train.py ./configs/oriented_rcnn/oriented_rcnn_r50_fpn_1x_dota_le90.py --gpu-ids 5  --work-dir /workspace/mmrotate/work_dirs/ms/oriented_rcnn/0221

from mmrotate.

wxggzz avatar wxggzz commented on July 19, 2024

Hi @wxggzz Please run the following command and upload your error report again.

CUDA_LAUNCH_BLOCKING=1 python tools/train.py ./configs/oriented_rcnn/oriented_rcnn_r50_fpn_1x_dota_le90.py --gpu-ids 5  --work-dir /workspace/mmrotate/work_dirs/ms/oriented_rcnn/0221

hello same result

 --------------------
2022-02-21 14:15:20,008 - mmrotate - INFO - workflow: [('train', 1)], max: 12 epochs
2022-02-21 14:15:20,008 - mmrotate - INFO - Checkpoints will be saved to /workspace/mmrotate/ /workspace/mmrotate/work_dirs/ms/oriented_rcnn/0221 by HardDiskBackend.
/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmdet/models/dense_heads/anchor_head.py:123: UserWarning: DeprecationWarning: anchor_generator is deprecated, please use "prior_generator" instead
  warnings.warn('DeprecationWarning: anchor_generator is deprecated, '
Traceback (most recent call last):
  File "tools/train.py", line 182, in <module>
    main()
  File "tools/train.py", line 178, in main
    meta=meta)
  File "/workspace/mmrotate/mmrotate/apis/train.py", line 156, in train_detector
    runner.run(data_loaders, cfg.workflow)
  File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 127, in run
    epoch_runner(data_loaders[i], **kwargs)
  File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 50, in train
    self.run_iter(data_batch, train_mode=True, **kwargs)
  File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 30, in run_iter
    **kwargs)
  File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmcv/parallel/data_parallel.py", line 75, in train_step
    return self.module.train_step(*inputs[0], **kwargs[0])
  File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmdet/models/detectors/base.py", line 248, in train_step
    losses = self(**data)
  File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 109, in new_func
    return old_func(*args, **kwargs)
  File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmdet/models/detectors/base.py", line 172, in forward
    return self.forward_train(img, img_metas, **kwargs)
  File "/workspace/mmrotate/mmrotate/models/detectors/two_stage.py", line 150, in forward_train
    **kwargs)
  File "/workspace/mmrotate/mmrotate/models/roi_heads/oriented_standard_roi_head.py", line 74, in forward_train
    img_metas)
  File "/workspace/mmrotate/mmrotate/models/roi_heads/oriented_standard_roi_head.py", line 97, in _bbox_forward_train
    bbox_results = self._bbox_forward(x, rois)
  File "/workspace/mmrotate/mmrotate/models/roi_heads/rotate_standard_roi_head.py", line 170, in _bbox_forward
    x[:self.bbox_roi_extractor.num_inputs], rois)
  File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 197, in new_func
    return old_func(*args, **kwargs)
  File "/workspace/mmrotate/mmrotate/models/roi_heads/roi_extractors/rotate_single_level_roi_extractor.py", line 132, in forward
    roi_feats_t = self.roi_layers[i](feats[i], rois_)
  File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmcv/ops/roi_align_rotated.py", line 177, in forward
    self.clockwise)
  File "/root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmcv/ops/roi_align_rotated.py", line 78, in forward
    clockwise=clockwise)
RuntimeError: CUDA error: an illegal memory access was encountered
terminate called after throwing an instance of 'c10::Error'
  what():  CUDA error: an illegal memory access was encountered
Exception raised from create_event_internal at /opt/conda/conda-bld/pytorch_1607370141920/work/c10/cuda/CUDACachingAllocator.cpp:687 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x42 (0x7f55f20eb8b2 in /root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: c10::cuda::CUDACachingAllocator::raw_delete(void*) + 0xad2 (0x7f55f233d982 in /root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/torch/lib/libc10_cuda.so)
frame #2: c10::TensorImpl::release_resources() + 0x4d (0x7f55f20d6b7d in /root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #3: <unknown function> + 0x5fea0a (0x7f562f428a0a in /root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #4: <unknown function> + 0x5feab6 (0x7f562f428ab6 in /root/anaconda3/envs/openmmlab/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
<omitting python frames>
frame #23: __libc_start_main + 0xf0 (0x7f5652bf6830 in /lib/x86_64-linux-gnu/libc.so.6)

Below is the log file
20220221_064433.log

from mmrotate.

yangxue0827 avatar yangxue0827 commented on July 19, 2024

try

CUDA_VISIBLE_DEVICES=5 PORT=29808 ./tools/dist_train.sh ./configs/oriented_rcnn/oriented_rcnn_r50_fpn_1x_dota_le90.py --work-dir /workspace/mmrotate/work_dirs/ms/oriented_rcnn/0221 1

from mmrotate.

wxggzz avatar wxggzz commented on July 19, 2024

try

CUDA_VISIBLE_DEVICES=5 PORT=29808 ./tools/dist_train.sh ./configs/oriented_rcnn/oriented_rcnn_r50_fpn_1x_dota_le90.py --work-dir /workspace/mmrotate/work_dirs/ms/oriented_rcnn/0221 1

Training with this command is no problem, now I will test it

from mmrotate.

wxggzz avatar wxggzz commented on July 19, 2024

try

CUDA_VISIBLE_DEVICES=5 PORT=29808 ./tools/dist_train.sh ./configs/oriented_rcnn/oriented_rcnn_r50_fpn_1x_dota_le90.py --work-dir /workspace/mmrotate/work_dirs/ms/oriented_rcnn/0221 1

and the test is also the same problem, you have to use dist_test.py

from mmrotate.

yangxue0827 avatar yangxue0827 commented on July 19, 2024
CUDA_VISIBLE_DEVICES=5 PORT=29807 \
       ./tools/dist_test.sh oriented_rcnn/oriented_rcnn_r50_fpn_1x_dota_le90.py \
        /workspace/mmrotate/work_dirs/ms/oriented_rcnn/0221/epoch_12.pth 1 --format-only \
        --eval-options submission_dir=/workspace/mmrotate/work_dirs/ms/oriented_rcnn/0221/Task1_results

from mmrotate.

wxggzz avatar wxggzz commented on July 19, 2024
CUDA_VISIBLE_DEVICES=5 PORT=29807 \
       ./tools/dist_test.sh oriented_rcnn/oriented_rcnn_r50_fpn_1x_dota_le90.py \
        /workspace/mmrotate/work_dirs/ms/oriented_rcnn/0221/epoch_12.pth 1 --format-only \
        --eval-options submission_dir=/workspace/mmrotate/work_dirs/ms/oriented_rcnn/0221/Task1_results

thanks

from mmrotate.

yangxue0827 avatar yangxue0827 commented on July 19, 2024

A successful solution: set smaller nms_pre

test_cfg=dict(
        nms_pre=1000,
        min_bbox_size=0,
        score_thr=0.05,
        nms=dict(iou_thr=0.4),
        max_per_img=2000))

from mmrotate.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.