Comments (13)
Hi,
You can refer to #12 to re-install cvpods
.
BTW, we are working on a neat implementation in this pr (#13). It will be merged when it is ready.
from yolof.
Hi,
You can refer to #12 to re-install
cvpods
.BTW, we are working on a neat implementation in this pr (#13). It will be merged when it is ready.
thanks for your reply. but I have some other problems. There are bugs in your code during training
ERROR [03/30 20:32:14 c2.engine.base_runner]: Exception during training:
Traceback (most recent call last):
File "/DATA/xiexu/yf/YOLOF/cvpods/engine/base_runner.py", line 84, in train
self.run_step()
File "/DATA/xiexu/yf/YOLOF/cvpods/engine/base_runner.py", line 185, in run_step
loss_dict = self.model(data)
File "/home/xiexu/anaconda3/envs/yfb/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/xiexu/anaconda3/envs/yfb/python3.7/site-packages/torch/nn/parallel/distributed.py", line 511, in forward
output = self.module(*inputs[0], **kwargs[0])
File "/home/xiexu/anaconda3/envs/yfb/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "../yolof_base/yolof.py", line 131, in forward
anchors, pred_anchor_deltas, gt_instances)
File "/home/xiexu/anaconda3/envs/yfb/python3.7/site-packages/torch/autograd/grad_mode.py", line 15, in decorate_context
return func(*args, **kwargs)
File "../yolof_base/yolof.py", line 245, in get_ground_truth
box_pred = self.box2box_transform.apply_deltas(box_delta, all_anchors)
File "/DATA/xiexu/yf/YOLOF/cvpods/modeling/box_regression.py", line 93, in apply_deltas
deltas).all().item(), "Box regression deltas become infinite or NaN!"
AssertionError: Box regression deltas become infinite or NaN!
from yolof.
Hello, when I run it with 'pods_train --num-gpus 1' and I have the same probelm "KeyError: "No object named 'RandomShift' found in 'transforms' registry!", I also refer to #12 to do ,the result is “.....Requirement already satisfied: certifi>=2020.06.20 in ./.local/lib/python3.6/site-packages (from matplotlib>=3.1.1->lvis) (2020.6.20)
Requirement already satisfied: pillow>=6.2.0 in ./.local/lib/python3.6/site-packages (from matplotlib>=3.1.1->lvis) (7.2.0)
Installing collected packages: lvis
Successfully installed lvis-0.5.3
”,everything is successful but it still repo this wrong(No object named 'RandomShift' found in 'transforms' registry!) and not to solve.
How did you solve the problem,upstairs?Hope someone to tell me.Thanks so much.
from yolof.
Hello, when I run it with 'pods_train --num-gpus 1' and I have the same probelm "KeyError: "No object named 'RandomShift' found in 'transforms' registry!", I also refer to #12 to do ,the result is “.....Requirement already satisfied: certifi>=2020.06.20 in ./.local/lib/python3.6/site-packages (from matplotlib>=3.1.1->lvis) (2020.6.20)
Requirement already satisfied: pillow>=6.2.0 in ./.local/lib/python3.6/site-packages (from matplotlib>=3.1.1->lvis) (7.2.0)
Installing collected packages: lvis
Successfully installed lvis-0.5.3
”,everything is successful but it still repo this wrong(No object named 'RandomShift' found in 'transforms' registry!) and not to solve.
How did you solve the problem,upstairs?Hope someone to tell me.Thanks so much.
hi, I think you need to re install the environment,The steps are as follows
pytorch=1.6 python==3.7
git clone https://github.com/thomasbrandon/mish-cuda
cd mish-cuda
python setup.py build install
cd ..
git clone [email protected]:megvii-model/YOLOF.git
cd YOLOF/
python setup.py develop
cd ./playground/detection/coco/yolof/yolof.res50.C5.1x
pods_train --num-gpus 2
from yolof.
@xiexu0210 Hi, have you modify any code in the repo? And does the bug occur every time you run with YOLOF?
from yolof.
@walynlee Hi, maybe you should uninstall the previous cvpods
first, then re-install YOLOF locally follow the steps.
from yolof.
@chensnathan Hello, I haven't modified any code yet,it already report errors,this is my environment, should I uninstall pytorch1.7 and install pytorch1.6 and update my python version?
Environment info:
sys.platform linux
Python 3.6.9 (default, Jan 26 2021, 15:33:00) [GCC 8.4.0]
numpy 1.19.3
cvpods 0.1 @/home/a303/cvpods/cvpods
cvpods compiler GCC 7.5
cvpods CUDA compiler 10.0
cvpods arch flags /home/a303/cvpods/cvpods/_C.cpython-36m-x86_64-linux-gnu.so; cannot find cuobjdump
cvpods_ENV_MODULE
PyTorch 1.7.0 @/home/a303/.local/lib/python3.6/site-packages/torch
PyTorch debug build True
CUDA available True
GPU 0 GeForce RTX 2080 Ti
CUDA_HOME :/usr/local/cuda-10.0
Pillow 7.2.0
torchvision 0.8.1 @/home/a303/.local/lib/python3.6/site-packages/torchvision
torchvision arch flags /home/a303/.local/lib/python3.6/site-packages/torchvision/_C.so; cannot find cuobjdump
cv2 4.4.0
PyTorch built with:
- GCC 7.3
- C++ Version: 201402
- Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
- Intel(R) MKL-DNN v1.6.0 (Git Hash 5ef631a030a6f73131c77892041042805a06064f)
- OpenMP 201511 (a.k.a. OpenMP 4.5)
- NNPACK is enabled
- CPU capability usage: AVX2
- CUDA Runtime 10.2
- NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75
- CuDNN 7.6.5
- Magma 2.5.2
from yolof.
Could you post your training log?
from yolof.
Could you post your training log?
hi,The error was reported only once 。
I ran through your code and changed the number of GPUs to 3。The result was very bad。
I have another question, How to debug your code,I only run it as a command line
from yolof.
The model diverges during your training. When you use fewer GPUs, you should warm up more iterations.
from yolof.
Hi,
You can refer to #12 to re-installcvpods
.
BTW, we are working on a neat implementation in this pr (#13). It will be merged when it is ready.thanks for your reply. but I have some other problems. There are bugs in your code during training
ERROR [03/30 20:32:14 c2.engine.base_runner]: Exception during training:
Traceback (most recent call last):
File "/DATA/xiexu/yf/YOLOF/cvpods/engine/base_runner.py", line 84, in train
self.run_step()
File "/DATA/xiexu/yf/YOLOF/cvpods/engine/base_runner.py", line 185, in run_step
loss_dict = self.model(data)
File "/home/xiexu/anaconda3/envs/yfb/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/xiexu/anaconda3/envs/yfb/python3.7/site-packages/torch/nn/parallel/distributed.py", line 511, in forward
output = self.module(*inputs[0], **kwargs[0])
File "/home/xiexu/anaconda3/envs/yfb/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "../yolof_base/yolof.py", line 131, in forward
anchors, pred_anchor_deltas, gt_instances)
File "/home/xiexu/anaconda3/envs/yfb/python3.7/site-packages/torch/autograd/grad_mode.py", line 15, in decorate_context
return func(*args, **kwargs)
File "../yolof_base/yolof.py", line 245, in get_ground_truth
box_pred = self.box2box_transform.apply_deltas(box_delta, all_anchors)
File "/DATA/xiexu/yf/YOLOF/cvpods/modeling/box_regression.py", line 93, in apply_deltas
deltas).all().item(), "Box regression deltas become infinite or NaN!"
AssertionError: Box regression deltas become infinite or NaN!
Hi,I met the same problem when my code have been trained for a little time.' Box regression deltas become infinite or NaN!'suddenly occurs.How did you solve the problem?
from yolof.
Could you post your training log?
hi,The error was reported only once 。
I ran through your code and changed the number of GPUs to 3。The result was very bad。
I have another question, How to debug your code,I only run it as a command line
Hi, if you use Pycharm to debug, in Run/Debug Configurations
, you can
set the working directory
to the code path which you want to run, e.g. YOLOF/playground/detection/coco/yolof/yolof.res50.C5.1x
set the script path
to YOLOF/tools/train_net.py
.
from yolof.
Could you post your training log?
hi,The error was reported only once 。
I ran through your code and changed the number of GPUs to 3。The result was very bad。
I have another question, How to debug your code,I only run it as a command lineHi, if you use Pycharm to debug, in
Run/Debug Configurations
, you canset the
working directory
to the code path which you want to run, e.g.YOLOF/playground/detection/coco/yolof/yolof.res50.C5.1x
set the
script path
toYOLOF/tools/train_net.py
.
I set like this, why the problem still has
from yolof.
Related Issues (20)
- How can i run this project in a single gpu HOT 1
- How to convert your moder to onnx? HOT 3
- Save the file HOT 1
- Compatibility with anchor-free methods HOT 1
- label assignment problem HOT 1
- TypeError: __init__() missing 2 required positional arguments: 'cfg' and 'distributed' HOT 1
- How to understand the calculation process of normalized_cls_score? HOT 4
- PASCAL VOC results HOT 1
- How to get FLOPS and params of YOLOF? HOT 1
- Object detection w/ video HOT 1
- RuntimeError: Error compiling objects for extension HOT 1
- AssertionError: Box regression deltas become infinite or NaN! HOT 1
- about accuracy HOT 1
- About NAN during training HOT 1
- pods_train: command not found HOT 5
- there are some errors when I run pods_train HOT 1
- uniform matching how to choose k nearest anchors
- Small or occluded objects
- Question about loss function
- Can't download model HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from yolof.