smallcorgi / faster-rcnn_tf Goto Github PK
View Code? Open in Web Editor NEWFaster-RCNN in Tensorflow
License: MIT License
Faster-RCNN in Tensorflow
License: MIT License
Can this code run in multiple GPUs? Thanks.
when i run python ./tools/demo.py --model models/VGGnet_fast_rcnn_iter_70000.ckpt
I got the result like this:
usage: demo.py [-h] [--gpu GPU_ID] [--cpu] [--net {vgg16,zf}]
demo.py: error: unrecognized arguments: --model models/VGGnet_fast_rcnn_iter_70000.ckpt
how to deal with it?
Can you provide training instruction?
I want to train on my own data with PASCAL_VOC format.
Thank you.
Environment: pip install TF by : sudo pip install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.11.0rc0-cp27-none-linux_x86_64.whl
when make roi_pooling_layer by make.sh, it failed.
So I separately test,
The first step is ok when run nvcc -std=c++11 -c -o roi_pooling_op.cu.o roi_pooling_op_gpu.cu.cc -I $TF_INC -D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC -arch=sm_52.
then run next step : g++ -std=c++11 -shared -o roi_pooling.so roi_pooling_op.cc roi_pooling_op.cu.o -I
but it reported much more error as below:
In file included from /usr/local/lib/python2.7/dist-packages/tensorflow/include/tensorflow/core/lib/gtl/array_slice.h:101:0,
from /usr/local/lib/python2.7/dist-packages/tensorflow/include/tensorflow/core/lib/strings/str_util.h:23,
from /usr/local/lib/python2.7/dist-packages/tensorflow/include/tensorflow/core/framework/op.h:29,
from roi_pooling_op.cc:22:
/usr/local/lib/python2.7/dist-packages/tensorflow/include/tensorflow/core/lib/gtl/array_slice_internal.h:232:38: error: ‘tensorflow::gtl::array_slice_internal::ArraySliceImplBase::ArraySliceImplBase’ names constructor
/usr/local/lib/python2.7/dist-packages/tensorflow/include/tensorflow/core/lib/gtl/array_slice_internal.h:252:32: error: ‘tensorflow::gtl::array_slice_internal::ArraySliceImplBase::ArraySliceImplBase’ names constructor
In file included from /usr/local/lib/python2.7/dist-packages/tensorflow/include/tensorflow/core/lib/gtl/array_slice.h:102:0,
from /usr/local/lib/python2.7/dist-packages/tensorflow/include/tensorflow/core/lib/strings/str_util.h:23,
from /usr/local/lib/python2.7/dist-packages/tensorflow/include/tensorflow/core/framework/op.h:29,
from roi_pooling_op.cc:22:
/usr/local/lib/python2.7/dist-packages/tensorflow/include/tensorflow/core/lib/gtl/inlined_vector.h: In member function ‘void tensorflow::gtl::InlinedVector<T, N>::Destroy(T*, int)’:
/usr/local/lib/python2.7/dist-packages/tensorflow/include/tensorflow/core/lib/gtl/inlined_vector.h:394:10: error: ‘is_trivially_destructible’ is not a member of ‘std’
/usr/local/lib/python2.7/dist-packages/tensorflow/include/tensorflow/core/lib/gtl/inlined_vector.h:394:42: error: expected primary-expression before ‘>’ token
/usr/local/lib/python2.7/dist-packages/tensorflow/include/tensorflow/core/lib/gtl/inlined_vector.h:394:43: error: ‘::value’ has not been declared
In file included from roi_pooling_op.cc:23:0:
/usr/local/lib/python2.7/dist-packages/tensorflow/include/tensorflow/core/framework/op_kernel.h: At global scope:
/usr/local/lib/python2.7/dist-packages/tensorflow/include/tensorflow/core/framework/op_kernel.h:169:19: error: ‘tensorflow::OpKernel::OpKernel’ names constructor
In file included from /usr/local/lib/python2.7/dist-packages/tensorflow/include/tensorflow/core/lib/strings/str_util.h:23:0,
from /usr/local/lib/python2.7/dist-packages/tensorflow/include/tensorflow/core/framework/op.h:29,
from roi_pooling_op.cc:22:
/usr/local/lib/python2.7/dist-packages/tensorflow/include/tensorflow/core/lib/gtl/array_slice.h: In instantiation of ‘tensorflow::gtl::ArraySlice::ArraySlice(const tensorflow::gtl::InlinedVector<T, N>&) [with int N = 4; T = tensorflow::DataType]’:
/usr/local/lib/python2.7/dist-packages/tensorflow/include/tensorflow/core/framework/types.h:86:36: required from here
/usr/local/lib/python2.7/dist-packages/tensorflow/include/tensorflow/core/lib/gtl/array_slice.h:140:33: error: no matching function for call to ‘tensorflow::gtl::array_slice_internal::ArraySliceImpltensorflow::DataType::ArraySliceImpl(tensorflow::gtl::InlinedVector<tensorflow::DataType, 4>::const_pointer, std::size_t)’
...............................
usr/local/lib/python2.7/dist-packages/tensorflow/include/tensorflow/core/lib/gtl/array_slice_internal.h:230:7: note: constexpr tensorflow::gtl::array_slice_internal::ArraySliceImpl<std::pair<std::basic_string, tensorflow::FunctionDefHelper::AttrValueWrapper> >::ArraySliceImpl(const tensorflow::gtl::array_slice_internal::ArraySliceImpl<std::pair<std::basic_string, tensorflow::FunctionDefHelper::AttrValueWrapper> >&)
/usr/local/lib/python2.7/dist-packages/tensorflow/include/tensorflow/core/lib/gtl/array_slice_internal.h:230:7: note: candidate expects 1 argument, 2 provided
/usr/local/lib/python2.7/dist-packages/tensorflow/include/tensorflow/core/lib/gtl/array_slice_internal.h:230:7: note: constexpr tensorflow::gtl::array_slice_internal::ArraySliceImpl<std::pair<std::basic_string, tensorflow::FunctionDefHelper::AttrValueWrapper> >::ArraySliceImpl(tensorflow::gtl::array_slice_internal::ArraySliceImpl<std::pair<std::basic_string, tensorflow::FunctionDefHelper::AttrValueWrapper> >&&)
/usr/local/lib/python2.7/dist-packages/tensorflow/include/tensorflow/core/lib/gtl/array_slice_internal.h:230:7: note: candidate expects 1 argument, 2 provided
Any one who knows this question how to solve please help me solve this problem, I will appreciate that.
note: cuda version 7.5, cudnn version v3.0
I followed instructions and install all dependencies... I run it on tensorflow GPU enabled, python 2.7 and get the following output:
Traceback (most recent call last): File "./tools/demo.py", line 11, in <module> from networks.factory import get_network File "/home/misko/Workspace/Faster-RCNN_TF/tools/../lib/networks/__init__.py", line 8, in <module> from .VGGnet_train import VGGnet_train File "/home/misko/Workspace/Faster-RCNN_TF/tools/../lib/networks/VGGnet_train.py", line 2, in <module> from networks.network import Network File "/home/misko/Workspace/Faster-RCNN_TF/tools/../lib/networks/network.py", line 3, in <module> import roi_pooling_layer.roi_pooling_op as roi_pool_op File "/home/misko/Workspace/Faster-RCNN_TF/tools/../lib/roi_pooling_layer/roi_pooling_op.py", line 5, in <module> _roi_pooling_module = tf.load_op_library(filename) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/load_library.py", line 64, in load_op_library None, None, error_msg, error_code) tensorflow.python.framework.errors_impl.NotFoundError: /home/misko/Workspace/Faster-RCNN_TF/tools/../lib/roi_pooling_layer/roi_pooling.so: undefined symbol: _ZN10tensorflow7strings6StrCatB5cxx11ERKNS0_8AlphaNumE
How can i make this run?
I have seen this issue in other threads, but no solution. Did anyone figured out?
Tried running demo.py and hit the following error:
aise errors._make_specific_exception(None, None, error_msg, error_code)
tensorflow.python.framework.errors.NotFoundError: ~/Faster-RCNN_TF/tools/../lib/roi_pooling_layer/roi_pooling.so: undefined symbol: _ZN10tensorflow7strings6StrCatB5cxx11ERKNS0_8AlphaNumE
My system had Tensorflow v0.11.0 installed, and I was Python 2.7.
Thanks!
cloud
running inside the latest docker tensorflow:
docker run -it -p 8888:8888 tensorflow/tensorflow
`
root@f54905c5bdaf:/notebooks/Faster-RCNN_TF# python ./tools/demo.py --model /VGGnet_fast_rcnn_iter_70000.ckpt
Traceback (most recent call last):
File "./tools/demo.py", line 11, in
from networks.factory import get_network
File "/notebooks/Faster-RCNN_TF/tools/../lib/networks/init.py", line 8, in
from .VGGnet_train import VGGnet_train
File "/notebooks/Faster-RCNN_TF/tools/../lib/networks/VGGnet_train.py", line 2, in
from networks.network import Network
File "/notebooks/Faster-RCNN_TF/tools/../lib/networks/network.py", line 3, in
import roi_pooling_layer.roi_pooling_op as roi_pool_op
File "/notebooks/Faster-RCNN_TF/tools/../lib/roi_pooling_layer/roi_pooling_op.py", line 5, in
_roi_pooling_module = tf.load_op_library(filename)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/load_library.py", line 63, in load_op_library
raise errors._make_specific_exception(None, None, error_msg, error_code)
tensorflow.python.framework.errors.NotFoundError: /notebooks/Faster-RCNN_TF/tools/../lib/roi_pooling_layer/roi_pooling.so: undefined symbol: _Z22ROIPoolBackwardLaucherPKffiiiiiiiS0_PfPKiRKN5Eigen9GpuDeviceE
root@f54905c5bdaf:/notebooks/Faster-RCNN_TF# nm -gC lib/roi_pooling_layer/roi_pooling.so |grep GpuDevice
U ROIPoolForwardLaucher(float const*, float, int, int, int, int, int, int, float const*, float*, int*, Eigen::GpuDevice const&)
U ROIPoolBackwardLaucher(float const*, float, int, int, int, int, int, int, int, float const*, float*, int const*, Eigen::GpuDevice const&)
U Eigen::GpuDevice const& tensorflow::OpKernelContext::eigen_deviceEigen::GpuDevice() const
`
cd $FRCN_ROOT/lib
make
After this step I am getting an error:
aquib@javed:~/Faster-RCNN_TF/lib$ make all
python setup.py build_ext --inplace
running build_ext
skipping 'utils/bbox.c' Cython extension (up-to-date)
skipping 'utils/nms.c' Cython extension (up-to-date)
skipping 'nms/cpu_nms.c' Cython extension (up-to-date)
skipping 'nms/gpu_nms.cpp' Cython extension (up-to-date)
rm -rf build
bash make.sh
make.sh: line 13: nvcc: command not found
g++: error: roi_pooling_op.cu.o: No such file or directory
g++: error: GOOGLE_CUDA=1: No such file or directory
What is this problem, how can I fix it?
I spent so much time debugging this issue that I give the answer here:
When running the demo.py as stated in README, I was getting an error cudaCheckError() failed : invalid device function
with no traceback. It happen when this line was executed : https://github.com/smallcorgi/Faster-RCNN_TF/blob/master/lib/fast_rcnn/test.py#L169
I have never seen this error in any of my other tensorflow project.
This issue was similar to this one in Faster-RCNN for python : rbgirshick/py-faster-rcnn#2
And i solved it by updating the arch code in https://github.com/smallcorgi/Faster-RCNN_TF/blob/master/lib/make.sh#L9 and https://github.com/smallcorgi/Faster-RCNN_TF/blob/master/lib/setup.py#L137
I don't know how to find the arch code of any GPU, but for Tesla K80, sm_37 seems to work.
I don't know if we can change something so that it works for any GPU or maybe we can add an information in the README?
Hope it can help people having the same issue.
Hello!
I try to run demo.py, but I failed because I do not have faster_rcnn_tf.model. Can you give a link to get faster_rcnn_tf.model or the pre-trained VGG16 model.
Thank you very much.
the error reads like this:
make.sh: line 8: nvcc: command not found
g++: error: roi_pooling_op.cu.o: No such file or directory
Thanks very much! @smallcorgi
whenever I am trying to do make as per the instructions.. I am getting this error
In file included from roi_pooling_op.cc:25:0:
work_sharder.h:21:49: fatal error: tensorflow/core/lib/core/threadpool.h: No such file or directory
#include "tensorflow/core/lib/core/threadpool.h"
Please advice what i am missing
I am using CUDA 8 and cuDNN 5.1.5, Ubuntu 16.04.
when i build the code, i meet errors below:
python setup.py build_ext --inplace
running build_ext
skipping 'utils/bbox.c' Cython extension (up-to-date)
skipping 'utils/nms.c' Cython extension (up-to-date)
skipping 'nms/cpu_nms.c' Cython extension (up-to-date)
skipping 'nms/gpu_nms.cpp' Cython extension (up-to-date)
rm -rf build
bash make.sh
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcurand.so locally
/usr/lib/gcc/x86_64-linux-gnu/5/include/mwaitxintrin.h(36): error: identifier "__builtin_ia32_monitorx" is undefined
/usr/lib/gcc/x86_64-linux-gnu/5/include/mwaitxintrin.h(42): error: identifier "__builtin_ia32_mwaitx" is undefined
2 errors detected in the compilation of "/tmp/tmpxft_0000529b_00000000-7_roi_pooling_op_gpu.cu.cpp1.ii".
g++: error: roi_pooling_op.cu.o: No such file or directory
Hi all,
I am getting the following error while training. I have installed the latest tensorflow version. Was the blobs dictionary changed with new tensorflow ?
Any help is appreciated
Thanks
assign pretrain model weights to conv1_2
assign pretrain model biases to conv1_2
assign pretrain model weights to conv2_2
assign pretrain model biases to conv2_2
assign pretrain model weights to conv2_1
assign pretrain model biases to conv2_1
/scratch3/skoppura/Faster-RCNN_TF/tools/../lib/roi_data_layer/minibatch.py:100: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
fg_inds, size=fg_rois_per_this_image, replace=False)
/scratch3/skoppura/Faster-RCNN_TF/tools/../lib/roi_data_layer/minibatch.py:120: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
labels[fg_rois_per_this_image:] = 0
/scratch3/skoppura/Faster-RCNN_TF/tools/../lib/roi_data_layer/minibatch.py:176: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
bbox_targets[ind, start:end] = bbox_target_data[ind, 1:]
/scratch3/skoppura/Faster-RCNN_TF/tools/../lib/roi_data_layer/minibatch.py:177: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
bbox_inside_weights[ind, start:end] = cfg.TRAIN.BBOX_INSIDE_WEIGHTS
Traceback (most recent call last):
File "./tools/train_net.py", line 96, in <module>
max_iters=args.max_iters)
File "/scratch3/skoppura/Faster-RCNN_TF/tools/../lib/fast_rcnn/train.py", line 222, in train_net
sw.train_model(sess, max_iters)
File "/scratch3/skoppura/Faster-RCNN_TF/tools/../lib/fast_rcnn/train.py", line 147, in train_model
feed_dict={self.net.data: blobs['data'], self.net.im_info: blobs['im_info'], self.net.keep_prob: 0.5, \
KeyError: 'im_info'
I got an error when I run the demo.py with GPU.
Environment: ubuntu14.04+python2.7+cuda7.5+cudnn5.1
root@SSAP-G3-Guest:~/Workspace/faster-rcnn-master# python tools/demo.py --model models/VGGnet_fast_rcnn_iter_70000.ckpt
The error is as follows:
Loaded network models/VGGnet_fast_rcnn_iter_70000.ckpt
E tensorflow/stream_executor/cuda/cuda_dnn.cc:378] Loaded runtime CuDNN library: 5006
(compatibility version 5000) but source was compiled with 5103 (compatibility version 5100).
If using a binary install, upgrade your CuDNN library to match.
If building from sources, make sure the library loaded at runtime matches a compatible version specified during compile configuration.
F tensorflow/core/kernels/conv_ops.cc:532] Check failed: stream->parent()->GetConvolveAlgorithms(&algorithms)
Aborted (core dumped)
Thanks!
Hi,
I am wondering if this implementation include any ROI Pooling Layer?
My understanding is the ROI Pooling is not supported in generic TensorFlow. I could be wrong though...
Thanks!
cloud
same as faster rcnn, the training only allow single image per batch in training. In caffe, the parameter of "iter_size" can be adjusted to do multi batch training as weights are updated after "iter_size" iterations, i.e., images. Can this be done in TF?
Thank you.
When I built the cython modules, some errors came out:
#python setup.py build_ext --inplace
running build_ext
skipping 'utils/bbox.c' Cython extension (up-to-date)
skipping 'utils/nms.c' Cython extension (up-to-date)
skipping 'nms/cpu_nms.c' Cython extension (up-to-date)
skipping 'nms/gpu_nms.cpp' Cython extension (up-to-date)
rm -rf build
bash make.sh
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcurand.so locally
In file included from roi_pooling_op.cc:25:0:
work_sharder.h:21:49: fatal error: tensorflow/core/lib/core/threadpool.h: No such file or directory
#include "tensorflow/core/lib/core/threadpool.h"
^
compilation terminated.
So I change the directory into python lib directory and found the tensorflow/core/lib/core folder didn't hava the threadpool.h, my python is python2.7.8, and OS is Cent OS6.8, I think the tensorflow I installed is ok, but how can solve this problem? can anyone met the same situation?
I was training on a new dataset which is based on the format of VOC2007, and got 5000 iterations into training when there was a crash. It looks like something happened while trying to take a snapshot of the weights of the neural net. Any ideas on how to fix this?
Here's the error:
Traceback (most recent call last):
File "./tools/train_net.py", line 95, in
max_iters=args.max_iters)
File "Faster-RCNN_TF-master/tools/../lib/fast_rcnn/train.py", line 209, in train_net
sw.train_model(sess, max_iters)
File "Faster-RCNN_TF-master/tools/../lib/fast_rcnn/train.py", line 166, in train_model
self.snapshot(sess, iter)
File "Faster-RCNN_TF-master/tools/../lib/fast_rcnn/train.py", line 60, in snapshot
sess.run(weights.assign(orig_0 * np.tile(self.bbox_stds, (weights_shape[0],1))))
ValueError: operands could not be broadcast together with shapes (4096,84) (4096,32)
def load2(self, data_path, session, ignore_missing=False):
saver=tf.train.Saver()
print'model start restore'
with tf.Session() as sess:
saver.restore(sess,data_path)
print 'Model Restored'
W tensorflow/core/framework/op_kernel.cc:940] Not found: Tensor name "Variable" not found in checkpoint files /home/jmy/Desktop/Faster-RCNN_TF-master/VGGnet_fast_rcnn_iter_70000.ckpt
[[Node: save_1/restore_slice = RestoreSlice[dt=DT_FLOAT, preferred_shard=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save_1/Const_0, save_1/restore_slice/tensor_name, save_1/restore_slice/shape_and_slice)]]
lr = tf.Variable(cfg.TRAIN.LEARNING_RATE, trainable=False) to
lr = tf.Variable(cfg.TRAIN.LEARNING_RATE, trainable=False,name='learning_rate'),
W tensorflow/core/framework/op_kernel.cc:940] Not found: Tensor name "rpn_conv/3x3/biases/Momentum" not found in checkpoint files /home/jmy/Desktop/Faster-RCNN_TF-master/Resnet_fast_rcnn_iter_1000.ckpt
[[Node: save_1/restore_slice_338 = RestoreSlice[dt=DT_FLOAT, preferred_shard=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save_1/Const_0, save_1/restore_slice_338/tensor_name, save_1/restore_slice_338/shape_and_slice)]]
momentum = tf.Variable(cfg.TRAIN.MOMENTUM,trainable=False,name='momentum')?
I trained this model on a machine that has GTX1080 and 16 GB memory.It always ends up with:
out of memory
invalid argument
an illegal memory access was encountered
E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:662] failed to record completion event; therefore, failed to create inter-stream dependency
I tensorflow/stream_executor/stream.cc:3788] stream 0x50b3390 did not memcpy host-to-device; source: 0x7f1c216f6e60
E tensorflow/stream_executor/stream.cc:272] Error recording event in stream: error recording CUDA event on stream 0x50989c0: CUDA_ERROR_ILLEGAL_ADDRESS; not marking stream as bad, as the Event object may be at fault. Monitor for further errors.
E tensorflow/stream_executor/cuda/cuda_event.cc:49] Error polling for event status: failed to query event: CUDA_ERROR_ILLEGAL_ADDRESS
F tensorflow/core/common_runtime/gpu/gpu_event_mgr.cc:198] Unexpected Event status: 1
[1] 6900 abort (core dumped) python ./faster_rcnn/train_net.py --gpu 0 --weights --imdb voc_2007_trainval
and README.md said:
Requirements: hardware
For training the end-to-end version of Faster R-CNN with VGG16, 3G of GPU memory is sufficient (using CUDNN)
what's the problem?
the pre-trained model is VGGnet_fast_rcnn_iter_70000.ckpt
Hi,
Errors were encountered when building Faster-RCNN-TF in CPU only environment.
It still did not succeed even after all "cuda" references were removed from setup.py.
Any suggestions?
Thanks!
cloud
I'm not familier to this. when i make in /lib, there come a lot warnings, i'm not sure if it can work. Could anyone answer my question... I'm truely confused...
Here's my log, it doesn't look like right...
irondroid@PC:~/Faster-RCNN_TF-master/lib$ make
python setup.py build_ext --inplace
running build_ext
cythoning utils/bbox.pyx to utils/bbox.c
building 'utils.cython_bbox' extension
creating build
creating build/temp.linux-x86_64-2.7
creating build/temp.linux-x86_64-2.7/utils
{'gcc': ['-Wno-cpp', '-Wno-unused-function']}
gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/irondroid/anaconda2/lib/python2.7/site-packages/numpy/core/include -I/home/irondroid/anaconda2/include/python2.7 -c utils/bbox.c -o build/temp.linux-x86_64-2.7/utils/bbox.o -Wno-cpp -Wno-unused-function
gcc -pthread -shared -L/home/irondroid/anaconda2/lib -Wl,-rpath=/home/irondroid/anaconda2/lib,--no-as-needed build/temp.linux-x86_64-2.7/utils/bbox.o -L/home/irondroid/anaconda2/lib -lpython2.7 -o /home/irondroid/Faster-RCNN_TF-master/lib/utils/cython_bbox.so
cythoning utils/nms.pyx to utils/nms.c
building 'utils.cython_nms' extension
{'gcc': ['-Wno-cpp', '-Wno-unused-function']}
gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/irondroid/anaconda2/lib/python2.7/site-packages/numpy/core/include -I/home/irondroid/anaconda2/include/python2.7 -c utils/nms.c -o build/temp.linux-x86_64-2.7/utils/nms.o -Wno-cpp -Wno-unused-function
gcc -pthread -shared -L/home/irondroid/anaconda2/lib -Wl,-rpath=/home/irondroid/anaconda2/lib,--no-as-needed build/temp.linux-x86_64-2.7/utils/nms.o -L/home/irondroid/anaconda2/lib -lpython2.7 -o /home/irondroid/Faster-RCNN_TF-master/lib/utils/cython_nms.so
cythoning nms/cpu_nms.pyx to nms/cpu_nms.c
building 'nms.cpu_nms' extension
creating build/temp.linux-x86_64-2.7/nms
{'gcc': ['-Wno-cpp', '-Wno-unused-function']}
gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/irondroid/anaconda2/lib/python2.7/site-packages/numpy/core/include -I/home/irondroid/anaconda2/include/python2.7 -c nms/cpu_nms.c -o build/temp.linux-x86_64-2.7/nms/cpu_nms.o -Wno-cpp -Wno-unused-function
gcc -pthread -shared -L/home/irondroid/anaconda2/lib -Wl,-rpath=/home/irondroid/anaconda2/lib,--no-as-needed build/temp.linux-x86_64-2.7/nms/cpu_nms.o -L/home/irondroid/anaconda2/lib -lpython2.7 -o /home/irondroid/Faster-RCNN_TF-master/lib/nms/cpu_nms.so
cythoning nms/gpu_nms.pyx to nms/gpu_nms.cpp
building 'nms.gpu_nms' extension
{'gcc': ['-Wno-unused-function'], 'nvcc': ['-arch=sm_35', '--ptxas-options=-v', '-c', '--compiler-options', "'-fPIC'"]}
/usr/local/cuda/bin/nvcc -I/home/irondroid/anaconda2/lib/python2.7/site-packages/numpy/core/include -I/usr/local/cuda/include -I/home/irondroid/anaconda2/include/python2.7 -c nms/nms_kernel.cu -o build/temp.linux-x86_64-2.7/nms/nms_kernel.o -arch=sm_35 --ptxas-options=-v -c --compiler-options '-fPIC'
ptxas info : 0 bytes gmem
ptxas info : Compiling entry function '_Z10nms_kernelifPKfPy' for 'sm_35'
ptxas info : Function properties for _Z10nms_kernelifPKfPy
0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info : Used 25 registers, 1280 bytes smem, 344 bytes cmem[0], 8 bytes cmem[2]
{'gcc': ['-Wno-unused-function'], 'nvcc': ['-arch=sm_35', '--ptxas-options=-v', '-c', '--compiler-options', "'-fPIC'"]}
gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/irondroid/anaconda2/lib/python2.7/site-packages/numpy/core/include -I/usr/local/cuda/include -I/home/irondroid/anaconda2/include/python2.7 -c nms/gpu_nms.cpp -o build/temp.linux-x86_64-2.7/nms/gpu_nms.o -Wno-unused-function
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
In file included from /home/irondroid/anaconda2/lib/python2.7/site-packages/numpy/core/include/numpy/ndarraytypes.h:1777:0,
from /home/irondroid/anaconda2/lib/python2.7/site-packages/numpy/core/include/numpy/ndarrayobject.h:18,
from /home/irondroid/anaconda2/lib/python2.7/site-packages/numpy/core/include/numpy/arrayobject.h:4,
from nms/gpu_nms.cpp:283:
/home/irondroid/anaconda2/lib/python2.7/site-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:15:2: warning: #warning "Using deprecated NumPy API, disable it by " "#defining NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp]
#warning "Using deprecated NumPy API, disable it by "
^
g++ -pthread -shared -L/home/irondroid/anaconda2/lib -Wl,-rpath=/home/irondroid/anaconda2/lib,--no-as-needed build/temp.linux-x86_64-2.7/nms/nms_kernel.o build/temp.linux-x86_64-2.7/nms/gpu_nms.o -L/usr/local/cuda/lib64 -L/home/irondroid/anaconda2/lib -Wl,-R/usr/local/cuda/lib64 -lcudart -lpython2.7 -o /home/irondroid/Faster-RCNN_TF-master/lib/nms/gpu_nms.so
rm -rf build
sh make.sh
/home/irondroid/anaconda2/lib/python2.7/site-packages/tensorflow/include
/home/irondroid/anaconda2/lib/python2.7/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../../../Eigen/src/Core/MathFunctions.h(1052): warning: calling a constexpr host function("real") from a host device function("abs") is not allowed. The experimental flag '--expt-relaxed-constexpr' can be used to allow this.
/home/irondroid/anaconda2/lib/python2.7/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../../../Eigen/src/Core/MathFunctions.h(1052): warning: calling a constexpr host function("imag") from a host device function("abs") is not allowed. The experimental flag '--expt-relaxed-constexpr' can be used to allow this.
/home/irondroid/anaconda2/lib/python2.7/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../../../Eigen/src/Core/MathFunctions.h(1052): warning: calling a constexpr host function from a host device function is not allowed. The experimental flag '--expt-relaxed-constexpr' can be used to allow this.
/home/irondroid/anaconda2/lib/python2.7/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../../../Eigen/src/Core/MathFunctions.h(1052): warning: calling a constexpr host function from a host device function is not allowed. The experimental flag '--expt-relaxed-constexpr' can be used to allow this.
/home/irondroid/anaconda2/lib/python2.7/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../../../Eigen/src/Core/MathFunctions.h(1057): warning: calling a constexpr host function("real") from a host device function("abs") is not allowed. The experimental flag '--expt-relaxed-constexpr' can be used to allow this.
/home/irondroid/anaconda2/lib/python2.7/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../../../Eigen/src/Core/MathFunctions.h(1057): warning: calling a constexpr host function("imag") from a host device function("abs") is not allowed. The experimental flag '--expt-relaxed-constexpr' can be used to allow this.
/home/irondroid/anaconda2/lib/python2.7/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../../../Eigen/src/Core/MathFunctions.h(1057): warning: calling a constexpr host function from a host device function is not allowed. The experimental flag '--expt-relaxed-constexpr' can be used to allow this.
/home/irondroid/anaconda2/lib/python2.7/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../../../Eigen/src/Core/MathFunctions.h(1057): warning: calling a constexpr host function from a host device function is not allowed. The experimental flag '--expt-relaxed-constexpr' can be used to allow this.
Hi,
I try to run the demo like this:
./tools/demo.py --model ./VGGnet_fast_rcnn_iter_70000.ckpt
But it comes out the mouse with cross shape and I click several times, it comes out:
from: can't read /var/mail/fast_rcnn.config
from: can't read /var/mail/fast_rcnn.test
from: can't read /var/mail/fast_rcnn.nms_wrapper
from: can't read /var/mail/utils.timer
from: can't read /var/mail/networks.factory
./tools/demo.py: line 14: syntax error near unexpected token `('
./tools/demo.py: line 14: `CLASSES = ('__background__','
How to solve this problem?
Hi,
I am running into this problem that NMS gave different results in CPU or GPU. I am not sure if I am missing anything. If anyone knows the reason, please let me know. Thanks a lot!
The config I am toggling is __C.USE_GPU_NMS
in lib/fast_rcnn/config.py
CPU:
Demo for data/demo/000456.jpg
Detection took 5.475s for 300 object proposals
GPU:
Demo for data/demo/000456.jpg
Detection took 4.063s for 3 object proposals
Hi!
I failed to run faster_rcnn_end2end.sh.
Can you introduce how to train a new detector?
Thank you very much!
When I run python ./demo.py -- model modelpath
, there come a lot of warnings with the same thing: libpng warning: Application built with libpng-1.6.22 but running with 1.5.12
. but it can still show up some pictures. I don't know if it's right. Could someone help me ?
Hello,
When I training the model, the speed of per iteration slow down, eg. in the beginning, the speed is about 0.4s/iter, after 10000 iterations, the speed reduce to about 1s/iter. However, the time of tensorflow session
rpn_loss_cls_value, rpn_loss_box_value,loss_cls_value, loss_box_value, _ = sess.run([rpn_cross_entropy, rpn_loss_box, cross_entropy, loss_box, train_op], feed_dict=feed_dict)
does not increase.
What's more, the CPU time seems much more than the beginning, the usage of GPU is often 0%. Therefore, I suspect that there are something wrong in roi_data_layer which run in CPU.
I have check the code, but I can not find any bug. Has anyone meets this problem and how to solve this problem.
Thank you.
hello,
when I run the demo with "./tools/demo.py --model model/VGGnet_fast_rcnn_iter_70000.ckpt",
I got an error: "cudaCheckError() failed : invalid device function"
Could you please tell me how to fix it, the error code is:
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so.7.5 locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.so.4 locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so.7.5 locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so.7.5 locally
I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties:
name: Tesla K40m
major: 3 minor: 5 memoryClockRate (GHz) 0.745
pciBusID 0000:02:00.0
Total memory: 11.25GiB
Free memory: 11.12GiB
W tensorflow/stream_executor/cuda/cuda_driver.cc:572] creating context when one is currently active; existing: 0x2ac5280
I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 1 with properties:
name: Tesla K40m
major: 3 minor: 5 memoryClockRate (GHz) 0.745
pciBusID 0000:03:00.0
Total memory: 11.25GiB
Free memory: 11.12GiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0 1
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0: Y Y
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 1: Y Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:838] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla K40m, pci bus id: 0000:02:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:838] Creating TensorFlow device (/gpu:1) -> (device: 1, name: Tesla K40m, pci bus id: 0000:03:00.0)
Tensor("Placeholder:0", shape=(?, ?, ?, 3), dtype=float32)
Tensor("conv5_3/conv5_3:0", shape=(?, ?, ?, 512), dtype=float32)
Tensor("rpn_conv/3x3/rpn_conv/3x3:0", shape=(?, ?, ?, 512), dtype=float32)
Tensor("rpn_cls_score/rpn_cls_score:0", shape=(?, ?, ?, 18), dtype=float32)
Tensor("rpn_cls_prob:0", shape=(?, ?, ?, ?), dtype=float32)
Tensor("rpn_cls_prob_reshape:0", shape=(?, ?, ?, 18), dtype=float32)
Tensor("rpn_bbox_pred/rpn_bbox_pred:0", shape=(?, ?, ?, 36), dtype=float32)
Tensor("Placeholder_1:0", shape=(?, 3), dtype=float32)
Tensor("conv5_3/conv5_3:0", shape=(?, ?, ?, 512), dtype=float32)
Tensor("rois:0", shape=(?, 5), dtype=float32)
[<tf.Tensor 'conv5_3/conv5_3:0' shape=(?, ?, ?, 512) dtype=float32>, <tf.Tensor 'rois:0' shape=(?, 5) dtype=float32>]
Tensor("fc7/fc7:0", shape=(?, 4096), dtype=float32)
Loaded network model/VGGnet_fast_rcnn_iter_70000.ckpt
cudaCheckError() failed : invalid device function
I noticed that in ./tools/train_net.py, although you have variable device_name to get gpu id from command line, you do not actually use it. That means this model can only be trained on the default gpu?
I am not familiar with Tensorflow. So I wonder how I can specify which gpu to use, or even use cpu.
I try 'python demo.py --model model_path(where is I put the VGG16_imagenet.npy)', but failed.
do i get the right model?
please give me some help, thanks
python setup.py build_ext --inplace
running build_ext
cythoning utils/bbox.pyx to utils/bbox.c
Error compiling Cython file:
------------------------------------------------------------
...
# Copyright (c) 2015 Microsoft
# Licensed under The MIT License [see LICENSE for details]
# Written by Sergey Karayev
# --------------------------------------------------------
cimport cython
^
------------------------------------------------------------
utils/bbox.pyx:8:8: Compiler crash in AnalyseDeclarationsTransform
File 'ModuleNode.py', line 103, in analyse_declarations: ModuleNode(bbox.pyx:1:0,
full_module_name = 'utils.cython_bbox')
File 'Nodes.py', line 425, in analyse_declarations: StatListNode(bbox.pyx:8:0)
File 'Nodes.py', line 425, in analyse_declarations: StatListNode(bbox.pyx:8:8)
File 'Nodes.py', line 7346, in analyse_declarations: CImportStatNode(bbox.pyx:8:8,
module_name = u'cython')
Compiler crash traceback from this point on:
File "/home/gt/anaconda2/lib/python2.7/site-packages/Cython/Compiler/Nodes.py", line 7346, in analyse_declarations
self.module_name, self.pos, relative_level=0 if self.is_absolute else -1)
File "/home/gt/anaconda2/lib/python2.7/site-packages/Cython/Compiler/Symtab.py", line 1159, in find_module
module_name, relative_to=relative_to, pos=pos, absolute_fallback=absolute_fallback)
File "/home/gt/anaconda2/lib/python2.7/site-packages/Cython/Compiler/Main.py", line 178, in find_module
pxd_pathname = self.find_pxd_file(qualified_name, pos)
File "/home/gt/anaconda2/lib/python2.7/site-packages/Cython/Compiler/Main.py", line 239, in find_pxd_file
pxd = self.search_include_directories(qualified_name, ".pxd", pos, sys_path=sys_path)
File "/home/gt/anaconda2/lib/python2.7/site-packages/Cython/Compiler/Main.py", line 280, in search_include_directories
tuple(self.include_directories), qualified_name, suffix, pos, include, sys_path)
File "/home/gt/anaconda2/lib/python2.7/site-packages/Cython/Utils.py", line 29, in wrapper
res = cache[args] = f(*args)
File "/home/gt/anaconda2/lib/python2.7/site-packages/Cython/Utils.py", line 119, in search_include_directories
path = os.path.join(dir, dotted_filename)
File "/home/gt/anaconda2/lib/python2.7/posixpath.py", line 73, in join
path += '/' + b
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe4 in position 9: ordinal not in range(128)
building 'utils.cython_bbox' extension
creating build
creating build/temp.linux-x86_64-2.7
creating build/temp.linux-x86_64-2.7/utils
{'gcc': ['-Wno-cpp', '-Wno-unused-function']}
gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/gt/anaconda2/lib/python2.7/site-packages/numpy/core/include -I/home/gt/anaconda2/include/python2.7 -c utils/bbox.c -o build/temp.linux-x86_64-2.7/utils/bbox.o -Wno-cpp -Wno-unused-function
utils/bbox.c:1:2: error: #error Do not use this file, it is the result of a failed Cython compilation.
#error Do not use this file, it is the result of a failed Cython compilation.
^
error: command 'gcc' failed with exit status 1
Makefile:2: recipe for target 'all' failed
make: *** [all] Error 1
Hello
I'm interested into fine-tunning pre-learned model. Do you have some recommendation where to start, and which layers I should train, and which of them I should fix. Is there any good tutorial on that?
Thank you
While I am trying to make.. I am getting the error: roi_pooling_op.cu.o: No such file or directory
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so locally
make.sh: line 13: nvcc: command not found
g++: error: roi_pooling_op.cu.o: No such file or directory
Please suggest what I need to do
hi ,
I am trying to execute this program on AWS and I am getting below error:
Traceback (most recent call last):
File "./tools/demo.py", line 11, in <module>
from networks.factory import get_network
File "/home/ubuntu/Faster-RCNN_TF/tools/../lib/networks/__init__.py", line 8, in <module>
from .VGGnet_train import VGGnet_train
File "/home/ubuntu/Faster-RCNN_TF/tools/../lib/networks/VGGnet_train.py", line 2, in <module>
from networks.network import Network
File "/home/ubuntu/Faster-RCNN_TF/tools/../lib/networks/network.py", line 4, in <module>
import roi_pooling_layer.roi_pooling_op_grad
File "/home/ubuntu/Faster-RCNN_TF/tools/../lib/roi_pooling_layer/roi_pooling_op_grad.py", line 7, in <module>
@tf.RegisterShape("RoiPool")
AttributeError: 'module' object has no attribute 'RegisterShape'
Can any one suggest a solution to this problem?
Thank you,
I am having make error on mac os 10.11.5 with the following when running make in lib directory.
...
"typeinfo for tensorflow::OpKernel", referenced from:
typeinfo for RoiPoolOp<Eigen::ThreadPoolDevice, float> in roi_pooling_op-0e649f.o
typeinfo for RoiPoolGradOp<Eigen::ThreadPoolDevice, float> in roi_pooling_op-0e649f.o
typeinfo for RoiPoolOp<Eigen::GpuDevice, float> in roi_pooling_op-0e649f.o
typeinfo for RoiPoolGradOp<Eigen::GpuDevice, float> in roi_pooling_op-0e649f.o
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
using (tensorflow)➜ lib git:(master) ✗ clang --version
Apple LLVM version 7.3.0 (clang-703.0.31)
Target: x86_64-apple-darwin15.5.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
W tensorflow/core/common_runtime/bfc_allocator.cc:275] Ran out of memory trying to allocate 392.00MiB. See logs for memory state.
W tensorflow/core/framework/op_kernel.cc:965] Internal: Dst tensor is not initialized.
E tensorflow/core/common_runtime/executor.cc:390] Executor failed to create kernel. Internal: Dst tensor is not initialized.
[[Node: zeros_24 = Constdtype=DT_FLOAT, value=Tensor<type: float shape: [25088,4096] values: [0 0 0]...>, _device="/job:localhost/replica:0/task:0/gpu:0"]]
Traceback (most recent call last):
File "./tools/train_net.py", line 96, in
max_iters=args.max_iters)
File "/home/deepinsight/Faster-RCNN_TF/tools/../lib/fast_rcnn/train.py", line 222, in train_net
sw.train_model(sess, max_iters)
File "/home/deepinsight/Faster-RCNN_TF/tools/../lib/fast_rcnn/train.py", line 134, in train_model
sess.run(tf.initialize_all_variables())
File "/home/deepinsight/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 766, in run
run_metadata_ptr)
File "/home/deepinsight/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 964, in _run
feed_dict_string, options, run_metadata)
File "/home/deepinsight/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1014, in _do_run
target_list, options, run_metadata)
File "/home/deepinsight/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1034, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InternalError: Dst tensor is not initialized.
[[Node: zeros_24 = Constdtype=DT_FLOAT, value=Tensor<type: float shape: [25088,4096] values: [0 0 0]...>, _device="/job:localhost/replica:0/task:0/gpu:0"]]
Caused by op u'zeros_24', defined at:
File "./tools/train_net.py", line 96, in
max_iters=args.max_iters)
File "/home/deepinsight/Faster-RCNN_TF/tools/../lib/fast_rcnn/train.py", line 222, in train_net
sw.train_model(sess, max_iters)
File "/home/deepinsight/Faster-RCNN_TF/tools/../lib/fast_rcnn/train.py", line 131, in train_model
train_op = tf.train.MomentumOptimizer(lr, momentum).minimize(loss, global_step=global_step)
File "/home/deepinsight/anaconda2/lib/python2.7/site-packages/tensorflow/python/training/optimizer.py", line 279, in minimize
name=name)
File "/home/deepinsight/anaconda2/lib/python2.7/site-packages/tensorflow/python/training/optimizer.py", line 393, in apply_gradients
self._create_slots(var_list)
File "/home/deepinsight/anaconda2/lib/python2.7/site-packages/tensorflow/python/training/momentum.py", line 51, in _create_slots
self._zeros_slot(v, "momentum", self._name)
File "/home/deepinsight/anaconda2/lib/python2.7/site-packages/tensorflow/python/training/optimizer.py", line 593, in _zeros_slot
named_slots[var] = slot_creator.create_zeros_slot(var, op_name)
File "/home/deepinsight/anaconda2/lib/python2.7/site-packages/tensorflow/python/training/slot_creator.py", line 106, in create_zeros_slot
val = array_ops.zeros(primary.get_shape().as_list(), dtype=dtype)
File "/home/deepinsight/anaconda2/lib/python2.7/site-packages/tensorflow/python/ops/array_ops.py", line 1362, in zeros
output = constant(zero, shape=shape, dtype=dtype, name=name)
File "/home/deepinsight/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/constant_op.py", line 169, in constant
attrs={"value": tensor_value, "dtype": dtype_value}, name=name).outputs[0]
File "/home/deepinsight/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2240, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/home/deepinsight/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1128, in init
self._traceback = _extract_stack()
InternalError (see above for traceback): Dst tensor is not initialized.
[[Node: zeros_24 = Constdtype=DT_FLOAT, value=Tensor<type: float shape: [25088,4096] values: [0 0 0]...>, _device="/job:localhost/replica:0/task:0/gpu:0"]]
Hi,
I tried to get the experiment working on Amazon GPU Cloud machine with a K520 graphic card with cuda 8. I got pretty much warnings, but I think the problem is some cuda function not working on the GPU. Here is some of the output:
assign pretrain model weights to conv2_1
assign pretrain model biases to conv2_1
Faster-RCNN_TF/tools/../lib/rpn_msr/proposal_target_layer_tf.py:89: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
bbox_targets[ind, start:end] = bbox_target_data[ind, 1:]
Faster-RCNN_TF/tools/../lib/rpn_msr/proposal_target_layer_tf.py:90: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
bbox_inside_weights[ind, start:end] = cfg.TRAIN.BBOX_INSIDE_WEIGHTS
cudaCheckError() failed : invalid device function
E tensorflow/stream_executor/stream.cc:272] Error recording event in stream: error recording CUDA event on stream 0x4cae120: CUDA_ERROR_DEINITIALIZED; not marking stream as bad, as the Event object may be at fault. Monitor for further errors.
E tensorflow/stream_executor/cuda/cuda_event.cc:49] Error polling for event status: failed to query event: CUDA_ERROR_DEINITIALIZED
F tensorflow/core/common_runtime/gpu/gpu_event_mgr.cc:198] Unexpected Event status: 1
E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:671] failed to record completion event; therefore, failed to create inter-stream dependency
I tensorflow/stream_executor/stream.cc:3775] stream 0x4caea80 did not memcpy device-to-host; source: 0x723f3cf00
./experiments/scripts/faster_rcnn_end2end.sh: line 57: 10679 Aborted (core dumped) python ./tools/train_net.py --device ${DEV} --device_id ${DEV_ID} --weights data/pretrain_model/VGG_imagenet.npy --imdb ${TRAIN_IMDB} --iters ${ITERS} --cfg experiments/cfgs/faster_rcnn_end2end.yml --network VGGnet_train ${EXTRA_ARGS}
Can you give me hint what the problem could be?
Thanks in advance
I try to download the VGG_imagenet.npy from google drive,but meet some problems.When I have downloaded about 100M,the project stopped.I cannot download the VGG_imagenet.npy from drive! So can anyone helpful sent the VGG_imagenet.npy to my e-mail?My e-mail is [email protected]. Thanks
I know this is a bug/issue tracker but I have a general question and didn't know where to ask.
I was just wondering how come in the Faster-RCNN paper they are able to detect the person inside the bus but using the pre-trained model here I am not? I understand there are many parameters that might be causing that but can someone help me understand which exactly? Is it the training data used itself or some parameters like IoU etc?
Thanks,
Ahmed.
Hi everyone, I was wondering if the demo runs out of the box for you guys after installing the prerequisites mentioned in the readme. I've found that the problem is in the lib/_init_paths.py
which tries to set some paths to caffe-fast-rcnn
and mftracker
which do not exist in this repo as it is. My assumption is that either the readme is incomplete as it does not mention in detail everything that we need to pre-install or there is as bug in the ./lib/_init_paths.py
file? Does anyone have any clue?
Here's my error output:
python demo.py --model ../weights/VGGnet_fast_rcnn_iter_70000.ckpt
Traceback (most recent call last):
File "demo.py", line 3, in <module>
from fast_rcnn.config import cfg
File "/home/user/Faster-RCNN_TF/tools/../lib/fast_rcnn/__init__.py", line 9, in <module>
from . import train
File "/home/user/Faster-RCNN_TF/tools/../lib/fast_rcnn/train.py", line 11, in <module>
import gt_data_layer.roidb as gdl_roidb
File "/home/user/Faster-RCNN_TF/tools/../lib/gt_data_layer/roidb.py", line 12, in <module>
from utils.cython_bbox import bbox_overlaps
ImportError: No module named cython_bbox
Hi, I tested the performance on voc2007 test set. And the performance is not as good as yours. My meanAP is only about 0.5851. Do you know what might be wrong?
# # sorry to bother you, when I run the demo.py ,it shows:
Tensor("Placeholder:0", shape=(?, ?, ?, 3), dtype=float32)
Tensor("conv5_3/conv5_3:0", shape=(?, ?, ?, 512), dtype=float32)
Tensor("rpn_conv/3x3/rpn_conv/3x3:0", shape=(?, ?, ?, 512), dtype=float32)
Tensor("rpn_cls_score/rpn_cls_score:0", shape=(?, ?, ?, 18), dtype=float32)
Tensor("rpn_cls_prob:0", shape=(?, ?, ?, ?), dtype=float32)
Tensor("rpn_cls_prob_reshape:0", shape=(?, ?, ?, 18), dtype=float32)
Tensor("rpn_bbox_pred/rpn_bbox_pred:0", shape=(?, ?, ?, 36), dtype=float32)
Tensor("Placeholder_1:0", shape=(?, 3), dtype=float32)
Tensor("conv5_3/conv5_3:0", shape=(?, ?, ?, 512), dtype=float32)
Tensor("rois:0", shape=(?, 5), dtype=float32)
[<tf.Tensor 'conv5_3/conv5_3:0' shape=(?, ?, ?, 512) dtype=float32>, <tf.Tensor 'rois:0' shape=(?, 5) dtype=float32>]
Tensor("fc7/fc7:0", shape=(?, 4096), dtype=float32)
Loaded network /home/jmy/Desktop/Faster-RCNN_TF-master/VGGnet_fast_rcnn_iter_70000.ckpt
cudaCheckError() failed : invalid device function
but I can run the tensorflow r0.10.0 model/image/mnist/convolutional.py successfully , I don't know where is the error.
Thank you
Hey, I'm a bit confused about what format of data I'd need to pass into this model in order to train on my own dataset. Could you give me an example of what I'd need? Thanks
We all want to verify the tensorflow faster rcnn how to work. but when we pull down into our environment, much more errors happened, which spent our much more time to solve the problem.
I think the reason mainly is the version of relation software doesn't match.
so could any one who can run successfully, supply the relation version?
such as:
system version:ubuntu 14.04 or 16.04?
gcc version: 4.8 or 5.4?
cuda version: 7.5 or 8.0?
cudnn version: v3 or v5?
I'm trying to run the demo following the instructions in readme, however, when I run the command
python ./tools/demo.py --model ./lib/pretrained/VGGnet_fast_rcnn_iter_70000.ckpt
I get the error below:
➜ Faster-RCNN_TF git:(master) ✗ python ./tools/demo.py --model ./lib/pretrained/VGGnet_fast_rcnn_iter_70000.ckpt
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcurand.so locally
Traceback (most recent call last):
File "./tools/demo.py", line 11, in
from networks.factory import get_network
File "/home/denis/WEB/DeepLearning/Faster-RCNN_TF/tools/../lib/networks/init.py", line 8, in
from .VGGnet_train import VGGnet_train
File "/home/denis/WEB/DeepLearning/Faster-RCNN_TF/tools/../lib/networks/VGGnet_train.py", line 2, in
from networks.network import Network
File "/home/denis/WEB/DeepLearning/Faster-RCNN_TF/tools/../lib/networks/network.py", line 3, in
import roi_pooling_layer.roi_pooling_op as roi_pool_op
File "/home/denis/WEB/DeepLearning/Faster-RCNN_TF/tools/../lib/roi_pooling_layer/roi_pooling_op.py", line 5, in
_roi_pooling_module = tf.load_op_library(filename)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/load_library.py", line 64, in load_op_library
None, None, error_msg, error_code)
tensorflow.python.framework.errors_impl.NotFoundError: /home/denis/WEB/DeepLearning/Faster-RCNN_TF/tools/../lib/roi_pooling_layer/roi_pooling.so: undefined symbol: _ZN10tensorflow7strings6StrCatB5cxx11ERKNS0_8AlphaNumE
CUDA: 8.0
CuDNN: 5
Python: 2.7.12
Tensorflow: 0.12.0-rc1
GPU: NVidia GeForce 750M (sm_30 architecture)
Due to my setup above, I modified CUDA_PATH
in make.sh
file to be like this:
CUDA_PATH=/usr/local/cuda-8.0/
and the nvcc
instruction to be like this:
nvcc -std=c++11 -c -o roi_pooling_op.cu.o roi_pooling_op_gpu.cu.cc \
-I $TF_INC -D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC $CXXFLAGS \
-arch=sm_30
Am I doing something wrong?
Could you please help me with running the demo properly?
The message it prompted:
File "/home/shang/Work/TF-Examples/tf-faster-rcnn/tools/../lib/roi_pooling_layer/roi_pooling_op.py", line 5, in <module>
_roi_pooling_module = tf.load_op_library(filename)
File "/home/shang/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/load_library.py", line 75, in load_op_library
raise errors._make_specific_exception(None, None, error_msg, error_code)
tensorflow.python.framework.errors.NotFoundError: /home/shang/Work/TF-Examples/tf-faster-rcnn/tools/../lib/roi_pooling_layer/roi_pooling.so: undefined symbol: _ZN10tensorflow8internal21CheckOpMessageBuilder9NewStringEv
I tried some workaround, including, changing g++ version 5 and 4.8, modifying -arch to sm_50 (platform is GTX980), adding -D_GLIBCXX_USE_CXX11_ABI=0 in g++ complie line. It seemed the roi_pooling.so could be generated correctly, but still got the same error when runing tf.load_op_library.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.