tonghe90 / textspotter Goto Github PK

CMake 1.20% Makefile 0.27% HTML 0.08% CSS 0.10% Jupyter Notebook 56.69% C++ 33.81% Shell 0.29% Python 4.51% Cuda 2.66% MATLAB 0.36% Dockerfile 0.03%

cvpr2018 end-to-end text-detection-recognition caffe textspotter

textspotter's Introduction

👋 Hi, I’m @tonghe90
👀 I’m interested in computer vision and machine learning
📫 How to reach me: feel free to drop me an email tonghe90 AT gmail DOT com

textspotter's People

Contributors

Stargazers

Watchers

Forkers

sarathknv daijucug yipeng-sun sapjunior lzd0825 liuxi2018 ichejun shubhampachori12110095 rosssong wxbxj smilewsw zgsxwsdxg bentengma fuxijia fendaq wenyafei4 ruphey gjyin fireae baiyancheng20 yuckfu lovebaldwin stevenlol enathang jdc08161063 gzzgz giminaan jsreddy december-boy opencvfun 10183308 willypku xiongfeiqin sheecegardezi gx9702 lturing dansonc windyhn xggiou happog wh-forker yyz277322264 craftsliu wangweilai1 ronnie-tian lqyiii gitchenguang generalsemantics decenwill fujingling wwwanghao whaozl yiran-thu yoyoshuang wyhgood klqulei hardsoft2023 geographerwang zhangjiekui bshowg ducanhvina17 weiliangxiao crazysal xshhhm zfxu meismaomao clscy ocelot7777 qiansi xiaowen-ttkx cfh3c hzm8341 liuwenhaha jsmilemsj xuhuaze707313 ronghanghu streamer-ap chenjun2hao uptodiff runauto doudouwenhui bachelorwangwei irinaarmstrong godla zhangyilinmoro shengzhang90 billyzju corleonechensiyu xiaojino tsui-xianjun kapitsa2811 amir22010 shenshenzhanzhan hyfine magnetstone andrewhuman sunxingxingtf chadpieere heavenceles hell-to-heaven

textspotter's Issues

'Python' layer unregistered in models/test_iou.pt

Hi there,

Sorry to bother but I've encountered an issue while running the test code.
The message popped up that F0329 ... layer_factory.hpp:81] Check failed: registry.count(type) == 1 (0 vs. 1) Unknown layer type: Python.
May you please check up is there any problem with the .pt file?

Thanks and have a nice day : )

unknown layer type

hello @tonghe90 ，
when I run python test.py --img=./imgs/img_105.jpg, it happens like below:

I0902 18:58:18.874758 37749 net.cpp:129] Top shape: 1 1 128 128 (16384)
I0902 18:58:18.874763 37749 net.cpp:137] Memory required for data: 711786500
I0902 18:58:18.874770 37749 layer_factory.hpp:77] Creating layer iou_maps_angles
F0902 18:58:18.874795 37749 layer_factory.hpp:81] Check failed: registry.count(type) == 1 (0 vs. 1) Unknown layer type: Python (known types: AbsVal, Accuracy, AffineTransformer, ArgMax, AttLstm, BNLL, BatchNorm, BatchReindex, Bias, Concat, ContrastiveLoss, Convolution, CosinangleLoss, Crop, Data, Deconvolution, Dropout, DummyData, ELU, Eltwise, Embed, EuclideanLoss, Exp, Filter, Flatten, HDF5Data, HDF5Output, HingeLoss, Im2col, ImageData, InfogainLoss, InnerProduct, Input, LRN, LSTMNew, LSTMUnit, Log, Lstm, MVN, MemoryData, MultinomialLogisticLoss, PReLU, Parameter, PointBilinear, Pooling, Power, RNN, ROIPooling, ReLU, Reduction, Reshape, ReverseAxis, SPP, Scale, Sigmoid, SigmoidCrossEntropyLoss, Silence, Slice, SmoothL1Loss, Softmax, SoftmaxWithLoss, Split, Sum, TanH, Threshold, Tile, Transpose, UnitboxLoss, WindowData)
*** Check failure stack trace: ***
Aborted (core dumped)

Can you give me a suggestion about how to deal with it? Thank you very much!

Training data format?

Hi, I still not clear what's the training dataset anotation format? Could you mind share it?

I can't open the URL to download the model.

I can't open the URL to download the caffemodel. Can you provide the caffemodel by other means? Or can someone who has already downloaded the caffemodel send me a copy? Thanks very very much.

@tonghe90 输入的bbox大小有什么限制吗?

我把loss_4s和iou loss层都注释掉了,现在仅有文字识别的softmaxwithloss损失函数(mask loss和iou loss都不参与训练); 然后自己写了一个输入数据层,可以输出包含文字的图片(640640大小), 作为gt的bbox的四个点的坐标以及文字的标签同时输出; 但是训练时候遇到segmentation fault, 提示内存越界; 请问输入给point bilinear layer的bbox大小有什么限制吗?648个采样点的条件下, 输入的bbox大小是否有什么要求?

Transpose layer breaks caffe installation

Hi,

Please advice :

 PROTOC src/caffe/proto/caffe.proto
CXX src/caffe/solvers/sgd_solver.cpp
CXX src/caffe/solvers/nesterov_solver.cpp
CXX src/caffe/solvers/rmsprop_solver.cpp
CXX src/caffe/solvers/adadelta_solver.cpp
CXX src/caffe/solvers/adam_solver.cpp
CXX src/caffe/parallel.cpp
CXX src/caffe/solvers/adagrad_solver.cpp
CXX src/caffe/internal_thread.cpp
CXX src/caffe/solver.cpp
CXX src/caffe/layers/accuracy_layer.cpp
CXX src/caffe/layers/recurrent_layer.cpp
CXX src/caffe/layers/cudnn_pooling_layer.cpp
CXX src/caffe/layers/euclidean_loss_layer.cpp
CXX src/caffe/layers/hdf5_output_layer.cpp
CXX src/caffe/layers/conv_layer.cpp
CXX src/caffe/layers/spp_layer.cpp
CXX src/caffe/layers/crop_layer.cpp
CXX src/caffe/layers/cudnn_lcn_layer.cpp
CXX src/caffe/layers/reduction_layer.cpp
CXX src/caffe/layers/lrn_layer.cpp
CXX src/caffe/layers/filter_layer.cpp
CXX src/caffe/layers/flatten_layer.cpp
CXX src/caffe/layers/tile_layer.cpp
CXX src/caffe/layers/cudnn_tanh_layer.cpp
CXX src/caffe/layers/unitbox_loss_layer.cpp
CXX src/caffe/layers/roi_pooling_layer.cpp
CXX src/caffe/layers/data_layer.cpp
CXX src/caffe/layers/deconv_layer.cpp
CXX src/caffe/layers/cosinangle_loss_layer.cpp
CXX src/caffe/layers/window_data_layer.cpp
CXX src/caffe/layers/split_layer.cpp
CXX src/caffe/layers/lstm_unit_layer.cpp
CXX src/caffe/layers/dropout_layer.cpp
CXX src/caffe/layers/cudnn_relu_layer.cpp
CXX src/caffe/layers/cudnn_sigmoid_layer.cpp
CXX src/caffe/layers/prelu_layer.cpp
CXX src/caffe/layers/batch_reindex_layer.cpp
CXX src/caffe/layers/pooling_layer.cpp
CXX src/caffe/layers/relu_layer.cpp
CXX src/caffe/layers/lstm_layer.cpp
CXX src/caffe/layers/rnn_layer.cpp
CXX src/caffe/layers/bias_layer.cpp
CXX src/caffe/layers/sigmoid_layer.cpp
CXX src/caffe/layers/eltwise_layer.cpp
CXX src/caffe/layers/neuron_layer.cpp
CXX src/caffe/layers/log_layer.cpp
CXX src/caffe/layers/embed_layer.cpp
CXX src/caffe/layers/slice_layer.cpp
CXX src/caffe/layers/hinge_loss_layer.cpp
CXX src/caffe/layers/infogain_loss_layer.cpp
CXX src/caffe/layers/at_layer.cpp
CXX src/caffe/layers/base_data_layer.cpp
CXX src/caffe/layers/point_bilinear_layer.cpp
CXX src/caffe/layers/concat_layer.cpp
CXX src/caffe/layers/tanh_layer.cpp
CXX src/caffe/layers/softmax_layer.cpp
CXX src/caffe/layers/memory_data_layer.cpp
CXX src/caffe/layers/reshape_layer.cpp
CXX src/caffe/layers/scale_layer.cpp
CXX src/caffe/layers/attention_lstm_layer.cpp
CXX src/caffe/layers/argmax_layer.cpp
CXX src/caffe/layers/lstm_new_layer.cpp
CXX src/caffe/layers/mvn_layer.cpp
CXX src/caffe/layers/sum_layer.cpp
CXX src/caffe/layers/inner_product_layer.cpp
CXX src/caffe/layers/transpose_layer.cpp
CXX src/caffe/layers/im2col_layer.cpp
CXX src/caffe/layers/elu_layer.cpp
CXX src/caffe/layers/base_conv_layer.cpp
CXX src/caffe/layers/silence_layer.cpp
CXX src/caffe/layers/reverse_axis_layer.cpp
CXX src/caffe/layers/threshold_layer.cpp
CXX src/caffe/layers/parameter_layer.cpp
CXX src/caffe/layers/exp_layer.cpp
CXX src/caffe/layers/dummy_data_layer.cpp
CXX src/caffe/layers/hdf5_data_layer.cpp
CXX src/caffe/layers/softmax_loss_layer.cpp
CXX src/caffe/layers/cudnn_lrn_layer.cpp
CXX src/caffe/layers/bnll_layer.cpp
CXX src/caffe/layers/sigmoid_cross_entropy_loss_layer.cpp
CXX src/caffe/layers/cudnn_conv_layer.cpp
CXX src/caffe/layers/multinomial_logistic_loss_layer.cpp
CXX src/caffe/layers/contrastive_loss_layer.cpp
CXX src/caffe/layers/smooth_L1_loss_layer.cpp
CXX src/caffe/layers/power_layer.cpp
CXX src/caffe/layers/loss_layer.cpp
CXX src/caffe/layers/absval_layer.cpp
CXX src/caffe/layers/input_layer.cpp
CXX src/caffe/layers/image_data_layer.cpp
CXX src/caffe/layers/cudnn_softmax_layer.cpp
CXX src/caffe/layers/batch_norm_layer.cpp
CXX src/caffe/data_transformer.cpp
CXX src/caffe/blob.cpp
CXX src/caffe/net.cpp
CXX src/caffe/layer_factory.cpp
CXX src/caffe/syncedmem.cpp
CXX src/caffe/util/io.cpp
CXX src/caffe/util/upgrade_proto.cpp
CXX src/caffe/util/math_functions.cpp
CXX src/caffe/util/hdf5.cpp
CXX src/caffe/util/blocking_queue.cpp
CXX src/caffe/util/db.cpp
CXX src/caffe/util/benchmark.cpp
CXX src/caffe/util/im2col.cpp
CXX src/caffe/util/cudnn.cpp
CXX src/caffe/util/db_leveldb.cpp
CXX src/caffe/util/db_lmdb.cpp
CXX src/caffe/util/signal_handler.cpp
CXX src/caffe/util/insert_splits.cpp
CXX src/caffe/common.cpp
CXX src/caffe/layer.cpp
CXX tools/copy_layers.cpp
CXX tools/convert_imageset.cpp
CXX tools/train_net.cpp
CXX tools/caffe.cpp
CXX tools/device_query.cpp
CXX tools/test_net.cpp
CXX tools/upgrade_net_proto_text.cpp
CXX tools/net_speed_benchmark.cpp
CXX tools/finetune_net.cpp
CXX tools/compute_image_mean.cpp
CXX tools/upgrade_net_proto_binary.cpp
CXX tools/extract_features.cpp
CXX tools/binary_to_text.cpp
CXX tools/upgrade_solver_proto_text.cpp
CXX tools/convert_model.cpp
CXX examples/mnist/convert_mnist_data.cpp
CXX examples/siamese/convert_mnist_siamese_data.cpp
CXX examples/cpp_classification/classification.cpp
CXX examples/cifar10/convert_cifar_data.cpp
CXX .build_release/src/caffe/proto/caffe.pb.cc
LD -o .build_release/lib/libcaffe.so.1.0.0
AR -o .build_release/lib/libcaffe.a
CXX/LD -o .build_release/tools/convert_imageset.bin
CXX/LD -o .build_release/tools/train_net.bin
CXX/LD -o .build_release/tools/caffe.bin
CXX/LD -o .build_release/tools/device_query.bin
CXX/LD -o .build_release/tools/test_net.bin
CXX/LD -o .build_release/tools/upgrade_net_proto_text.bin
CXX/LD -o .build_release/tools/net_speed_benchmark.bin
CXX/LD -o .build_release/tools/copy_layers.bin
CXX/LD -o .build_release/tools/upgrade_net_proto_binary.bin
CXX/LD -o .build_release/tools/finetune_net.bin
CXX/LD -o .build_release/tools/compute_image_mean.bin
CXX/LD -o .build_release/tools/extract_features.bin
.build_release/lib/libcaffe.so: undefined reference to `caffe::TransposeLayer<double>::Forward_gpu(std::vector<caffe::Blob<double>*, std::allocator<caffe::Blob<double>*> > const&, std::vector<caffe::Blob<double>*, std::allocator<caffe::Blob<double>*> > const&)'
.build_release/lib/libcaffe.so: undefined reference to `cv::imread(cv::String const&, int)'
.build_release/lib/libcaffe.so: undefined reference to `cv::imencode(cv::String const&, cv::_InputArray const&, std::vector<unsigned char, std::allocator<unsigned char> >&, std::vector<int, std::allocator<int> > const&)'
.build_release/lib/libcaffe.so: undefined reference to `caffe::TransposeLayer<float>::Forward_gpu(std::vector<caffe::Blob<float>*, std::allocator<caffe::Blob<float>*> > const&, std::vector<caffe::Blob<float>*, std::allocator<caffe::Blob<float>*> > const&)'
.build_release/lib/libcaffe.so: undefined reference to `caffe::TransposeLayer<float>::Backward_gpu(std::vector<caffe::Blob<float>*, std::allocator<caffe::Blob<float>*> > const&, std::vector<bool, std::allocator<bool> > const&, std::vector<caffe::Blob<float>*, std::allocator<caffe::Blob<float>*> > const&)'
.build_release/lib/libcaffe.so: undefined reference to `cv::imdecode(cv::_InputArray const&, int)'
.build_release/lib/libcaffe.so: undefined reference to `caffe::TransposeLayer<double>::Backward_gpu(std::vector<caffe::Blob<double>*, std::allocator<caffe::Blob<double>*> > const&, std::vector<bool, std::allocator<bool> > const&, std::vector<caffe::Blob<double>*, std::allocator<caffe::Blob<double>*> > const&)'
collect2: error: ld returned 1 exit status
Makefile:625: recipe for target '.build_release/tools/copy_layers.bin' failed
make: *** [.build_release/tools/copy_layers.bin] Error 1
make: *** Waiting for unfinished jobs....
.build_release/lib/libcaffe.so: undefined reference to `caffe::TransposeLayer<double>::Forward_gpu(std::vector<caffe::Blob<double>*, std::allocator<caffe::Blob<double>*> > const&, std::vector<caffe::Blob<double>*, std::allocator<caffe::Blob<double>*> > const&)'
.build_release/lib/libcaffe.so: undefined reference to `cv::imread(cv::String const&, int)'
.build_release/lib/libcaffe.so: undefined reference to `cv::imencode(cv::String const&, cv::_InputArray const&, std::vector<unsigned char, std::allocator<unsigned char> >&, std::vector<int, std::allocator<int> > const&)'
.build_release/lib/libcaffe.so: undefined reference to `caffe::TransposeLayer<float>::Forward_gpu(std::vector<caffe::Blob<float>*, std::allocator<caffe::Blob<float>*> > const&, std::vector<caffe::Blob<float>*, std::allocator<caffe::Blob<float>*> > const&)'
.build_release/lib/libcaffe.so: undefined reference to `caffe::TransposeLayer<float>::Backward_gpu(std::vector<caffe::Blob<float>*, std::allocator<caffe::Blob<float>*> > const&, std::vector<bool, std::allocator<bool> > const&, std::vector<caffe::Blob<float>*, std::allocator<caffe::Blob<float>*> > const&)'
.build_release/lib/libcaffe.so: undefined reference to `cv::imdecode(cv::_InputArray const&, int)'
.build_release/lib/libcaffe.so: undefined reference to `caffe::TransposeLayer<double>::Backward_gpu(std::vector<caffe::Blob<double>*, std::allocator<caffe::Blob<double>*> > const&, std::vector<bool, std::allocator<bool> > const&, std::vector<caffe::Blob<double>*, std::allocator<caffe::Blob<double>*> > const&)'
collect2: error: ld returned 1 exit status
Makefile:625: recipe for target '.build_release/tools/caffe.bin' failed
make: *** [.build_release/tools/caffe.bin] Error 1
.build_release/lib/libcaffe.so: undefined reference to `caffe::TransposeLayer<double>::Forward_gpu(std::vector<caffe::Blob<double>*, std::allocator<caffe::Blob<double>*> > const&, std::vector<caffe::Blob<double>*, std::allocator<caffe::Blob<double>*> > const&)'
.build_release/lib/libcaffe.so: undefined reference to `cv::imread(cv::String const&, int)'
.build_release/lib/libcaffe.so: undefined reference to `cv::imencode(cv::String const&, cv::_InputArray const&, std::vector<unsigned char, std::allocator<unsigned char> >&, std::vector<int, std::allocator<int> > const&)'
.build_release/lib/libcaffe.so: undefined reference to `caffe::TransposeLayer<float>::Forward_gpu(std::vector<caffe::Blob<float>*, std::allocator<caffe::Blob<float>*> > const&, std::vector<caffe::Blob<float>*, std::allocator<caffe::Blob<float>*> > const&)'
.build_release/lib/libcaffe.so: undefined reference to `caffe::TransposeLayer<float>::Backward_gpu(std::vector<caffe::Blob<float>*, std::allocator<caffe::Blob<float>*> > const&, std::vector<bool, std::allocator<bool> > const&, std::vector<caffe::Blob<float>*, std::allocator<caffe::Blob<float>*> > const&)'
.build_release/lib/libcaffe.so: undefined reference to `cv::imdecode(cv::_InputArray const&, int)'
.build_release/lib/libcaffe.so: undefined reference to `caffe::TransposeLayer<double>::Backward_gpu(std::vector<caffe::Blob<double>*, std::allocator<caffe::Blob<double>*> > const&, std::vector<bool, std::allocator<bool> > const&, std::vector<caffe::Blob<double>*, std::allocator<caffe::Blob<double>*> > const&)'
collect2: error: ld returned 1 exit status
Makefile:625: recipe for target '.build_release/tools/convert_imageset.bin' failed
make: *** [.build_release/tools/convert_imageset.bin] Error 1
.build_release/lib/libcaffe.so: undefined reference to `caffe::TransposeLayer<double>::Forward_gpu(std::vector<caffe::Blob<double>*, std::allocator<caffe::Blob<double>*> > const&, std::vector<caffe::Blob<double>*, std::allocator<caffe::Blob<double>*> > const&)'
.build_release/lib/libcaffe.so: undefined reference to `cv::imread(cv::String const&, int)'
.build_release/lib/libcaffe.so: undefined reference to `cv::imencode(cv::String const&, cv::_InputArray const&, std::vector<unsigned char, std::allocator<unsigned char> >&, std::vector<int, std::allocator<int> > const&)'
.build_release/lib/libcaffe.so: undefined reference to `caffe::TransposeLayer<float>::Forward_gpu(std::vector<caffe::Blob<float>*, std::allocator<caffe::Blob<float>*> > const&, std::vector<caffe::Blob<float>*, std::allocator<caffe::Blob<float>*> > const&)'
.build_release/lib/libcaffe.so: undefined reference to `caffe::TransposeLayer<float>::Backward_gpu(std::vector<caffe::Blob<float>*, std::allocator<caffe::Blob<float>*> > const&, std::vector<bool, std::allocator<bool> > const&, std::vector<caffe::Blob<float>*, std::allocator<caffe::Blob<float>*> > const&)'
.build_release/lib/libcaffe.so: undefined reference to `cv::imdecode(cv::_InputArray const&, int)'
.build_release/lib/libcaffe.so: undefined reference to `caffe::TransposeLayer<double>::Backward_gpu(std::vector<caffe::Blob<double>*, std::allocator<caffe::Blob<double>*> > const&, std::vector<bool, std::allocator<bool> > const&, std::vector<caffe::Blob<double>*, std::allocator<caffe:.build_release/lib/libcaffe.so: undefined reference to `caffe::TransposeLayer<double>::Forward_gpu(std::vector<caffe::Blob<double>*, std::allocator<caffe::Blob<double>*> > const&, std::vector<caffe::Blob<double>*, std::allocator<caffe::Blob<double>*> > const&)'
.build_release/lib/libcaffe.so: undefined reference to `cv::imread(cv::String const&, int)'
.build_release/lib/libcaffe.so: undefined reference to `cv::imencode(cv::String const&, cv::_InputArray const&, std::vector<unsigned char, std::allocator<unsigned char> >&, std::vector<int, std::allocator<int> > const&)'
.build_release/lib/libcaffe.so: undefined reference to `caffe::TransposeLayer<float>::Forward_gpu(std::vector<caffe::Blob<float>*, std::allocator<caffe::Blob<float>*> > const&, std::vector<caffe::Blob<float>*, std::allocator<caffe::Blob<float>*> > const&)'
.build_release/lib/libcaffe.so: undefined reference to `caffe::TransposeLayer<float>::Backward_gpu(std::vector<caffe::Blob<float>*, std::allocator<caffe::Blob<float>*> > const&, std::vector<bool, std::allocator<bool> > const&, std::vector<caffe::Blob<float>*, std::allocator<caffe::Blob<float>*> > const&)'
.build_release/lib/libcaffe.so: undefined reference to `cv::imdecode(cv::_InputArray const&, int)'
.build_release/lib/libcaffe.so: undefined reference to `caffe::TransposeLayer<double>::Backward_gpu(std::vector<caffe::Blob<double>*, std::allocator<caffe::Blob<double>*> > const&, std::vector<bool, std::allocator<bool> > const&, std::vector<caffe::Blob<double>*, std::allocator<caffe::Blob<double>*> > const&)'
:Blob<doublecollect2: error: ld returned 1 exit status
>*> > const&)'
collect2: error: ld returned 1 exit status
Makefile:625: recipe for target '.build_release/tools/upgrade_net_proto_text.bin' failed
make: *** [.build_release/tools/upgrade_net_proto_text.bin] Error 1
Makefile:625: recipe for target '.build_release/tools/upgrade_net_proto_binary.bin' failed
make: *** [.build_release/tools/upgrade_net_proto_binary.bin] Error 1
.build_release/lib/libcaffe.so: undefined reference to `caffe::TransposeLayer<double>::Forward_gpu(std::vector<caffe::Blob<double>*, std::allocator<caffe::Blob<double>*> > const&, std::vector<caffe::Blob<double>*, std::allocator<caffe::Blob<double>*> > const&)'
.build_release/lib/libcaffe.so: undefined reference to `cv::imread(cv::String const&, int)'
.build_release/lib/libcaffe.so: undefined reference to `cv::imencode(cv::String const&, cv::_InputArray const&, std::vector<unsigned char, std::allocator<unsigned char> >&, std::vector<int, std::allocator<int> > const&)'
.build_release/lib/libcaffe.so: undefined reference to `caffe::TransposeLayer<float>::Forward_gpu(std::vector<caffe::Blob<float>*, std::allocator<caffe::Blob<float>*> > const&, std::vector<caffe::Blob<float>*, std::allocator<caffe::Blob<float>*> > const&)'
.build_release/lib/libcaffe.so: undefined reference to `caffe::TransposeLayer<float>::Backward_gpu(std::vector<caffe::Blob<float>*, std::allocator<caffe::Blob<float>*> > const&, std::vector<bool, std::allocator<bool> > const&, std::vector<caffe::Blob<float>*, std::allocator<caffe::Blob<float>*> > const&)'
.build_release/lib/libcaffe.so: undefined reference to `cv::imdecode(cv::_InputArray const&, int)'
.build_release/lib/libcaffe.so: undefined reference to `caffe::TransposeLayer<double>::Backward_gpu(std::vector<caffe::Blob<double>*, std::allocator<caffe::Blob<double>*> > const&, std::vector<bool, std::allocator<bool> > const&, std::vector<caffe::Blob<double>*, std::allocator<caffe::Blob<double>*> > const&)'
collect2: error: ld returned 1 exit status
Makefile:625: recipe for target '.build_release/tools/extract_features.bin' failed
make: *** [.build_release/tools/extract_features.bin] Error 1
.build_release/lib/libcaffe.so: undefined reference to `caffe::TransposeLayer<double>::Forward_gpu(std::vector<caffe::Blob<double>*, std::allocator<caffe::Blob<double>*> > const&, std::vector<caffe::Blob<double>*, std::allocator<caffe::Blob<double>*> > const&)'
.build_release/lib/libcaffe.so: undefined reference to `cv::imread(cv::String const&, int)'
.build_release/lib/libcaffe.so: undefined reference to `cv::imencode(cv::String const&, cv::_InputArray const&, std::vector<unsigned char, std::allocator<unsigned char> >&, std::vector<int, std::allocator<int> > const&)'
.build_release/lib/libcaffe.so: undefined reference to `caffe::TransposeLayer<float>::Forward_gpu(std::vector<caffe::Blob<float>*, std::allocator<caffe::Blob<float>*> > const&, std::vector<caffe::Blob<float>*, std::allocator<caffe::Blob<float>*> > const&)'
.build_release/lib/libcaffe.so: undefined reference to `caffe::TransposeLayer<float>::Backward_gpu(std::vector<caffe::Blob<float>*, std::allocator<caffe::Blob<float>*> > const&, std::vector<bool, std::allocator<bool> > const&, std::vector<caffe::Blob<float>*, std::allocator<caffe::Blob<float>*> > const&)'
.build_release/lib/libcaffe.so: undefined reference to `cv::imdecode(cv::_InputArray const&, int)'
.build_release/lib/libcaffe.so: undefined reference to `caffe::TransposeLayer<double>::Backward_gpu(std::vector<caffe::Blob<double>*, std::allocator<caffe::Blob<double>*> > const&, std::vector<bool, std::allocator<bool> > const&, std::vector<caffe::Blob<double>*, std::allocator<caffe::Blob<double>*> > const&)'
collect2: error: ld returned 1 exit status
Makefile:625: recipe for target '.build_release/tools/compute_image_mean.bin' failed
make: *** [.build_release/tools/compute_image_mean.bin] Error 1

Cannot make caffe version due to reverse_axis_layer error

Hello @tonghe90 ,
At the moment it is impossible to compile your caffe version due to the following error:

CXX src/caffe/layers/base_conv_layer.cpp
In file included from ./include/caffe/common.hpp:19:0,
from ./include/caffe/blob.hpp:8,
from ./include/caffe/layers/reverse_axis_layer.hpp:6,
from src/caffe/layers/reverse_axis_layer.cpp:1:
./include/caffe/util/device_alternate.hpp:14:15: error: expected initializer before ‘<’ token
void classname::Forward_gpu(const vector<Blob>& bottom,
^
src/caffe/layers/reverse_axis_layer.cpp:61:1: note: in expansion of macro ‘STUB_GPU’
STUB_GPU(ReverseLayer);
^
./include/caffe/util/device_alternate.hpp:17:15: error: expected initializer before ‘<’ token
void classname::Backward_gpu(const vector<Blob>& top,
^
src/caffe/layers/reverse_axis_layer.cpp:61:1: note: in expansion of macro ‘STUB_GPU’
STUB_GPU(ReverseLayer);
^
Makefile:581: recipe for target '.build_release/src/caffe/layers/reverse_axis_layer.o' failed
make: *** [.build_release/src/caffe/layers/reverse_axis_layer.o] Error 1
make: *** Waiting for unfinished jobs....

Do you have any idea why is this happening?
I am configuring the Makefile.config to run in a CPU rather than GPU.
Let me know any comments.
Thank you!

Out of memory error on test example

When I try to run the simple example given in the README

python test.py --img=./imgs/img_105.jpg

I get an out of memory error:

F0426 09:41:13.545714 20964 syncedmem.cpp:71] Check failed: error == cudaSuccess (2 vs. 0)  out of memory
*** Check failure stack trace: ***

I am trying to run this on a GTX 1080, which has 8120 MB of global memory (according to deviceQuery).

When I tabulate the "Memory required for data" lines from the caffe log output, it adds up to 381 GB, though perhaps this isn't all required simultaneously or it is otherwise a double-counting. The same failure occurs when I try a much smaller (140x180 px) crop of the same image.

Is that right? Do you expect the model to fit and run within roughly 8GB of GPU memory? If not, how much memory is required to run this model?

EDIT: Same error happens on another host with K40 and K80 GPUs (each with roughly 12GB of GPU memory)

out of memory exception

Hi ,i run the test.py with a new image, it comes up with an exception with memory, messages as below:

1 / 1:  /home/eng_imgs/web.jpeg
F0724 12:09:38.190629 40484 syncedmem.cpp:71] Check failed: error == cudaSuccess (2 vs. 0)  out of memory

The memory size of my gpu is 11178MiB, how should i tune the code to run on my gpu server?

How to generate Binary masks ?

In the paper you mention first generate binary masks, then later on say they are optional depending if provided in ground truth data.
Which one is it > ?
masks are there in icdar '13 segmentation task challenge 1, 2, but boxes are horizontal bbox notation and no segment masks for incidental ones in icdar '15.
Please provide details with which dataset you've performed which types of loss experiment ?

"gt_label" in tool_layers/gen_gts_layer

Hi @tonghe90 :
sorry for bother you again.
i can understand almost all of your code, but i am really confuse about customize layer "gen_gts_layer", specially the bottom[0] blob "gt_bbox" whose shape is N* 1 *H * W, but i dont know about what excatly gt_bbox is and what is the mean of the vaule in gt_bbox.

textspotter/pylayer/tool_layers.py

Line 304 in 0166abd

for n in range(batch_size):

for n in range(batch_size):
gt_label = bottom[0].data[n, 0] #gt_label is a matrix，shape=H*W
tmp = np.sum(gt_label, axis=1)
gt_num = len(np.where(tmp != 0)[0])
if gt_num == 0:
continue
roi_n = gt_label[:gt_num, :8] * 4 #here i cant understand.
roi_n = np.hstack((np.ones((gt_num, 1)) * n, roi_n))
gt_boxes = np.vstack((gt_boxes, roi_n))

@tonghe90 我这边没有12gb的显卡，可否用两块6gb的显卡（比如GTX1060）替代？

如题，用两块6gb（或8gb）的显卡和用一块12gb的显卡进行训练或测试有什么区别？

Meaning of sample_gt_cont in test_iou.pt?

@tonghe90 Sorry for another question to ask, I still not clear the function of sample_gt_cont in decoder layer, could you give an explanation?

Synthtext pre processing and table 2 accuracies

Hi,

Please can you tell the steps taken for pre-procesing synthtext labels ??

your model uses fixed max length of 25 but synthtext dataset has boxes with labels length(number of characters per box in ground truth)>=35

Also how did you get the accuracies mentioned in Table 2 ?
is that after all steps of training ? It says accuracy on icdar dataset but also says groundtruth used.
or is it after training on Synthtext and then fine tune of 80k iteration on Icdar ie after step 2 of training ??

protoc version

Hello, I have got a problem of the protoc version when I make you code of caffe, so I want to know which version of protoc are you used for the code ,Thank you

请问有pytorch或者tensorflow的版本吗？

请问如何进行端到端的训练

论文中训练的第二步后期，需要将检测结果输入到 text-align layer ，请问这里具体是怎么实现的呢？如果计算识别的损失呢？是通过将检测的结果和GT进行IoU的计算来判断检测结果和GT标注的bbox相对应从而得到识别的GT吗？谢谢！ @tonghe90

how about the time cost?

I run test.py on Tesla P40, if I set the scale to 1000, the detection part needs time 1s around.
when I set the scale to 300, then the detection part time is 0.3s around. how about others? is it normal. and if I want to speed the forword, is there any sugestion? thanks

why score_map have two channels?

Hi @tonghe90 :
i have reviewed your train.pt file, i found your score_map generated by 1*1 convoultion layer( "score_4s" in file) has two channels, as to your answer in this issue:
#16 (comment)
its confuse me, should i perpare a correspoding two channels score map as surpervision information or just one channel ?

可以识别中文吗？

您好，请问下这个网络可以识别中文吗？或者用比较小的改动来识别中文字符？

请问有人成功训练了吗？

请问有人成功训练了吗？望指导一下

Therer are many errors about "at_layer.cpp" and other layers

Hi He:
I git clone your code, when I make your code, we encounter many errors. Is the code you have released is incomplete or other reasons? Some of the errors are as follows：

src/caffe/layers/at_layer.cpp:18:20: error: request for member ‘output_h’ in ‘param’, which is of non-class type ‘const int’
output_H_ = param.output_h();
^
src/caffe/layers/at_layer.cpp:20:12: error: request for member ‘has_output_w’ in ‘param’, which is of non-class type ‘const int’
if (param.has_output_w()) {
^
src/caffe/layers/at_layer.cpp:21:21: error: request for member ‘output_w’ in ‘param’, which is of non-class type ‘const int’

/usr/include/c++/5/bits/stl_vector.h:303:7: note: candidate expects 3 arguments, 5 provided
/usr/include/c++/5/bits/stl_vector.h:264:7: note: candidate: std::vector<_Tp, _Alloc>::vector(const allocator_type&) [with _Tp = int; _Alloc = std::allocator; std::vector<_Tp, _Alloc>::allocator_type = std::allocator]
vector(const allocator_type& __a) _GLIBCXX_NOEXCEPT
^
/usr/include/c++/5/bits/stl_vector.h:264:7: note: candidate expects 1 argument, 5 provided
/usr/include/c++/5/bits/stl_vector.h:253:7: note: candidate: std::vector<_Tp, _Alloc>::vector() [with _Tp = int; _Alloc = std::allocator]

issues about gen_gts_layer

Q1:in train.pt ,"gt_bbox" is noted by ” N * 8 ### grounding truth boxes for text (for computing loss)”
but in Class gen_gts_layer which in tool_layers.py it is noted by "bottom[0]: gt_label [N,1,sz,sz]"
What does gt_bbox mean?
Q2:Could you please provide an intuitive explanation of what the following variables are ?
'sample_gt_cont'
'sample_gt_label_input'
'sample_gt_label_output'

sorry, pressed Enter accidentally

Is this model capable of recognizing Chinese, and if so, which Chinese characters?

thanks

Error loading parameters

Hello @tonghe90,

Congrats for the good project and paper. I am trying to test you code but I am having problems loading the params, do you have any idea why is this happening?

WARNING: Logging before InitGoogleLogging() is written to STDERR
W0514 12:58:54.210842 2459 _caffe.cpp:139] DEPRECATION WARNING - deprecated use of Python interface
W0514 12:58:54.210868 2459 _caffe.cpp:140] Use this instead (with the named "weights" parameter):
W0514 12:58:54.210873 2459 _caffe.cpp:142] Net('./models/test_iou.pt', 1, weights='./models/textspotter.caffemodel')
[libprotobuf ERROR google/protobuf/text_format.cc:288] Error parsing text-format caffe.NetParameter: 7067:24: Message type "caffe.LayerParameter" has no field named "point_bilinear_param".
F0514 12:58:54.243449 2459 upgrade_proto.cpp:88] Check failed: ReadProtoFromTextFile(param_file, param) Failed to parse NetParameter file: ./models/test_iou.pt

Will you release the training code?

Can't read test_iou.pt but test_lstm.pt works

I am trying to run this net for mobile using caffe-mobile-lib.

I have added all your new layers to the lib and built it, but when it tries to read test_iou.pt it fails with the error:

A/caffe_jni: F1204 16:01:12.593006 26632 upgrade_proto.cpp:79] Check failed: ReadProtoFromTextFile(param_file, param) Failed to parse NetParameter file: /storage/emulated/0/textRec/textRec/test_iou.pt
    terminating.

But it works with test_lstm.

Any idea why or if there is something else I have to do?

paramater "rf"

Hi tonghe:

Could you please tell what is the role of paramater 'rf' in "det_nms_layer", why "pre_bbox" need to Multiplied by it?

textspotter/pylayer/tool_layers.py

Line 227 in 0166abd

pre_bbox *= self.rf

@tonghe90 train.pt中ignore_bbox是用来做什么的？

感觉在gt_bbox中把ignore_bbox去掉不就可以了吗？

@tonghe90 如何准备text/non-text 掩膜数据？

我的意思是掩膜是单个字符的外接矩形框，还是要把字符沿着笔画边缘分割出来？

strong, weak and generic lexicon

您好，我不太理解强、弱和通用字典的定义以及使用，请问强、弱和通用字典是什么意思？怎么用他们来识别？为什么识别结果会有差异？

Hello, I don't quite understand the definition and use of strong, weak and general lexicon. What are strong, weak and general lexicon? How to use them to identify? Why are there differences in recognition results in terms of using a strong, weak and generic lexicon? I need your help very much,looking forward to your answer!Thank you so much!

@tonghe90 训练数据的准备问题?

@tonghe90 感谢分享代码，我看了之前的issue #16 中，你提到训练数据包括(1) text/non-text region, (2) for every point in the text region, you need to calculate the distance between the current point to the four edges with an extra inclined ange. 但我看了train.pt，发现第（2）类数据（每个点到bounding box的4条边的距离）在iou_maps_angles层中生成了；所以准备训练数据时是不是只需要(1) text/non-text region和（2）gt_bbox（文本框的4个顶点坐标就够了？

make: *** [.build_release/src/caffe/layers/at_layer.o] Error 1

When I run make -j8, I met this problem, I tried some solutions on Internet, but they didn't work...Can you help me solve it ?

Makefile:581: recipe for target '.build_release/src/caffe/layers/reverse_axis_layer.o' failed

When compiling the origional caffe code, I have the same problem as #15 .
Then I compile your caffe code, but there is an error. I have no idea how to solve it, can you help me? Thank you.

./include/caffe/util/device_alternate.hpp:14:15: error: expected initializer before ‘<’ token
void classname::Forward_gpu(const vector<Blob>& bottom,
^
src/caffe/layers/reverse_axis_layer.cpp:61:1: note: in expansion of macro ‘STUB_GPU’
STUB_GPU(ReverseLayer);
^~~~~~~~
./include/caffe/util/device_alternate.hpp:17:15: error: expected initializer before ‘<’ token
void classname::Backward_gpu(const vector<Blob>& top,
^
src/caffe/layers/reverse_axis_layer.cpp:61:1: note: in expansion of macro ‘STUB_GPU’
STUB_GPU(ReverseLayer);
^~~~~~~~
Makefile:581: recipe for target '.build_release/src/caffe/layers/reverse_axis_layer.o' failed
make: *** [.build_release/src/caffe/layers/reverse_axis_layer.o] Error 1
make: *** Waiting for unfinished jobs....

about the train code

hello @tonghe90 ,
may I have your train code? Thanks.

Do i need to write OHEM layer

Hi @tonghe90 ：
i was tried to implement your code recently, and found a layer named "OHEM",i am not sure whether i need to write this layer or not.

Errors when compiling caffe.

Hello @tonghe90 ,
I have been experiencing a lot of errors, after a while these errors appear.
The following is the output of: make all -j 8

I have followed the steps outlined in: https://github.com/BVLC/caffe/wiki/Ubuntu-16.04-Installation-Guide applied to your own modified caffe version.

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11280): error: argument of type "void *" is incompatible with parameter of type "long long *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11289): error: argument of type "void *" is incompatible with parameter of type "long long *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11300): error: argument of type "void *" is incompatible with parameter of type "long long *"

92 errors detected in the compilation of "/tmp/tmpxft_00001626_00000000-13_exp_layer.compute_52.cpp1.ii".
Makefile:594: recipe for target '.build_release/cuda/src/caffe/layers/exp_layer.o' failed
make: *** [.build_release/cuda/src/caffe/layers/exp_layer.o] Error 1

System specs:
gcc = 5.5
opencv = 3.3
python = 2.7
cudnn = 7.5
cuda = 8.0
O.S. = Ubuntu 16.04

Makefile.config

Makefile.config.txt

How to training?

@tonghe90 I would like to ask you some questions about training: 1) How to build the train_val.prototxt file for training according to the two prototxt files test_iou.pt and test_lstm.pt for testing you have given? I am sorry that I have not used this branch network before. 2)In the paper, You mentioned the three steps of training. I want to know how to control the detection branch to be fixed or open it.
Because I am a newbie, I hope that you can give me some guidance, of course, the more detailed the better, thank you very much.

Is CPU mode supported?

I kept got errors during buiding the project with CPU_ONLY.
And I found that the foward and backward pass of at_layer.cpp are not implemented.
Have you considered adding CPU implementation?

Only the last scale is saved among scaled input results

Hi, in your test.py,

textspotter/test.py

Line 241 in e904571

for k in range(len(scales)):

it seems,

new_boxes, words, words_score

are being refreshed during the loop and only the result for scale 2080 seems to be processed afterwards.
Looks like it needs merging process of each result otherwise seems redundant.
Would you please explain if I'm missing something?

Could you provide the training code ?

@tonghe90 Hello！ Could you provide the training code? Current project only support testing...

@tonghe90，我的是1070

1 / 1: ./imgs/img_105.jpg
F0409 21:07:56.721240 21671 syncedmem.cpp:71] Check failed: error == cudaSuccess (2 vs. 0) out of memory
*** Check failure stack trace: ***

@tonghe90 请问训练时，文字识别的ground truth（文字的标签）在哪里输入？

如题，我在train.pt中没看到文字识别的ground truth，如果没有文字的标签，文字识别部分如何训练呢？

'class caffe::LayerParameter' has no member named 'at_param'

Hello @tonghe90,

When trying to build your version of caffe, I am facing the following issue:

src/caffe/layers/at_layer.cpp: In instantiation of 'void caffe::AffineTransformerLayer<Dtype>::LayerSetUp(const std::vector<caffe::Blob<Dtype>*>&, const std::vector<caffe::Blob<Dtype>*>&) [with Dtype = float]':
src/caffe/layers/at_layer.cpp:107:1:   required from here
src/caffe/layers/at_layer.cpp:17:50: error: 'class caffe::LayerParameter' has no member named 'at_param'
  const auto &param = this->layer_param_.at_param();
                                                  ^
src/caffe/layers/at_layer.cpp: In instantiation of 'void caffe::AffineTransformerLayer<Dtype>::LayerSetUp(const std::vector<caffe::Blob<Dtype>*>&, const std::vector<caffe::Blob<Dtype>*>&) [with Dtype = double]':
src/caffe/layers/at_layer.cpp:107:1:   required from here
src/caffe/layers/at_layer.cpp:17:50: error: 'class caffe::LayerParameter' has no member named 'at_param'
Makefile:581: recipe for target '.build_release/src/caffe/layers/at_layer.o' failed
make: *** [.build_release/src/caffe/layers/at_layer.o] Error 1

To fix some other errors earlier, I did add the following line to the Makefile.config:
CUSTOM_CXX := g++ -std=c++11

Could you please help me fix this issue. Thanks!

test error

models/textspotter.caffemodel
WARNING: Logging before InitGoogleLogging() is written to STDERR
W0718 15:41:54.913141 23734 _caffe.cpp:140] DEPRECATION WARNING - deprecated use of Python interface
W0718 15:41:54.913169 23734 _caffe.cpp:141] Use this instead (with the named "weights" parameter):
W0718 15:41:54.913173 23734 _caffe.cpp:143] Net('models/test_iou.pt', 1, weights='models/textspotter.caffemodel')
[libprotobuf ERROR google/protobuf/text_format.cc:274] Error parsing text-format caffe.NetParameter: 7067:24: Message type "caffe.LayerParameter" has no field named "point_bilinear_param".
F0718 15:41:54.915925 23734 upgrade_proto.cpp:88] Check failed: ReadProtoFromTextFile(param_file, param) Failed to parse NetParameter file: models/test_iou.pt
*** Check failure stack trace: ***
Aborted (core dumped)

章型字体可以识别吗

您好，请问章型字可以识别吗