Giter Site home page Giter Site logo

zdlcaffe / wsrtcb Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 1.0 16.39 MB

Scene Text Detection with Fully Convolutional Neural Networks

Python 4.67% CMake 1.22% Makefile 0.27% HTML 0.08% CSS 0.10% Jupyter Notebook 57.59% C++ 33.01% Shell 0.30% Cuda 2.36% MATLAB 0.36% Dockerfile 0.03%

wsrtcb's Introduction

1. Introduction

This project includes the text detection source code and trained model about the word stroke region and text center block.

2. Installation

  • Clone the repo
git clone [email protected]:zdlcaffe/WSRTCB.git  
  • Then you can do as follow:
cd ${WSRTCB_root/Train_WSR_TCB/caffe}  
make โ€“j  
make pycaffe 

3. Testing

3.1 Generate WSR/TCB score map

cd ${WSRTCB_root/}  
mkdir snapshot  
mkdir pre_model
  • put TD_MKEI_Word.caffemodel to the fold of ${WSRTCB_root/Train_WSR_TCB/ snapshot}.

  • Suppose you have downloaded the MSRA-TD500 dataset, execute the following commands to test the model on MSRA-TD500. Then you can do as follow:

cd ${WSRTCB_root/Train_WSR_TCB/demo}  
python Demo.py  

3.2 There are some samples:

3.3 Threshold WSR/TCB maps:

You can do as follow:

cd ${WSRTCB_root/Text_Demo}  
python fuse_thred	.py  

3.4 Generate detection results

You can do as follow:

cd ${WSRTCB_root/Text_Demo}  
python Demo_region_word.py

3.5 There are some samples:

4. Training

Download the pretrained model vgg16convs.caffemodel, and put it to ${WSRTCB_root/Train_WSR_TCB/pre_model}

4.1 Generate your map

Scripts for generating ground truth have been provided in the label_generate directory. It not hard to write a converting script for your own dataset.

4.2 Train your own model

Modify ${WSRTCB_root/Train_WSR_TCB/model/TD_MKEI_Word.py} to configure your dataset name and dataset path like:
......
data_params['root'] = 'data/MKEIWord'
data_params['source'] = "MKEI_Word.lst"
......

4.3 Start training

You can do as follow:

cd ${WSRTCB_root/Train_WSR_TCB}  
sh ./train.sh 

Citation

Use this bibtex to cite this repository:

@article{liu2019scene,
  title={Scene text detection with fully convolutional neural networks},
  author={Liu, Zhandong and Zhou, Wengang and Li, Houqiang},
  journal={Multimedia Tools and Applications},
  pages={1--23},
  year={2019},
  publisher={Springer}
}

Acknowlegement

wsrtcb's People

Contributors

zdlcaffe avatar

Watchers

James Cloos avatar

Forkers

lzd0825

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.