This project includes the text detection source code and trained model about the word stroke region and text center block.
- Clone the repo
git clone [email protected]:zdlcaffe/WSRTCB.git
- Then you can do as follow:
cd ${WSRTCB_root/Train_WSR_TCB/caffe}
make โj
make pycaffe
- Download the TD_MKEI_Word.caffemodel, trained on KAIT dataset.
- Then you can do as follow:
cd ${WSRTCB_root/}
mkdir snapshot
mkdir pre_model
-
put TD_MKEI_Word.caffemodel to the fold of ${WSRTCB_root/Train_WSR_TCB/ snapshot}.
-
Suppose you have downloaded the MSRA-TD500 dataset, execute the following commands to test the model on MSRA-TD500. Then you can do as follow:
cd ${WSRTCB_root/Train_WSR_TCB/demo}
python Demo.py
You can do as follow:
cd ${WSRTCB_root/Text_Demo}
python fuse_thred .py
You can do as follow:
cd ${WSRTCB_root/Text_Demo}
python Demo_region_word.py
Download the pretrained model vgg16convs.caffemodel, and put it to ${WSRTCB_root/Train_WSR_TCB/pre_model}
Scripts for generating ground truth have been provided in the label_generate directory. It not hard to write a converting script for your own dataset.
Modify ${WSRTCB_root/Train_WSR_TCB/model/TD_MKEI_Word.py} to configure your dataset name and dataset path like:
......
data_params['root'] = 'data/MKEIWord'
data_params['source'] = "MKEI_Word.lst"
......
You can do as follow:
cd ${WSRTCB_root/Train_WSR_TCB}
sh ./train.sh
Use this bibtex to cite this repository:
@article{liu2019scene,
title={Scene text detection with fully convolutional neural networks},
author={Liu, Zhandong and Zhou, Wengang and Li, Houqiang},
journal={Multimedia Tools and Applications},
pages={1--23},
year={2019},
publisher={Springer}
}