Giter Site home page Giter Site logo

lazylazypig / textproposals Goto Github PK

View Code? Open in Web Editor NEW

This project forked from lluisgomez/textproposals

0.0 2.0 0.0 919 KB

Implementation of the method proposed in the papers " TextProposals: a Text-specific Selective Search Algorithm for Word Spotting in the Wild" and "Object Proposals for Text Extraction in the Wild" (Gomez & Karatzas), 2016 and 2015 respectively.

Home Page: https://github.com/lluisgomez/TextProposals

CMake 0.61% C++ 83.20% MATLAB 16.19%

textproposals's Introduction

TextProposals

Implementation of the method proposed in the papers:

  • "TextProposals: a Text-specific Selective Search Algorithm for Word Spotting in the Wild" (Gomez and Karatzas), arXiv:1604.02619 2016.

  • "Object Proposals for Text Extraction in the Wild" (Gomez & Karatzas), International Conference on Document Analysis and Recognition, ICDAR2015.

This code reproduces the results published on the papers for the SVT, ICDAR2013, ICDAR2015 datasets.

If you make use of this code, we appreciate it if you cite our papers:

@article{gomez2016,
  title     = {TextProposals: a Text-specific Selective Search Algorithm for Word Spotting in the Wild},
  author    = {Lluis Gomez and Dimosthenis Karatzas},
  journal   = {arXiv preprint arXiv:1604.02619},
  year      = {2016}
}
@inproceedings{GomezICDAR15object,
  title     = {Object Proposals for Text Extraction in the Wild},
  author    = {Lluis Gomez and Dimosthenis Karatzas},
  booktitle = {ICDAR},
  year      = {2015}
}

For any questions please write us: ({lgomez,dimos}@cvc.uab.es). Thanks!

Includes the following third party code:

CNN models

The end-to-end evaluation require the DictNet_VGG model to be placed in the project root directory. DictNet_VGG Caffe model and prototxt are available here http://nicolaou.homouniversalis.org/assets/vgg_text/

Compilation

Requires: OpenCV (3.0.x), Caffe (tested with d21772c), tinyXML

cmake .
make

(NOTE: you may need to change the include and lib paths to your Caffe and cuda installations in CMakeLists.txt file)

Run

./img2hierarchy <img_filename>

writes to stdout a list of proposals, one per line, with the format: x,y,w,h,c. where x,y,w,h define a bounding box, and c is a confidence value used to rank the proposals.

./img2hierarchy_cnn <img_filename>

same as before but for end-to-end recognition using the DictNet_VGG CNN model.

End-to-end Evaluation

The following commands reproduce end-to-end results in our paper:

./eval_IC03 data/ICDAR2003/SceneTrialTest/words.xml <LEX_SIZE>

./eval_SVT data/SVT/test.xml <LEX_SIZE>

./eval_IC15 <LEX_SIZE>

The value of LEX_SIZE parameter indicates the size of the lexicon to be used: 0 (for small lexicons), 1 (for Full lexicon), or 2 (for no lexicon, i.e. the 90k word vocabulary of the DictNet model).

Ground truth data for each dataset must be downloaded and placed in their respective folders in ./data/ directory.

In the case of ICDAR2015, since test ground truth is not available, the program save the results in res/ directory. These results files can be uploaded to the ICDAR Robust Reading Competition site for evaluation.

Object Proposal Evaluation

The following command lines generate a txt file with proposals for each image in the SVT and ICDAR2013 datasets respectively.

for i in `cat /path/to/datasets/SVT/svt1/test.xml | grep imageName | cut -d '>' -f 2 | cut -d '<' -f 1 | cut -d '/' -f 2 | cut -d '.' -f 1 `; do echo $i; ./img2hierarchy /path/to/datasets/SVT/svt1/img/$i.jpg 13 > data/$i; done;

for i in `cat /path/to/datasets/ICDAR2013/test_locations.xml | grep imageName | cut -d '>' -f 2 | cut -d '<' -f 1 | cut -d '_' -f 2 | cut -d '.' -f 1`; do echo $i; ./img2hierarchy /path/to/datasets/ICDAR2013/test/img_$i.jpg 13 > data/$i; done

once the files are generated you may want to run the matlab code in the evaluation/ folder to get the IoU scores and plots.

Notice that the MATLAB evaluation script performs deduplicatioin of the bounding boxes proposals. Thus, if you use another evauation framework you must deduplicate proposals same way.

textproposals's People

Contributors

lluisgomez avatar hermanschaaf avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.