Giter Site home page Giter Site logo

moran_v2's Introduction

MORAN: A Multi-Object Rectified Attention Network for Scene Text Recognition

MORAN is a network with rectification mechanism for general scene text recognition. The paper (accepted to appear in Pattern Recognition, 2019) in arXiv, online version is available now.

Here is a brief introduction in Chinese.

Recent Update

Improvements of MORAN v2:

  • More stable rectification network for one-stage training
  • Replace VGG backbone by ResNet
  • Use bidirectional decoder (a trick borrowed from ASTER)
Version IIIT5K SVT IC03 IC13 SVT-P CUTE80 IC15 (1811) IC15 (2077)
MORAN v1 (curriculum training)* 91.2 88.3 95.0 92.4 76.1 77.4 74.7 68.8
MORAN v2 (one-stage training) 93.4 88.3 94.2 93.2 79.7 81.9 77.8 73.9

*The results of v1 were reported in our paper. If this project is helpful for your research, please cite our Pattern Recognition paper.

Requirements

(Welcome to develop MORAN together.)

Use pip to install the libraries.

    pip install -r requirements.txt

Data Preparation

Please convert your own dataset to LMDB format by using the tool provided by @Baoguang Shi.

You can also download the training (NIPS 2014, CVPR 2016) and testing datasets prepared by us.

The raw pictures of testing datasets can be found here.

Training and Testing

Modify the path to dataset folder in train_MORAN.sh:

	--train_nips path_to_dataset \
	--train_cvpr path_to_dataset \
	--valroot path_to_dataset \

And start training: (manually decrease the learning rate for your task)

	sh train_MORAN.sh

Demo

Download the model parameter file demo.pth.

Put it into root folder. Then, execute the demo.py for more visualizations.

	python demo.py

Citation

@article{cluo2019moran,
  author  = {Canjie Luo, Lianwen Jin, Zenghui Sun},
  title   = {MORAN: A Multi-Object Rectified Attention Network for Scene Text Recognition},
  journal = {Pattern Recognition}, 
  volume  = {}, 
  number  = {}, 
  pages   = {},
  year    = {2019}, 
}

Acknowledgment

The repo is developed based on @Jieru Mei's crnn.pytorch and @marvis' ocr_attention. Thanks for your contribution.

Attention

The project is only free for academic research purposes.

moran_v2's People

Contributors

canjie-luo avatar cclauss avatar

Watchers

James Cloos avatar  avatar  avatar paper2code - bot avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.