Giter Site home page Giter Site logo

omni-us / research-contentdistillation-htr Goto Github PK

View Code? Open in Web Editor NEW
4.0 6.0 2.0 338 KB

Source code for ICFHR20 "Distilling Content from Style for Handwritten Word Recognition"

License: MIT License

Python 99.82% Shell 0.18%
handwriting-recognition generative-adversarial-network document-analysis

research-contentdistillation-htr's Introduction

License: MIT Python 3.7

Distilling Content from Style for Handwritten Word Recognition

A novel method that is able to disentangle the content and style aspects of input images by jointly optimizing a generative process and a handwritten word recognizer.

Architecture

Distilling Content from Style for Handwritten Word Recognition
Lei Kang, Pau Riba, Marçal Rusiñol, Alicia Fornés, and Mauricio Villegas
Accepted to ICFHR2020

Software environment

  • Ubuntu 16.04 x64
  • Python 3
  • PyTorch 1.0.1

Dataset preparation

We carry on our experiments on the widely used handwritten dataset IAM.

Note before running the code

  • The training takes a lot of GPU memory, in my case, it takes 24GB in a GPU RTX 6000 with batchsize 8. Even if we set the basesize to 1, it still takes 16GB GPU memory.

How to train?

Once the dataset is prepared, you need to denote the correct urls in the file load_data.py, then you are ready to go. To run from scratch:

./run_train_scratch.sh

Or to start with a saved checkpoint:

./run_train_pretrain.sh

Note: Which GPU to use or which epoch you want to start from could be set in this shell script. (Epoch ID corresponds to the weights that you want to load in the folder save_weights)

How to test?

./run_test.sh

And don't forget to change the epoch ID in this shell script to load the correct weights of the model that is corresponding to the epoch ID.

How to boost the HTR performance?

After the content distillation model is trained properly by early stopping, load the recognizer module ConTranModel.rec and do the fine-tuning with IAM training set alone. The Seq2Seq HTR recognizer can be found here.

Citation

If you use the code for your research or application, please cite our paper:

To be filled.

research-contentdistillation-htr's People

Contributors

leitro avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

research-contentdistillation-htr's Issues

Pretrained model

Hi @leitro, thanks for sharing your amazing work.

Could you share your pre-trained model so that we can get rid of retraining one and play it right away?

I will be grateful if you could provide your pre-trained model. Thanks in advance.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.