Giter Site home page Giter Site logo

rclstr's Introduction

RCLSTR

[ACMMM 2023] Relational Contrastive Learning for Scene Text Recognition


Introduction

This repository is an official implementation of RCLSTR.

Getting Started

1. Environment Setup

Base Environments

Python >= 3.8
CUDA == 11.0
PyTorch == 1.7.1

Step-by-step installation instructions

a. Create a conda virtual environment and activate it.

conda create -n RCLSTR python=3.8 -y
conda activate RCLSTR

b. Install PyTorch and torchvision following the official instructions.

pip install torch==1.7.1+cu110 torchvision==0.8.2+cu110 torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html

c. Install other packages.

pip install lmdb pillow nltk natsort fire tensorboard tqdm imgaug einops
pip install numpy==1.22.3

2. Data Preparation

Download datasets

We use the ST(SynthText) training datasets from STR-Fewer-Labels. Download the datasets from baiduyun (password:px16).

Data folder structure

data_CVPR2021
├── training
│   └── label
│       └── synth
├── validation
│   ├── 1.SVT
│   ├── 2.IIIT
│   ├── 3.IC13
│   ├── 4.IC15
│   ├── 5.COCO
│   ├── 6.RCTW17
│   ├── 7.Uber
│   ├── 8.ArT
│   ├── 9.LSVT
│   ├── 10.MLT19
│   └── 11.ReCTS
└── evaluation
    └── benchmark
        ├── CUTE80
        ├── IC03_867
        ├── IC13_1015
        ├── IC15_2077
        ├── IIIT5k_3000
        ├── SVT
        └── SVTP

Link the dataset path as follows:

cd pretrain
ln -s /path/to/data_CVPR2021 data_CVPR2021
cd evaluation
ln -s /path/to/data_CVPR2021 data_CVPR2021

TPS model weights

For the TPS module, we use the pretrained TPS model weights from STR-Fewer-Labels. Please download the TPS model weights from baiduyun (password:px16) and put it in pretrain/TPS_model.

3. Pretrain and decoder evaluation

Pretrain

RCLSTR method includes regularization module (reg), hierarchical module (hier) and cross-hierarchy consistency module (con).

Pretrain SeqMoCo model:
cd pretrain
CUDA_VISIBLE_DEVICES=0,1,2,3 python main_moco.py   \
--model_name TRBA  \
--exp_name SeqMoCo   \
--lr 0.0015   \
--batch-size 32   \
--dist-url 'tcp://localhost:10002' \
--multiprocessing-distributed \
--world-size 1 \
--rank 0   \
--data data_CVPR2021/training/label/synth  \
--data-format lmdb  \
--light_aug   \
--instance_map window   \
--epochs 5   \
--useTPS ./TPS_model/TRBA-Baseline-synth.pth \
--loss_setting consistent \
--frame_weight 0 \
--frame_alpha 0 \
--word_weight 0 \
--word_alpha 0
Pretrain SeqMoCo model with reg module:
cd pretrain
CUDA_VISIBLE_DEVICES=0,1,2,3 python main_moco.py   \
--model_name TRBA  \
--exp_name SeqMoCo_reg   \
--lr 0.0015   \
--batch-size 32   \
--dist-url 'tcp://localhost:10002' \
--multiprocessing-distributed \
--world-size 1 \
--rank 0   \
--data data_CVPR2021/training/label/synth  \
--data-format lmdb  \
--light_aug   \
--instance_map window   \
--epochs 5   \
--useTPS ./TPS_model/TRBA-Baseline-synth.pth \
--loss_setting consistent \
--permutation \
--frame_weight 0 \
--frame_alpha 0 \
--word_weight 0 \
--word_alpha 0
Pretrain SeqMoCo model with reg and hier module:
cd pretrain
CUDA_VISIBLE_DEVICES=0,1,2,3 python main_moco.py   \
--model_name TRBA  \
--exp_name SeqMoCo_reg_hier   \
--lr 0.0015   \
--batch-size 32   \
--dist-url 'tcp://localhost:10002' \
--multiprocessing-distributed \
--world-size 1 \
--rank 0   \
--data data_CVPR2021/training/label/synth  \
--data-format lmdb  \
--light_aug   \
--instance_map window   \
--epochs 5   \
--useTPS ./TPS_model/TRBA-Baseline-synth.pth \
--loss_setting consistent \
--permutation 
Pretrain SeqMoCo model with reg, hier and con module:
cd pretrain
CUDA_VISIBLE_DEVICES=0,1,2,3 python main_moco.py   \
--model_name TRBA  \
--exp_name SeqMoCo_reg_hier_con   \
--lr 0.0015   \
--batch-size 32   \
--dist-url 'tcp://localhost:10002' \
--multiprocessing-distributed \
--world-size 1 \
--rank 0   \
--data data_CVPR2021/training/label/synth  \
--data-format lmdb  \
--light_aug   \
--instance_map window   \
--epochs 5   \
--useTPS ./TPS_model/TRBA-Baseline-synth.pth \
--loss_setting consistent \
--permutation \
--multi_level_consistent global2local \
--multi_level_ins 0

Feature representation evaluation

Train attention-based decoder for feature representation evaluation:
cd evaluation
CUDA_VISIBLE_DEVICES=0 python train_new.py \
--model_name TRA \
--exp_name TRA_reg_hier_con \
--saved_model ../pretrain/SeqMoCo_reg_hier_con/checkpoint_0004.pth.tar \
--select_data synth \
--batch_size 256 \
--Aug light

Main Results


TODO

  • Support ViT

Acknowledgements

We thank these great works and open-source codebases:

Citation

If you find our method useful for your reserach, please cite

@misc{zhang2023relational,
      title={Relational Contrastive Learning for Scene Text Recognition}, 
      author={Jinglei Zhang and Tiancheng Lin and Yi Xu and Kai Chen and Rui Zhang},
      year={2023},
      eprint={2308.00508},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

rclstr's People

Contributors

thundervvv avatar

Stargazers

 avatar liheng avatar Yangfu Li avatar  avatar Yifei Zhang avatar  avatar Lê Anh Duy avatar Phạm Văn Lĩnh avatar sword avatar Hang Guo avatar LiuZhuang avatar Chun Chet Ng avatar  avatar Jeonghun Baek avatar  avatar Sreerag M avatar

Watchers

Kostas Georgiou avatar  avatar

Forkers

zhangyifei01

rclstr's Issues

RIMES dataset

Hello friend, do you know how to obtain the RIMES dataset? I have encountered the same problem as you。

torch.nn.parallel.DistributedDataParallel deadlock

I am unable to run main_moco.py. In its line 308,
"model = torch.nn.parallel.DistributedDataParallel(model, device_ids=[args.gpu])"
Every time I run to this line, it deadlocks. The output is only two lines,
"Use GPU: 0 for training
=> creating model 'TRBA'"
I've checked the environment for multiprocessing training. Can you give me some solutions?

RuntimeError: No rendezvous handler for ://

When main_moco.py run to line 262
"mp.spawn(main_worker, nprocs=ngpus_per_node, args=(ngpus_per_node, args))",
it reports error like:
"raise RuntimeError("No rendezvous handler for {}://".format(result.scheme))
RuntimeError: No rendezvous handler for ://"
Can you give me any advice to solve this error?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.