Giter Site home page Giter Site logo

abdur75648 / utrnet-high-resolution-urdu-text-recognition Goto Github PK

View Code? Open in Web Editor NEW
33.0 6.0 8.0 119 KB

UTRNet: High-Resolution Urdu Text Recognition In Printed Documents (ICDAR'23)

Home Page: https://abdur75648.github.io/UTRNet/

License: Other

Python 100.00%
document-analysis high-resolution hrnet icdar icdar2023 ocr scene-text-recognition text-detection text-recognition unet

utrnet-high-resolution-urdu-text-recognition's Issues

Error while loading the bigger model weights

When I try to replace the small model weights used in HF demo with the larger model weights mentioned in the GitHub repo it throws the following error:

size mismatch for SequenceModeling.0.rnn.weight_ih_l0: copying a param with shape torch.Size([1024, 32]) from checkpoint, the shape in current model is torch.Size([1024, 512]).
size mismatch for SequenceModeling.0.rnn.weight_ih_l0_reverse: copying a param with shape torch.Size([1024, 32]) from checkpoint, the shape in current model is torch.Size([1024, 512]).

Do I need to modify any code to make it work?

Pickle Error on LMDB Dataset

Hi,

I hope you are doing good. I created an LMDB dataset, when I am trying to load the dataset for the training, it is giving me the error which I have attached in the message. Kindly help me. For the test data path, I provided the path of dir where the test data is generated via create_lmdb_dataset and for the validation data path, I provided the path of dir where .mdb files are located generated from the same file.

Thanks and Regards

test

Code & Dataset

Hi,
Great work on Urdu OCR, @abdur75648!
It would be great if you tell when will all the codes & datasets mentioned in the UTRNet paper be released?
Thanks in advance

Issue in read.py

Hello,
Thankyou for the code. I have tried to run this line to read image and get output in a .txt file, but I didnt get any text urdu output.

CUDA_VISIBLE_DEVICES=0 python3 read.py --image_path path/to/image.png --FeatureExtraction HRNet --SequenceModeling DBiLSTM --Prediction CTC --saved_model saved_models/UTRNet-Large/best_norm_ED.pth

This is the output I received:


CUDA_VISIBLE_DEVICES=0 python3 read.py --image_path images/page_2.png --FeatureExtraction HRNet --SequenceModeling DBiLSTM --Prediction CTC  --saved_model saved_models/best_norm_ED.pth
Device :  cuda
model input parameters 32 400 20 1 32 256 182 100 HRNet DBiLSTM CTC
Loaded pretrained model from saved_models/best_norm_ED.pth
5

The used Synthetic text dataset

Thanks for the excellent work and sharing the code. How the synthetic text are prepared (the one that you created) ? Is it possible to share the code of that, so we also can create for other language.

Issue with pip install -r requirements.txtpip install -r requirements.txt

I have tried many times.. But this is where I get stuck.
I have checked !apt-get install git-lfs
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
git-lfs is already the newest version (3.0.2-1ubuntu0.2).
0 upgraded, 0 newly installed, 0 to remove and 38 not upgraded.
ISSUE
Using cached opencv-contrib-python-4.5.1.48.tar.gz (148.8 MB)
error: subprocess-exited-with-error

× pip subprocess to install build dependencies did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.
Installing build dependencies ... error
error: subprocess-exited-with-error

× pip subprocess to install build dependencies did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.