Giter Site home page Giter Site logo

simplify23 / cdistnet Goto Github PK

View Code? Open in Web Editor NEW
106.0 12.0 18.0 1.66 MB

Official Pytorch implementations of CDistNet: Perceiving Multi-Domain Character Distance for Robust Text Recognition(IJCV)

License: Apache License 2.0

Jupyter Notebook 53.39% Python 46.61%

cdistnet's Issues

Any way to train only on language and not images?

I have a relatively small dataset of a different format (license plates) and it often gets license plate format wrong.

I was wondering if there was a way to train the model on just a bunch of text string data without feeding any images at all in order to enforce the format.

Please let me know if it is possible to train the language/semantic model independently, by just feeding string text data of words, without corresponding images.

about inference

How to set the parameters of input_char,such as predict a new image

CDistNetv2

@simplify23 are you planning to release CDistNetv2 code?
waiting for light weight and faster module

Open source license?

Are you willing to specify an open-source license such as a MIT License?
The github has no license specified.

Attention Maps

Could you please release the code to generate the attention maps as published in the paper

Missing transformer

When trying to run test.py I get the following error:

(CDistNet) C:\<path>\CDistNet>python test.py --i_path ..\examples\300_0.jpg 
configs/CDistNet_config.py
<class 'str'>
Traceback (most recent call last):
  File "test.py", line 175, in <module>
    main()
  File "test.py", line 168, in main
    test_one(cfg, args)
  File "test.py", line 126, in test_one
    en = get_parameter_number(model.transformer.encoder)
  File "C:\<path>\miniconda3\envs\CDistNet\lib\site-packages\torch\nn\modules\module.py", line 1178, in __getattr__ 
    type(self).__name__, name))
AttributeError: 'CDistNet' object has no attribute 'transformer'

train for other language

hello thanks for your paper and released codes
I want to train your code for other language but I see in lmdbdataset that you use English char and limit the max length to 30 that is true?
I should change line 245 and 246?

`def len(self):
return self.length

def get(self,idx):
    with self.env.begin(write=False) as txn:
        image_key, label_key = f'image-{idx+1:09d}', f'label-{idx+1:09d}'
        label = str(txn.get(label_key.encode()), 'utf-8')  # label
        label = re.sub('[^0-9a-zA-Z]+', '', label)
        label = label[:30]`

Questions on Conv2d of Transformer Layers

Hello, while examining the code,

I noticed that most of the nn.Linear() operations are replaced with nn.Conv2d(kernel_size=(1,1)) operations

when comparing nn.Transformer and the implementation of the code.

Is there a benefit for such replacement?

accuracy is lower than other models

I tried to train your model but I got accuracy is lower than other transformer models. could you please let me know how can I got higher accuracy ?

Inference Time

I tried your network and got a good result but I faced the problem of inference speed. could you please let me know I can increase the speed of recognition?

Can it work with phone number ?

I trained the model with billboards but when I inference it, it doesn't work well with sequences of numbers or phone numbers. Can you help me? Thanks very much.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.