Light

deepinsight / insightocr Goto Github PK

View Code? Open in Web Editor NEW

107.0 107.0 42.0 39.16 MB

MXNet OCR implementation. Including text recognition and detection.

License: MIT License

Python 100.00%

crnn mxnet ocr text-recognition

insightocr's People

Contributors

Stargazers

Watchers

Forkers

wyc2015fq blackarrow3542 cloudfool fq19851220 bigbao9494 xiangliu886 happog af258963 tskarthikeyann xiaomaxiao xhappy fendaq barongeng zgsxwsdxg trantorrepository mlikewater zzmcdc labimage aliushn otrewyi191 dexception xiaotie1005 zhukkang tinyloop wuxiaolianggit marc45 chadpieere alwc zwcdp class8hawk zhaogoodwell mdyuan926 fyq823 gregbugaj yueyedeai lastincisor mdiqbalahmad techsuni2023 jeromyjsmith msgpo dearborn-open-ai swimauth

insightocr's Issues

lstm problem

There's  a line in     lstm.py  .

last_states.append(LSTMState(c=mx.sym.Variable("l%d_init_c" % i), h=mx.sym.Variable("l%d_init_h" % i)))

I found Variable c & h are not initialized or setted. So why the symbol is right?

Unable to train insightocr on VGG_Text data

Hi,
I am not able to train insightocr using train_crnn.py file.
I have downloaded the Synthetic Word Dataset(10GB) from VGG_Text.
Then i have made following changes-

In config.py,
default.dataset = 'vgg'
dataset.vgg.dataset_path = 'path of vgg dataset'
In train_crnn.py,
In line 70- image_set='annotation_train'
In line 71- image_set='annotation_val'

annotation_train file contains data in following format-

./2425/1/115_Lube_45484.jpg 45484
image_name index

These index are mapped to label in lexicon.txt file. But in the entire script, we have not passed the path of lexicon.txt file. So how the model is going to train.

Kindly provide some solution to train it on VGG_text dataset.

Also what is config.image_path in config.py?

mxnet原生ctc试过吗?

image data prepocess

Hi,

About the code in your crnn/data.py (line 121-122)

_data -= 127.5
_data *= 0.0078125

Could you please explain what you mean by this? Thanks.

这个库考虑对一些具体的场景的图片做嘛

比如一些证件的ocr

acc 一直是0

你好，我用你的代码重新训练模型。实验的时候，除了带LSTN的网络，其他不带LSTM的网络，比如simplenet、resnet、mobilenet都试过，训练过程中acc一直都是0，请教一下，这是什么原因，谢谢

Is 4x1pooling better than original?

   I found you have changed the cnn part of crnn  by using 4x1pooling .Is it better?

Pretrained model, detection and inference code

This project is great.

when all the thing is ready?. thanks

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.