Giter Site home page Giter Site logo

crnn.gluon's Introduction

I'm WenmuZhou, a deep learning alchemist 👨‍💻 working remotely since 2016 🚀

Profile Views :

:WenmuZhou

WenmuZhou's GitHub stats

WenmuZhou's GitHub stats

Top Langs

Top Langs

crnn.gluon's People

Contributors

wenmuzhou avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

crnn.gluon's Issues

Large gap between train-val vs test accuracies.

Hello,

I trained the model on the 8M synthetic dataset by Max Jaderberg et al. for 20 epochs on 4 GPUs in parallel (total batch-size of 512). The training accuracy towards the end of the training was around 95% and the validation accuracy (on ICDAR'13 train set) was around 82%. However, when I test the model independently using 'predict.py' on the test set of ICDAR'13, I get around 53% accuracy. Do you know why this could be happening ? Is the batch-size somehow messing this up at the inference time ?

辨识垂直的文字,需要做什么调整吗?

paper上说这个模型是为了水平方向的文字而设计的,如果我们要辨识垂直的文字,该做何种调整呢?

  1. 不用理他
  2. 修改网路结构
  3. 制造出垂直的文字串,然后旋转90度

训练中文的模型,请问大概要多少张图片

你好,感谢你的代码
请问训练一个可以辨识繁简字体的辨识器
大概需要多少样本?

我准备了大约170万个中文句子(无空格),大约有85万笔是繁体,85万笔是简体(从繁体翻译过去的)
一万笔比长度介于[4,40]的数字(包含[-,])
约30万个英文句子,含空格,字数介于[4,40]i
合计大约两百万个样本吧
请问这样足够吗?

所有的训练样本都是靠ttf这类档案产生的(我收集了83种),产生样本的时候会随机选择其中一种
字典有8227个字,我从字典中移除了空格,因为有一篇回文有提到加入空格的话,训练效果不好
但是我并没有将标签中的空格移除,请问我该移除空格吗?

Edit : 查了你写的图片生成代码,生成的图片和标签都不该有空格
Edit2 : 我的中/英文字串来自于真实的文本,并非从字典集中随机抽选的,随机抽选的话每个字的出现机率会是均等的,请问采用随机抽选的方式会比较好吗?
Edit3 : 我将 ImageDataset的num_label设定为40,因为最长的字串,长度只有40,这样子对训练的结果有何影响呢?

代码无误,但是loss一直很高

您好,我参考了您的代码,使用自己生成的英文数据集进行训练,loss一直在100多,无法下降,请问这是您跑过的效果好的代码嘛

Training data missing

Thanks for posting this example. It is most helpful.

I wonder if you might consider also adding some sample training data and data files to this project, such as those in the default files /data/zhy/crnn/Chinese_character/test2.txt and /data/zhy/crnn/Chinese_character/train2.txt. This would make it much easier for someone like me who would like to adopt your excellent project to his or her own needs.

不定长输入的训练和推理

1.请问模型是否支持不定长输入的训练和推理?
2.训练收敛时间是多长,我自己复现了一版遇到了收敛慢,acc一直为0,看你在别的回答中提到30epochs会收敛,收敛的时候loss大概是什么数量级的
3.另外symbol版的一个epoch可收敛,为什么gluon版的很慢
谢谢

accuracy is zero even after 640 epochs

Hi,
I was trying training some OCR task, but the acc is always zero .
the dataset is ICDAR2013
It's just alphabet and number and common characters. I created one charset as below.

all_english = no + """abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ,.<>/?;:"[]{}!@#$%^&*()-=+|""" + BLANK_SYMBOL

the GT file is like something below

./data/test/word_1.png|||PROPER
./data/test/word_2.png|||FOOD
./data/test/word_3.png|||PRONTO
./data/test/word_4.png|||professional
./data/test/word_5.png|||Java
./data/test/word_6.png|||Web
./data/test/word_7.png|||Services
./data/test/word_8.png|||go
./data/test/word_9.png|||SONY

and I modified the line 28 code in dataset.py

line = line.strip('\n').split('|||')

the attached file is traning log.

train.zip

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.