wenmuzhou / crnn.gluon Goto Github PK
View Code? Open in Web Editor NEWA gluon re-implementation of Convolutional recurrent network in gluon
License: Apache License 2.0
A gluon re-implementation of Convolutional recurrent network in gluon
License: Apache License 2.0
Hello,
I trained the model on the 8M synthetic dataset by Max Jaderberg et al. for 20 epochs on 4 GPUs in parallel (total batch-size of 512). The training accuracy towards the end of the training was around 95% and the validation accuracy (on ICDAR'13 train set) was around 82%. However, when I test the model independently using 'predict.py' on the test set of ICDAR'13, I get around 53% accuracy. Do you know why this could be happening ? Is the batch-size somehow messing this up at the inference time ?
paper上说这个模型是为了水平方向的文字而设计的,如果我们要辨识垂直的文字,该做何种调整呢?
你好,感谢你的代码
请问训练一个可以辨识繁简字体的辨识器
大概需要多少样本?
我准备了大约170万个中文句子(无空格),大约有85万笔是繁体,85万笔是简体(从繁体翻译过去的)
一万笔比长度介于[4,40]的数字(包含[-,])
约30万个英文句子,含空格,字数介于[4,40]i
合计大约两百万个样本吧
请问这样足够吗?
所有的训练样本都是靠ttf这类档案产生的(我收集了83种),产生样本的时候会随机选择其中一种
字典有8227个字,我从字典中移除了空格,因为有一篇回文有提到加入空格的话,训练效果不好
但是我并没有将标签中的空格移除,请问我该移除空格吗?
Edit : 查了你写的图片生成代码,生成的图片和标签都不该有空格
Edit2 : 我的中/英文字串来自于真实的文本,并非从字典集中随机抽选的,随机抽选的话每个字的出现机率会是均等的,请问采用随机抽选的方式会比较好吗?
Edit3 : 我将 ImageDataset的num_label设定为40,因为最长的字串,长度只有40,这样子对训练的结果有何影响呢?
您好,我参考了您的代码,使用自己生成的英文数据集进行训练,loss一直在100多,无法下降,请问这是您跑过的效果好的代码嘛
Why is the accuracy of my training always zero?
Thanks for posting this example. It is most helpful.
I wonder if you might consider also adding some sample training data and data files to this project, such as those in the default files /data/zhy/crnn/Chinese_character/test2.txt and /data/zhy/crnn/Chinese_character/train2.txt. This would make it much easier for someone like me who would like to adopt your excellent project to his or her own needs.
1.请问模型是否支持不定长输入的训练和推理?
2.训练收敛时间是多长,我自己复现了一版遇到了收敛慢,acc一直为0,看你在别的回答中提到30epochs会收敛,收敛的时候loss大概是什么数量级的
3.另外symbol版的一个epoch可收敛,为什么gluon版的很慢
谢谢
Hi,
I was trying training some OCR task, but the acc is always zero .
the dataset is ICDAR2013
It's just alphabet and number and common characters. I created one charset as below.
all_english = no + """abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ,.<>/?;:"[]{}!@#$%^&*()-=+|""" + BLANK_SYMBOL
the GT file is like something below
./data/test/word_1.png|||PROPER
./data/test/word_2.png|||FOOD
./data/test/word_3.png|||PRONTO
./data/test/word_4.png|||professional
./data/test/word_5.png|||Java
./data/test/word_6.png|||Web
./data/test/word_7.png|||Services
./data/test/word_8.png|||go
./data/test/word_9.png|||SONY
and I modified the line 28 code in dataset.py
line = line.strip('\n').split('|||')
the attached file is traning log.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.