meijieru / crnn.pytorch Goto Github PK
View Code? Open in Web Editor NEWConvolutional recurrent network in pytorch
License: MIT License
Convolutional recurrent network in pytorch
License: MIT License
i want some data about running with CPU, how can i turn off CUDA ?
Hello,
when l run convert_dataset.py
l get the following error :
python2 convert_dataset.py
Origin dataset has 23920 samples
Traceback (most recent call last):
File "convert_dataset.py", line 57, in <module>
convert('/home/ahmed/Downloads/sample/output', '/home/ahmed/Downloads/sample/o_train')
File "convert_dataset.py", line 25, in convert
label = originDataset.getLabel(i + 1)
AttributeError: 'lmdbDataset' object has no attribute 'getLabel'
indeed, in dataset.py
there is no getLabel
How to fix that ?
is the img name is the label?can you show me the training data format
Is it open source? Which license?
Hi, when I train my own model with cpu, it goes normal and prints out loss. However if I use GPU, the system just get stuck there, after printing out model configuration it does not print anything. Do you know what might be the problem?
When i try to launch demo.py i get this error:
Traceback (most recent call last):
File "demo.py", line 30, in
preds = preds.squeeze(2)
File "/home/ubuntu-andrea/pytorchGPU3/lib/python3.5/site-packages/torch/autograd/variable.py", line 717, in squeeze
return Squeeze.apply(self, dim)
File "/home/ubuntu-andrea/pytorchGPU3/lib/python3.5/site-packages/torch/autograd/_functions/tensor.py", line 375, in forward
result = input.squeeze(dim)
RuntimeError: dimension out of range (expected to be in range of [-2, 1], but got 2)
What could be the problem? @meijieru
Hello,
l'm wondering whether the CRNN is able to output also the probability of each sequence
from example :
--h-e--ll-oo- => 'hello' with a probability= 0.89
for instance
how can l get that ?
When I run "python demo.py", but failed, met these error:
loading pretrained model from ./data/crnn.pth
Traceback (most recent call last):
File "demo.py", line 30, in
preds = preds.squeeze(2)
File "/usr/local/lib/python2.7/dist-packages/torch/autograd/variable.py", line 704, in squeeze
return Squeeze.apply(self, dim)
File "/usr/local/lib/python2.7/dist-packages/torch/autograd/_functions/tensor.py", line 341, in forward
result = input.squeeze(dim)
RuntimeError: dimension out of range (expected to be in range of [-2, 1], but got 2)
could you help me how to solve this problem?
the main function line 91:text = torch.IntTensor(opt.batchSize * 5),can you explain what this line meaning?
when training, we use (128,32), why (100,32) use in test?
run crnn_main.py using --cuda,and then error.
Error content:loss_func = warp_ctc.gpu_ctc if is_cuda else warp_ctc.cpu_ctc
AttributeError:'module' object has no attribute 'gpu_ctc'.
How to deal with?
Hello,
l'm trying to create my own dataset to train my model from scratch. I'm working with python3.5.2 .While l'm creating a dataset l encountered the following problems.
1)
with open(imagePath, 'r') as f:
imageBin = f.read()
returns the following error :
codecs.py", line 321, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte
l fixed it as follow :
with open(imagePath, 'rb') as f:
imageBin = f.read()
2)for python3
for i in xrange(nSamples):
should be replaced by for i in range(nSamples):
and
for k, v in cache.iteritems():
should be replaced by for k, v in cache.items():
txn.put(k, v)
indef writeCache(env, cache):
function
line 86, in createDataset
writeCache(env, cache)
File "/home/ahmed/Downloads/crnn.pytorch-master/data-processing.py", line 46, in writeCache
txn.put(k, v)
TypeError: Won't implicitly convert Unicode to bytes; use .encode()
Here is my whole code :
import os
import lmdb # install lmdb by "pip install lmdb"
import cv2
import numpy as np
import glob
real_path='/home/ahmed/Downloads/sample/train/'
path='/home/ahmed/Downloads/sample/'
path_train='train/'
path_output='/home/ahmed/Downloads/sample/output/'
os.chdir(path+path_train)
images_train = glob.glob("*.jpg")
left_train,labels_train,right_train = list(zip(*[os.path.splitext(x)[0].split('_')
for x in images_train]))
def checkImageIsValid(imageBin):
if imageBin is None:
return False
imageBuf = np.fromstring(imageBin, dtype=np.uint8)
img = cv2.imdecode(imageBuf, cv2.IMREAD_GRAYSCALE)
imgH, imgW = img.shape[0], img.shape[1]
if imgH * imgW == 0:
return False
return True
def writeCache(env, cache):
with env.begin(write=True) as txn:
for k, v in cache.items():
txn.put(k, v)
def createDataset(outputPath,images_train, labels_train, lexiconList=None, checkValid=True):
"""
Create LMDB dataset for CRNN training.
ARGS:
outputPath : LMDB output path
imagePathList : list of image path
labelList : list of corresponding groundtruth texts
lexiconList : (optional) list of lexicon lists
checkValid : if true, check the validity of every image
"""
assert(len(images_train) == len(labels_train))
nSamples = len(images_train)
env = lmdb.open(path_output, map_size=1099511627776)
cache = {}
cnt = 1
for i in range(nSamples):
imagePath = images_train[i]
label = labels_train[i]
if not os.path.exists(real_path+imagePath):
print('%s does not exist' % imagePath)
continue
with open(real_path+imagePath, 'rb') as f:
imageBin = f.read()
if checkValid:
if not checkImageIsValid(imageBin):
print('%s is not a valid image' % imagePath)
continue
imageKey = 'image-%09d' % cnt
labelKey = 'label-%09d' % cnt
cache[imageKey] = imageBin
cache[labelKey] = label
if lexiconList:
lexiconKey = 'lexicon-%09d' % cnt
cache[lexiconKey] = ' '.join(lexiconList[i])
if cnt % 1000 == 0:
writeCache(env, cache)
cache = {}
print('Written %d / %d' % (cnt, nSamples))
cnt += 1
nSamples = cnt-1
cache['num-samples'] = str(nSamples)
writeCache(env, cache)
print('Created dataset with %d samples' % nSamples)
if __name__ == '__main__':
createDataset(path_output,images_train, labels_train)
I have looked at the values of variable preds after executing "preds = model(image)" in demo.py. The values are the following:
Variable containing:
(0 ,.,.) =
-106.3455 -115.3943 -114.5584 ... -115.6788 -110.2145 -112.2794
(1 ,.,.) =
-67.3953 -92.3248 -92.7227 ... -88.7459 -81.5368 -88.8212
(2 ,.,.) =
-56.8008 -89.8197 -92.8852 ... -85.0180 -77.4713 -85.3732
...
(35,.,.) =
-38.8606 -79.6700 -81.3100 ... -71.8229 -57.3992 -68.8093
(36,.,.) =
-39.6410 -75.7699 -75.7648 ... -70.3662 -55.7602 -68.3655
(37,.,.) =
-45.2289 -77.6819 -77.0921 ... -73.9425 -59.1527 -70.6211
[torch.cuda.FloatTensor of size 38x1x37 (GPU 0)]
For my concerning this values should represents a sequence of probabilities over the classes. So i'm wondering why they are negative values such the ones reported here? @meijieru
what should i do?
I run crnn_main.py by this command:
python crnn_main.py --trainroot="/home/wangjianbo_i/OCR/data/IIIT5K/traindatalmdb" --valroot="/home/wangjianbo_i/OCR/data/IIIT5K/testdatalmdb" --cuda --alphabet="0123456789abcdefghijklmnopkrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ,;:" --imgH=32
And the error message shows below. I tried to run it on the company's server to avoid the environment error, but get the same result. How to solve this problem? :-D
Hi Jieru,
Thanks for your previous answer. I managed to create my own dataset. However, when I ran the "crnn_main.py" file with args specifying the training and testing dataset (with everything else the same as default value), it gave me errors as follows:
Traceback (most recent call last):
File "/home/su/work_space/pycharm-community-2017.1/helpers/pydev/pydevd.py", line 1578, in <module>
globals = debugger.run(setup['file'], None, None, is_module)
File "/home/su/work_space/pycharm-community-2017.1/helpers/pydev/pydevd.py", line 1015, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "/home/su/work_space/MOCR/crnn.pytorch/crnn_main.py", line 13, in <module>
import dataset
File "/home/su/work_space/MOCR/crnn.pytorch/dataset.py", line 10, in <module>
import sixlmdp_train_data_ordered
ImportError: No module named sixlmdp_train_data_ordered
Does it mean the python module "sixlmdp_train_data_ordered" is somehow missing? Is it a custom module defined by yourself? I made a bit of search, but I can't find it anywhere online. Many thanks.
Cheers,
Su
I keep getting this error on running crnn_main.py.
Traceback (most recent call last): File "crnn_main.py", line 200, in <module> cost = trainBatch(crnn, criterion, optimizer) File "crnn_main.py", line 183, in trainBatch preds = crnn(image) File "/home/ubuntu/anaconda/lib/python2.7/site-packages/torch/nn/modules/module.py", line 206, in __call__ result = self.forward(*input, **kwargs) File "/home/ubuntu/crnn.pytorch/models/crnn.py", line 80, in forward assert h == 1, "the height of conv must be 1" AssertionError: the height of conv must be 1
Hello,
to run convert_dataset.py l give the path to the output of create_dataset.py
which are data.mdb and lock.mdb
When l execute convert_dataset.py
if __name__ == "__main__":
convert('/home/ahmed/Downloads/sample/data/training', '/home/ahmed/Downloads/sample/data/training-ordered')
l get the following error :
Traceback (most recent call last):
File "/home/ahmed/Downloads/crnn.pytorch-master/tool/convert_dataset.py", line 55, in <module>
convert('/home/ahmed/Downloads/sample/data/training', '/home/ahmed/Downloads/sample/data/data-ordered')
File "/home/ahmed/Downloads/crnn.pytorch-master/tool/convert_dataset.py", line 17, in convert
originDataset = dataset.lmdbDataset(originPath, 'abc', *args)
TypeError: __init__() takes from 1 to 4 positional arguments but 9 were given
I use the icdar dataset to train crnn(pytorch), here is the problem:
Both training loss and test loss decrease normally, but the network can not predict anything!
Dose anybody encounter similar problem?
Here is an screenshot of my result:
Start val
-------------------------- => , gt: Oleg
-------------------------- => , gt: P
-------------------------- => , gt: CASTLE
-------------------------- => , gt: HIGH
-------------------------- => , gt: GYM
-------------------------- => , gt: FOR
-------------------------- => , gt: COOLING
-------------------------- => , gt: only
-------------------------- => , gt: Refill
-------------------------- => , gt: GRADUATE
Test loss: 19.685625, accuray: 0.000000
[7/25][60/60] Loss: 19.398106
[8/25][20/60] Loss: 19.721890
[8/25][40/60] Loss: 19.135641
Start val
-------------------------- => , gt: Hypertension:
-------------------------- => , gt: ntl:
-------------------------- => , gt: ANCIENT
-------------------------- => , gt: PanaSyncE70
-------------------------- => , gt: Specialists
-------------------------- => , gt: 2N
-------------------------- => , gt: EU-funded
-------------------------- => , gt: Dez.
-------------------------- => , gt: Guidex
-------------------------- => , gt: TAXI
Test loss: 19.580940, accuray: 0.000000
[8/25][60/60] Loss: 20.216869
[9/25][20/60] Loss: 18.244850
[9/25][40/60] Loss: 19.222029
Start val
-------------------------- => , gt: Connells
-------------------------- => , gt: 2010
-------------------------- => , gt: &
-------------------------- => , gt: GENTLEMEN
-------------------------- => , gt: WELCOME
-------------------------- => , gt: STUDIO
-------------------------- => , gt: DOLLAR
-------------------------- => , gt: SPENCER
-------------------------- => , gt: Systems
-------------------------- => , gt: NIGHT
Test loss: 19.106692, accuray: 0.000000
[9/25][60/60] Loss: 18.840774
Hello,
l noticed that all the images are constrained to be in the same size.
1- transformer = dataset.resizeNormalize((100, 32))
in demo.py
img_path = '/demo.png' of the sequence "available" has a dimension of (184,72), when we rescale it to (100,32) the amount of lost information is negligible. But when we have images of variable sizes far from 100 and 32 this will be problematic. In this case how did you solve the problem.
2- the same question than the one asked in (1-) transform=dataset.resizeNormalize((128, 32)))
in crnn_main.py
. Why we should resize all the images to (128,32) ?
for the height=32 l do agree because it allows to make difference for example between 'i' and 'l' so the height is important to discriminate between characters, but for the width=128 l don't understand why it should be!
3- Does the parameters transform=dataset.resizeNormalize((128, 32)))
in crnn_main.py
should remain statics to (128,32) to be able to use the architecture or can be changed to the dimensions which fit our data ?
The following error I am getting while runnunig demo.py for crnn inference.can you please help me:
loading pretrained model from ./data/crnn.pth
Traceback (most recent call last):
File "demo.py", line 27, in <module>
preds = model(image)
File "/home/anirban/python/local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 206, in __call__
result = self.forward(*input, **kwargs)
File "/home/anirban/crnn_inference/models/crnn.py", line 86, in forward
conv = nn.parallel.data_parallel(self.cnn, input, gpu_ids)
File "/home/anirban/python/local/lib/python2.7/site-packages/torch/nn/parallel/data_parallel.py", line 96, in data_parallel
output_device = device_ids[0]
TypeError: 'NoneType' object has no attribute '__getitem__'
@meijieru
Thanks for this package. I would like to know that I want to train it on a different dataset, but when I use /tools/create_dataset.py, I get nothing. So please help me in this matter or suggest me any other method to do so.
My OS is CentOS Linux release 7.2.1511 (Core), x86_64.
My server has four Tesla P40 GPU cards.
The command is:
python crnn_main.py --trainroot /home/xuliang/CRNN_org/crnn/tool/synth90k_train_sort --valroot /home/xuliang/CRNN_org/crnn/tool/synth90k_val_sort --cuda --ngpu 4 --adadelta --keep_ratio --random_sample
And error message is:
Namespace(adadelta=True, adam=False, alphabet='0123456789abcdefghijklmnopqrstuvwxyz', batchSize=64, beta1=0.5, crnn='', cuda=True, displayInterval=500, experiment=None, imgH=32, imgW=100, keep_ratio=True, lr=0.01, n_test_disp=10, ngpu=4, nh=256, niter=25, random_sample=True, saveInterval=500, trainroot='/home/xuliang/CRNN_org/crnn/tool/synth90k_train_sort', valInterval=500, valroot='/home/xuliang/CRNN_org/crnn/tool/synth90k_val_sort', workers=2)
Random Seed: 7716
CRNN (
(cnn): Sequential (
(conv0): Conv2d(1, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(relu0): ReLU (inplace)
(pooling0): MaxPool2d (size=(2, 2), stride=(2, 2), dilation=(1, 1))
(conv1): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(relu1): ReLU (inplace)
(pooling1): MaxPool2d (size=(2, 2), stride=(2, 2), dilation=(1, 1))
(conv2): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(batchnorm2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True)
(relu2): ReLU (inplace)
(conv3): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(relu3): ReLU (inplace)
(pooling2): MaxPool2d (size=(2, 2), stride=(2, 1), dilation=(1, 1))
(conv4): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(batchnorm4): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True)
(relu4): ReLU (inplace)
(conv5): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(relu5): ReLU (inplace)
(pooling3): MaxPool2d (size=(2, 2), stride=(2, 1), dilation=(1, 1))
(conv6): Conv2d(512, 512, kernel_size=(2, 2), stride=(1, 1))
(batchnorm6): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True)
(relu6): ReLU (inplace)
)
(rnn): Sequential (
(0): BidirectionalLSTM (
(rnn): LSTM(512, 256, bidirectional=True)
(embedding): Linear (512 -> 256)
)
(1): BidirectionalLSTM (
(rnn): LSTM(256, 256, bidirectional=True)
(embedding): Linear (512 -> 37)
)
)
)
[0/25][500/112885] Loss: 16.009935
Start val
Traceback (most recent call last):
File "crnn_main.py", line 207, in
val(crnn, test_dataset, criterion)
File "crnn_main.py", line 158, in val
sim_preds = converter.decode(preds.data, preds_size.data, raw=False)
File "/home/xuliang/CRNN_pytorch_v2/crnn.pytorch/utils.py", line 51, in decode
t[index:index + l], torch.IntTensor([l]), raw=raw))
ValueError: result of slicing is an empty tensor
Thanks for your help.
Hi Jieru,
Sorry for the previous question.
I cloned the code again and this time I'm getting cnn height equals 1 assertion error in crnn.py
.
def forward(self, input):
# conv features
conv = utils.data_parallel(self.cnn, input, self.ngpu)
b, c, h, w = conv.size()
assert h == 1, "the height of conv must be 1"
In the code, the input
is of dimension 64x1x64x128 ([torch.cuda.FloatTensor of size 64x1x64x128 (GPU 0)]
), and the conv
size has become 64x512x3x33 (b=64, c=512, h=3, w=33). Could you please give me any hint on how to resolve it? My training data is sub-sample from synth90k, so if I am correct, the image should be gray scale. Many thanks.
Cheers,
Su
Hi! When I was trying your demo program, I got this error:
loading pretrained model from ./data/crnn.pth
Traceback (most recent call last):
File "demo.py", line 27, in
preds = model(image)
File "/home/rohit/.local/lib/python3.5/site-packages/torch/nn/modules/module.py", line 206, in call
result = self.forward(*input, **kwargs)
File "/home/rohit/Documents/gitprojects/crnn.pytorch/models/crnn.py", line 78, in forward
conv = utils.data_parallel(self.cnn, input, self.ngpu)
AttributeError: module 'utils' has no attribute 'data_parallel'
Could you please tell me how I can fix this?
I install with SeanNaren/warp-ctc on my macbookpro. How to solve it ?
➜ pytorch_binding git:(zxdev_pytorch) python setup.py install
CUDA_HOME not found in the environment so building without GPU support. To build with GPU support please define the CUDA_HOME environment variable. This should be a path which contains include/cuda.h
generating build/_warp_ctc.c
regenerated: 'build/_warp_ctc.c'
running install
running build
running build_py
creating build/lib.macosx-10.11-x86_64-2.7
creating build/lib.macosx-10.11-x86_64-2.7/warpctc_pytorch
copying warpctc_pytorch/__init__.py -> build/lib.macosx-10.11-x86_64-2.7/warpctc_pytorch
running build_ext
building 'warpctc_pytorch._warp_ctc' extension
creating build/temp.macosx-10.11-x86_64-2.7
creating build/temp.macosx-10.11-x86_64-2.7/build
creating build/temp.macosx-10.11-x86_64-2.7/Users
creating build/temp.macosx-10.11-x86_64-2.7/Users/zhangxin
creating build/temp.macosx-10.11-x86_64-2.7/Users/zhangxin/github
creating build/temp.macosx-10.11-x86_64-2.7/Users/zhangxin/github/warp-ctc
creating build/temp.macosx-10.11-x86_64-2.7/Users/zhangxin/github/warp-ctc/pytorch_binding
creating build/temp.macosx-10.11-x86_64-2.7/Users/zhangxin/github/warp-ctc/pytorch_binding/src
clang -fno-strict-aliasing -fno-common -dynamic -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/usr/local/lib/python2.7/site-packages/torch/utils/ffi/../../lib/include -I/usr/local/lib/python2.7/site-packages/torch/utils/ffi/../../lib/include/TH -I/Users/zhangxin/github/warp-ctc/include -I/usr/local/Cellar/python/2.7.13/Frameworks/Python.framework/Versions/2.7/include/python2.7 -c build/_warp_ctc.c -o build/temp.macosx-10.11-x86_64-2.7/build/_warp_ctc.o -std=c++11 -fPIC
error: invalid argument '-std=c++11' not allowed with 'C/ObjC'
error: command 'clang' failed with exit status 1
Hello,
l've tested the pre-trained model in (demo.py
) with IIIT5K
test dataset as mentioned in the paper as follow :
l took 18 images. l wanted to know how much time it takes for each sequence image prediction.
import torch
from torch.autograd import Variable
import utils
import dataset
from PIL import Image
import models.crnn as crnn
from datetime import datetime
import glob
import os
test_path='/data/IIITS5K/'
alphabet = '0123456789abcdefghijklmnopqrstuvwxyz'
model = crnn.CRNN(32, 1, 37, 256, 1).cuda()
print('loading pretrained model from %s' % model_path)
model.load_state_dict(torch.load(model_path))
converter = utils.strLabelConverter(alphabet)
transformer = dataset.resizeNormalize((100, 32))
os.chdir(test_path)
images_names=glob.glob("*.png")
i=0
p=0
for img in images_names:
image = Image.open(img).convert('L')
a = datetime.now()
image = transformer(image).cuda()
image = image.view(1, *image.size())
image = Variable(image)
model.eval()
preds = model(image)
_, preds = preds.max(2)
preds = preds.squeeze(2)
preds = preds.transpose(1, 0).contiguous().view(-1)
preds_size = Variable(torch.IntTensor([preds.size(0)]))
raw_pred = converter.decode(preds.data, preds_size.data, raw=True)
sim_pred = converter.decode(preds.data, preds_size.data, raw=False)
b = datetime.now()
c = b - a
print('%-20s => %-20s' % (raw_pred, sim_pred),img)
# print(end - start)
print(c.total_seconds())
if (sim_pred+'.png') == img:
p +=1
i +=1
print(p, " sur ", i , " true positive")
l noticed that always the first sequence image prediction takes from 0.453489 to 0.503489 seconds
then all the remaining images take a negligible time 0.008099 for instance. It's a little bit strange.
I'm wondering why the first sequence image takes considerably more time than the remaining images. even if we change the order of the image to process first.
crnn.pytorch-master/demo.py
loading pretrained model from /home/ahmed/Downloads/crnn.pytorch-master/data/crnn.pth
8----8--3---9--6--6--0---- => 8839660 8839660.png
0.503489
9----------3------0------- => 930 930.png
0.009706
00-------8-----0----0----- => 0800 0800.png
0.008247
0-----1--9----9---0----1-- => 019901 019901.png
0.008314
g---------5------m-------- => g5m 95m.png
0.008099
1--------44-------9------- => 149 149.png
0.008319
0------11--9---2----2----- => 01922 01922.png
0.008478
7-----8---3-1--4--2--3---- => 7831423 7831423.png
0.008663
77---------4--------1----- => 741 741.png
0.008374
2-------0------1----0----- => 2010 2010.png
0.008169
22------------88---------- => 28 28.png
0.008131
t-------------o----------- => to to.png
0.008403
8---------------0--------- => 80 80.png
0.008148
2----2--4---3--5--0---2--- => 2243502 2243502.png
0.008258
22--------0----0----3----- => 2003 2003.png
0.008829
1--------8-----7----7----- => 1877 1877.png
0.008178
2----------------55------- => 25 25.png
0.0081
5------6----1---6----1---- => 56161 56161.png
0.009257
Hello,
After creating my own dataset with create_dataset.py l got the following :
/train_directory which contains data-train.mdb & lock-train.mdb
/test_directory which contains data-test.mdb & lock-train.mdb
then while l run crnn_main.py
l get the following error :
python2 crnn_main.py [--param val]
Traceback (most recent call last):
File "crnn_main.py", line 18, in <module>
parser.add_argument('/home/ahmed/Downloads/sample/output/train', required=True, help='path to dataset')
File "/home/ahmed/anaconda3/envs/cv/lib/python2.7/argparse.py", line 1276, in add_argument
kwargs = self._get_positional_kwargs(*args, **kwargs)
File "/home/ahmed/anaconda3/envs/cv/lib/python2.7/argparse.py", line 1388, in _get_positional_kwargs
raise TypeError(msg)
TypeError: 'required' is an invalid argument for positionals
Error related with :
parser = argparse.ArgumentParser()
parser.add_argument('/home/ahmed/Downloads/sample/output/train', required=True, help='path to dataset')
parser.add_argument('/home/ahmed/Downloads/sample/output/test', required=True, help='path to dataset')
What's wrong ?
Hello ,
l tried to install the fast CTC parallel library from this link https://github.com/SeanNaren/warp-ctc as follow :
git clone https://github.com/baidu-research/warp-ctc.git
cd warp-ctc
mkdir build
cd build
cmake ../
-- The C compiler identification is GNU 4.8.5
-- The CXX compiler identification is GNU 4.8.5
-- Check for working C compiler: /root/anaconda3/bin/cc
-- Check for working C compiler: /root/anaconda3/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /root/anaconda3/bin/c++
-- Check for working CXX compiler: /root/anaconda3/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Looking for pthread_create
-- Looking for pthread_create - not found
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE
**CMake Warning at /usr/share/cmake-3.5/Modules/FindCUDA.cmake:771 (message):
Expecting to find librt for libcudart_static, but didn't find it.
Call Stack (most recent call first):
CMakeLists.txt:20 (FIND_PACKAGE)**
-- Found CUDA: /usr/local/cuda (found suitable version "8.0", minimum required is "6.5")
-- cuda found TRUE
-- Found Torch7 in /home/ahmed/torch/install
-- Torch found /home/ahmed/torch/install/share/cmake/torch
-- Building shared library with GPU support
-- Building Torch Bindings with GPU support
-- Configuring done
-- Generating done
-- Build files have been written to: /home/ahmed/warp-ctc/build
After running
make
l got the following error :
make
[ 10%] Building NVCC (Device) object CMakeFiles/warpctc.dir/src/warpctc_generated_reduce.cu.o
[ 20%] Building NVCC (Device) object CMakeFiles/warpctc.dir/src/warpctc_generated_ctc_entrypoint.cu.o
Scanning dependencies of target warpctc
[ 30%] Linking CXX shared library libwarpctc.so
[ 30%] Built target warpctc
Scanning dependencies of target test_cpu
[ 40%] Building CXX object CMakeFiles/test_cpu.dir/tests/test_cpu.cpp.o
[ 50%] Linking CXX executable test_cpu
/home/ahmed/torch/install/lib/libTHC.so: undefined reference to `std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string()@GLIBCXX_3.4.21'
/home/ahmed/torch/install/lib/libTHC.so: undefined reference to `std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_append(char const*, unsigned long)@GLIBCXX_3.4.21'
/home/ahmed/torch/install/lib/libTH.so.0: undefined reference to `GOMP_parallel@GOMP_4.0'
/home/ahmed/torch/install/lib/libTHC.so: undefined reference to `std::random_device::_M_init(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)@GLIBCXX_3.4.21'
/home/ahmed/torch/install/lib/libTHC.so: undefined reference to `std::runtime_error::runtime_error(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)@GLIBCXX_3.4.21'
/home/ahmed/torch/install/lib/libTHC.so: undefined reference to `std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_replace(unsigned long, unsigned long, char const*, unsigned long)@GLIBCXX_3.4.21'
/home/ahmed/torch/install/lib/libTHC.so: undefined reference to `std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_create(unsigned long&, unsigned long)@GLIBCXX_3.4.21'
collect2: error: ld returned 1 exit status
CMakeFiles/test_cpu.dir/build.make:99: recipe for target 'test_cpu' failed
make[2]: *** [test_cpu] Error 1
CMakeFiles/Makefile2:67: recipe for target 'CMakeFiles/test_cpu.dir/all' failed
make[1]: *** [CMakeFiles/test_cpu.dir/all] Error 2
Makefile:127: recipe for target 'all' failed
make: *** [all] Error 2
what's wrong l don't understand the error ?
Hi can you achieve the same accuracy as the original crnn? And what's your accuracy?
Thanks.
Hello,
l'm thinking about making confusion matrix for CRNN. However the output classes are variable sequences and not as standard output classes as 0 1 2 .. or A B C .
Any idea of how making confusion matrix for CRNN.
Thank you
@meijieru
I am trying to convert the crnn_demo_model.t7 into an equivalent .pth model using the script convert_t7.py but I am getting the following error.
File "tool/convert_t7.py", line 167, in
torch_to_pytorch(py_model, args.model_file, args.output)
File "tool/convert_t7.py", line 130, in torch_to_pytorch
if layer_map[t7_name] != py_name:
KeyError: 'b'
Please help me to rectify this issue.
Hello,
I've created a LMDB file with Caffe but i have some issues while reading it with six.BytesIO() so i can't get the names of my images. It tells me that my images are corrupted (but the name in the lmdb is right).
Is there any other methods to create a LMDB file (without Caffe maybe) or other suggestions ?
I'm working on Ubuntu 16.04 with python3.4
Thank you in advance.
Best regards
Hello,
When l run crnn_main.py
with more than 15000 training exmples l get the following error :
and also when l run it with less than 10000 examples but l execute 3 or more times :
THCudaCheck FAIL file=/py/conda-bld/pytorch_1493676237139/work/torch/lib/THC/generic/THCStorage.cu line=66 error=2 : out of memory
Traceback (most recent call last):
File "crnn_main.py", line 222, in <module>
cost = trainBatch(crnn, criterion, optimizer)
File "crnn_main.py", line 207, in trainBatch
cost.backward()
File "/home/ahmed/anaconda3/envs/cv/lib/python2.7/site-packages/torch/autograd/variable.py", line 146, in backward
self._execution_engine.run_backward((self,), (gradient,), retain_variables)
File "/home/ahmed/anaconda3/envs/cv/lib/python2.7/site-packages/torch/nn/_functions/thnn/auto.py", line 171, in backward
grad_input = input.new().resize_as_(input)
RuntimeError: cuda runtime error (2) : out of memory at /py/conda-bld/pytorch_1493676237139/work/torch/lib/THC/generic/THCStorage.cu:66
How can l circumvent that ?
Thank you
Start val
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: accredits
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: cabral
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: esters
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: indulgent
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: typologies
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: bulwarks
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: insensitivity
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: fiche
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: elephantiasis
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: niamey
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: divorces
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: uncanny
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: nameable
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: colds
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: hydrofoils
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: eccentrically
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: strictly
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: misappropriate
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: televisions
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: turgid
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: undefined
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: margaritas
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: notifying
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: amniocentesis
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: lands
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: cunard
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: camshaft
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: taps
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: mediated
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: flexes
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: plumier
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: extradite
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: paramountcy
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: korans
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: compel
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: homburg
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: noncommunicable
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: fireguard
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: servants
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: dinnertime
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: elopes
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: innervate
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: whys
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: unhurt
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: screechier
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: chops
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: saprophytes
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: cannily
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: laborers
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: inns
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: loused
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: agrees
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: gusty
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: prizewinners
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: cremation
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: notoriously
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: awesome
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: musicals
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: cuddliest
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: unsupervised
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: villus
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: daters
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: acheson
qqqqqqqqqqqqqqqqqqqqqqqqqq => q , gt: unmentioned
Test loss: inf, accuray: 0.000000
I get this output. It doesn't change. The loss is infinite and the accuracy is zero. Is there some parameter I should tweak?
Hello,
l stack at the following line code in crnn_main.py
in line 100
image = torch.FloatTensor(opt.batchSize, 3, opt.imgH, opt.imgH)
we are are supposed to have imgH, imgW
or you rescale the image to be symmetric 32 * 32 ?
line 188
cost.backward()
How it backward the cost ?
in line 101
text = torch.IntTensor(opt.batchSize * 5)
Why is it batchSize*5? is 5 a static parameter or can be changed ?
I create the lmdb data, and the model is worse。I think the ground truth I set is wrong.
Hello,
l want to fine-tune the model on my data. so l added this instruction in my command to load the pretrained model
--crnn="/home/ahmed/Downloads/crnn.pytorch-master/data/crnn.pth"
as follow :
python2 crnn_main.py --trainroot="/home/ahmed/Downloads/mnt/data/train" --valroot="/home/ahmed/Downloads/mnt/data/valid" --imgH=32 --cuda --crnn="/home/ahmed/Downloads/crnn.pytorch-master/data/crnn.pth" --adadelta --experiment="/home/ahmed/Downloads/training_data/output"
However l get the following error :
Namespace(Diters=5, adadelta=True, adam=False, alphabet='0123456789abcdefghijklmnopqrstuvwxyz', batchSize=64, beta1=0.5, crnn='/home/ahmed/Downloads/crnn.pytorch-master/data/crnn.pth', cuda=True, displayInterval=1, experiment='/home/ahmed/Downloads/training_data/output', imgH=32, keep_ratio=False, lr=1, n_test_disp=1, ngpu=1, nh=100, niter=25, random_sample=False, saveInterval=1, trainroot='/home/ahmed/Downloads/mnt/data/train', valInterval=1, valroot='/home/ahmed/Downloads/mnt/data/valid', workers=2)
Random Seed: 5051
loading pretrained model from /home/ahmed/Downloads/crnn.pytorch-master/data/crnn.pth
Traceback (most recent call last):
File "crnn_main.py", line 97, in <module>
crnn.load_state_dict(torch.load(opt.crnn))
File "/home/ahmed/anaconda3/envs/cv/lib/python2.7/site-packages/torch/nn/modules/module.py", line 335, in load_state_dict
own_state[name].copy_(param)
RuntimeError: inconsistent tensor size at /py/conda-bld/pytorch_1493676237139/work/torch/lib/TH/generic/THTensorCopy.c:51
What's wrong ?
l don't understand the error :
RuntimeError: inconsistent tensor size at /py/conda-bld/pytorch_1493676237139/work/torch/lib/TH/generic/THTensorCopy.c:51
Every comment would be appreciable
Thank you
Hi meijieru,
First of all, thanks so much for the package, it works nicely and is easy to install and test.
I am really interested in your package and I am trying to go deeper by redoing the whole process starting from training the model. I am quite a beginner in this field, so I have some question as follows,
If I understand correctly, to train a model, I need to prepare a training dataset (a set of images of words and their corresponding labels) and pass it to "create_dataset.py" of the original "crnn" package. Then I have to use "convert_dataset.py" file in the "tool" folder to sort the training files according to their width. Finally use "crnn_main.py" to train the model. Is that correct?
Is possible, is there anywhere that I can download the training data to reproduce the exactly same (or similar) result (final trained model) as you got in the end?
I knew these questions might sound silly to you, but your comments are greatly appreciated. I think it is also helpful to others who are new to this field. Many thanks.
Su
I mean ICDAR2013 evaluation includes uppercase letters right ?
[dutongchun@cpu0 crnn.pytorch]$ python crnn_main.py --trainroot /home/dutongchun/songwendong/xxx/train --valroot /home/dutongchun/songwendong/xxx/validate/ --batchSize 16 --workers 1 --cuda
Namespace(adadelta=False, adam=False, alphabet='0123456789abcdefghijklmnopqrstuvwxyz', batchSize=16, beta1=0.5, crnn='', cuda=True, displayInterval=500, experiment=None, imgH=32, imgW=100, keep_ratio=False, lr=0.01, n_test_disp=10, ngpu=1, nh=256, niter=25, random_sample=False, saveInterval=500, trainroot='/home/dutongchun/songwendong/xxx/train', valInterval=500, valroot='/home/dutongchun/songwendong/xxx/validate/', workers=1)
Random Seed: 2008
CRNN (
(cnn): Sequential (
(conv0): Conv2d(1, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(relu0): ReLU (inplace)
(pooling0): MaxPool2d (size=(2, 2), stride=(2, 2), dilation=(1, 1))
(conv1): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(relu1): ReLU (inplace)
(pooling1): MaxPool2d (size=(2, 2), stride=(2, 2), dilation=(1, 1))
(conv2): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(batchnorm2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True)
(relu2): ReLU (inplace)
(conv3): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(relu3): ReLU (inplace)
(pooling2): MaxPool2d (size=(2, 2), stride=(2, 1), dilation=(1, 1))
(conv4): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(batchnorm4): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True)
(relu4): ReLU (inplace)
(conv5): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(relu5): ReLU (inplace)
(pooling3): MaxPool2d (size=(2, 2), stride=(2, 1), dilation=(1, 1))
(conv6): Conv2d(512, 512, kernel_size=(2, 2), stride=(1, 1))
(batchnorm6): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True)
(relu6): ReLU (inplace)
)
(rnn): Sequential (
(0): BidirectionalLSTM (
(rnn): LSTM(512, 256, bidirectional=True)
(embedding): Linear (512 -> 256)
)
(1): BidirectionalLSTM (
(rnn): LSTM(256, 256, bidirectional=True)
(embedding): Linear (512 -> 37)
)
)
)
/home/dutongchun/songwendong/crnn.pytorch/dataset.py:95: UserWarning: torch.range is deprecated in favor of torch.arange and will be removed in 0.3. Note that arange generates values in [start; end), not [start; end].
batch_index = random_start + torch.range(0, self.batch_size - 1)
[dutongchun@cpu0 crnn.pytorch]$
Hello,
in crnn_main.py line 125
the function def val(net, dataset, criterion, max_iter=100):
takes a a parameter max_iter
which by default is set to 100
. However, it doesn't work for any dataset ( but in the article the max iteration is between 200k and 500k). For huge dataset max_iter=100
works but for small one :
you should reduce the value manually.
For instance l tried two small dataset the first worked with max_iter=49
and another one with max_iter=22
and another one with max_iter=5
otherwise l get the following error
[0/35][1/13] Loss: 65.981506
Start val
Traceback (most recent call last):
File "crnn_main.py", line 216, in <module>
val(crnn, test_dataset, criterion)
File "crnn_main.py", line 144, in val
data = val_iter.next()
File "/home/ahmed/anaconda3/envs/cv/lib/python2.7/site-packages/torch/utils/data/dataloader.py", line 202, in __next__
raise StopIteration
StopIteration
l reached to set manually max_iter
as follow :
putting print(i)
in line 141
for i in range(max_iter):
print(i)
data = val_iter.next()
to see after which iteration it returns raise StopIteration error
How to solve that ?
Thank you
Hello,
l stuck with fine tuning.
1)First of all to fine tune the model you have to set --nh="256" otherwise it will not work, you'll get this error
(
loading pretrained model from /home/ahmed/Downloads/crnn.pytorch-master/data/crnn.pth
Traceback (most recent call last):
File "crnn_main.py", line 98, in
crnn.load_state_dict(torch.load(opt.crnn))
File "/home/ahmed/anaconda3/envs/cv/lib/python2.7/site-packages/torch/nn/modules/module.py", line 335, in load_state_dict
own_state[name].copy_(param)
RuntimeError: inconsistent tensor size at /py/conda-bld/pytorch_1493676237139/work/torch/lib/TH/generic/THTensorCopy.c:51
)
because the pretrained model --nh="256" and not 100 as it is set in the default model. But when fine tuning obviously we can change the parameter, so l find it strange that it doesn't work
l tried the following :
A) add one letter, let's say Z or another char , . /
'0123456789abcdefghijklmnopqrstuvwxyzZ'
l got the same error
RuntimeError: inconsistent tensor size at /py/conda-bld/pytorch_1493676237139/work/torch/lib/TH/generic/THTensorCopy.c:51
B) l removed one char and add another remove z and add / '0123456789abcdefghijklmnopqrstuvwxy/'
l get the same error
RuntimeError: inconsistent tensor size at /py/conda-bld/pytorch_1493676237139/work/torch/lib/TH/generic/THTensorCopy.c:51
C) l set alphabet only to digits '0123456789'
the same error
RuntimeError: inconsistent tensor size at /py/conda-bld/pytorch_1493676237139/work/torch/lib/TH/generic/THTensorCopy.c:51
Have you any idea for solving the problem of fine tuning to make a variable length of alphabet and the architecture ?
Thanks a lot
Hello,
l'm getting this error when running python demo.py
.
What's wrong with my the code. It seems that crnn.py don't import the utils which is in crnn.pytorch/models/ and import the one which is located in crnn.pytorch/
python3.5 demo.py
loading pretrained model from ./data/crnn.pth
Traceback (most recent call last):
File "demo.py", line 27, in <module>
preds = model(image)
File "/home/ahmed/anaconda3/envs/my_env/lib/python3.5/site-packages/torch/nn/modules/module.py", line 206, in __call__
result = self.forward(*input, **kwargs)
File "/home/ahmed/Downloads/crnn.pytorch-master/models/crnn.py", line 78, in forward
conv = utils.data_parallel(self.cnn, input, self.ngpu)
AttributeError: module 'utils' has no attribute 'data_parallel'
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.