Hi, How can I use the tbpp(Densenet) with my own data? Specifically, I see inside

Testing on my own data. about ssd_detectors HOT 8 CLOSED

mvoelk commented on July 30, 2024

Testing on my own data.

from ssd_detectors.

Comments (8)

mvoelk commented on July 30, 2024 2

Okay, I was curious and spent some time figuring it out...

The reshape operation

np_gray_img = np.reshape(gray_img, (1,256,32,1))

fails in your case. Try something like the following

np_gray_img = gray_img.T[None,:,:,None]

from ssd_detectors.

mvoelk commented on July 30, 2024

The code in the data_*.py files is dataset specific and derives a GTUtility class. Objects of the GTUtility class are only pickled to avoid the long preprocessing time of the datasets. The gt.mat file is specific to the SyntText dataset.

The attributs image_names, data and text of the GTUtility class are lists with as many elements as samples in the dataset.
image_names contains strings of the image file names.
data contains numpy arrays, where each row corresponds to a text instance and contains the vertices (x1, y1, x2, y2, x3, y3, x4, y4) of the oriented bounding box normalized by the image size, followd by a one hot encoding of the classification, which is in the text case always (0, 1).
text contains lists with the text strings associated to the text instances and is used as ground truth for the recognition stage.

If you only want to do prediction, you can proceed as with the real world images in SL_predict.ipynb

from ssd_detectors.

krish240574 commented on July 30, 2024

Thank you for the detailed explanation. Let me try the predictions and get back to you if any issues.
Cheers,
Krishna

from ssd_detectors.

krish240574 commented on July 30, 2024

Another question, how do I test the CRNN part with my own data? I see that the pre-trained CRNN models use the .pkl code again. Is there any code to test with my own data similar to what you have, for cropping box prediction? (I understand that the CRNN takes the bounding boxes, cropped after detection by the first tbpp NN). I'm looking at all files in the repo, can't find any code yet.

Specifically, do I need to dump all the results of bounding-box detection into a .pkl, for the CRNN to predict, using?
Thanks.
Krishna

from ssd_detectors.

mvoelk commented on July 30, 2024

CRNN_train.ipynb, SL_end2end_predict.ipynb and sl_videotest.py may be relevant for you.

In general, the input of the CRNN model is a batch of 32x256 grayscale images... The rest is up to you ;)

from ssd_detectors.

sniper0110 commented on July 30, 2024

Hello,

I am trying to use your pretrained model of CRNN (with lstm or gru) to recognize text on my images. I am using images from ICDAR2015 scene text dataset. For this I am using a small code :

import numpy as np
import matplotlib.pyplot as plt
import os
import editdistance
import pickle
import time

from keras.optimizers import SGD, Adam
from keras.callbacks import ModelCheckpoint

from crnn_model import CRNN
from crnn_data import InputGenerator
from crnn_utils import decode
from ssd_training import Logger, ModelSnapshot
import cv2
from crnn_utils import alphabet87 as alphabet


##Model
input_width = 256
input_height = 32
batch_size = 128
input_shape = (input_width, input_height, 1)

model, model_pred = CRNN(input_shape, len(alphabet), gru=False)
experiment = 'crnn_lstm_synthtext'
path_to_weights = './checkpoints/201806162129_crnn_lstm_synthtext/weights.300000.h5'
#path_to_weights = './checkpoints/201806190711_crnn_gru_synthtext/weights.300000.h5'
model_pred.load_weights(path_to_weights)


path_to_cropped_text = "" # path to my cropped text

my_img = cv2.imread(path_to_cropped_text)
resized_img = cv2.resize(my_img, (256,32))
gray_img = cv2.cvtColor(resized_img, cv2.COLOR_BGR2GRAY)
np_gray_img = np.reshape(gray_img, (1,256,32,1))

prediction = model_pred.predict(np_gray_img)


##Decode predictions
chars = [alphabet[c] for c in np.argmax(prediction[0], axis=1)]
res_str = decode(chars)

Unfortunately, I am getting almost always the result as "N" as if my text is a letter N. I don't know why is this happening and maybe I am making a mistake on how to use your code.

My original image had shape (69, 256, 3) and then I resized it to be compatible with input shape and of course I changed it to grayscale. I checked the image after this transformation and the text is still pretty obvious (no distorsions) so I was wondering what I am doing wrong.

Any help is greatly appreciated!

from ssd_detectors.

sniper0110 commented on July 30, 2024

Thanks a lot mate, that was the problem indeed. I am very curious as to why the operation you did (gray_img.T[None,:,:,None]) is different than my operation (np.reshape(gray_img, (1,256,32,1)). At the end they both give arrays with equal shapes (1, 256, 32, 1). Can you elaborate more on how are they different please? I am very curious!

from ssd_detectors.

mvoelk commented on July 30, 2024

OpenCV is not always as intuitive as it could be. However, the output of the cv functions has shape (32, 256) and you need the transpose.

For more details, please see the NumPy help.

from ssd_detectors.

Testing on my own data. about ssd_detectors HOT 8 CLOSED

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent