mvoelk / ssd_detectors Goto Github PK
View Code? Open in Web Editor NEWThis project forked from rykov8/ssd_keras
SSD-based object and text detection with Keras, SSD, DSOD, TextBoxes, SegLink, TextBoxes++, CRNN
License: MIT License
This project forked from rykov8/ssd_keras
SSD-based object and text detection with Keras, SSD, DSOD, TextBoxes, SegLink, TextBoxes++, CRNN
License: MIT License
ssd_detectors-master\ssd_data.py in preprocess(img, size)
628 img = img.astype(np.float32)
629 mean = np.array([104,117,123])
--> 630 img -= mean[np.newaxis, np.newaxis, :]
631 return img
632
ValueError: operands could not be broadcast together with shapes (512,512) (1,1,3) (512,512)
When I execute the following code:
gen_train = InputGenerator(gt_util_train, prior_util, batch_size, model.image_size, augmentation=False)
gen_val = InputGenerator(gt_util_val, prior_util, batch_size, model.image_size, augmentation=False)
tmp_inputs, tmp_targets = next(gen_train.generate())
I get the following RuntimeWarning
ssd_detectors/tbpp_utils.py:83: RuntimeWarning: divide by zero encountered in log
offsets_rboxs[prior_mask,4] = np.log(gt_rboxes[:,4] / priors_wh[:,1]) / variances_wh[:,1]
I get the same warning when I try to train TBPP512 or TBPP512_dense model. Also while training my Precision, recall metrics are 0 and conf_loss
and loc_loss
are NaN. Is it due to the above warning ? If not then how can I debug? Here are the metrics for 1st epoch:
Epoch 1/100
5/5 [==============================] - 55s 11s/step - loss: nan - conf_loss: nan - loc_loss: nan - precision: 0.0000e+00 - recall: 0.0000e+00 - accuracy: 0.8000 - fmeasure: 0.0000e+00 - num_pos: 43.2000 - num_neg: 611588.8000 - val_loss: nan - val_conf_loss: nan - val_loc_loss: nan - val_precision: 0.0000e+00 - val_recall: 0.0000e+00 - val_accuracy: 1.0000 - val_fmeasure: 0.0000e+00 - val_num_pos: 80.5000 - val_num_neg: 611551.5000
I have used a custom dataset with 1 class only and modified it according to the format required by GTUtility
. I've also verified the values in gt_util_train
and gt_util_val
and they seem to be correct.
loss: 17.0426 - conf_loss: 0.0283 - loc_loss: 16.7599 - precision: 4.2830e-04 - recall: 0.0217 - accuracy: 3.8194e-04 - fmeasure: 7.5787e-04 - num_pos: 5828.8542 - num_neg: 3663963.1458
Hi mvoelk,
I recently saw your code and unable to understand from where to start.
Unable to find main function.
Can you please help.
Hi, I'm university student in Korea.
Cause I wanted to make training process more faster, I changed parameters of fit_generator as
workers=12,
use_multiprocessing=True,
Then I got a problem like below.
As you can see, last batch of first epoch does not work and gives warning message. Is there any solution for this?
Hi - as of now the .pkl files have to be generated from the datasets to run the end2end files. Requesting the owner to please upload these to the repo to allow running the code using weights files without having to download huge datasets.
Hi,
I was going through your code and wanted to ask, do we have to train SegLink and CRNN separately on our dataset or is there a code which does both simultaneously.
I also wanted to ask for ICDAR2015FST do I have to specify the bounding boxes as:
xmin, ymin, xmax, ymax
and for ICDAR2015IST the format is:
x1, y1, x2, y2, x3, y3, x4, y4 ( The 4 points of the box in the anti-clockwise direction ).
Hello, Mr.Volk
Thank you very much for your nice codes!
I have one question for you
I'm new to deep learning, have only basic understanding about keras codes, and currently trying to run your DSOD_train.py.
Problem is, I keep getting OOM errors while executing the "Train" section of the code (error message below)
I tried to use only one GPU out of two I have, and to use 'allow_growth' option in tensorflow, and neither worked
I believe I need to reduce the size of minibatch(guess your code using batch size 128, am I right?), but I have no idea where to find the code to make this change. (just changing batch_size = 26 to some number lower didn't solve the problem, so I searched your .py files, ended up with no clue)
I'd really appreciate your help on my problem
By the way, I'm using Ubuntu 16.04 and latest tensorflow-keras
------------------------------------------------error message
ResourceExhaustedError Traceback (most recent call last)
in
49 workers=1,
50 #use_multiprocessing=False,
---> 51 initial_epoch=initial_epoch)
/usr/local/lib/python3.6/dist-packages/keras/legacy/interfaces.py in wrapper(*args, **kwargs)
89 warnings.warn('Update your ' + object_name + '
call to the ' +
90 'Keras 2 API: ' + signature, stacklevel=2)
---> 91 return func(*args, **kwargs)
92 wrapper._original_function = func
93 return wrapper
/usr/local/lib/python3.6/dist-packages/keras/engine/training.py in fit_generator(self, generator, steps_per_epoch, epochs, verbose, callbacks, validation_data, validation_steps, class_weight, max_queue_size, workers, use_multiprocessing, shuffle, initial_epoch)
1416 use_multiprocessing=use_multiprocessing,
1417 shuffle=shuffle,
-> 1418 initial_epoch=initial_epoch)
1419
1420 @interfaces.legacy_generator_methods_support
/usr/local/lib/python3.6/dist-packages/keras/engine/training_generator.py in fit_generator(model, generator, steps_per_epoch, epochs, verbose, callbacks, validation_data, validation_steps, class_weight, max_queue_size, workers, use_multiprocessing, shuffle, initial_epoch)
215 outs = model.train_on_batch(x, y,
216 sample_weight=sample_weight,
--> 217 class_weight=class_weight)
218
219 outs = to_list(outs)
/usr/local/lib/python3.6/dist-packages/keras/engine/training.py in train_on_batch(self, x, y, sample_weight, class_weight)
1215 ins = x + y + sample_weights
1216 self._make_train_function()
-> 1217 outputs = self.train_function(ins)
1218 return unpack_singleton(outputs)
1219
/usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py in call(self, inputs)
2713 return self._legacy_call(inputs)
2714
-> 2715 return self._call(inputs)
2716 else:
2717 if py_any(is_tensor(x) for x in inputs):
/usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py in _call(self, inputs)
2673 fetched = self._callable_fn(*array_vals, run_metadata=self.run_metadata)
2674 else:
-> 2675 fetched = self._callable_fn(*array_vals)
2676 return fetched[:len(self.outputs)]
2677
/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py in call(self, *args, **kwargs)
1456 ret = tf_session.TF_SessionRunCallable(self._session._session,
1457 self._handle, args,
-> 1458 run_metadata_ptr)
1459 if run_metadata:
1460 proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)
ResourceExhaustedError: 2 root error(s) found.
(0) Resource exhausted: OOM when allocating tensor with shape[6,1376,32,32] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node batch_normalization_302/FusedBatchNorm}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
[[loss_5/mul/_21899]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
(1) Resource exhausted: OOM when allocating tensor with shape[6,1376,32,32] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node batch_normalization_302/FusedBatchNorm}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
0 successful operations.
0 derived errors ignored.
Can I evaluate on arbitrary sized input? is there a way?
Model seems to return error when I change
Input in the TBPP model to None sized array
Hello,
Can you please upload the pickled dataset that you used for training SegLink?
It would be great if we can just run the code first and then try to understand the pipeline. I am asking this because I am finding it difficult to understand how the data is prepared for training SegLink. I have trained object detectors before but I think I am missing a step when it comes to training text detectors. So checking your data and playing with it would definitely help me understand better the pipeline.
Thanks in advance.
I am wondering why the average height is compute as h = (norm(np.cross(dt, tl-br)) + norm(np.cross(dt, tr-bl))) / (2*(norm(dt)+eps))
in polygon_to_rbox3
function? Why didn't just compute like this (norm(tr - br) + norm(tl - bl))/2
?
Original code reference: https://github.com/mvoelk/ssd_detectors/blob/master/utils/bboxes.py#L60
Hello!
Thank you so much for your codes and help for everyone
I tried to run your *_train codes, but due to slow learning, I don't think I can ever complete the training..
Your *_evalute codes use trained weights (.h5), which I don't have, so I have trouble running those codes as well
Would you mind providing checkpoints information...?
It'd be great help if you could give me a hand
Thank you!
For me, I want to train a very lightweight/fast object detection model for recognizing a single solid object e.g. a play station joystick. I tried transfer learning on tensorflow object detection API with SSDLiteMobileNetV2 but it's not fast enough because it was made to be big so that it can predict multiple classes. But I want to predict only one class which is a rigid object that won't deform or change shape at all.
That's why I'm thinking of defining MobileNetV2 to be a bit smaller and training SSD from scratch (as I think it's not possible to reuse the weights from the bigger model) so that I could achieve faster inference on a mobile phone. And maybe later I will convert the model to TF Lite.
For example, I want my model to run fast like this paper: https://arxiv.org/abs/1907.05047
Hi @mvoelk , good job here to reimplement a bunch of SSD variation with advanced and SoTA techniques. Can I ask you, you mention in https://github.com/mvoelk/ssd_detectors/blob/master/sl_utils.py#L290 that there is a vertical issue problem where it will become nan/inf. Is this has something to do with the way you wrap the dataset from .pkl or is this the issue with the decode function it self? do you have some direction on how to address this issue?
Thank you
Not necessarily an issue, but the mAP I got from DSOD512 training on VOC 07+12 and testing on 07 was quite low, approximately 0.13.
Only thing I really changed was using Adam instead of AdamAccumulate because it throws an error on tf 2.0. I also used softmax.
Also, metrics don't show during training other than the loss itself.
def trainMultiGPU():
# set up data sets
gt_util_voc = GTUtility("data/VOC2012train/")
gt_util_voc7 = GTUtility("data/VOC2007train/")
gt_util_voc_val = GTUtility("data/VOC2012val/", validation=True)
gt_util_voc7_val = GTUtility("data/VOC2007val/", validation=True)
gt_util_train = GTUtility.merge(gt_util_voc, gt_util_voc7)
gt_util_val = GTUtility.merge(gt_util_voc_val, gt_util_voc7_val)
experiment = 'dsod300_voc12_7'
batch_size = 16
# class_weights = prior_util.compute_class_weights(gt_util_train)
class_weights = np.array(
[0.00007169, 1.20864663, 1.23607288, 0.81087541, 1.32018959, 1.65339534, 1.47852761, 0.45099343, 0.84154551,
0.33765636, 1.41315118, 1.32907548, 0.63492811, 1.15680594, 1.18978997, 0.07548318, 0.91531396, 1.21262288,
1.15910985, 1.49269817, 1.08304682])
# DSOD paper
# batch size 128
# 320k iterations
# initial learning rate 0.1
epochs = 1000
initial_epoch = 0
with tf.device("/cpu:0"):
# set up DSOD 512
model = DSOD512(num_classes=gt_util_train.num_classes, softmax=True)
prior_util = PriorUtil(model)
gen_train = InputGenerator(gt_util_train, prior_util, batch_size, model.image_size, augmentation=True)
gen_val = InputGenerator(gt_util_val, prior_util, batch_size, model.image_size, augmentation=True)
# weight decay
regularizer = keras.regularizers.l2(5e-4) # None if disabled
for l in model.layers:
if l.__class__.__name__.startswith('Conv'):
l.kernel_regularizer = regularizer
checkdir = './checkpoints/' + time.strftime('%Y%m%d%H%M') + '_' + experiment
if not os.path.exists(checkdir):
os.makedirs(checkdir)
optim = keras.optimizers.Adam(lr=1e-3)
# loss = SSDLoss(alpha=1.0, neg_pos_ratio=3.0)
loss = SSDFocalLoss(lambda_conf=1.0, class_weights=class_weights)
model = multi_gpu_model(model, gpus=2)
model.compile(optimizer=optim, loss=loss.compute, metrics=loss.metrics)
# add some callbacks
reduce_lr = keras.callbacks.ReduceLROnPlateau(monitor='val_loss', factor=0.1, patience=3, verbose=1)
early_stopping = keras.callbacks.EarlyStopping(monitor='val_loss', min_delta=0, patience=10, verbose=1)
history = model.fit(
gen_train.generate(),
steps_per_epoch=gen_train.num_batches,
epochs=epochs,
verbose=1,
callbacks=[
keras.callbacks.ModelCheckpoint(checkdir + '/weights.{epoch:03d}.h5', verbose=1, save_weights_only=True,
save_best_only=True, period=3),
Logger(checkdir),
reduce_lr,
early_stopping
],
validation_data=gen_val.generate(),
validation_steps=gen_val.num_batches,
class_weight=None,
workers=1,
use_multiprocessing=False,
initial_epoch=initial_epoch)
Hi ,
Can you please provide the .mat file of dataset.
Hello! First of all, thank you very much for your implementation of TBPP in Keras. Is well written and simple to use!
I'm having trouble understanding just a thing: The output of a inference in TBPP (at least for me) is a tensor with the shape of (batches, 76454, 18). I understand the eighteen values being the predicted positions, boxes, classes. But the 76454, where this come from?
I'm trying to hack your model with a tfRecord dataset input, because I'm having a big problem with my gpu being idle while my cpu is stressed doing data loading.
Thank you again with all the help you provide to my masters research by implementing this models!
Hi - thank you for the well written implementation of SSD Text detection using Keras. I have been using the code from SL_end2end_predict.ipynb to get back detected characters along with their bounding boxes.
The Detection model has an output dimension of (batches, 5461, 31). Here, how may I retrieve the coordinates of a prediction box?
Thank you.
Hi thank you once again. As mentioned in #49 I am trying to get back the predicted words (list of characters) + the coordinates of their bounding box using the code from SL_end2end_predict.ipynb on my custom images.
Could you please list how to retrieve this from the prediction returned by the rec_model?
I initialise my CRNN with
rec_model = CRNN((input_width, input_height, 1), len(alphabet), gru=True, prediction_only=True)
Thank you.
Hello again,
I just wanted to know where you got the ICDAR2013 dataset to train your CRNN model? I found other datasets here : http://rrc.cvc.uab.es but I can't find ICDAR2013.
In fact, I just want to run your code for training CRNN just to see the workflow and then change the dataset to my own customed one.
Thanks in advance!
Hello,
What is the sample_random_batch() function inside TBPP_train.ipynb? Is it part of the dict returned by pickle.load()? I'm sorry if I sound ignorant, I can't seem to find it anywhere, please advise.
Thanks,
Kumar
Hi,
How can I use the tbpp(Densenet) with my own data? Specifically, I see inside tbpp_evaluate.ipynb that I need to do a sample_random_batch() from the gt_util_val, that is created from the split() on the pickle file containing - gt_util_synthtext_seglink.pkl. I understand that I need to create a pickle file with my test data, in order to fit into this pipeline. However, in your code I see that the pkl is generated from a .mat file(I understand that is the Matlab format).
So here is where I am stuck, do I need to create my files in a .mat format, so I can sneak them into the pipeline, for evaluation/testing? Can I create a pkl file directly, bypassing the .mat creation?
I do believe it would be quite useful if there was a small writeup about the format that the input data needs to be in, I've looked hard at Ankush Gupta's SynthText repo(https://github.com/ankush-me/SynthText) and am still wrapping my head around the format used. I'm reading Ankush's paper as you view this issue too.
Please clarify,
Thanks,
Krishna
Hey mvoelk,
your work is great!
I try to train DSOD512 with VOC2012.
from data_voc import GTUtility
gt_util_voc = GTUtility('data/VOC2012/')
gt_util_train, gt_util_val = GTUtility.split(gt_util_voc)
.
.
.
with open(checkdir+'/source.py','wb') as f:
source = ''.join(['# In[%i]\n%s\n\n' % (i, In[i]) for i in range(len(In))])
f.write(source.encode())
and get the Error, because 'In' is unknown.
I don't think that the problem is splitting. Before I change the code I got the same issue. Is In the train List?
Could you help me please?
Hi,
I am trying to duplicate the tbpp512fl_synthtext experiment. After the 9th epoch completes, I have: val_precision: 0.8418 - val_recall: 0.6407 - val_accuracy: 0.5731 - val_fmeasure: 0.7189.
It is far from "TextBoxes++ with DenseNet and Focal Loss" performance that you have reported on the ReadMe (0.901 precision, 0.931 recall and 0.916 fmeasure). It also looks to me that I will never reach 90+% performance.
I understand that I have not used DenseNet yet, and you also do not report "TextBoxes++ with Focal Loss" (without DenseNet). Since it will take me days to duplicate the experiment with "TextBoxes++ with DenseNet and Focal Loss", I want to understand whether the result that I obtained above for tbpp512fl_synthtext experiment is reasonable.
It is really surprising to me that using-or-not-using DenseNet would cause such large differences. So, please confirm if you got a chance. Many thanks indeed.
Best Regards
Jie
Hi, I have ported the SL_end2end_predict.ipynb to a .py file that loads only user images and gets predictions from them.
I am getting an output tensor shape of (1, 5461,18) for a single image. These are the summarized values:
[[ 0.9778447 0.02215527 0.05647869 ... 0.39081636 0.2786811 0.7213189 ] [ 0.9876583 0.01234164 0.17384888 ... 0.3960406 0.4638915 0.53610843] [ 0.9857997 0.01420039 0.1859522 ... 0.35515028 0.5405665 0.4594335 ] ... [ 0.99148715 0.00851282 0.8232149 ... 0.02318294 0.9777579 0.02224212] [ 0.99127495 0.00872501 -0.4267429 ... 0.0168435 0.98193043 0.01806954] [ 0.98890346 0.01109659 -0.55322266 ... 0.02168869 0.97793293 0.02206702]] [[0.9778447 0.02215527] [0.9876583 0.01234164] [0.9857997 0.01420039] ... [0.99148715 0.00851282] [0.99127495 0.00872501] [0.98890346 0.01109659]] [[ 0.05647869 0.25386062 -0.7133617 0.8498816 0.20088424] [ 0.17384888 0.65766203 -0.7351028 0.9332367 0.14610018] [ 0.1859522 1.1206706 -0.6635226 0.9163737 0.17239925] ... [ 0.8232149 -1.1342822 -3.839357 -1.5365617 0.0083948 ] [-0.4267429 -0.5418013 -3.4509156 -1.672849 -0.05473372] [-0.55322266 -0.27144217 -0.29643777 -0.03384027 0.6844612 ]] [[0.9789397 0.02106032 0.9673921 ... 0.03583498 0.9666423 0.03335765] [0.98277885 0.01722118 0.9819172 ... 0.02573826 0.97589934 0.02410063] [0.97954214 0.0204578 0.97924906 ... 0.02678417 0.975162 0.02483795] ... [0.97945476 0.02054522 0.9785146 ... 0.01981203 0.97710925 0.02289074] [0.9828745 0.01712552 0.9792094 ... 0.01742494 0.97953975 0.02046019] [0.9779886 0.02201145 0.9780286 ... 0.02184003 0.9782518 0.0217482 ]] [[0.33596185 0.6640381 0.30375123 ... 0.39081636 0.2786811 0.7213189 ] [0.6212505 0.37874946 0.12344692 ... 0.3960406 0.4638915 0.53610843] [0.53112626 0.46887374 0.17611167 ... 0.35515028 0.5405665 0.4594335 ] ... [0.9766356 0.02336443 0.9767829 ... 0.02318294 0.9777579 0.02224212] [0.98007303 0.019927 0.97990566 ... 0.0168435 0.98193043 0.01806954] [0.9779488 0.02205116 0.9778461 ... 0.02168869 0.97793293 0.02206702]]
The issue is, in sl_utils.py:304 confs = segment_labels[:,1]
Extracts
[0.02215527 0.01234164 0.01420039 ... 0.00851282 0.00872501 0.01109659] which do not look like the confidence values. Is my model output incorrect because of the input image?
My input is:
`for img_path in glob.glob('./examples_images/*'):
img = cv2.imread(img_path)
images_orig.append(np.copy(img))
h, w = image_size
resized_img=()
resized_img = cv2.resize(img, (w,h),resized_img, cv2.INTER_LINEAR)
resized_img = resized_img[:, :, (2,1,0)] / 255 # BGR to RGB
images.append(resized_img)
images = np.asarray(images)
preds = det_model.predict(images, batch_size=1, verbose=1)
`
Attached my python file.
sl_crnn.py.txt
Thank you once again for a very helpful repo. Would appreciate your kind help on this.
Hi, I'm university student of South Korea.
I'm in beginner level of deep learning.
Is there any way to visualize prior boxes on the syntext image?
++ I want to know how to use function plot_boxes
in PriorMaps and plot
functions in PriorUtils
Hi, I kinda newbie to git and CV here
How can I test the TextBoxes++ with my own images ? I can see SSD_predict.ipynb and SL_predict.ipynb but no TBPP_predict.ipynb in your git. Can you help me ?
Thanks in advance
Hi, I was trying to use your CRNN code and found an error written below. I have to admit that I am new to RNN but I think my all inputs were labeled correctly with your code.
I am stuck with this issue for a while. this blocks me producing my first crnn model :(
Traceback (most recent call last):
File "crnn_train.py", line 97, in
initial_epoch=0)
File "D:\experimental\random_test\testvenv\lib\site-packages\keras\legacy\interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "D:\experimental\random_test\testvenv\lib\site-packages\keras\engine\training.py", line 1658, in fit_generator
initial_epoch=initial_epoch)
File "D:\experimental\random_test\testvenv\lib\site-packages\keras\engine\training_generator.py", line 215, in fit_generator
class_weight=class_weight)
File "D:\experimental\random_test\testvenv\lib\site-packages\keras\engine\training.py", line 1449, in train_on_batch
outputs = self.train_function(ins)
File "D:\experimental\random_test\testvenv\lib\site-packages\keras\backend\tensorflow_backend.py", line 2979, in call
return self._call(inputs)
File "D:\experimental\random_test\testvenv\lib\site-packages\keras\backend\tensorflow_backend.py", line 2937, in _call
fetched = self._callable_fn(*array_vals)
File "D:\experimental\random_test\testvenv\lib\site-packages\tensorflow\python\client\session.py", line 1458, in call
run_metadata_ptr)
tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found.
(0) Invalid argument: Labels length is zero in batch 2
[[{{node ctc/CTCLoss}}]]
(1) Invalid argument: Labels length is zero in batch 2
[[{{node ctc/CTCLoss}}]]
[[training/SGD/gradients/ctc/CTCLoss_grad/mul/_315]]
Hello Sir. @mvoelk
Thank you for your great work
What is the meaning of log
{"neg_seg_conf_loss": 0.3863193988800049, "link_fmeasure": 0.0, "num_neg_seg": 5007.0, "seg_recall": 0.0, "pos_seg_conf_loss": 1.3579927682876587, "link_recall": 0.0, "link_conf_loss": 0.7612245082855225, "seg_fmeasure": 0.0, "pos_link_conf_loss": 1.7857060432434082, "epoch": 0, "lr": 0.0010000000474974513, "seg_accuracy": 0.304583340883255, "seg_precision": 0.0, "seg_conf_loss": 0.629237711429596, "num_pos_seg": 1669.0, "link_precision": 0.0, "link_accuracy": 0.0, "seg_loc_loss": 3.570629596710205, "loss": 4.961091995239258, "iteration": 33, "batch": 33, "time": 213.97210311889648, "neg_link_conf_loss": 0.41973066329956055}
{"neg_seg_conf_loss": 0.24363349378108978, "link_fmeasure": 0.5489721298217773, "num_neg_seg": 5835.0, "seg_recall": 0.35218510031700134, "pos_seg_conf_loss": 1.1648257970809937, "link_recall": 0.38869863748550415, "link_conf_loss": 0.5120393633842468, "seg_fmeasure": 0.48598790168762207, "pos_link_conf_loss": 1.263663411140442, "epoch": 3, "lr": 0.0010000000474974513, "seg_accuracy": 0.39625000953674316, "seg_precision": 0.7837528586387634, "seg_conf_loss": 0.47393155097961426, "num_pos_seg": 1945.0, "link_precision": 0.9341563582420349, "link_accuracy": 0.37833333015441895, "seg_loc_loss": 2.074434757232666, "loss": 3.060405731201172, "iteration": 99318, "batch": 2709, "time": 275985.530739069, "neg_link_conf_loss": 0.26149797439575195}
How can i tell that my model have good accuracy or converge ?
Thank you
environment set1: (use tf2, follow the environment.ipynb)
OS debian stretch/sid
Python 3.7.4
NumPy 1.17.2
Pandas 1.0.4
Matplotlib 3.2.1
OpenCV 3.4.3
TensorFlow 2.0.0-beta1
Keras 2.2.4-tf
tqdm 4.46.1
imageio 2.6.1
environment set2:
OS debian stretch/sid
Python 3.7.5
NumPy 1.18.0
Pandas 0.25.3
Matplotlib 3.2.1
OpenCV 3.4.3
TensorFlow 1.15.0
Keras 2.2.4-tf
tqdm 4.41.1
imageio 2.8.0
when use set1:
it run wrong in PriorUtil:
Traceback (most recent call last):
File "/mnt/downloads/github_src/ssd_detectors/SSD_predict.py", line 40, in
prior_util = PriorUtil(model)
File "/mnt/downloads/github_src/ssd_detectors/ssd_utils.py", line 353, in init
self.update_priors()
File "/mnt/downloads/github_src/ssd_detectors/ssd_utils.py", line 375, in update_priors
m.compute_priors()
File "/mnt/downloads/github_src/ssd_detectors/ssd_utils.py", line 193, in compute_priors
linx = np.array([(0.5 + i) for i in range(map_w)]) * step_x
TypeError: 'NoneType' object cannot be interpreted as an integer
Traceback (most recent call last):
File "/mnt/downloads/github_src/ssd_detectors/SL_end2end_predict.py", line 41, in
prior_util = PriorUtil(model)
File "/mnt/downloads/github_src/ssd_detectors/sl_utils.py", line 45, in init
if i > 0 and np.all(np.array(previous_map_size) != np.array(map_size)*2):
TypeError: unsupported operand type(s) for *: 'NoneType' and 'int'
when use set2
SSD_predict and SL_predict
it works well.
but alse print that:
layer missing zero_padding2d_5
file []
what is wrong...
when use set2 run
layer missing reshape_1
file []
something went wrong bidirectional_1
model [[512, 1024], [256, 1024], [1024], [512, 1024], [256, 1024], [1024]]
file [(512, 768), (256, 768), (768,), (512, 768), (256, 768), (768,)]
Layer weight shape (512, 1024) not compatible with provided weight shape (512, 768)
layer missing bidirectional_2
file [(512, 768), (256, 768), (768,), (512, 768), (256, 768), (768,)]
layer missing label_input
file []
layer missing input_length
file []
layer missing label_length
file []
layer missing ctc
file []
Traceback (most recent call last):
File "/mnt/downloads/github_src/ssd_detectors/SL_end2end_predict.py", line 152, in
res_crnn = crnn_model.predict(words)
File "/home/hyj/anaconda3/envs/py37tf15/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training.py", line 908, in predict
use_multiprocessing=use_multiprocessing)
File "/home/hyj/anaconda3/envs/py37tf15/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_arrays.py", line 723, in predict
callbacks=callbacks)
File "/home/hyj/anaconda3/envs/py37tf15/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_arrays.py", line 394, in model_iteration
batch_outs = f(ins_batch)
File "/home/hyj/anaconda3/envs/py37tf15/lib/python3.7/site-packages/tensorflow_core/python/keras/backend.py", line 3476, in call
run_metadata=self.run_metadata)
File "/home/hyj/anaconda3/envs/py37tf15/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1472, in call
run_metadata_ptr)
tensorflow.python.framework.errors_impl.InternalError: 2 root error(s) found.
(0) Internal: Blas GEMM launch failed : a.shape=(3, 512), b.shape=(512, 256), m=3, n=256, k=512
[[{{node bidirectional/forward_lstm_1/while/MatMul}}]]
[[softmax/truediv/_209]]
(1) Internal: Blas GEMM launch failed : a.shape=(3, 512), b.shape=(512, 256), m=3, n=256, k=512
[[{{node bidirectional/forward_lstm_1/while/MatMul}}]]
0 successful operations.
0 derived errors ignored.
thank you !
I am trying to display the output from the tbpp model using real-world images. I have a set of results from the model, and am using the following code:
`from sl_utils import PriorUtil
prior_util = PriorUtil(tbpp_model)
segment_threshold = 0.6; link_threshold = 0.25
res = prior_util.decode(results[0], segment_threshold, link_threshold)
prior_util.plot_results(res)`
I get the following error:
ValueError: operands could not be broadcast together with shapes (76454,6) (76454,8)
The shape of 'results' is (1, 76454, 19)
Hi,
I am trying to experiment with tbpp training program. However, I am a little confused about gt_util_synthtext_seglink.pkl file. How exactly should I generate the file?
I have tried running data_synthtext.py, which generates gt_util_synthtext.pkl. I then simply rename it to gt_util_synthtext_seglink.pkl. However, with this file, when running tbpp training program, I receive exceptions. I am attaching the detailed message below, but I think the problem happens at:
File "Z:\Users\jie\projects\ssd_detectors\tbpp_utils.py", line 21, in
gt_rboxes = np.array([polygon_to_rbox3(np.reshape(p, (-1,2))) for p in gt_data[:,:8]])
p is of size 5, of course, won't be able to reshape to (2).
I suspect that it is because the pickle file is not in the expected format. Could anyone explain to me how exactly the pickle file is generated. Many thanks indeed.
Best Regards
Jie
C:\Python36\python.exe Z:/Users/jie/projects/ssd_detectors/txbb_train.py
Using TensorFlow backend.
layer missing input_2
2018-08-25 09:26:20.701900: I T:\src\github\tensorflow\tensorflow\core\platform\cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2018-08-25 09:26:21.011315: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1404] Found device 0 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.683
pciBusID: 0000:01:00.0
totalMemory: 11.00GiB freeMemory: 9.10GiB
2018-08-25 09:26:21.011663: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1483] Adding visible gpu devices: 0
2018-08-25 09:26:21.703413: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:964] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-08-25 09:26:21.703617: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:970] 0
2018-08-25 09:26:21.703747: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:983] 0: N
2018-08-25 09:26:21.703968: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1096] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 8795 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1)
2018-08-25 09:26:21.704888: E T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:228] Illegal GPUOptions.experimental.num_dev_to_dev_copy_streams=0 set to 1 instead.
layer missing zero_padding2d_2
something went wrong conv4_3_norm_mbox_conf
model [[3, 5, 512, 28], [28]]
file [(3, 3, 512, 84), (84,)]
Layer weight shape (3, 5, 512, 28) not compatible with provided weight shape (3, 3, 512, 84)
layer missing fc7_mbox_conf
layer missing conv6_2_mbox_conf
layer missing conv7_2_mbox_conf
layer missing conv8_2_mbox_conf
layer missing conv9_2_mbox_conf
layer missing conv10_2_mbox_conf
something went wrong conv4_3_norm_mbox_loc
model [[3, 5, 512, 56], [56]]
file [(3, 3, 512, 16), (16,)]
Layer weight shape (3, 5, 512, 56) not compatible with provided weight shape (3, 3, 512, 16)
layer missing fc7_mbox_loc
layer missing conv6_2_mbox_loc
layer missing conv7_2_mbox_loc
layer missing conv8_2_mbox_loc
layer missing conv9_2_mbox_loc
layer missing conv10_2_mbox_loc
layer missing fc7_mbox_conf_flat
layer missing conv6_2_mbox_conf_flat
layer missing conv7_2_mbox_conf_flat
layer missing conv8_2_mbox_conf_flat
layer missing conv9_2_mbox_conf_flat
layer missing conv10_2_mbox_conf_flat
layer missing fc7_mbox_loc_flat
layer missing conv6_2_mbox_loc_flat
layer missing conv7_2_mbox_loc_flat
layer missing conv8_2_mbox_loc_flat
layer missing conv9_2_mbox_loc_flat
layer missing conv10_2_mbox_loc_flat
layer missing conv4_3_norm_mbox_priorbox
layer missing fc7_mbox_priorbox
layer missing conv6_2_mbox_priorbox
layer missing conv7_2_mbox_priorbox
layer missing conv8_2_mbox_priorbox
layer missing conv9_2_mbox_priorbox
layer missing conv10_2_mbox_priorbox
layer missing mbox_priorbox
Epoch 1/100
Traceback (most recent call last):
File "Z:/Users/jie/projects/ssd_detectors/txbb_train.py", line 89, in
initial_epoch=initial_epoch,
File "C:\Python36\lib\site-packages\keras\legacy\interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "C:\Python36\lib\site-packages\keras\engine\training.py", line 1426, in fit_generator
initial_epoch=initial_epoch)
File "C:\Python36\lib\site-packages\keras\engine\training_generator.py", line 155, in fit_generator
generator_output = next(output_generator)
File "C:\Python36\lib\site-packages\keras\utils\data_utils.py", line 793, in get
six.reraise(value.class, value, value.traceback)
File "C:\Python36\lib\site-packages\six.py", line 693, in reraise
raise value
File "C:\Python36\lib\site-packages\keras\utils\data_utils.py", line 658, in _data_generator_task
generator_output = next(self._generator)
File "Z:\Users\jie\projects\ssd_detectors\ssd_data.py", line 530, in generate
y = self.prior_util.encode(y)
File "Z:\Users\jie\projects\ssd_detectors\tbpp_utils.py", line 21, in encode
gt_rboxes = np.array([polygon_to_rbox3(np.reshape(p, (-1,2))) for p in gt_data[:,:8]])
File "Z:\Users\jie\projects\ssd_detectors\tbpp_utils.py", line 21, in
gt_rboxes = np.array([polygon_to_rbox3(np.reshape(p, (-1,2))) for p in gt_data[:,:8]])
File "C:\Python36\lib\site-packages\numpy\core\fromnumeric.py", line 279, in reshape
return _wrapfunc(a, 'reshape', newshape, order=order)
File "C:\Python36\lib\site-packages\numpy\core\fromnumeric.py", line 51, in _wrapfunc
return getattr(obj, method)(*args, **kwds)
ValueError: cannot reshape array of size 5 into shape (2)
Process finished with exit code 1
Are "Hard Negative Mining" implemented in TBPP model?
i loaded the dataset using below code
from data_svt import GTUtility
gt_util = GTUtility('./data/svt/')
gt_util_train, gt_util_val = gt_util.split(0.7)
then i runned the code
# SegLink + DenseNet
model = DSODSL512()
#model = DSODSL512(activation='leaky_relu')
weights_path = None
batch_size = 6
experiment = 'dsodsl512_synthtext'
if weights_path is not None:
if weights_path.find('ssd512') > -1:
layer_list = [
'conv1_1', 'conv1_2',
'conv2_1', 'conv2_2',
'conv3_1', 'conv3_2', 'conv3_3',
'conv4_1', 'conv4_2', 'conv4_3',
'conv5_1', 'conv5_2', 'conv5_3',
'fc6', 'fc7',
'conv6_1', 'conv6_2',
'conv7_1', 'conv7_2',
'conv8_1', 'conv8_2',
'conv9_1', 'conv9_2',
]
freeze = [
'conv1_1', 'conv1_2',
'conv2_1', 'conv2_2',
'conv3_1', 'conv3_2', 'conv3_3',
#'conv4_1', 'conv4_2', 'conv4_3',
#'conv5_1', 'conv5_2', 'conv5_3',
]
load_weights(model, weights_path, layer_list)
for layer in model.layers:
layer.trainable = not layer.name in freeze
else:
load_weights(model, weights_path)
prior_util = PriorUtil(model)
and finally
epochs = 100
initial_epoch = 0
gen_train = InputGenerator(gt_util_train, prior_util, batch_size, model.image_size, augmentation=False)
gen_val = InputGenerator(gt_util_val, prior_util, batch_size, model.image_size, augmentation=False)
checkdir = './checkpoints/' + time.strftime('%Y%m%d%H%M') + '_' + experiment
if not os.path.exists(checkdir):
os.makedirs(checkdir)
with open(checkdir+'/source.py','wb') as f:
source = ''.join(['# In[%i]\n%s\n\n' % (i, In[i]) for i in range(len(In))])
f.write(source.encode())
#optim = keras.optimizers.SGD(lr=1e-3, momentum=0.9, decay=0, nesterov=True)
optim = keras.optimizers.Adam(lr=1e-3, beta_1=0.9, beta_2=0.999, epsilon=0.001, decay=0.0)
# weight decay
regularizer = keras.regularizers.l2(5e-4) # None if disabled
#regularizer = None
for l in model.layers:
if l.__class__.__name__.startswith('Conv'):
l.kernel_regularizer = regularizer
loss = SegLinkLoss(lambda_offsets=1.0, lambda_links=1.0, neg_pos_ratio=3.0)
#loss = SegLinkFocalLoss()
#loss = SegLinkFocalLoss(lambda_segments=1.0, lambda_offsets=1.0, lambda_links=1.0)
#loss = SegLinkFocalLoss(gamma_segments=3, gamma_links=3)
model.compile(optimizer=optim, loss=loss.compute, metrics=loss.metrics)
history = model.fit_generator(
gen_train.generate(),
steps_per_epoch=gen_train.num_batches,
epochs=epochs,
verbose=1,
callbacks=[
keras.callbacks.ModelCheckpoint(checkdir+'/weights.{epoch:03d}.h5', verbose=1, save_weights_only=True),
Logger(checkdir),
#LearningRateDecay()
],
validation_data=gen_val.generate(),
validation_steps=gen_val.num_batches,
class_weight=None,
max_queue_size=1,
workers=1,
#use_multiprocessing=False,
initial_epoch=initial_epoch,
#pickle_safe=False, # will use threading instead of multiprocessing, which is lighter on memory use but slower
)
but i get this error
ValueError Traceback (most recent call last)
in ()
44 class_weight=None,
45 max_queue_size=1,
---> 46 workers=1,
47 #use_multiprocessing=False,
48 #initial_epoch=initial_epoch,
/usr/local/lib/python3.6/dist-packages/keras/legacy/interfaces.py in wrapper(*args, **kwargs)
89 warnings.warn('Update your ' + object_name + '
call to the ' +
90 'Keras 2 API: ' + signature, stacklevel=2)
---> 91 return func(*args, **kwargs)
92 wrapper._original_function = func
93 return wrapper
/usr/local/lib/python3.6/dist-packages/keras/engine/training.py in fit_generator(self, generator, steps_per_epoch, epochs, verbose, callbacks, validation_data, validation_steps, class_weight, max_queue_size, workers, use_multiprocessing, shuffle, initial_epoch)
1416 use_multiprocessing=use_multiprocessing,
1417 shuffle=shuffle,
-> 1418 initial_epoch=initial_epoch)
1419
1420 @interfaces.legacy_generator_methods_support
/usr/local/lib/python3.6/dist-packages/keras/engine/training_generator.py in fit_generator(model, generator, steps_per_epoch, epochs, verbose, callbacks, validation_data, validation_steps, class_weight, max_queue_size, workers, use_multiprocessing, shuffle, initial_epoch)
179 batch_index = 0
180 while steps_done < steps_per_epoch:
--> 181 generator_output = next(output_generator)
182
183 if not hasattr(generator_output, 'len'):
/usr/local/lib/python3.6/dist-packages/keras/utils/data_utils.py in get(self)
707 "use_multiprocessing=False, workers > 1
."
708 "For more information see issue #1638.")
--> 709 six.reraise(*sys.exc_info())
/usr/local/lib/python3.6/dist-packages/six.py in reraise(tp, value, tb)
691 if value.traceback is not tb:
692 raise value.with_traceback(tb)
--> 693 raise value
694 finally:
695 value = None
/usr/local/lib/python3.6/dist-packages/keras/utils/data_utils.py in get(self)
683 try:
684 while self.is_running():
--> 685 inputs = self.queue.get(block=True).get()
686 self.queue.task_done()
687 if inputs is not None:
/usr/lib/python3.6/multiprocessing/pool.py in get(self, timeout)
668 return self._value
669 else:
--> 670 raise self._value
671
672 def _set(self, i, obj):
/usr/lib/python3.6/multiprocessing/pool.py in worker(inqueue, outqueue, initializer, initargs, maxtasks, wrap_exception)
117 job, i, func, args, kwds = task
118 try:
--> 119 result = (True, func(*args, **kwds))
120 except Exception as e:
121 if wrap_exception and func is not _helper_reraises_exception:
/usr/local/lib/python3.6/dist-packages/keras/utils/data_utils.py in next_sample(uid)
624 The next value of generator uid
.
625 """
--> 626 return six.next(_SHARED_SEQUENCES[uid])
627
628
/content/drive/My Drive/ssd_detectors_master/ssd_data.py in generate(self, debug, encode, seed)
565 if len(targets) == batch_size:
566 if encode:
--> 567 targets = [self.prior_util.encode(y) for y in targets]
568 targets = np.array(targets, dtype=np.float32)
569 tmp_inputs = np.array(inputs, dtype=np.float32)
/content/drive/My Drive/ssd_detectors_master/ssd_data.py in (.0)
565 if len(targets) == batch_size:
566 if encode:
--> 567 targets = [self.prior_util.encode(y) for y in targets]
568 targets = np.array(targets, dtype=np.float32)
569 tmp_inputs = np.array(inputs, dtype=np.float32)
/content/drive/My Drive/ssd_detectors_master/sl_utils.py in encode(self, gt_data, debug)
138 polygons = []
139 for word in gt_data:
--> 140 xy = np.reshape(word[:8], (-1, 2))
141 xy = np.copy(xy) * (self.image_w, self.image_h)
142 polygons.append(xy)
/usr/local/lib/python3.6/dist-packages/numpy/core/fromnumeric.py in reshape(a, newshape, order)
290 [5, 6]])
291 """
--> 292 return _wrapfunc(a, 'reshape', newshape, order=order)
293
294
/usr/local/lib/python3.6/dist-packages/numpy/core/fromnumeric.py in _wrapfunc(obj, method, *args, **kwds)
54 def _wrapfunc(obj, method, *args, **kwds):
55 try:
---> 56 return getattr(obj, method)(*args, **kwds)
57
58 # An AttributeError occurs if the object does not have
ValueError: cannot reshape array of size 5 into shape (2)
and how i modify this model for custom text or object detections dataset i used labelimg to create custom dataset so how can i use this dataset in your model.
Thanking You
I'm trying to train my own textboxes++ using the ipython notebook you provided and I noticed an issue:
model = TBPP512_dense(softmax=False)
) for the background and foreground class confidence instead of using softmax. However, the original SSD training used softmax. And it makes sense since it's not a multi-lable problem. Is there any reason using sigmoid?could you give me the textboxes++ code of keras? I can't train and test the textboxes++ model.Thanks
Hi @mvoelk !
First of all, thanks a lot for your work.
I found a problem using the code from SL_end2end_predict.ipynb. In my specific use case I want to read some long words (actually sequence of numbers). The detector has no problems and it extracts correctly the bounding box (verified plotting the content of boxes
). The issue is that this long word is truncated by the function crop_words
and so the output of the CRNN model is wrong.
It doesn't seem to me that cropping a long word is a good way to handle the situation. How do you think I can fix this?
Thanks.
Hello, I'm trying to convert your model in TBPP_end2end_predict_GPUonly.ipynb
to tflite format for mobile application.
This is my environment
Python 3.6.9
Notebook 5.3.1
NumPy 1.18.5
Pandas 1.0.5
Matplotlib 3.2.2
OpenCV 4.1.2
TensorFlow 2.3.0
Keras 2.4.0
tqdm 4.41.1
imageio 2.4.1
Here's my code after the concatenate end2end model.
import tensorflow as tf
converter = tf.lite.TFLiteConverter.from_keras_model(model)
tflite_model = converter.convert()
open("convert.tflite", "wb").write(tflite_model)
And I got this error:
Exception Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/tensorflow/lite/python/convert.py in toco_convert_protos(model_flags_str, toco_flags_str, input_data_str, debug_info_str, enable_mlir_converter)
198 debug_info_str,
--> 199 enable_mlir_converter)
200 return model_str
5 frames
/usr/local/lib/python3.6/dist-packages/tensorflow/lite/python/wrap_toco.py in wrapped_toco_convert(model_flags_str, toco_flags_str, input_data_str, debug_info_str, enable_mlir_converter)
37 debug_info_str,
---> 38 enable_mlir_converter)
39
Exception: /usr/local/lib/python3.6/dist-packages/tensorflow/python/util/deprecation.py:507:0: error: requires element_shape to be 1D tensor during TF Lite transformation pass
/usr/local/lib/python3.6/dist-packages/tensorflow/python/util/deprecation.py:574:0: note: called from
/content/drive/.shortcut-targets-by-id/1U-eNJ9b4Rq8kRh8RQ-t88VQTH1HUMEmA/ssd_detectors/tbpp_layers.py:209:0: note: called from
/usr/local/lib/python3.6/dist-packages/tensorflow/python/autograph/impl/api.py:302:0: note: called from
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer.py:985:0: note: called from
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/functional.py:508:0: note: called from
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/functional.py:386:0: note: called from
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer.py:985:0: note: called from
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/saving/saving_utils.py:134:0: note: called from
/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/def_function.py:600:0: note: called from
/usr/local/lib/python3.6/dist-packages/tensorflow/python/util/deprecation.py:507:0: note: see current operation: %1382 = "tf.TensorListReserve"(%249, %1381) {device = ""} : (tensor<i32>, tensor<i32>) -> tensor<!tf.variant<tensor<*xf32>>>
/usr/local/lib/python3.6/dist-packages/tensorflow/python/util/deprecation.py:507:0: error: failed to legalize operation 'tf.TensorListReserve' that was explicitly marked illegal
/usr/local/lib/python3.6/dist-packages/tensorflow/python/util/deprecation.py:574:0: note: called from
/content/drive/.shortcut-targets-by-id/1U-eNJ9b4Rq8kRh8RQ-t88VQTH1HUMEmA/ssd_detectors/tbpp_layers.py:209:0: note: called from
/usr/local/lib/python3.6/dist-packages/tensorflow/python/autograph/impl/api.py:302:0: note: called from
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer.py:985:0: note: called from
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/functional.py:508:0: note: called from
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/functional.py:386:0: note: called from
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer.py:985:0: note: called from
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/saving/saving_utils.py:134:0: note: called from
/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/def_function.py:600:0: note: called from
/usr/local/lib/python3.6/dist-packages/tensorflow/python/util/deprecation.py:507:0: note: see current operation: %1382 = "tf.TensorListReserve"(%249, %1381) {device = ""} : (tensor<i32>, tensor<i32>) -> tensor<!tf.variant<tensor<*xf32>>>
During handling of the above exception, another exception occurred:
ConverterError Traceback (most recent call last)
<ipython-input-9-ba886ac57a78> in <module>()
4 # converter.experimental_new_converter = True
5 # converter.target_spec.supported_ops =[tf.lite.OpsSet.TFLITE_BUILTINS, tf.lite.OpsSet.SELECT_TF_OPS]
----> 6 tflite_model = converter.convert()
7
8 open("convert.tflite", "wb").write(tflite_model)
/usr/local/lib/python3.6/dist-packages/tensorflow/lite/python/lite.py in convert(self)
829
830 return super(TFLiteKerasModelConverterV2,
--> 831 self).convert(graph_def, input_tensors, output_tensors)
832
833
/usr/local/lib/python3.6/dist-packages/tensorflow/lite/python/lite.py in convert(self, graph_def, input_tensors, output_tensors)
631 input_tensors=input_tensors,
632 output_tensors=output_tensors,
--> 633 **converter_kwargs)
634
635 calibrate_and_quantize, flags = quant_mode.quantizer_flags(
/usr/local/lib/python3.6/dist-packages/tensorflow/lite/python/convert.py in toco_convert_impl(input_data, input_tensors, output_tensors, enable_mlir_converter, *args, **kwargs)
572 input_data.SerializeToString(),
573 debug_info_str=debug_info_str,
--> 574 enable_mlir_converter=enable_mlir_converter)
575 return data
576
/usr/local/lib/python3.6/dist-packages/tensorflow/lite/python/convert.py in toco_convert_protos(model_flags_str, toco_flags_str, input_data_str, debug_info_str, enable_mlir_converter)
200 return model_str
201 except Exception as e:
--> 202 raise ConverterError(str(e))
203
204 if distutils.spawn.find_executable(_toco_from_proto_bin) is None:
ConverterError: /usr/local/lib/python3.6/dist-packages/tensorflow/python/util/deprecation.py:507:0: error: requires element_shape to be 1D tensor during TF Lite transformation pass
/usr/local/lib/python3.6/dist-packages/tensorflow/python/util/deprecation.py:574:0: note: called from
/content/drive/.shortcut-targets-by-id/1U-eNJ9b4Rq8kRh8RQ-t88VQTH1HUMEmA/ssd_detectors/tbpp_layers.py:209:0: note: called from
/usr/local/lib/python3.6/dist-packages/tensorflow/python/autograph/impl/api.py:302:0: note: called from
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer.py:985:0: note: called from
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/functional.py:508:0: note: called from
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/functional.py:386:0: note: called from
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer.py:985:0: note: called from
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/saving/saving_utils.py:134:0: note: called from
/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/def_function.py:600:0: note: called from
/usr/local/lib/python3.6/dist-packages/tensorflow/python/util/deprecation.py:507:0: note: see current operation: %1382 = "tf.TensorListReserve"(%249, %1381) {device = ""} : (tensor<i32>, tensor<i32>) -> tensor<!tf.variant<tensor<*xf32>>>
/usr/local/lib/python3.6/dist-packages/tensorflow/python/util/deprecation.py:507:0: error: failed to legalize operation 'tf.TensorListReserve' that was explicitly marked illegal
/usr/local/lib/python3.6/dist-packages/tensorflow/python/util/deprecation.py:574:0: note: called from
/content/drive/.shortcut-targets-by-id/1U-eNJ9b4Rq8kRh8RQ-t88VQTH1HUMEmA/ssd_detectors/tbpp_layers.py:209:0: note: called from
/usr/local/lib/python3.6/dist-packages/tensorflow/python/autograph/impl/api.py:302:0: note: called from
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer.py:985:0: note: called from
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/functional.py:508:0: note: called from
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/functional.py:386:0: note: called from
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer.py:985:0: note: called from
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/saving/saving_utils.py:134:0: note: called from
/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/def_function.py:600:0: note: called from
/usr/local/lib/python3.6/dist-packages/tensorflow/python/util/deprecation.py:507:0: note: see current operation: %1382 = "tf.TensorListReserve"(%249, %1381) {device = ""} : (tensor<i32>, tensor<i32>) -> tensor<!tf.variant<tensor<*xf32>>>
Hope you can help me solve this problem, thanks a lot!
In notebook SL_end2end_predict.ipynb:
Model = SL512 weights_path = './models/201809231008_sl512_synthtext/weights.002.h5' segment_threshold = 0.6; link_threshold = 0.25 plot_name = 'sl512_crnn_sythtext' Model = DSODSL512 weights_path = './models/201806021007_dsodsl512_synthtext/weights.012.h5' segment_threshold = 0.55; link_threshold = 0.45 plot_name = 'dsodsl512_crnn_sythtext' sl_graph = tf.Graph() with sl_graph.as_default(): sl_session = tf.compat.v1.Session() with sl_session.as_default(): model = Model() prior_util = PriorUtil(model) load_weights(model, weights_path) image_size = model.image_size
Gives output:
WARNING:tensorflow:From /usr/local/lib/python3.7/site-packages/tensorflow_core/python/ops/resource_variable_ops.py:1635: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version. Instructions for updating: If using Keras pass *_constraint arguments to layers. something went wrong conv2d_1 model [[3, 3, 64, 64], [64]] file [(3, 3, 3, 64), (64,)] Layer weight shape (3, 3, 64, 64) not compatible with provided weight shape (3, 3, 3, 64) something went wrong conv2d_2 model [[3, 3, 64, 128], [128]] file [(3, 3, 64, 64), (64,)] Layer weight shape (3, 3, 64, 128) not compatible with provided weight shape (3, 3, 64, 64) something went wrong batch_normalization_2 model [[128], [128], [128], [128]] file [(64,), (64,), (64,), (64,)] Layer weight shape (128,) not compatible with provided weight shape (64,) something went wrong conv2d_3 model [[1, 1, 128, 192], [192]] file [(3, 3, 64, 128), (128,)] Layer weight shape (1, 1, 128, 192) not compatible with provided weight shape (3, 3, 64, 128) something went wrong batch_normalization_4 model [[192], [192], [192], [192]] file [(128,), (128,), (128,), (128,)] Layer weight shape (192,) not compatible with provided weight shape (128,) something went wrong conv2d_4 model [[3, 3, 192, 48], [48]] file [(1, 1, 128, 192), (192,)] Layer weight shape (3, 3, 192, 48) not compatible with provided weight shape (1, 1, 128, 192) something went wrong batch_normalization_5 model [[176], [176], [176], [176]] file [(192,), (192,), (192,), (192,)] Layer weight shape (176,) not compatible with provided weight shape (192,) something went wrong conv2d_5 model [[1, 1, 176, 192], [192]] file [(3, 3, 192, 48), (48,)] Layer weight shape (1, 1, 176, 192) not compatible with provided weight shape (3, 3, 192, 48) something went wrong batch_normalization_6 model [[192], [192], [192], [192]] file [(176,), (176,), (176,), (176,)] Layer weight shape (192,) not compatible with provided weight shape (176,) something went wrong conv2d_6 model [[3, 3, 192, 48], [48]] file [(1, 1, 176, 192), (192,)] Layer weight shape (3, 3, 192, 48) not compatible with provided weight shape (1, 1, 176, 192) something went wrong batch_normalization_7 model [[224], [224], [224], [224]] file [(192,), (192,), (192,), (192,)] Layer weight shape (224,) not compatible with provided weight shape (192,) something went wrong conv2d_7 model [[1, 1, 224, 192], [192]] file [(3, 3, 192, 48), (48,)] Layer weight shape (1, 1, 224, 192) not compatible with provided weight shape (3, 3, 192, 48) something went wrong batch_normalization_8 model [[192], [192], [192], [192]] file [(224,), (224,), (224,), (224,)] Layer weight shape (192,) not compatible with provided weight shape (224,) something went wrong conv2d_8 model [[3, 3, 192, 48], [48]] file [(1, 1, 224, 192), (192,)] Layer weight shape (3, 3, 192, 48) not compatible with provided weight shape (1, 1, 224, 192) something went wrong batch_normalization_9 model [[272], [272], [272], [272]] file [(192,), (192,), (192,), (192,)] Layer weight shape (272,) not compatible with provided weight shape (192,) something went wrong conv2d_9 model [[1, 1, 272, 192], [192]] file [(3, 3, 192, 48), (48,)] Layer weight shape (1, 1, 272, 192) not compatible with provided weight shape (3, 3, 192, 48) something went wrong batch_normalization_10 model [[192], [192], [192], [192]] file [(272,), (272,), (272,), (272,)] Layer weight shape (192,) not compatible with provided weight shape (272,) something went wrong conv2d_10 model [[3, 3, 192, 48], [48]] file [(1, 1, 272, 192), (192,)] Layer weight shape (3, 3, 192, 48) not compatible with provided weight shape (1, 1, 272, 192) something went wrong batch_normalization_11 model [[320], [320], [320], [320]] file [(192,), (192,), (192,), (192,)] Layer weight shape (320,) not compatible with provided weight shape (192,) something went wrong conv2d_11 model [[1, 1, 320, 192], [192]] file [(3, 3, 192, 48), (48,)] Layer weight shape (1, 1, 320, 192) not compatible with provided weight shape (3, 3, 192, 48) something went wrong batch_normalization_12 model [[192], [192], [192], [192]] file [(320,), (320,), (320,), (320,)] Layer weight shape (192,) not compatible with provided weight shape (320,) something went wrong conv2d_12 model [[3, 3, 192, 48], [48]] file [(1, 1, 320, 192), (192,)] Layer weight shape (3, 3, 192, 48) not compatible with provided weight shape (1, 1, 320, 192) something went wrong batch_normalization_13 model [[368], [368], [368], [368]] file [(192,), (192,), (192,), (192,)] Layer weight shape (368,) not compatible with provided weight shape (192,) something went wrong conv2d_13 model [[1, 1, 368, 192], [192]] file [(3, 3, 192, 48), (48,)] Layer weight shape (1, 1, 368, 192) not compatible with provided weight shape (3, 3, 192, 48) something went wrong batch_normalization_14 model [[192], [192], [192], [192]] file [(368,), (368,), (368,), (368,)] Layer weight shape (192,) not compatible with provided weight shape (368,) something went wrong conv2d_14 model [[3, 3, 192, 48], [48]] file [(1, 1, 368, 192), (192,)] Layer weight shape (3, 3, 192, 48) not compatible with provided weight shape (1, 1, 368, 192) something went wrong batch_normalization_15 model [[416], [416], [416], [416]] file [(192,), (192,), (192,), (192,)] Layer weight shape (416,) not compatible with provided weight shape (192,) something went wrong conv2d_15 model [[1, 1, 416, 416], [416]] file [(3, 3, 192, 48), (48,)] Layer weight shape (1, 1, 416, 416) not compatible with provided weight shape (3, 3, 192, 48) something went wrong conv2d_16 model [[1, 1, 416, 192], [192]] file [(1, 1, 416, 416), (416,)] Layer weight shape (1, 1, 416, 192) not compatible with provided weight shape (1, 1, 416, 416) something went wrong batch_normalization_17 model [[192], [192], [192], [192]] file [(416,), (416,), (416,), (416,)] Layer weight shape (192,) not compatible with provided weight shape (416,) something went wrong conv2d_17 model [[3, 3, 192, 48], [48]] file [(1, 1, 416, 192), (192,)] Layer weight shape (3, 3, 192, 48) not compatible with provided weight shape (1, 1, 416, 192) something went wrong batch_normalization_18 model [[464], [464], [464], [464]] file [(192,), (192,), (192,), (192,)] Layer weight shape (464,) not compatible with provided weight shape (192,) something went wrong conv2d_18 model [[1, 1, 464, 192], [192]] file [(3, 3, 192, 48), (48,)] Layer weight shape (1, 1, 464, 192) not compatible with provided weight shape (3, 3, 192, 48) something went wrong batch_normalization_19 model [[192], [192], [192], [192]] file [(464,), (464,), (464,), (464,)] Layer weight shape (192,) not compatible with provided weight shape (464,) something went wrong conv2d_19 model [[3, 3, 192, 48], [48]] file [(1, 1, 464, 192), (192,)] Layer weight shape (3, 3, 192, 48) not compatible with provided weight shape (1, 1, 464, 192) something went wrong batch_normalization_20 model [[512], [512], [512], [512]] file [(192,), (192,), (192,), (192,)] Layer weight shape (512,) not compatible with provided weight shape (192,) something went wrong conv2d_20 model [[1, 1, 512, 192], [192]] file [(3, 3, 192, 48), (48,)] Layer weight shape (1, 1, 512, 192) not compatible with provided weight shape (3, 3, 192, 48) something went wrong batch_normalization_21 model [[192], [192], [192], [192]] file [(512,), (512,), (512,), (512,)] Layer weight shape (192,) not compatible with provided weight shape (512,) something went wrong conv2d_21 model [[3, 3, 192, 48], [48]] file [(1, 1, 512, 192), (192,)] Layer weight shape (3, 3, 192, 48) not compatible with provided weight shape (1, 1, 512, 192) something went wrong batch_normalization_22 model [[560], [560], [560], [560]] file [(192,), (192,), (192,), (192,)] Layer weight shape (560,) not compatible with provided weight shape (192,) something went wrong conv2d_22 model [[1, 1, 560, 192], [192]] file [(3, 3, 192, 48), (48,)] Layer weight shape (1, 1, 560, 192) not compatible with provided weight shape (3, 3, 192, 48) something went wrong batch_normalization_23 model [[192], [192], [192], [192]] file [(560,), (560,), (560,), (560,)] Layer weight shape (192,) not compatible with provided weight shape (560,) something went wrong conv2d_23 model [[3, 3, 192, 48], [48]] file [(1, 1, 560, 192), (192,)] Layer weight shape (3, 3, 192, 48) not compatible with provided weight shape (1, 1, 560, 192) something went wrong batch_normalization_24 model [[608], [608], [608], [608]] file [(192,), (192,), (192,), (192,)].....
and so on.
Environment: tensorflow 2.2.0
Please let me know why this is happening and what I could to to solve it.
I was trying to train the DSODSL512 model using my own data, which is in ICDAR-FST2015 data format.
So, when I tried to train the other models (TB, DSOD) using the same GTUtility and InputGenerators, it worked perfectly, but when I tried doing the same on SegLink models, this error was raised:
ValueError: cannot reshape array of size 5 into shape (2)
The full trace of the error is given below. Please look into it.
P.S. I had changed the GTUtility for the ICDAR2015 to reflect my folder structure.
ValueError Traceback (most recent call last)
<ipython-input-6-a02fda50a65c> in <module>
38 workers=1,
39 #use_multiprocessing=False,
---> 40 initial_epoch=initial_epoch,
41 #pickle_safe=False, # will use threading instead of multiprocessing, which is lighter on memory use but slower
42 )
e:\installed_programs\anaconda3\envs\keras\lib\site-packages\keras\legacy\interfaces.py in wrapper(*args, **kwargs)
89 warnings.warn('Update your `' + object_name + '` call to the ' +
90 'Keras 2 API: ' + signature, stacklevel=2)
---> 91 return func(*args, **kwargs)
92 wrapper._original_function = func
93 return wrapper
e:\installed_programs\anaconda3\envs\keras\lib\site-packages\keras\engine\training.py in fit_generator(self, generator, steps_per_epoch, epochs, verbose, callbacks, validation_data, validation_steps, class_weight, max_queue_size, workers, use_multiprocessing, shuffle, initial_epoch)
1416 use_multiprocessing=use_multiprocessing,
1417 shuffle=shuffle,
-> 1418 initial_epoch=initial_epoch)
1419
1420 @interfaces.legacy_generator_methods_support
e:\installed_programs\anaconda3\envs\keras\lib\site-packages\keras\engine\training_generator.py in fit_generator(model, generator, steps_per_epoch, epochs, verbose, callbacks, validation_data, validation_steps, class_weight, max_queue_size, workers, use_multiprocessing, shuffle, initial_epoch)
179 batch_index = 0
180 while steps_done < steps_per_epoch:
--> 181 generator_output = next(output_generator)
182
183 if not hasattr(generator_output, '__len__'):
e:\installed_programs\anaconda3\envs\keras\lib\site-packages\keras\utils\data_utils.py in get(self)
707 "`use_multiprocessing=False, workers > 1`."
708 "For more information see issue #1638.")
--> 709 six.reraise(*sys.exc_info())
e:\installed_programs\anaconda3\envs\keras\lib\site-packages\six.py in reraise(tp, value, tb)
691 if value.__traceback__ is not tb:
692 raise value.with_traceback(tb)
--> 693 raise value
694 finally:
695 value = None
e:\installed_programs\anaconda3\envs\keras\lib\site-packages\keras\utils\data_utils.py in get(self)
683 try:
684 while self.is_running():
--> 685 inputs = self.queue.get(block=True).get()
686 self.queue.task_done()
687 if inputs is not None:
e:\installed_programs\anaconda3\envs\keras\lib\multiprocessing\pool.py in get(self, timeout)
642 return self._value
643 else:
--> 644 raise self._value
645
646 def _set(self, i, obj):
e:\installed_programs\anaconda3\envs\keras\lib\multiprocessing\pool.py in worker(inqueue, outqueue, initializer, initargs, maxtasks, wrap_exception)
117 job, i, func, args, kwds = task
118 try:
--> 119 result = (True, func(*args, **kwds))
120 except Exception as e:
121 if wrap_exception and func is not _helper_reraises_exception:
e:\installed_programs\anaconda3\envs\keras\lib\site-packages\keras\utils\data_utils.py in next_sample(uid)
624 The next value of generator `uid`.
625 """
--> 626 return six.next(_SHARED_SEQUENCES[uid])
627
628
F:\SSD Detectors\ssd_detectors\ssd_data.py in generate(self, debug, encode, seed)
563 if len(targets) == batch_size:
564 if encode:
--> 565 targets = [self.prior_util.encode(y) for y in targets]
566 targets = np.array(targets, dtype=np.float32)
567 tmp_inputs = np.array(inputs, dtype=np.float32)
F:\SSD Detectors\ssd_detectors\ssd_data.py in <listcomp>(.0)
563 if len(targets) == batch_size:
564 if encode:
--> 565 targets = [self.prior_util.encode(y) for y in targets]
566 targets = np.array(targets, dtype=np.float32)
567 tmp_inputs = np.array(inputs, dtype=np.float32)
F:\SSD Detectors\ssd_detectors\sl_utils.py in encode(self, gt_data, debug)
139 polygons = []
140 for word in gt_data:
--> 141 xy = np.reshape(word[:8], (-1, 2))
142 xy = np.copy(xy) * (self.image_w, self.image_h)
143 polygons.append(xy)
e:\installed_programs\anaconda3\envs\keras\lib\site-packages\numpy\core\fromnumeric.py in reshape(a, newshape, order)
290 [5, 6]])
291 """
--> 292 return _wrapfunc(a, 'reshape', newshape, order=order)
293
294
e:\installed_programs\anaconda3\envs\keras\lib\site-packages\numpy\core\fromnumeric.py in _wrapfunc(obj, method, *args, **kwds)
54 def _wrapfunc(obj, method, *args, **kwds):
55 try:
---> 56 return getattr(obj, method)(*args, **kwds)
57
58 # An AttributeError occurs if the object does not have
ValueError: cannot reshape array of size 5 into shape (2)
The script used is:
#!/usr/bin/env python
# coding: utf-8
# In[1]:
import numpy as np
import matplotlib.pyplot as plt
import keras
import os
import time
import pickle
from sl_model import SL512, DSODSL512
from ssd_data import InputGenerator
from sl_utils import PriorUtil
from sl_training import SegLinkLoss, SegLinkFocalLoss
from utils.training import Logger, LearningRateDecay
from utils.model import load_weights, calc_memory_usage
# ### Data
# In[2]:
from data_icdar2015fst import GTUtility
gt_util_train = GTUtility('data/ICDAR_Camera_photos')
gt_util_val = GTUtility('data/ICDAR_Camera_photos', test=True)
# ### Model
# In[3]:
# SegLink + DenseNet
model = DSODSL512()
#model = DSODSL512(activation='leaky_relu')
weights_path = None
batch_size = 6
experiment = 'dsodsl512_synthtext'
# In[4]:
if weights_path is not None:
if weights_path.find('ssd512') > -1:
layer_list = [
'conv1_1', 'conv1_2',
'conv2_1', 'conv2_2',
'conv3_1', 'conv3_2', 'conv3_3',
'conv4_1', 'conv4_2', 'conv4_3',
'conv5_1', 'conv5_2', 'conv5_3',
'fc6', 'fc7',
'conv6_1', 'conv6_2',
'conv7_1', 'conv7_2',
'conv8_1', 'conv8_2',
'conv9_1', 'conv9_2',
]
freeze = [
'conv1_1', 'conv1_2',
'conv2_1', 'conv2_2',
'conv3_1', 'conv3_2', 'conv3_3',
#'conv4_1', 'conv4_2', 'conv4_3',
#'conv5_1', 'conv5_2', 'conv5_3',
]
load_weights(model, weights_path, layer_list)
for layer in model.layers:
layer.trainable = not layer.name in freeze
else:
load_weights(model, weights_path)
prior_util = PriorUtil(model)
# ### Training
# In[5]:
epochs = 10
initial_epoch = 0
gen_train = InputGenerator(gt_util_train, prior_util, batch_size, model.image_size, augmentation=True)
gen_val = InputGenerator(gt_util_val, prior_util, batch_size, model.image_size, augmentation=True)
# In[6]:
checkdir = './checkpoints/' + time.strftime('%Y%m%d%H%M') + '_' + experiment
if not os.path.exists(checkdir):
os.makedirs(checkdir)
with open(checkdir+'/source.py','wb') as f:
source = ''.join(['# In[%i]\n%s\n\n' % (i, In[i]) for i in range(len(In))])
f.write(source.encode())
#optim = keras.optimizers.SGD(lr=1e-3, momentum=0.9, decay=0, nesterov=True)
optim = keras.optimizers.Adam(lr=1e-3, beta_1=0.9, beta_2=0.999, epsilon=0.001, decay=0.0)
# weight decay
regularizer = keras.regularizers.l2(5e-4) # None if disabled
#regularizer = None
for l in model.layers:
if l.__class__.__name__.startswith('Conv'):
l.kernel_regularizer = regularizer
#loss = SegLinkLoss(lambda_offsets=1.0, lambda_links=1.0, neg_pos_ratio=3.0)
loss = SegLinkFocalLoss(lambda_segments=100.0, lambda_offsets=1.0, lambda_links=100.0, gamma_segments=2, gamma_links=2)
model.compile(optimizer=optim, loss=loss.compute, metrics=loss.metrics)
history = model.fit_generator(
gen_train.generate(),
steps_per_epoch=gen_train.num_batches,
epochs=epochs,
verbose=1,
callbacks=[
keras.callbacks.ModelCheckpoint(checkdir+'/weights.{epoch:03d}.h5', verbose=1, save_weights_only=True),
Logger(checkdir),
#LearningRateDecay()
],
validation_data=gen_val.generate(),
validation_steps=gen_val.num_batches,
class_weight=None,
max_queue_size=1,
workers=1,
#use_multiprocessing=False,
initial_epoch=initial_epoch,
#pickle_safe=False, # will use threading instead of multiprocessing, which is lighter on memory use but slower
)
It's such a great implementation. But when I tried to visualize the ground truth images after training 1 epoch in Seglink ( i saw that some ground truth bounding boxes didnt fit the text although i have tested them all before training).
Visualize method: with normalized coordinates of bounding box, I first encoded them, then decoded (used sl_utils). After that, i drew the images.
But that problem doesnt appear in all images but some. So I wonder maybe there were some bug in either encode or decode part. Can you spend some time having a look on it ?
If you need to know anything in detail, just let me know.
Hello, sir, thanks for this excellent work. I am a newcomer for text detection and recognition. I followed your instruction from the "Usage" part, but I just cannot run any scripts successfully at all. It seems a lot of datasets need to be pre-installed.
Can you provide some more specific explanations on how to use this code? or a simple example for the user to reproduce the process will also be very helpful. Thanks for your efforts.
Thanks for sharing your code. I have a comment regarding the location of the padding operation in ssd_data.py.
Line 538 in 3329c2e
I think it should be done before resizing the input image, not after, as we would like to preserve the aspect ratio of the original image. So, I think this version of the code has no effect on preserving the aspect ratio.
Hi @mvoelk
I am trying to use focal loss in SSD model, as per the code in DSOD_train.ipynb, lines
I want to understand how these parameters were calculated for "lamba_conf" and "class_weights" please confirm if you got a chance. Many thanks indeed
Best Regards
Vaibhav
ssd300_coco_weights_fixed, ssd300_voc_weights_fixed, ssd512_coco_weights_fixed ssd512_voc_weights_fixed.hdf5 have two layers labeled as conv4_3_1:b, W but not conv4_3_1.
This triggers error. Does some one know how to fix the issue?
Hi, Markus,
Cool repo for detection models.
I'm interested in more light architectures for object detection and tried to change VGG16 (SSD512_body and SSD300_body) to mobile_net or mobile_net_v2.
But I can't get an adequate quality for them on VOC dataset.
Metrics on 30 epoch:
loss 4.34212 conf_loss 0.02317 loc_loss 2.02536
precision 0.54101 recall 0.17049 fmeasure 0.25575 accuracy 0.86921
val_loss 4.28855 val_conf_loss 0.02258 val_loc_loss 2.03045
val_precision 0.49608 val_recall 0.16673 val_fmeasure 0.24578 val_accuracy 0.87617
Probably it's due to new feature maps or incompatibility between backbone and extra blocks or something in training pipeline.
Did you try to do the same? If yes, could you give your contact to discuss how to use these architectures in your repo?
Best Wishes,
Valery
I want to train the model on my own data for a specific use case. I have my dataset in ICDAR-FST2015 dataset format, but the thing is, the InputGenerator in crnn_data which is used for training the CRNN model seems to enter an infinite loop and the training part (model.fit_generator()) doesn't show any progress even after hours. Is this normal behavior? Should I change my dataset into another format (like PASCAL-VOC) and try the same thing again?
I have been stuck at this for days now and any suggestions/help will highly be appreciated.
Thanks.
The steps for the training followed:
from data_icdar2015fst import GTUtility
gt_util_train = GTUtility('path')
gt_util_val = GTUtility('path', test=True)
from crnn_utils import alphabet87 as alphabet
input_width = 600
input_height = 800
batch_size = 8
input_shape = (input_width, input_height, 1)
# model, model_pred = CRNN(input_shape, len(alphabet), gru=False)
# experiment = 'crnn_lstm_synthtext'
model, model_pred = CRNN(input_shape, len(alphabet), gru=True)
experiment = 'crnn_gru_synthtext'
max_string_len = model_pred.output_shape[1]
gen_train = InputGenerator(gt_util_train, batch_size, alphabet, input_shape[:2],
grayscale=True, max_string_len=max_string_len)
gen_val = InputGenerator(gt_util_val, batch_size, alphabet, input_shape[:2],
grayscale=True, max_string_len=max_string_len)
checkdir = './checkpoints/' + time.strftime('%Y%m%d%H%M') + '_' + experiment
if not os.path.exists(checkdir):
os.makedirs(checkdir)
with open(checkdir+'/source.py','wb') as f:
source = ''.join(['# In[%i]\n%s\n\n' % (i, In[i]) for i in range(len(In))])
f.write(source.encode())
optimizer = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True, clipnorm=5)
#optimizer = Adam(lr=0.02, epsilon=0.001, clipnorm=1.)
# dummy loss, loss is computed in lambda layer
model.compile(loss={'ctc': lambda y_true, y_pred: y_pred}, optimizer=optimizer)
#model.summary()
model.fit_generator(generator=gen_train.generate(), # batch_size here?
steps_per_epoch=gt_util_train.num_objects // batch_size,
epochs=10,
validation_data=gen_val.generate(), # batch_size here?
validation_steps=gt_util_val.num_objects // batch_size,
verbose=2,
callbacks=[
ModelCheckpoint(checkdir+'/weights.{epoch:03d}.h5', verbose=1, save_weights_only=True),
ModelSnapshot(checkdir, 10000),
Logger(checkdir)
],
initial_epoch=0)
Hi,
I am using SL_train.ipynb to train with my own VOC format dataset on Windows10. I used LabelImg to label the groundtruth annotation, and used data_voc.py to generate the pickle file.
I've only used 5 images (3 for training, 1 for val, 1 for test). I set the batch size to 1. But the training process kept raising the following InvalidArgumentError after passing through the first image.
Can you help? Thanks.
1/4 [======>.......................] - ETA: 1:29 - loss: 20.9251 - seg_conf_loss: 3.7705 - seg_loc_loss: 10.3475 - link_conf_loss: 6.8071 - num_pos_seg: 28.0000 - num_neg_seg: 84.0000 - pos_seg_conf_loss: 3.3756 - neg_seg_conf_loss: 3.9021 - pos_link_conf_loss: 2.0223 - neg_link_conf_loss: 8.4020 - seg_precision: 0.0000e+00 - seg_recall: 0.0000e+00 - seg_accuracy: 0.0000e+00 - seg_fmeasure: 0.0000e+00 - link_precision: 0.0000e+00 - link_recall: 0.0000e+00 - link_accuracy: 0.0000e+00 - link_fmeasure: 0.0000e+00
---------------------------------------------------------------------------
InvalidArgumentError Traceback (most recent call last)
<ipython-input-14-29316809ab04> in <module>()
46 workers=1,
47 #use_multiprocessing=False,
---> 48 initial_epoch=initial_epoch,
49 #pickle_safe=False, # will use threading instead of multiprocessing, which is lighter on memory use but slower
50 )
d:\Anaconda3\envs\text_detection\lib\site-packages\keras\legacy\interfaces.py in wrapper(*args, **kwargs)
89 warnings.warn('Update your `' + object_name +
90 '` call to the Keras 2 API: ' + signature, stacklevel=2)
---> 91 return func(*args, **kwargs)
92 wrapper._original_function = func
93 return wrapper
d:\Anaconda3\envs\text_detection\lib\site-packages\keras\engine\training.py in fit_generator(self, generator, steps_per_epoch, epochs, verbose, callbacks, validation_data, validation_steps, class_weight, max_queue_size, workers, use_multiprocessing, shuffle, initial_epoch)
1413 use_multiprocessing=use_multiprocessing,
1414 shuffle=shuffle,
-> 1415 initial_epoch=initial_epoch)
1416
1417 @interfaces.legacy_generator_methods_support
d:\Anaconda3\envs\text_detection\lib\site-packages\keras\engine\training_generator.py in fit_generator(model, generator, steps_per_epoch, epochs, verbose, callbacks, validation_data, validation_steps, class_weight, max_queue_size, workers, use_multiprocessing, shuffle, initial_epoch)
211 outs = model.train_on_batch(x, y,
212 sample_weight=sample_weight,
--> 213 class_weight=class_weight)
214
215 outs = to_list(outs)
d:\Anaconda3\envs\text_detection\lib\site-packages\keras\engine\training.py in train_on_batch(self, x, y, sample_weight, class_weight)
1213 ins = x + y + sample_weights
1214 self._make_train_function()
-> 1215 outputs = self.train_function(ins)
1216 return unpack_singleton(outputs)
1217
d:\Anaconda3\envs\text_detection\lib\site-packages\keras\backend\tensorflow_backend.py in __call__(self, inputs)
2664 return self._legacy_call(inputs)
2665
-> 2666 return self._call(inputs)
2667 else:
2668 if py_any(is_tensor(x) for x in inputs):
d:\Anaconda3\envs\text_detection\lib\site-packages\keras\backend\tensorflow_backend.py in _call(self, inputs)
2634 symbol_vals,
2635 session)
-> 2636 fetched = self._callable_fn(*array_vals)
2637 return fetched[:len(self.outputs)]
2638
d:\Anaconda3\envs\text_detection\lib\site-packages\tensorflow\python\client\session.py in __call__(self, *args)
1452 else:
1453 return tf_session.TF_DeprecatedSessionRunCallable(
-> 1454 self._session._session, self._handle, args, status, None)
1455
1456 def __del__(self):
d:\Anaconda3\envs\text_detection\lib\site-packages\tensorflow\python\framework\errors_impl.py in __exit__(self, type_arg, value_arg, traceback_arg)
517 None, None,
518 compat.as_text(c_api.TF_Message(self.status.status)),
--> 519 c_api.TF_GetCode(self.status.status))
520 # Delete the underlying status object from memory otherwise it stays alive
521 # as there is a reference to status from this from the traceback due to
InvalidArgumentError: Reshape cannot infer the missing input size for an empty tensor unless all specified input sizes are non-zero
[[Node: training_1/Adam/gradients/loss_1/predictions_loss/TopKV2_grad/Reshape = Reshape[T=DT_INT32, Tshape=DT_INT32, _class=["loc:@train...rseToDense"], _device="/job:localhost/replica:0/task:0/device:CPU:0"](loss_1/predictions_loss/TopKV2:1, training_1/Adam/gradients/loss_1/predictions_loss/TopKV2_grad/stack)]]
Hi, I'm university student in Korea.
At first, Thank you for your help of my previous issue(visualizing).
This time, I wanted TBPP model's input shape to be arbitrary, so I set TBPP model's input shape asimage.shape
.
But when i adjust PriorUtil, because of assert code in ssd_utils.py line.193, I got an assertion error.
To avoid this error, I just erased assert code and I got an result as below
Although I erased assert code, I think there is no problem so far.
Could you let me know why did you put assertion code and is there any problem when I erase assertion code in ssd_utils.py line.193?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.