borda / keras-yolo3 Goto Github PK

View Code? Open in Web Editor NEW

This project forked from qqwweee/keras-yolo3

31.0 6.0 10.0 438 KB

A Keras implementation of YOLOv3 (Tensorflow backend) a successor of qqwweee/keras-yolo3

License: MIT License

Python 100.00%

cnn-keras detection classification object-detection tensorflow yolo yolov3 yolov3-tiny

keras-yolo3's Introduction

Keras: YOLO v3

Introduction

A Keras implementation of YOLOv3 (Tensorflow backend) inspired by allanzelener/YAD2K. This fork is a continuation of qqwweee/keras-yolo3 with some CI and bug fixing since its parent become inactive...

For package installation use of the following commands

pip install git+https://github.com/Borda/keras-yolo3.git
pip install https://github.com/Borda/keras-yolo3/archive/master.zip

or clone/download repository locally and run python setup.py install

Quick Start

For more model and configuration please see YOLO website and darknet repository.

Download YOLOv3 weights from YOLO website.

wget -O ./model_data/yolo3.weights  \
   https://pjreddie.com/media/files/yolov3.weights  \
   --progress=bar:force:noscroll

alternatively you can download light version yolov3-tiny.weights

Convert the Darknet YOLO model to a Keras model.

python3 scripts/convert_weights.py \
    --config_path ./model_data/yolo.cfg \
    --weights_path ./model_data/yolo.weights \
    --output_path ./model_data/yolo.h5

Run YOLO detection.

python3 scripts/detection.py \
   --path_weights ./model_data/yolo.h5 \
   --path_anchors ./model_data/yolo_anchors.csv \
   --path_classes ./model_data/coco_classes.txt \
   --path_output ./results \
   --path_image ./model_data/bike-car-dog.jpg \
   --path_video person.mp4

For Full YOLOv3, just do in a similar way, just specify model path and anchor path with --path_weights <model_file> and --path_anchors <anchor_file>.

MultiGPU usage: use --nb_gpu N to use N GPUs. It is passed to the Keras multi_gpu_model().

Training

For training you can use VOC dataset, COCO datset or your own...

Generate your own annotation file and class names file.
- One row for one image;
- Row format: image_file_path box1 box2 ... boxN;
- Box format: x_min,y_min,x_max,y_max,class_id (no space).
- Run one of following scrips for dataset conversion
  - scripts/annotation_voc.py
  - scripts/annotation_coco.py
  - scripts/annotation_csv.py
    Here is an example:
```
path/to/img1.jpg 50,100,150,200,0 30,50,200,120,3
path/to/img2.jpg 120,300,250,600,2
...
```
Make sure you have run python scripts/convert_weights.py <...>. The file model_data/yolo_weights.h5 is used to load pre-trained weights.
Modify training.py and start training. python training.py. Use your trained weights or checkpoint weights with command line option --model model_file when using yolo_interactive.py. Remember to modify class path or anchor path, with --classes class_file and --anchors anchor_file.

If you want to use original pre-trained weights for YOLOv3:

wget https://pjreddie.com/media/files/darknet53.conv.74
rename it as darknet53.weights
python convert.py -w darknet53.cfg darknet53.weights model_data/darknet53_weights.h5
use model_data/darknet53_weights.h5 in training.py

Some issues to know

The test environment is Python 3.x ; Keras 2.2.0 ; tensorflow 1.14.0
Default anchors are used. If you use your own anchors, probably some changes are needed.
The inference result is not totally the same as Darknet but the difference is small.
Always load pretrained weights and freeze layers in the first stage of training. Or try Darknet training. It's OK if there is a mismatch warning.
The training strategy is for reference only. Adjust it according to your dataset and your goal. and add further strategy if needed.
For speeding up the training process with frozen layers train_bottleneck.py can be used. It will compute the bottleneck features of the frozen model first and then only trains the last layers. This makes training on CPU possible in a reasonable time. See this post for more information on bottleneck features.
Failing while run multi-GPU training, think about porting to TF 2.0.

Nice reading

keras-yolo3's People

Contributors

Stargazers

Watchers

Forkers

ic jatinmandav lagerspetz-lempea gpu-poor johanmodin rulerof masterhimanshupoddar 6293 robisen1 manojbalan87

keras-yolo3's Issues

inconsistency of exported trained models

While training model completely random weights from the beginning and exporting just weights

keras-yolo3/scripts/train.py

Line 112 in b46e258

model.save_weights(path_weights)

and complete model

keras-yolo3/scripts/train.py

Line 121 in b46e258

yolo_model.save(path_model)

the model does not predict the same detections.

For both predistions, I have used the same predict.py script.
Overall the model from weights after a short training on VOC dataset is able to predict some people on sample volleyball video compare to fully exported model predicts nothing...

random box in uilts.py?

I am troubled, the box_data becomes 0 after random preprocessing for real-time data augmentation. Will this affect the training? the later are the output in function get_random_data() in uilts.py.

box:  [[344 424  71 210   1]
 [209 335  25 124   0]] len box:  2
not random box_data:  [[173. 330.  20. 155.   0.]
 [286. 404.  59. 226.   1.]
 [  0.   0.   0.   0.   0.]
 [  0.   0.   0.   0.   0.]
 [  0.   0.   0.   0.   0.]
 [  0.   0.   0.   0.   0.]
 [  0.   0.   0.   0.   0.]
 [  0.   0.   0.   0.   0.]
 [  0.   0.   0.   0.   0.]
 [  0.   0.   0.   0.   0.]
 [  0.   0.   0.   0.   0.]
 [  0.   0.   0.   0.   0.]
 [  0.   0.   0.   0.   0.]
 [  0.   0.   0.   0.   0.]
 [  0.   0.   0.   0.   0.]
 [  0.   0.   0.   0.   0.]
 [  0.   0.   0.   0.   0.]
 [  0.   0.   0.   0.   0.]
 [  0.   0.   0.   0.   0.]
 [  0.   0.   0.   0.   0.]] len box:  20
random box_data:  [[0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]] len box:  20

detection video from webcam

I cant open webcam although i set --path_video 0. How to fix this problem? Thanks!

missing 1 required positional argument: 'model_image_size'

Thank you for opening a fork as the upstream seems to be largely broken with calls and shebangs to python, which for me call python2 instead of python3, and the argument parser in yolo_video.py is also completely broken, same for the FPS counter, ...

So, I tried to follow the steps in your README.md and converted the weights and now I'm trying:

python3 scripts/detection.py --path_weights ./model_data/yolov3.h5 --path_anchors ./model_data/yolo_anchors.csv --path_classes ./model_data/coco_classes.txt --path_video video.avi

But I get missing 1 required positional argument: 'model_image_size'. This is not mentioned in the Readme. Maybe it can be mentioned or maybe it even can be inferred from the input image or video or from the network model by default as it is included in the yolov3.cfg. For YOLOv3, it would be --model_image_size 608 608 by default.

Also, the progress bar looks like it doesn't work, it only shows videos: 0%| | 0/1 [00:00<?, ?it/s], so it looks like the progress is in number of videos instead of number of frames and thereby rather useless. Also it would be nice to have the option to view it directly by using cv2.imshow instead of exporting it like it was possible in the original repo.

IndexError: bytearray index out of range

This is what I ran

python scripts/detection.py --path_weights ./model_data/yolo_weights_full.h5 --path_anchors ./model_data/yolo_anchors.csv --path_classes ./model_data/coco_classes.txt --path_output ./results --path_image C:\Users\HPO2KOR\Desktop\Work\venv\Patent\Labelled_Dataset\Dataset\train\text\FFDDAPMDD1.png

But I am getting an error

Traceback (most recent call last):
  File "scripts/detection.py", line 198, in <module>
    _main(**arg_params)
  File "scripts/detection.py", line 185, in _main
    predict_image(yolo, path_img, path_output)
  File "scripts/detection.py", line 94, in predict_image
    image_pred, pred_items = yolo.detect_image(image)
  File "C:\Users\HPO2KOR\Desktop\Work\venv\Patent\Dataset\Dataset\keras-yolo3-master\keras_yolo3\yolo.py", line 202, in detect_image
    out_scores[i], self.colors[c], thickness)
  File "C:\Users\HPO2KOR\Desktop\Work\venv\Patent\Dataset\Dataset\keras-yolo3-master\keras_yolo3\visual.py", line 56, in draw_bounding_box
    outline=color)
  File "C:\Users\HPO2KOR\Desktop\Work\venv\virtualenv\lib\site-packages\PIL\ImageDraw.py", line 246, in rectangle
    ink, fill = self._getink(outline, fill)
  File "C:\Users\HPO2KOR\Desktop\Work\venv\virtualenv\lib\site-packages\PIL\ImageDraw.py", line 112, in _getink
    ink = self.palette.getcolor(ink)
  File "C:\Users\HPO2KOR\Desktop\Work\venv\virtualenv\lib\site-packages\PIL\ImagePalette.py", line 109, in getcolor
    self.palette[index + 256] = color[1]
IndexError: bytearray index out of range

How do I resolve this?

Data generator problems

While using file formatted as for original qqwweee implementation (on which it is working), i got this error.

W0829 14:36:56.934175 23120 training_generator.py:251] Your dataset iterator ran out of data; interrupting training. Make sure that your iterator can generate at least `steps_per_epoch * epochs` batches (in this case, 115800 batches). You may need touse the repeat() function when building your dataset.
Traceback (most recent call last):
  File "training.py", line 214, in <module>
    _main(**arg_params)
  File "training.py", line 205, in _main
    callbacks=[tb_logging, checkpoint, reduce_lr, early_stopping])
  File "C:\Users\micha\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\keras\engine\training.py", line 1433, in fit_generator
    steps_name='steps_per_epoch')
  File "C:\Users\micha\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\keras\engine\training_generator.py", line 300, in model_iteration
    aggregator.finalize()
  File "C:\Users\micha\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\keras\engine\training_utils.py", line 111, in finalize
    raise ValueError('Empty training data.')
ValueError: Empty training data.

how to use only partial GPU memory

While prediction the model (even Tiny version) allocates complete GPU memory...

Coustom Model and Coustom Classes

This is the error I am facing while testing it on different weights and different anchors and different classes. How to resolve this issue?

Q: IOU by kmeans so small?

NICE job. But why is iou by kmeans so small (4%)?

porting to TF 2.0

Since this Keras implementation is based on TensorFlow, it makes sense to port ti to tf.keras in TensorFlow 2.0 or higher. This move may solve the issue with multi-GPU training #20

https://www.pyimagesearch.com/2019/10/21/keras-vs-tf-keras-whats-the-difference-in-tensorflow-2-0/

how to generate new anchor box size?

Hello
How can I generate new anchor box size which exist in "tiny-yolo_anchors.csv" for my own dataset?
thanks

loss: NaN while training with my own dataset

Hello,

I'm trying to migrate from qqwweeee's yolo implementation, but training with my own dataset results in loss: NaN (in qqwweee i can train just fine with same dataset).
Already implemented basic null checks, and also my dataset goes through automated checks (so every image present, every box checked by its coordinates, class present in classes file) before training.

Already tried with different batch sizes -- same effect.
Time for loss function to become NaN seems random (almost always in 1st epoch).

I think that problem somehow related to data generator...
Maybe you can suggest a proper way to debug it?

image-size: [416, 416]
batch-size:
  bottlenecks: 8
  head: 48
  # the unfreeze model takes more memory
  full: 8
epochs:
  bottlenecks: 25
  head: 50
  full: 30
CB_learning-rate:
  factor: 0.01
  patience: 3
CB_stopping:
  min_delta: 0
  patience: 25
valid-split: 0.1
generator:
  augment: true
  resize_img: true
  nb_threads: 0.9
recompute-bottlenecks: false

python scripts/training.py --path_dataset train_annotations.txt --path_weights model_data\yolo_weights.h5 --path_anchors model_data/yolo_anchors.csv --path_classes model_data/custom_classes.txt --path_output logs/003 --path_config model_data/train_yolo.yaml

predict not working

Thanks a lot for this refactor, it's a million times better than the base repo. Having said that, if I train my model with non 416x416 images, predict is later unable to load the model.

I've tried this with and without the following changes in config_train.json

+    "image-size": [1600, 192],
+    "batch-size": 8,

and train.py

+    'image-size': (1600, 192),
+    'batch-size': 8,

Either way training works fine, with xval_loss as low as 25. But when using predict I always get no matter my attempted fixes in yolo3/yolo.py on self.yolo_model.load_weights(self.weights_path):

Traceback (most recent call last):
  File "/xx/yolo3/yolo.py", line 73, in generate
    self.yolo_model = load_model(self.weights_path, compile=False)
  File "/zz/keras/engine/saving.py", line 419, in load_model
    model = _deserialize_model(f, custom_objects, compile)
  File "/zz/keras/engine/saving.py", line 221, in _deserialize_model
    model_config = f['model_config']
  File "/zz/keras/utils/io_utils.py", line 302, in __getitem__
    raise ValueError('Cannot create group in read only mode.')
ValueError: Cannot create group in read only mode.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/zz/tensorflow/python/framework/ops.py", line 1659, in _create_c_op
    c_op = c_api.TF_FinishOperation(op_desc)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Dimension 0 in both shapes must be equal, but are 1 and 42. Shapes are [1,1,1024,255] and [42,1024,1,1]. for 'Assign_360' (op: 'Assign') with input shapes: [1,1,1024,255], [42,1024,1,1].

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "scripts/predict.py", line 176, in <module>
    _main(**arg_params)
  File "scripts/predict.py", line 153, in _main
    classes_path=path_classes, gpu_num=gpu_num)
  File "/xx/yolo3/yolo.py", line 59, in __init__
    self.boxes, self.scores, self.classes = self.generate()
  File "/xx/yolo3/yolo.py", line 86, in generate
    self.yolo_model.load_weights(self.weights_path)
  File "/zz/keras/engine/network.py", line 1166, in load_weights
    f, self.layers, reshape=reshape)
  File "/zz/keras/engine/saving.py", line 1058, in load_weights_from_hdf5_group
    K.batch_set_value(weight_value_tuples)
  File "/zz/keras/backend/tensorflow_backend.py", line 2465, in batch_set_value
    assign_op = x.assign(assign_placeholder)
  File "/zz/tensorflow/python/ops/variables.py", line 1762, in assign
    name=name)
  File "/zz/tensorflow/python/ops/state_ops.py", line 223, in assign
    validate_shape=validate_shape)
  File "/zz/tensorflow/python/ops/gen_state_ops.py", line 64, in assign
    use_locking=use_locking, name=name)
  File "/zz/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper
    op_def=op_def)
  File "/zz/tensorflow/python/util/deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "/zz/tensorflow/python/framework/ops.py", line 3300, in create_op
    op_def=op_def)
  File "/zz/tensorflow/python/framework/ops.py", line 1823, in __init__
    control_input_ops)
  File "/zz/tensorflow/python/framework/ops.py", line 1662, in _create_c_op
    raise ValueError(str(e))
ValueError: Dimension 0 in both shapes must be equal, but are 1 and 42. Shapes are [1,1,1024,255] and [42,1024,1,1]. for 'Assign_360' (op: 'Assign') with input shapes: [1,1,1024,255], [42,1024,1,1].

Are you sure that predict works as you expect?

The only time it works for me is if it load the default yolo3.h5 model.

Path for parameters is missing

There are many prams arguments in training, detection and yolo.py. Although i set default values for params but still missing path in trainng.py.

Turn of data augmentation for validation set

IMO the data augmentation should be turned of for the validation set.
This was also discussed here

Training with my own dataset

Hi @Borda, thanks for the great working.
It is highly appreciated if you could support.
I want to train the 2 new classes flat rack container and general container and I have some questions.

How many labeled images are enough (minimum)?
Do I have to change the anchors file?

Currently, I have trained 5 images for each image.And got the result like this.

This is my training information

#train.txt 
training_set/general_container/images/df3b509a.jpg 1,1,268,198,20
training_set/general_container/images/image9.jpg 15,31,115,108,20
training_set/general_container/images/image10.jpg 20,29,474,288,20
training_set/general_container/images/image28.jpg 42,17,450,347,20
training_set/general_container/images/image29.jpg 34,67,465,311,20
training_set/general_container/images/image32.jpg 48,4,432,351,20
training_set/flat_rack_container/images/1.jpg 15,172,702,447,21
training_set/flat_rack_container/images/10.jpg 6,30,771,499,21
training_set/flat_rack_container/images/10FR40.jpg 104,323,1980,13232,21
training_set/flat_rack_container/images/11.jpg 18,140,464,429,21
training_set/flat_rack_container/images/20-feet-flat-rack-shipping-container-500x500.jpg 25,82,491,423,21
training_set/flat_rack_container/images/20-Flat-Rack.jpg 12,14,434,350,21 438,24,596,300,21

#classes
general container
flat rack container

Do you know the reason?

Q: Have you run the scripts/kmeans.py by yourself?

Have you run the scripts/kmeans.py by yourself? How about your result?

incompatible shapes with large grayscale inages

After performing a training for images with shape (1632,1088) i thought there is now time for detection. Unfortunately while loading weights into model with same input parameters as training model an error has arisen:

Traceback (most recent call last):
  File "detect_interactive.py", line 76, in <module>
    _main(**arg_params)
  File "detect_interactive.py", line 60, in _main
    nb_gpu=nb_gpu)
  File "D:\Publikacja\repaired_yolo\yolo3\yolo.py", line 89, in __init__
    self.boxes, self.scores, self.classes = self._create_model()
  File "D:\Publikacja\repaired_yolo\yolo3\yolo.py", line 129, in _create_model
    self.yolo_model.load_weights(self.weights_path)
  File "C:\Users\micha\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\keras\engine\training.py", line 162, in load_weights
    return super(Model, self).load_weights(filepath, by_name)
  File "C:\Users\micha\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\keras\engine\network.py", line 1424, in load_weights
    saving.load_weights_from_hdf5_group(f, self.layers)
  File "C:\Users\micha\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\keras\saving\hdf5_format.py", line 759, in load_weights_from_hdf5_group
    K.batch_set_value(weight_value_tuples)
  File "C:\Users\micha\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\keras\backend.py", line 3066, in batch_set_value
    assign_op = x.assign(assign_placeholder)
  File "C:\Users\micha\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\ops\resource_variable_ops.py", line 1141, in assign
    self._shape.assert_is_compatible_with(value_tensor.shape)
  File "C:\Users\micha\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\framework\tensor_shape.py", line 1103, in assert_is_compatible_with
    raise ValueError("Shapes %s and %s are incompatible" % (self, other))
ValueError: Shapes (1, 1, 512, 24) and (21, 512, 1, 1) are incompatible

Disclaimer - I'm not using theano

Custom model to EdgeTPU

Hi great work with this project, thanks! I was able to train a model with my own dataset and your training.py. Inspired by this repo, I'm trying to convert the model to TF Lite it is slightly different from the original Yolov3:

In the model I produced using training.py I observe a section around the UpSampling2D op which does not seem to be in the coco_tiny_v3 model displayed on the right. This UpSampling2D operation cannot be used in quantized models so I'm wondering if there is a way to get rid of it (as it does not seem to be part of the original model either)?

It would be awesome if I could convert my yolov3 tiny model to Coral/EdgeTPU!

buffer is too small while converting Yolo weights

download the original model https://pjreddie.com/media/files/yolo.weights and run conversion via

python convert_weights.py --config_path ../model_data/yolo.cfg --weights_path ../model_data/yolo.weights --output_path ../model_data/yolo.h5

crashes with folowing error message:

 73%|██████████████████████████▎                    | 79/108 [02:16<01:38,  3.41s/it]
Traceback (most recent call last):
  File "convert_weights.py", line 297, in <module>
    _main(**arg_params)
  File "convert_weights.py", line 267, in _main
    weights_file, count, weight_decay, out_index)
  File "convert_weights.py", line 168, in parse_section
    weight_decay)
  File "convert_weights.py", line 116, in parse_convolutional
    buffer=weights_file.read(weights_size * 4))
TypeError: buffer is too small for requested array

yolo_correct_boxes - explaination

Hi all,
Can some one explain the function yolo_correct_boxes in model.py

Thnaks in advance

multi-GPU training fails

crashes with a similar error even on training head...

INFO:root:Train on 14626 samples, val on 1625 samples, with batch size 16.
Epoch 1/150
2019-10-11 23:42:30.041545: I tensorflow/stream_executor/dso_loader.cc:152] successfully opened CUDA library libcublas.so.10.0 locally
913/914 [============================>.] - ETA: 1s - loss: 27.99742019-10-12 00:00:47.026746: W tensorflow/core/framework/op_kernel.cc:1401] OP_REQUIRES failed at tensor_array_ops.cc:661 : Invalid argument: TensorArray replica_0/model_3/yolo_loss/TensorArray_5484: Could not read from TensorArray index 0. Furthermore, the element shape is not fully defined: [?,?,3]. It is possible you are working with a resizeable TensorArray and stop_gradients is not allowing the gradients to be written. If you set the full element_shape property on the forward TensorArray, the proper all-zeros tensor will be returned instead of incurring this error.
2019-10-12 00:00:47.027158: W tensorflow/core/framework/op_kernel.cc:1401] OP_REQUIRES failed at tensor_array_ops.cc:661 : Invalid argument: TensorArray replica_0/model_3/yolo_loss/TensorArray_1_5485: Could not read from TensorArray index 0. Furthermore, the element shape is not fully defined: [?,?,3]. It is possible you are working with a resizeable TensorArray and stop_gradients is not allowing the gradients to be written. If you set the full element_shape property on the forward TensorArray, the proper all-zeros tensor will be returned instead of incurring this error.
2019-10-12 00:00:47.027194: W tensorflow/core/framework/op_kernel.cc:1401] OP_REQUIRES failed at tensor_array_ops.cc:661 : Invalid argument: TensorArray replica_0/model_3/yolo_loss/TensorArray_2_5486: Could not read from TensorArray index 0. Furthermore, the element shape is not fully defined: [?,?,3]. It is possible you are working with a resizeable TensorArray and stop_gradients is not allowing the gradients to be written. If you set the full element_shape property on the forward TensorArray, the proper all-zeros tensor will be returned instead of incurring this error.
Traceback (most recent call last):
File "scripts/training.py", line 211, in <module>
_main(**arg_params)
File "scripts/training.py", line 182, in _main
callbacks=[tb_logging, checkpoint, reduce_lr, early_stopping])
File "/home/j.borovec/.local/lib/python3.6/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "/home/j.borovec/.local/lib/python3.6/site-packages/keras/engine/training.py", line 1418, in fit_generator
initial_epoch=initial_epoch)
File "/home/j.borovec/.local/lib/python3.6/site-packages/keras/engine/training_generator.py", line 234, in fit_generator
workers=0)
File "/home/j.borovec/.local/lib/python3.6/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "/home/j.borovec/.local/lib/python3.6/site-packages/keras/engine/training.py", line 1472, in evaluate_generator
verbose=verbose)
File "/home/j.borovec/.local/lib/python3.6/site-packages/keras/engine/training_generator.py", line 346, in evaluate_generator
outs = model.test_on_batch(x, y, sample_weight=sample_weight)
File "/home/j.borovec/.local/lib/python3.6/site-packages/keras/engine/training.py", line 1256, in test_on_batch
outputs = self.test_function(ins)
File "/home/j.borovec/.local/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2715, in _call_
return self._call(inputs)
File "/home/j.borovec/.local/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2675, in _call
fetched = self._callable_fn(*array_vals)
File "/home/j.borovec/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1439, in _call_
run_metadata_ptr)
File "/home/j.borovec/.local/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 528, in _exit_
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: TensorArray replica_0/model_3/yolo_loss/TensorArray_5484: Could not read from TensorArray index 0. Furthermore, the element shape is not fully defined: [?,?,3]. It is possible you are working with a resizeable TensorArray and stop_gradients is not allowing the gradients to be written. If you set the full element_shape property on the forward TensorArray, the proper all-zeros tensor will be returned instead of incurring this error.
[\\{{node replica_0/model_3/yolo_loss/TensorArrayStack/TensorArrayGatherV3}}]
[\\{{node replica_1/model_3/yolo_loss/ExpandDims_3}}]

see qqwweee#204, qqwweee#497

How to change IOU?

Hello Borda
I want to change iou .
How can I change this parameter and does this need to train again after changing that?

With respect to your response

too many filtered objects while training

during training on own dataset, there are too many

DEBUG:root:Warning: 3 of 3 (100%) generated boxes was filtered out
DEBUG:root:Warning: 1 of 2 (50%) generated boxes was filtered out
DEBUG:root:Warning: 3 of 3 (100%) generated boxes was filtered out
DEBUG:root:Warning: 2 of 5 (40%) generated boxes was filtered out
DEBUG:root:Warning: 6 of 6 (100%) generated boxes was filtered out
DEBUG:root:Warning: 2 of 2 (100%) generated boxes was filtered out
DEBUG:root:Warning: 2 of 2 (100%) generated boxes was filtered out
DEBUG:root:Warning: 1 of 2 (50%) generated boxes was filtered out
DEBUG:root:Warning: 2 of 3 (66%) generated boxes was filtered out
DEBUG:root:Warning: 1 of 2 (50%) generated boxes was filtered out
DEBUG:root:Warning: 1 of 2 (50%) generated boxes was filtered out
DEBUG:root:Warning: 4 of 5 (80%) generated boxes was filtered out
DEBUG:root:Warning: 2 of 4 (50%) generated boxes was filtered out
DEBUG:root:Warning: 4 of 4 (100%) generated boxes was filtered out

which is quite suspicious...

keras-yolo3/yolo3/utils.py

Line 329 in 817e948

logging.debug('Warning: some generated boxes was filtered out')

number of mask dimensions should be specified

ValueError                                Traceback (most recent call last)
<ipython-input-12-628fd8754cb7> in <module>
     38 input_image_shape = K.placeholder(shape=(2, ))
     39 boxes, scores, classes = yolo_eval(yolo_model.output, anchors, len(class_names), input_image_shape,
---> 40                                     score_threshold=0.3, iou_threshold=0.45)
     41 
     42 print("YOLO model ready!")

~/library/Mod04/01-Yolo/yolo_keras/model.py in yolo_eval(yolo_outputs, anchors, num_classes, image_shape, max_boxes, score_threshold, iou_threshold)
    213     for c in range(num_classes):
    214         # TODO: use keras backend instead of tf.
--> 215         class_boxes = tf.boolean_mask(boxes, mask[:, c])
    216         class_box_scores = tf.boolean_mask(box_scores[:, c], mask[:, c])
    217         nms_index = tf.image.non_max_suppression(

~/anaconda3_501/lib/python3.6/site-packages/tensorflow/python/util/dispatch.py in wrapper(*args, **kwargs)

~/anaconda3_501/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py in boolean_mask_v2(tensor, mask, axis, name)
   1423                                    #  [[7, 10],
   1424                                    #   [8, 11],
-> 1425                                    #   [9, 12]]]
   1426   ```
   1427 

~/anaconda3_501/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py in boolean_mask(tensor, mask, name, axis)
   1352     name: A name for the operation (optional).
   1353 
-> 1354   Returns:
   1355     if `num_or_size_splits` is a scalar returns `num_or_size_splits` `Tensor`
   1356     objects; if `num_or_size_splits` is a 1-D Tensor returns

ValueError: Number of mask dimensions must be specified, even if some dimensions are None.  E.g. shape=[None] is ok, but shape=None is not.

training model on second GPU, too low memory

Running training on GPU machine where are two physical graphics cards but the rest one (index 0) is in use by another process from 99%, so I have set to use the second one, but somehow in the process, it is ignored and still asking default GPU card 0

export CUDA_VISIBLE_DEVICES=1
python3 scripts/training.py --path_dataset ~/Cache/Project_Video/DATASETS/ppl-detect-v2_temp/dataset.txt --path_weights ./model_data/tiny-yolo.h5 --path_anchors ./model_data/tiny-yolo_anchors.csv --path_output ./model_data --path_config ./model_data/train_tiny-yolo_ppl.yaml

failing message

2019-08-21 00:58:58.957373: W tensorflow/core/common_runtime/bfc_allocator.cc:319] *************************************************************************************************___
2019-08-21 00:58:58.957417: W tensorflow/core/framework/op_kernel.cc:1502] OP_REQUIRES failed at conv_ops.cc:486 : Resource exhausted: OOM when allocating tensor with shape[128,104,104,64] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
Traceback (most recent call last):
  File "scripts/training.py", line 208, in <module>
    _main(**arg_params)
  File "scripts/training.py", line 200, in _main
    callbacks=[tb_logging, checkpoint, reduce_lr, early_stopping])
  File "/home/jb/.local/lib/python3.6/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
    return func(*args, **kwargs)
  File "/home/jb/.local/lib/python3.6/site-packages/keras/engine/training.py", line 1418, in fit_generator
    initial_epoch=initial_epoch)
  File "/home/jb/.local/lib/python3.6/site-packages/keras/engine/training_generator.py", line 217, in fit_generator
    class_weight=class_weight)
  File "/home/jb/.local/lib/python3.6/site-packages/keras/engine/training.py", line 1217, in train_on_batch
    outputs = self.train_function(ins)
  File "/home/jb/.local/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2715, in __call__
    return self._call(inputs)
  File "/home/jb/.local/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2675, in _call
    fetched = self._callable_fn(*array_vals)
  File "/home/jb/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1458, in __call__
    run_metadata_ptr)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: 2 root error(s) found.
  (0) Resource exhausted: OOM when allocating tensor with shape[128,104,104,64] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
         [[{{node conv2d_3/convolution}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

         [[loss_1/add_12/_1041]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

  (1) Resource exhausted: OOM when allocating tensor with shape[128,104,104,64] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
         [[{{node conv2d_3/convolution}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

0 successful operations.
0 derived errors ignored.

irh avalaible GPUs:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.48                 Driver Version: 410.48                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce RTX 208...  Off  | 00000000:09:00.0 Off |                  N/A |
| 51%   57C    P8    39W / 260W |  10773MiB / 10989MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce RTX 208...  Off  | 00000000:41:00.0 Off |                  N/A |
| 86%   78C    P2   124W / 260W |  10912MiB / 10986MiB |     45%      Default |
+-------------------------------+----------------------+----------------------+