Giter Site home page Giter Site logo

borda / keras-yolo3 Goto Github PK

View Code? Open in Web Editor NEW

This project forked from qqwweee/keras-yolo3

31.0 6.0 10.0 438 KB

A Keras implementation of YOLOv3 (Tensorflow backend) a successor of qqwweee/keras-yolo3

License: MIT License

Python 100.00%
cnn-keras detection classification object-detection tensorflow yolo yolov3 yolov3-tiny

keras-yolo3's Introduction

Keras: YOLO v3

Build Status Build status CircleCI codecov Codacy Badge CodeFactor license

Introduction

A Keras implementation of YOLOv3 (Tensorflow backend) inspired by allanzelener/YAD2K. This fork is a continuation of qqwweee/keras-yolo3 with some CI and bug fixing since its parent become inactive...

For package installation use of the following commands

pip install git+https://github.com/Borda/keras-yolo3.git
pip install https://github.com/Borda/keras-yolo3/archive/master.zip

or clone/download repository locally and run python setup.py install


Quick Start

For more model and configuration please see YOLO website and darknet repository.

  1. Download YOLOv3 weights from YOLO website.
    wget -O ./model_data/yolo3.weights  \
       https://pjreddie.com/media/files/yolov3.weights  \
       --progress=bar:force:noscroll
    alternatively you can download light version yolov3-tiny.weights
  2. Convert the Darknet YOLO model to a Keras model.
    python3 scripts/convert_weights.py \
        --config_path ./model_data/yolo.cfg \
        --weights_path ./model_data/yolo.weights \
        --output_path ./model_data/yolo.h5
  3. Run YOLO detection.
    python3 scripts/detection.py \
       --path_weights ./model_data/yolo.h5 \
       --path_anchors ./model_data/yolo_anchors.csv \
       --path_classes ./model_data/coco_classes.txt \
       --path_output ./results \
       --path_image ./model_data/bike-car-dog.jpg \
       --path_video person.mp4
    For Full YOLOv3, just do in a similar way, just specify model path and anchor path with --path_weights <model_file> and --path_anchors <anchor_file>.
  4. MultiGPU usage: use --nb_gpu N to use N GPUs. It is passed to the Keras multi_gpu_model().

Training

For training you can use VOC dataset, COCO datset or your own...

  1. Generate your own annotation file and class names file.
    • One row for one image;
    • Row format: image_file_path box1 box2 ... boxN;
    • Box format: x_min,y_min,x_max,y_max,class_id (no space).
    • Run one of following scrips for dataset conversion
      • scripts/annotation_voc.py
      • scripts/annotation_coco.py
      • scripts/annotation_csv.py
        Here is an example:
    path/to/img1.jpg 50,100,150,200,0 30,50,200,120,3
    path/to/img2.jpg 120,300,250,600,2
    ...
    
  2. Make sure you have run python scripts/convert_weights.py <...>. The file model_data/yolo_weights.h5 is used to load pre-trained weights.
  3. Modify training.py and start training. python training.py. Use your trained weights or checkpoint weights with command line option --model model_file when using yolo_interactive.py. Remember to modify class path or anchor path, with --classes class_file and --anchors anchor_file.

If you want to use original pre-trained weights for YOLOv3:

  1. wget https://pjreddie.com/media/files/darknet53.conv.74
  2. rename it as darknet53.weights
  3. python convert.py -w darknet53.cfg darknet53.weights model_data/darknet53_weights.h5
  4. use model_data/darknet53_weights.h5 in training.py

Some issues to know

  1. The test environment is Python 3.x ; Keras 2.2.0 ; tensorflow 1.14.0
  2. Default anchors are used. If you use your own anchors, probably some changes are needed.
  3. The inference result is not totally the same as Darknet but the difference is small.
  4. Always load pretrained weights and freeze layers in the first stage of training. Or try Darknet training. It's OK if there is a mismatch warning.
  5. The training strategy is for reference only. Adjust it according to your dataset and your goal. and add further strategy if needed.
  6. For speeding up the training process with frozen layers train_bottleneck.py can be used. It will compute the bottleneck features of the frozen model first and then only trains the last layers. This makes training on CPU possible in a reasonable time. See this post for more information on bottleneck features.
  7. Failing while run multi-GPU training, think about porting to TF 2.0.

Nice reading

keras-yolo3's People

Contributors

6293 avatar b02902131 avatar borda avatar dleam avatar jiaowoboshao avatar johanmodin avatar lgyee avatar philtrade avatar qqwweee avatar rulerof avatar stefanbo92 avatar tanakataiki avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

keras-yolo3's Issues

inconsistency of exported trained models

While training model completely random weights from the beginning and exporting just weights

model.save_weights(path_weights)
and complete model
yolo_model.save(path_model)
the model does not predict the same detections.

For both predistions, I have used the same predict.py script.
Overall the model from weights after a short training on VOC dataset is able to predict some people on sample volleyball video compare to fully exported model predicts nothing...

random box in uilts.py?

I am troubled, the box_data becomes 0 after random preprocessing for real-time data augmentation. Will this affect the training? the later are the output in function get_random_data() in uilts.py.

box:  [[344 424  71 210   1]
 [209 335  25 124   0]] len box:  2
not random box_data:  [[173. 330.  20. 155.   0.]
 [286. 404.  59. 226.   1.]
 [  0.   0.   0.   0.   0.]
 [  0.   0.   0.   0.   0.]
 [  0.   0.   0.   0.   0.]
 [  0.   0.   0.   0.   0.]
 [  0.   0.   0.   0.   0.]
 [  0.   0.   0.   0.   0.]
 [  0.   0.   0.   0.   0.]
 [  0.   0.   0.   0.   0.]
 [  0.   0.   0.   0.   0.]
 [  0.   0.   0.   0.   0.]
 [  0.   0.   0.   0.   0.]
 [  0.   0.   0.   0.   0.]
 [  0.   0.   0.   0.   0.]
 [  0.   0.   0.   0.   0.]
 [  0.   0.   0.   0.   0.]
 [  0.   0.   0.   0.   0.]
 [  0.   0.   0.   0.   0.]
 [  0.   0.   0.   0.   0.]] len box:  20
random box_data:  [[0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]] len box:  20

missing 1 required positional argument: 'model_image_size'

Thank you for opening a fork as the upstream seems to be largely broken with calls and shebangs to python, which for me call python2 instead of python3, and the argument parser in yolo_video.py is also completely broken, same for the FPS counter, ...

So, I tried to follow the steps in your README.md and converted the weights and now I'm trying:

python3 scripts/detection.py --path_weights ./model_data/yolov3.h5 --path_anchors ./model_data/yolo_anchors.csv --path_classes ./model_data/coco_classes.txt --path_video video.avi

But I get missing 1 required positional argument: 'model_image_size'. This is not mentioned in the Readme. Maybe it can be mentioned or maybe it even can be inferred from the input image or video or from the network model by default as it is included in the yolov3.cfg. For YOLOv3, it would be --model_image_size 608 608 by default.

Also, the progress bar looks like it doesn't work, it only shows videos: 0%| | 0/1 [00:00<?, ?it/s], so it looks like the progress is in number of videos instead of number of frames and thereby rather useless. Also it would be nice to have the option to view it directly by using cv2.imshow instead of exporting it like it was possible in the original repo.

IndexError: bytearray index out of range

This is what I ran

python scripts/detection.py --path_weights ./model_data/yolo_weights_full.h5 --path_anchors ./model_data/yolo_anchors.csv --path_classes ./model_data/coco_classes.txt --path_output ./results --path_image C:\Users\HPO2KOR\Desktop\Work\venv\Patent\Labelled_Dataset\Dataset\train\text\FFDDAPMDD1.png

But I am getting an error

Traceback (most recent call last):
  File "scripts/detection.py", line 198, in <module>
    _main(**arg_params)
  File "scripts/detection.py", line 185, in _main
    predict_image(yolo, path_img, path_output)
  File "scripts/detection.py", line 94, in predict_image
    image_pred, pred_items = yolo.detect_image(image)
  File "C:\Users\HPO2KOR\Desktop\Work\venv\Patent\Dataset\Dataset\keras-yolo3-master\keras_yolo3\yolo.py", line 202, in detect_image
    out_scores[i], self.colors[c], thickness)
  File "C:\Users\HPO2KOR\Desktop\Work\venv\Patent\Dataset\Dataset\keras-yolo3-master\keras_yolo3\visual.py", line 56, in draw_bounding_box
    outline=color)
  File "C:\Users\HPO2KOR\Desktop\Work\venv\virtualenv\lib\site-packages\PIL\ImageDraw.py", line 246, in rectangle
    ink, fill = self._getink(outline, fill)
  File "C:\Users\HPO2KOR\Desktop\Work\venv\virtualenv\lib\site-packages\PIL\ImageDraw.py", line 112, in _getink
    ink = self.palette.getcolor(ink)
  File "C:\Users\HPO2KOR\Desktop\Work\venv\virtualenv\lib\site-packages\PIL\ImagePalette.py", line 109, in getcolor
    self.palette[index + 256] = color[1]
IndexError: bytearray index out of range

How do I resolve this?

Data generator problems

While using file formatted as for original qqwweee implementation (on which it is working), i got this error.

W0829 14:36:56.934175 23120 training_generator.py:251] Your dataset iterator ran out of data; interrupting training. Make sure that your iterator can generate at least `steps_per_epoch * epochs` batches (in this case, 115800 batches). You may need touse the repeat() function when building your dataset.
Traceback (most recent call last):
  File "training.py", line 214, in <module>
    _main(**arg_params)
  File "training.py", line 205, in _main
    callbacks=[tb_logging, checkpoint, reduce_lr, early_stopping])
  File "C:\Users\micha\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\keras\engine\training.py", line 1433, in fit_generator
    steps_name='steps_per_epoch')
  File "C:\Users\micha\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\keras\engine\training_generator.py", line 300, in model_iteration
    aggregator.finalize()
  File "C:\Users\micha\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\keras\engine\training_utils.py", line 111, in finalize
    raise ValueError('Empty training data.')
ValueError: Empty training data.

Coustom Model and Coustom Classes

This is the error I am facing while testing it on different weights and different anchors and different classes. How to resolve this issue?
Screenshot from 2019-11-18 10-54-33

loss: NaN while training with my own dataset

Hello,

I'm trying to migrate from qqwweeee's yolo implementation, but training with my own dataset results in loss: NaN (in qqwweee i can train just fine with same dataset).
Already implemented basic null checks, and also my dataset goes through automated checks (so every image present, every box checked by its coordinates, class present in classes file) before training.

Already tried with different batch sizes -- same effect.
Time for loss function to become NaN seems random (almost always in 1st epoch).

I think that problem somehow related to data generator...
Maybe you can suggest a proper way to debug it?

image-size: [416, 416]
batch-size:
  bottlenecks: 8
  head: 48
  # the unfreeze model takes more memory
  full: 8
epochs:
  bottlenecks: 25
  head: 50
  full: 30
CB_learning-rate:
  factor: 0.01
  patience: 3
CB_stopping:
  min_delta: 0
  patience: 25
valid-split: 0.1
generator:
  augment: true
  resize_img: true
  nb_threads: 0.9
recompute-bottlenecks: false
python scripts/training.py --path_dataset train_annotations.txt --path_weights model_data\yolo_weights.h5 --path_anchors model_data/yolo_anchors.csv --path_classes model_data/custom_classes.txt --path_output logs/003 --path_config model_data/train_yolo.yaml

predict not working

Thanks a lot for this refactor, it's a million times better than the base repo. Having said that, if I train my model with non 416x416 images, predict is later unable to load the model.

I've tried this with and without the following changes in config_train.json

+    "image-size": [1600, 192],
+    "batch-size": 8,

and train.py

+    'image-size': (1600, 192),
+    'batch-size': 8,

Either way training works fine, with xval_loss as low as 25. But when using predict I always get no matter my attempted fixes in yolo3/yolo.py on self.yolo_model.load_weights(self.weights_path):

Traceback (most recent call last):
  File "/xx/yolo3/yolo.py", line 73, in generate
    self.yolo_model = load_model(self.weights_path, compile=False)
  File "/zz/keras/engine/saving.py", line 419, in load_model
    model = _deserialize_model(f, custom_objects, compile)
  File "/zz/keras/engine/saving.py", line 221, in _deserialize_model
    model_config = f['model_config']
  File "/zz/keras/utils/io_utils.py", line 302, in __getitem__
    raise ValueError('Cannot create group in read only mode.')
ValueError: Cannot create group in read only mode.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/zz/tensorflow/python/framework/ops.py", line 1659, in _create_c_op
    c_op = c_api.TF_FinishOperation(op_desc)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Dimension 0 in both shapes must be equal, but are 1 and 42. Shapes are [1,1,1024,255] and [42,1024,1,1]. for 'Assign_360' (op: 'Assign') with input shapes: [1,1,1024,255], [42,1024,1,1].

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "scripts/predict.py", line 176, in <module>
    _main(**arg_params)
  File "scripts/predict.py", line 153, in _main
    classes_path=path_classes, gpu_num=gpu_num)
  File "/xx/yolo3/yolo.py", line 59, in __init__
    self.boxes, self.scores, self.classes = self.generate()
  File "/xx/yolo3/yolo.py", line 86, in generate
    self.yolo_model.load_weights(self.weights_path)
  File "/zz/keras/engine/network.py", line 1166, in load_weights
    f, self.layers, reshape=reshape)
  File "/zz/keras/engine/saving.py", line 1058, in load_weights_from_hdf5_group
    K.batch_set_value(weight_value_tuples)
  File "/zz/keras/backend/tensorflow_backend.py", line 2465, in batch_set_value
    assign_op = x.assign(assign_placeholder)
  File "/zz/tensorflow/python/ops/variables.py", line 1762, in assign
    name=name)
  File "/zz/tensorflow/python/ops/state_ops.py", line 223, in assign
    validate_shape=validate_shape)
  File "/zz/tensorflow/python/ops/gen_state_ops.py", line 64, in assign
    use_locking=use_locking, name=name)
  File "/zz/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper
    op_def=op_def)
  File "/zz/tensorflow/python/util/deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "/zz/tensorflow/python/framework/ops.py", line 3300, in create_op
    op_def=op_def)
  File "/zz/tensorflow/python/framework/ops.py", line 1823, in __init__
    control_input_ops)
  File "/zz/tensorflow/python/framework/ops.py", line 1662, in _create_c_op
    raise ValueError(str(e))
ValueError: Dimension 0 in both shapes must be equal, but are 1 and 42. Shapes are [1,1,1024,255] and [42,1024,1,1]. for 'Assign_360' (op: 'Assign') with input shapes: [1,1,1024,255], [42,1024,1,1].

Are you sure that predict works as you expect?

The only time it works for me is if it load the default yolo3.h5 model.

Path for parameters is missing

There are many prams arguments in training, detection and yolo.py. Although i set default values for params but still missing path in trainng.py.

Training with my own dataset

Hi @Borda, thanks for the great working.
It is highly appreciated if you could support.
I want to train the 2 new classes flat rack container and general container and I have some questions.

  • How many labeled images are enough (minimum)?
  • Do I have to change the anchors file?

Currently, I have trained 5 images for each image.And got the result like this.
output_test
This is my training information

#train.txt 
training_set/general_container/images/df3b509a.jpg 1,1,268,198,20
training_set/general_container/images/image9.jpg 15,31,115,108,20
training_set/general_container/images/image10.jpg 20,29,474,288,20
training_set/general_container/images/image28.jpg 42,17,450,347,20
training_set/general_container/images/image29.jpg 34,67,465,311,20
training_set/general_container/images/image32.jpg 48,4,432,351,20
training_set/flat_rack_container/images/1.jpg 15,172,702,447,21
training_set/flat_rack_container/images/10.jpg 6,30,771,499,21
training_set/flat_rack_container/images/10FR40.jpg 104,323,1980,13232,21
training_set/flat_rack_container/images/11.jpg 18,140,464,429,21
training_set/flat_rack_container/images/20-feet-flat-rack-shipping-container-500x500.jpg 25,82,491,423,21
training_set/flat_rack_container/images/20-Flat-Rack.jpg 12,14,434,350,21 438,24,596,300,21

Screen Shot 2019-09-04 at 2 52 30 PM

#classes
general container
flat rack container

Do you know the reason?

incompatible shapes with large grayscale inages

After performing a training for images with shape (1632,1088) i thought there is now time for detection. Unfortunately while loading weights into model with same input parameters as training model an error has arisen:

Traceback (most recent call last):
  File "detect_interactive.py", line 76, in <module>
    _main(**arg_params)
  File "detect_interactive.py", line 60, in _main
    nb_gpu=nb_gpu)
  File "D:\Publikacja\repaired_yolo\yolo3\yolo.py", line 89, in __init__
    self.boxes, self.scores, self.classes = self._create_model()
  File "D:\Publikacja\repaired_yolo\yolo3\yolo.py", line 129, in _create_model
    self.yolo_model.load_weights(self.weights_path)
  File "C:\Users\micha\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\keras\engine\training.py", line 162, in load_weights
    return super(Model, self).load_weights(filepath, by_name)
  File "C:\Users\micha\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\keras\engine\network.py", line 1424, in load_weights
    saving.load_weights_from_hdf5_group(f, self.layers)
  File "C:\Users\micha\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\keras\saving\hdf5_format.py", line 759, in load_weights_from_hdf5_group
    K.batch_set_value(weight_value_tuples)
  File "C:\Users\micha\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\keras\backend.py", line 3066, in batch_set_value
    assign_op = x.assign(assign_placeholder)
  File "C:\Users\micha\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\ops\resource_variable_ops.py", line 1141, in assign
    self._shape.assert_is_compatible_with(value_tensor.shape)
  File "C:\Users\micha\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\framework\tensor_shape.py", line 1103, in assert_is_compatible_with
    raise ValueError("Shapes %s and %s are incompatible" % (self, other))
ValueError: Shapes (1, 1, 512, 24) and (21, 512, 1, 1) are incompatible

Disclaimer - I'm not using theano

Custom model to EdgeTPU

Hi great work with this project, thanks! I was able to train a model with my own dataset and your training.py. Inspired by this repo, I'm trying to convert the model to TF Lite it is slightly different from the original Yolov3:
image
In the model I produced using training.py I observe a section around the UpSampling2D op which does not seem to be in the coco_tiny_v3 model displayed on the right. This UpSampling2D operation cannot be used in quantized models so I'm wondering if there is a way to get rid of it (as it does not seem to be part of the original model either)?

It would be awesome if I could convert my yolov3 tiny model to Coral/EdgeTPU!

buffer is too small while converting Yolo weights

download the original model https://pjreddie.com/media/files/yolo.weights and run conversion via

python convert_weights.py --config_path ../model_data/yolo.cfg --weights_path ../model_data/yolo.weights --output_path ../model_data/yolo.h5

crashes with folowing error message:

 73%|██████████████████████████▎                    | 79/108 [02:16<01:38,  3.41s/it]
Traceback (most recent call last):
  File "convert_weights.py", line 297, in <module>
    _main(**arg_params)
  File "convert_weights.py", line 267, in _main
    weights_file, count, weight_decay, out_index)
  File "convert_weights.py", line 168, in parse_section
    weight_decay)
  File "convert_weights.py", line 116, in parse_convolutional
    buffer=weights_file.read(weights_size * 4))
TypeError: buffer is too small for requested array

multi-GPU training fails

crashes with a similar error even on training head...

INFO:root:Train on 14626 samples, val on 1625 samples, with batch size 16.
Epoch 1/150
2019-10-11 23:42:30.041545: I tensorflow/stream_executor/dso_loader.cc:152] successfully opened CUDA library libcublas.so.10.0 locally
913/914 [============================>.] - ETA: 1s - loss: 27.99742019-10-12 00:00:47.026746: W tensorflow/core/framework/op_kernel.cc:1401] OP_REQUIRES failed at tensor_array_ops.cc:661 : Invalid argument: TensorArray replica_0/model_3/yolo_loss/TensorArray_5484: Could not read from TensorArray index 0. Furthermore, the element shape is not fully defined: [?,?,3]. It is possible you are working with a resizeable TensorArray and stop_gradients is not allowing the gradients to be written. If you set the full element_shape property on the forward TensorArray, the proper all-zeros tensor will be returned instead of incurring this error.
2019-10-12 00:00:47.027158: W tensorflow/core/framework/op_kernel.cc:1401] OP_REQUIRES failed at tensor_array_ops.cc:661 : Invalid argument: TensorArray replica_0/model_3/yolo_loss/TensorArray_1_5485: Could not read from TensorArray index 0. Furthermore, the element shape is not fully defined: [?,?,3]. It is possible you are working with a resizeable TensorArray and stop_gradients is not allowing the gradients to be written. If you set the full element_shape property on the forward TensorArray, the proper all-zeros tensor will be returned instead of incurring this error.
2019-10-12 00:00:47.027194: W tensorflow/core/framework/op_kernel.cc:1401] OP_REQUIRES failed at tensor_array_ops.cc:661 : Invalid argument: TensorArray replica_0/model_3/yolo_loss/TensorArray_2_5486: Could not read from TensorArray index 0. Furthermore, the element shape is not fully defined: [?,?,3]. It is possible you are working with a resizeable TensorArray and stop_gradients is not allowing the gradients to be written. If you set the full element_shape property on the forward TensorArray, the proper all-zeros tensor will be returned instead of incurring this error.
Traceback (most recent call last):
File "scripts/training.py", line 211, in <module>
_main(**arg_params)
File "scripts/training.py", line 182, in _main
callbacks=[tb_logging, checkpoint, reduce_lr, early_stopping])
File "/home/j.borovec/.local/lib/python3.6/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "/home/j.borovec/.local/lib/python3.6/site-packages/keras/engine/training.py", line 1418, in fit_generator
initial_epoch=initial_epoch)
File "/home/j.borovec/.local/lib/python3.6/site-packages/keras/engine/training_generator.py", line 234, in fit_generator
workers=0)
File "/home/j.borovec/.local/lib/python3.6/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "/home/j.borovec/.local/lib/python3.6/site-packages/keras/engine/training.py", line 1472, in evaluate_generator
verbose=verbose)
File "/home/j.borovec/.local/lib/python3.6/site-packages/keras/engine/training_generator.py", line 346, in evaluate_generator
outs = model.test_on_batch(x, y, sample_weight=sample_weight)
File "/home/j.borovec/.local/lib/python3.6/site-packages/keras/engine/training.py", line 1256, in test_on_batch
outputs = self.test_function(ins)
File "/home/j.borovec/.local/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2715, in _call_
return self._call(inputs)
File "/home/j.borovec/.local/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2675, in _call
fetched = self._callable_fn(*array_vals)
File "/home/j.borovec/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1439, in _call_
run_metadata_ptr)
File "/home/j.borovec/.local/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 528, in _exit_
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: TensorArray replica_0/model_3/yolo_loss/TensorArray_5484: Could not read from TensorArray index 0. Furthermore, the element shape is not fully defined: [?,?,3]. It is possible you are working with a resizeable TensorArray and stop_gradients is not allowing the gradients to be written. If you set the full element_shape property on the forward TensorArray, the proper all-zeros tensor will be returned instead of incurring this error.
[\\{{node replica_0/model_3/yolo_loss/TensorArrayStack/TensorArrayGatherV3}}]
[\\{{node replica_1/model_3/yolo_loss/ExpandDims_3}}]

see qqwweee#204, qqwweee#497

How to change IOU?

Hello Borda
I want to change iou .
How can I change this parameter and does this need to train again after changing that?

With respect to your response

too many filtered objects while training

during training on own dataset, there are too many

DEBUG:root:Warning: 3 of 3 (100%) generated boxes was filtered out
DEBUG:root:Warning: 1 of 2 (50%) generated boxes was filtered out
DEBUG:root:Warning: 3 of 3 (100%) generated boxes was filtered out
DEBUG:root:Warning: 2 of 5 (40%) generated boxes was filtered out
DEBUG:root:Warning: 6 of 6 (100%) generated boxes was filtered out
DEBUG:root:Warning: 2 of 2 (100%) generated boxes was filtered out
DEBUG:root:Warning: 2 of 2 (100%) generated boxes was filtered out
DEBUG:root:Warning: 1 of 2 (50%) generated boxes was filtered out
DEBUG:root:Warning: 2 of 3 (66%) generated boxes was filtered out
DEBUG:root:Warning: 1 of 2 (50%) generated boxes was filtered out
DEBUG:root:Warning: 1 of 2 (50%) generated boxes was filtered out
DEBUG:root:Warning: 4 of 5 (80%) generated boxes was filtered out
DEBUG:root:Warning: 2 of 4 (50%) generated boxes was filtered out
DEBUG:root:Warning: 4 of 4 (100%) generated boxes was filtered out

which is quite suspicious...

logging.debug('Warning: some generated boxes was filtered out')

number of mask dimensions should be specified

ValueError                                Traceback (most recent call last)
<ipython-input-12-628fd8754cb7> in <module>
     38 input_image_shape = K.placeholder(shape=(2, ))
     39 boxes, scores, classes = yolo_eval(yolo_model.output, anchors, len(class_names), input_image_shape,
---> 40                                     score_threshold=0.3, iou_threshold=0.45)
     41 
     42 print("YOLO model ready!")

~/library/Mod04/01-Yolo/yolo_keras/model.py in yolo_eval(yolo_outputs, anchors, num_classes, image_shape, max_boxes, score_threshold, iou_threshold)
    213     for c in range(num_classes):
    214         # TODO: use keras backend instead of tf.
--> 215         class_boxes = tf.boolean_mask(boxes, mask[:, c])
    216         class_box_scores = tf.boolean_mask(box_scores[:, c], mask[:, c])
    217         nms_index = tf.image.non_max_suppression(

~/anaconda3_501/lib/python3.6/site-packages/tensorflow/python/util/dispatch.py in wrapper(*args, **kwargs)

~/anaconda3_501/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py in boolean_mask_v2(tensor, mask, axis, name)
   1423                                    #  [[7, 10],
   1424                                    #   [8, 11],
-> 1425                                    #   [9, 12]]]
   1426   ```
   1427 

~/anaconda3_501/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py in boolean_mask(tensor, mask, name, axis)
   1352     name: A name for the operation (optional).
   1353 
-> 1354   Returns:
   1355     if `num_or_size_splits` is a scalar returns `num_or_size_splits` `Tensor`
   1356     objects; if `num_or_size_splits` is a 1-D Tensor returns

ValueError: Number of mask dimensions must be specified, even if some dimensions are None.  E.g. shape=[None] is ok, but shape=None is not.

training model on second GPU, too low memory

Running training on GPU machine where are two physical graphics cards but the rest one (index 0) is in use by another process from 99%, so I have set to use the second one, but somehow in the process, it is ignored and still asking default GPU card 0

export CUDA_VISIBLE_DEVICES=1
python3 scripts/training.py --path_dataset ~/Cache/Project_Video/DATASETS/ppl-detect-v2_temp/dataset.txt --path_weights ./model_data/tiny-yolo.h5 --path_anchors ./model_data/tiny-yolo_anchors.csv --path_output ./model_data --path_config ./model_data/train_tiny-yolo_ppl.yaml

failing message

2019-08-21 00:58:58.957373: W tensorflow/core/common_runtime/bfc_allocator.cc:319] *************************************************************************************************___
2019-08-21 00:58:58.957417: W tensorflow/core/framework/op_kernel.cc:1502] OP_REQUIRES failed at conv_ops.cc:486 : Resource exhausted: OOM when allocating tensor with shape[128,104,104,64] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
Traceback (most recent call last):
  File "scripts/training.py", line 208, in <module>
    _main(**arg_params)
  File "scripts/training.py", line 200, in _main
    callbacks=[tb_logging, checkpoint, reduce_lr, early_stopping])
  File "/home/jb/.local/lib/python3.6/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
    return func(*args, **kwargs)
  File "/home/jb/.local/lib/python3.6/site-packages/keras/engine/training.py", line 1418, in fit_generator
    initial_epoch=initial_epoch)
  File "/home/jb/.local/lib/python3.6/site-packages/keras/engine/training_generator.py", line 217, in fit_generator
    class_weight=class_weight)
  File "/home/jb/.local/lib/python3.6/site-packages/keras/engine/training.py", line 1217, in train_on_batch
    outputs = self.train_function(ins)
  File "/home/jb/.local/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2715, in __call__
    return self._call(inputs)
  File "/home/jb/.local/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2675, in _call
    fetched = self._callable_fn(*array_vals)
  File "/home/jb/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1458, in __call__
    run_metadata_ptr)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: 2 root error(s) found.
  (0) Resource exhausted: OOM when allocating tensor with shape[128,104,104,64] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
         [[{{node conv2d_3/convolution}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

         [[loss_1/add_12/_1041]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

  (1) Resource exhausted: OOM when allocating tensor with shape[128,104,104,64] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
         [[{{node conv2d_3/convolution}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

0 successful operations.
0 derived errors ignored.

irh avalaible GPUs:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.48                 Driver Version: 410.48                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce RTX 208...  Off  | 00000000:09:00.0 Off |                  N/A |
| 51%   57C    P8    39W / 260W |  10773MiB / 10989MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce RTX 208...  Off  | 00000000:41:00.0 Off |                  N/A |
| 86%   78C    P2   124W / 260W |  10912MiB / 10986MiB |     45%      Default |
+-------------------------------+----------------------+----------------------+

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.