yunyang1994 / tensorflow2.0-examples Goto Github PK

🙄 Difficult algorithm, Simple code.

License: MIT License

Jupyter Notebook 85.82% Python 14.18%

tensorflow2 tensorflow-examples deep-learning deep-neural-networks machine-learning gan linear-regression resnet reinforcement-learning image-classification

tensorflow2.0-examples's Issues

AssertionError: failed to read all data

Hello, thank you for this YOLO implementation, it's quite impressive. As I'm working through it I ran into a problem. I downloaded the weights as instructed, they are in the directory where image_demo.py is, but when executing the image_demo.py I get the following error from utils.load_weights :

assert len(wf.read()) == 0, 'failed to read all data'
AssertionError: failed to read all data

Can you please give me a hint how to resolve it? It seems like the weights are not read correctly. I am on windows 10 if that matters.

upload image

Yolov3 slow?

with video_demo.py about 20% speed compared to your 1.0 repo. but thanks much for sharing!

RPN训练结果

其中训练的一些结果图，一张图片的处理速度为3s，

tensorflow.python.framework.errors_impl.InvalidArgumentError: RPN train.py

Traceback (most recent call last):
File "/home/z840/PycharmProjects/TensorFlow2.0-Examples/4-Object_Detection/RPN/train.py", line 136, in
score_loss, boxes_loss = compute_loss(target_scores, target_bboxes, target_masks, pred_scores, pred_bboxes)
File "/home/z840/PycharmProjects/TensorFlow2.0-Examples/4-Object_Detection/RPN/train.py", line 110, in compute_loss
boxes_loss = 0.5 * tf.pow(boxes_loss, 2) * tf.cast(boxes_loss<1, tf.float32) + (boxes_loss - 0.5) * tf.cast(boxes_loss >=1, tf.float32)
File "/home/z840/anaconda3/envs/tf2.0/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py", line 884, in binary_op_wrapper
return func(x, y, name=name)
File "/home/z840/anaconda3/envs/tf2.0/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py", line 1180, in _mul_dispatch
return gen_math_ops.mul(x, y, name=name)
File "/home/z840/anaconda3/envs/tf2.0/lib/python3.6/site-packages/tensorflow/python/ops/gen_math_ops.py", line 6487, in mul
_six.raise_from(_core._status_to_exception(e.code, message), None)
File "", line 3, in raise_from
tensorflow.python.framework.errors_impl.InvalidArgumentError: cannot compute Mul as input #1(zero-based) was expected to be a double tensor but is a float tensor [Op:Mul] name: mul/

在训练时出现了这个错误，麻烦大神看一下

How to save the model file in yolov3 project.

I just write one line:
model.save("./yolov3")
in
for epoch in range(cfg.TRAIN.EPOCHS):
for image_data, target in trainset:
train_step(image_data, target, epoch)
for image_data, target in valset:
validate_step(image_data, target, epoch)
model.save_weights("./yolov3")
model.save("./yolov3")

without any sess.run saver.save?

tensorflow 2.0 - yolov3 - 如何制作自己的训练集进行训练

我想做一个水果蔬菜之类的目标检测，请问按照您的数据格式，如何制作自己的数据集呢？我需要用lablImg标注成xml的标签，然后再转换成您的这个格式吗？有什么更加方便的方法吗

How to train a model with or own data?

Hi, Thanks for sharing your amaizing work!

I want to know a few things about your implementation of Yolo V3 on TF2

How we traing the model if we want another size? like 608 or 1056. Change the __C.TRAIN.INPUT_SIZE in config is enogth? or should we recalculate anything else?
Can we transfer learning from other pre trained model? Or always is from scratch?

I tried to train it, but get the nan value on loss after ~4000 steps

=> STEP 4051   lr: 0.000979   giou_loss: 3.13   conf_loss: 5.88   prob_loss: 0.88   total_loss: 9.89
=> STEP 4052   lr: 0.000978   giou_loss:  nan   conf_loss: 8.29   prob_loss: 1.40   total_loss:  nan
=> STEP 4053   lr: 0.000978   giou_loss:  nan   conf_loss:  nan   prob_loss:  nan   total_loss:  nan

Also, when i tried to test my model i got this error:

conv_weights = conv_weights.reshape(conv_shape).transpose([2, 3, 1, 0])
ValueError: cannot reshape array of size 4814 into shape (64,32,3,3)

Should i make some kind of action before test my model? should i load the weight by the .index file?

depth estimation 咋没了？

不要啊大佬，我真的很需要那个代码，哭唧唧，我开题刚开了这个，就不见了，嘤嘤嘤。。。

tensorflow2.0 yolov3是基于keras还是纯tensorflow？

看评论区对tf2的速度表达不一致，到底是tf1快还是tf2快呢，在实际的运行当中。

Process is getting Killed during training on yymnist

Hi, @YunYang1994.
Thanks for the awesome project!
Problem: During training the process is getting Killed.

I'm training during the night so no other processes can interrupt by taking the CPU, it's the 4th time in a row that the training is being stopped, and I can't even start from the same starting point, it starts from the beginning. Any ideas guys?

Update 1:
Changed batch_size to 2

为什么预测的图片没有出现锚框

TensorRT

Did anyone try to create the inference with TensorRT & TF2 for YOLOv3 by any chance, I cant get the trt.TrtGraphConverter() to convert for whatever reason....

Difference between bbox_iou and bbox_giou?

Can you explain the difference between the functionality of bbox_iou and bbox_giou?

yolov3.data-00000-of-00001，训练完模型是tf的，请问怎么测试图片呢?

请问，image_demo里面加载的是yolo3.weights，如果使用train.py训练出来的模型怎么测试呢，我测试图片是没有标记的，就像看一下具体的检测结果!!!谢谢

Tensorflow-2

Hello,
I'm one of the fans of Tensorflow-2 and have created a repository to collect the best of resources available which use this version.
Recently I saw your fantastic repository and notebooks about TF v2.0.
So I add this to my repo. I appreciate if you can help me to collect other stuff you think might be useful
Thanks.

Train custom dataset with pre-trained weights

When I try to use utils.load_weights in train.py it fails. Do you have a tf format version of pre-trained weights from darknet or some other way to initialise the network with it?

FCN

请教一下PO主，为什么FCN中POOL4和POOL3做运算之前要进行一个SCALING不是很能理解

请教关于 STRIDE 的问题

在 yolov3.py 中的 decode 部分，[conv_sbbox, conv_mbbox, conv_lbbox] 对应的 STRIDE 为什么是 [8, 16, 32] 呢？不应该是小的 conv_sbbox 乘以比较大的 32 倍才能还原吗？

运行了下image_demo和video，出现了下面的错误：ValueError: could not broadcast input array from shape (10647,85) into shape (1,85)

关于代码细节的请教

iou = bbox_iou(pred_xywh[:, :, :, :, np.newaxis, :], bboxes[:, np.newaxis, np.newaxis, np.newaxis, :, :])
计算预测框与真实框的iou是怎样扩展维度的？
#bboxes shape (batch_size,max_bbox_per_scale,4)-->(batch_size,1,1,1,max_bbox_per_scale,4)
#pred_xywh shape (batch_size,out_size,out_size,3,4)-->(batch_size,out_size,out_size,3,1,4)
shape变化后是否如上所示，两个张量之间shape不匹配计算的过程是怎样的？例如left_up = tf.maximum(boxes1[..., :2], boxes2[..., :2])是怎样计算的？感谢作者

图片数据在哪下载呀

请问RPN例子中，synthetic_dataset里是不是还有数据，要去哪下载呀

跑到 40000 多个 epoch 时出现 loss: nan

=> STEP 46130 lr: 0.000270 giou_loss: 0.39 conf_loss: 0.02 prob_loss: 0.00 total_loss: 0.41
=> STEP 46131 lr: 0.000270 giou_loss: 0.37 conf_loss: 0.01 prob_loss: 0.00 total_loss: 0.38
=> STEP 46132 lr: 0.000270 giou_loss: nan conf_loss: 0.88 prob_loss: 0.00 total_loss: nan
=> STEP 46133 lr: 0.000270 giou_loss: nan conf_loss: nan prob_loss: nan total_loss: nan
=> STEP 46134 lr: 0.000270 giou_loss: nan conf_loss: nan prob_loss: nan total_loss: nan

这样的情况也属于梯度爆炸吗？
我用的数据是 CCPD2019 车牌识别的图像集。
跑了几个小时，loss 从 1800 一直下降的很好，到 0.38。
46134 个 epoch，得到 8 万个 epoch 才跑完一遍数据集。

total_loss: nan?

=> STEP  748   lr: 0.000598   giou_loss: 2.10   conf_loss: 6.18   prob_loss: 0.03   total_loss: 8.31
=> STEP  749   lr: 0.000599   giou_loss: 2.54   conf_loss: 6.51   prob_loss: 0.02   total_loss: 9.07
=> STEP  750   lr: 0.000600   giou_loss:  nan   conf_loss: 10.89   prob_loss: 0.06   total_loss:  nan
=> STEP  751   lr: 0.000601   giou_loss:  nan   conf_loss:  nan   prob_loss:  nan   total_loss:  nan
=> STEP  752   lr: 0.000602   giou_loss:  nan   conf_loss:  nan   prob_loss:  nan   total_loss:  nan

发现问题

dataset.py
bbox_xywh = np.concatenate([(bbox_coor[2:] + bbox_coor[:2]) * 0.5, bbox_coor[2:] - bbox_coor[:2]], axis=-1)
bbox_xywh_scaled = 1.0 * bbox_xywh[np.newaxis, :] / self.strides[:, np.newaxis]

        iou = []
        exist_positive = False
        for i in range(3):
            anchors_xywh = np.zeros((self.anchor_per_scale, 4))
            anchors_xywh[:, 0:2] = np.floor(bbox_xywh_scaled[i, 0:2]).astype(np.int32) + 0.5
            anchors_xywh[:, 2:4] = self.anchors[i]

            iou_scale = self.bbox_iou(bbox_xywh_scaled[i][np.newaxis, :], anchors_xywh)

这一块 bbox_xywh_scaled的宽高是被除了 stride 的， anchors_xywh 需要同时除以 stride.

Load Weights in demo.py going wrong - YOLOV3 OBJECT DETECTION

Hello,

while testing for frist time. I downloaded the yolo weights and try the image_demo.py. but at load_weights() in conv_layer_name = conv2d_9

model.Summary():
conv2d_9 (Conv2D) (None, 52, 52, 256) 294912 zero_padding2d_2[0][0]

traceback

Traceback (most recent call last):
  File "image_demo.py", line 41, in <module>
    utils.load_weights(model, "./yolov3.weights")
  File "/PATH/TensorFlow2.0-Examples/4-Object_Detection/YOLOV3/core/utils.py", line 51, in load_weights
    conv_weights = conv_weights.reshape(conv_shape).transpose([2, 3, 1, 0])
ValueError: cannot reshape array of size 95566 into shape (256,128,3,3)

Any clue of what is going on? i dont really want to change the load_weights func.

kind regards

关注中，博主加油！！

I am a newbie to YOLOv3. Recently, I am working on training my dataset by YOLOv3 and want to run on my TX2. So, It is my fortune to follow this respository at the beginning on it.

Holp that i can support your work one day.

My Question： Can someone share a YOLO learning group?

image_demo.py results in nan values in bboxes_pred

Hi,
Thanks for this amazing project @YunYang1994 .

When using the image_demo.py, I'm getting the following warnings:

/yolov3-tf2-master/YunYang1994_TF2_YoloV3/TensorFlow2.0-Examples/4-Object_Detection/YOLOV3/core/utils.py:221: RuntimeWarning: invalid value encountered in maximum
  pred_coor = np.concatenate([np.maximum(pred_coor[:, :2], [0, 0]),
/yolov3-tf2-master/YunYang1994_TF2_YoloV3/TensorFlow2.0-Examples/4-Object_Detection/YOLOV3/core/utils.py:222: RuntimeWarning: invalid value encountered in minimum
  np.minimum(pred_coor[:, 2:], [org_w - 1, org_h - 1])], axis=-1)
/yolov3-tf2-master/YunYang1994_TF2_YoloV3/TensorFlow2.0-Examples/4-Object_Detection/YOLOV3/core/utils.py:228: RuntimeWarning: invalid value encountered in greater
  scale_mask = np.logical_and((valid_scale[0] < bboxes_scale), (bboxes_scale < valid_scale[1]))
/yolov3-tf2-master/YunYang1994_TF2_YoloV3/TensorFlow2.0-Examples/4-Object_Detection/YOLOV3/core/utils.py:228: RuntimeWarning: invalid value encountered in less
  scale_mask = np.logical_and((valid_scale[0] < bboxes_scale), (bboxes_scale < valid_scale[1]))
/yolov3-tf2-master/YunYang1994_TF2_YoloV3

And afterwards, I'm printing the pred_bbox and bboxes.
The predicted bboxes are almost all nans (and the value of bboxes doesn't matter because it uses pred_bbox):

tf.Tensor(
[[          nan           nan           nan ...           nan
            nan           nan]
 [          nan           nan           nan ...           nan
            nan           nan]
 [          nan           nan           nan ...           nan
            nan           nan]
 ...
 [3.8400000e+02 3.9251508e+02 2.5982959e+02 ... 1.7753243e-04
  5.0899386e-04 7.8248978e-04]
 [3.8401508e+02 3.8405893e+02 3.9810013e-02 ... 6.5720081e-04
  1.8835366e-03 2.6790202e-03]
 [3.8415375e+02 3.8413541e+02 7.9363394e-01 ... 1.9850866e-03
  2.3146670e-03 3.1724779e-03]], shape=(10647, 15), dtype=float32)

The output image has no bboxes.
I'm using the weights that are downloaded from the link in your Readme file.

warmup_steps后loss突然全部nan

使用的一个两类别的数据集训练

训练得到的模型请问如何用C++推理，谢谢！

训练得到的模型请问如何用C++推理，请给一些提示，谢谢！

Custom Object Detection with TF & YOLOV3

Hey, thanks for the repo.
Is it possible for you to provide a tutorial or instruction sheet for custom object detection? I would like to use TF2.x and YOLOV3.

I'm a new learner and want to train with my custom dataset. Can you provide some insight also about real-time object detection with CPU or using a dedicated embedded system like Pi or Jetson Nano?

Thanks..

Google Coral EdgeTPU

Hey @YunYang1994, do you think its possible to convert Yolo3 model into tflite model that can be run on a coral edge tpu.

From docs:

You need to convert your model to TensorFlow Lite and it must be quantized using either quantization-aware training.
https://coral.ai/docs/edgetpu/

Thanks for your answer...

关于训练时间!

请问从头开始训练，大概需要多长时间，loss一般能达到多少呢

image_demo error on Tensorflow==2.0.0

Hello,
First, i'd like to thank you for your great job - your examples helped me a lot!
However, after upgrading from tensorflow version 2.0.0b0 to 2.0.0 something is broken in yolov3 structure, i guess. When running image_demo.py code it fails on line:

pred_bbox = model.predict(image_data)

The error is as follows:

File "/tensorflow-2.0.0/python3.6/tensorflow_core/python/keras/engine/training.py", line 909, in predict use_multiprocessing=use_multiprocessing) File "/tensorflow-2.0.0/python3.6/tensorflow_core/python/keras/engine/training_arrays.py", line 722, in predict callbacks=callbacks) File "/tensorflow-2.0.0/python3.6/tensorflow_core/python/keras/engine/training_arrays.py", line 400, in model_iteration aggregator.aggregate(batch_outs, batch_start, batch_end) File "/tensorflow-2.0.0/python3.6/tensorflow_core/python/keras/engine/training_utils.py", line 343, in aggregate result.aggregate(batch_element, batch_start, batch_end) File "/tensorflow-2.0.0/python3.6/tensorflow_core/python/keras/engine/training_utils.py", line 267, in aggregate batch_element.shape, self.results.shape)) ValueError: Mismatch between expected batch size and model output batch size. Output shape = (10647, 85), expected output shape = shape (1, 85)
Kind regards

Test set for FCN-8

The FCN-8 example uses the VOC2007 and VOC2012 datasets for training and the VOC2007 dataset for evaluation. However the VOC2012 train dataset contains the VOC2007 train and test images.

So basically you use a part of the train dataset to evaluate the network! No surprise that the results are nearly perfect!

With other images the results are really mediocre. Here's an example (the cyclist is segmented as bicycle(green) or dog (dark blue), the bike as person (pink) or car (cyan), the regions are quite random...)

Still have to check the performance of FCN-8 with other implementations to see if it's an implementation problem, or this is the normal performance of the FCN-8 network.

about SSIM loss of monodepth

大佬您好，我英语不好还是用中文吧。我想问问您的monodepth那SSIM loss是不是应该改成1 - SSIM loss?因为我试着跑了一下发现图片越变越不像原图。还有请问用了zip()函数是不是就不能用gpu跑了？

How to do validation after some train steps automatically?

Hi， I want to add validation in your train.py in yolov3. So I will know when the network is overfitting.... So, what I need is just add a valid_step func then call it after some train steps?

raise TypeError("Using a `tf.Tensor` as a Python `bool`

If anyone faces the following error message:
TypeError("Using a tf.Tensor as a Python bool )

because of the BatchNormalization() function, just replace that function call with the built-in keras layer.

bn: conv = tf.keras.layers.BatchNormalization()(conv)

Extending YOLOV3 to non-square images

Hello again,

I have now tested your YOLOV3 code and it works perfectly! Currently I am trying to extend the solution to accepting non-square images (trying to train on my own data).

I already did the necessary rescaling of the inputs(taking into account convolutions have to "add up") but am currently stuck on adjusting the decode function.

Here is excerpt from your code:

output_size      = conv_shape[1]
conv_output = tf.reshape(conv_output, (batch_size, output_size, output_size, anchor_per_scale, 5 + self.num_class))

...
y = tf.tile(tf.range(output_size, dtype=tf.int32)[:, tf.newaxis], [1, output_size])
x = tf.tile(tf.range(output_size, dtype=tf.int32)[tf.newaxis, :], [output_size, 1])

xy_grid = tf.concat([x[:, :, tf.newaxis], y[:, :, tf.newaxis]], axis=-1)
xy_grid = tf.tile(xy_grid[tf.newaxis, :, :, tf.newaxis, :], [batch_size, 1, 1, anchor_per_scale, 1])
xy_grid = tf.cast(xy_grid, tf.float32)

This is where I am strugging. I wrote the following:

    output_size      = (conv_shape[1], conv_shape[2])
    conv_output = tf.reshape(conv_output, (batch_size, output_size[0], output_size[1], 3, 5 + NUM_CLASS))

But have no idea how to handle this:

y = tf.tile(tf.range(output_size, dtype=tf.int32)[:, tf.newaxis], [1, output_size])
x = tf.tile(tf.range(output_size, dtype=tf.int32)[tf.newaxis, :], [output_size, 1])
xy_grid = tf.concat([x[:, :, tf.newaxis], y[:, :, tf.newaxis]], axis=-1)
xy_grid = tf.tile(xy_grid[tf.newaxis, :, :, tf.newaxis, :], [batch_size, 1, 1, 3, 1])

Could you give me a hint what this part of the code does and how would I extend it to a non-square format? My final feature maps are (batch_size, 38,17, 3xNUM_CLASS) while yours are (batch_size, 13,13, 3xNUM_CLASS)

Thank you!

How to edit to save weights automaticly?

I am asking to how to save weights automaticly?
because an epoch is too long, it takes me a lot of time train it.

upsample function in common.py yolov3

why did you define upsample function using tf.image.resize instead of using tf.keras.layers.upsample2D? Is there any specific reason?

有人能告诉我dataset哪里target是如何得到的吗？

求助：有关tensorboard，tf2.0保存的events文件和tf1.0有什么不一样吗

先谢谢作者，训练、测试代码都已经跑通，在我自己数据集上效果还可以，但是我发现保存的log文件在tensorboard一直是No scalar data was found.我尝试了好多方法，可是一直没有，想问作者是怎么操作的，谢谢！！我的tensorboard是1.14.0，即使我把它拷贝到其他电脑上也是一样的问题。。

yolov3-tensorflow2.0 release 版本cv2无法读取视频

env:
tensorflow-gpu== 2.0.0
opencv-python== 4.1.1.26
numpy==1.16.4
Pillow==6.1.0
scipy==1.2.1
wget==3.2
seaborn==0.9.0
easydict==1.9

cv2 读取视频是没有问题的，已测试
但是在tensorflow2.0 videodemo中，读取视频提示 raise ValueError("No image!")
请核查一下问题所在

Happened IndexError: index 52 is out of bounds for axis 1 with size 52!

`
IndexError Traceback (most recent call last)
/content/train.py in ()
73
74 for epoch in range(cfg.TRAIN.EPOCHS):
---> 75 for image_data, target in trainset:
76 train_step(image_data, target)
77 model.save_weights("./yoface")

1 frames
/content/core/dataset.py in preprocess_true_boxes(self, bboxes)
230 xind, yind = np.floor(bbox_xywh_scaled[best_detect, 0:2]).astype(np.int32)
231
--> 232 label[best_detect][yind, xind, best_anchor, :] = 0
233 label[best_detect][yind, xind, best_anchor, 0:4] = bbox_xywh
234 label[best_detect][yind, xind, best_anchor, 4:5] = 1.0

IndexError: index 52 is out of bounds for axis 1 with size 52
`

As it shows, I use widerface dataset to train my model but it happened indexError, it seems some problem with my code, please help me to solve it.

关于TRAIN.DATA_AUG

没看懂这部分代码，生成的图片没用上吗？

     if self.data_aug:
            image, bboxes = self.random_horizontal_flip(np.copy(image), np.copy(bboxes))
            image, bboxes = self.random_crop(np.copy(image), np.copy(bboxes))
            image, bboxes = self.random_translate(np.copy(image), np.copy(bboxes))

So I modified the ./core/yolov3.py and replace all the tf.newaxis to tf.expand_dims

For example:

# original code
# y = tf.tile(tf.range(output_size, dtype=tf.int32)[:, tf.newaxis], [1, output_size])
# x = tf.tile(tf.range(output_size, dtype=tf.int32)[tf.newaxis, :], [output_size, 1])

# my version
y = tf.range(output_size, dtype=tf.int32)
y = tf.expand_dims(y, -1)
y = tf.tile(y, [1, output_size])
x = tf.range(output_size,dtype=tf.int32)
x = tf.expand_dims(x, 0)
x = tf.tile(x, [output_size, 1])

convert coco to image list format script?

Does there any scripts convert coco to image list format?

yunyang1994 / tensorflow2.0-examples Goto Github PK

tensorflow2.0-examples's Issues

Recommend Projects

Recommend Topics

Recommend Org