yunyang1994 / tensorflow-yolov3 Goto Github PK
View Code? Open in Web Editor NEW🔥 TensorFlow Code for technical report: "YOLOv3: An Incremental Improvement"
Home Page: https://yunyang1994.gitee.io/2018/12/28/YOLOv3-算法的一点理解/
License: MIT License
🔥 TensorFlow Code for technical report: "YOLOv3: An Incremental Improvement"
Home Page: https://yunyang1994.gitee.io/2018/12/28/YOLOv3-算法的一点理解/
License: MIT License
Hey, guys, I implement a new version of yolov3. It supports training and the experiment result is fantastic on my own dataset. The repo is here: https://github.com/wizyoung/YOLOv3_TensorFlow
Hello, in your code, I didn't find moving_variance
and moving_mean
of batch norm layer is updated during training phase. Is it a mistake, or you do it on purpose?
Hi ! i'm facing the issue when i run video_demo.py ,i'm looking forward you reply, Thanks.
VIDEOIO ERROR: V4L: can't open camera by index 0
Traceback (most recent call last):
File "video_demo.py", line 37, in
raise ValueError("No image!")
ValueError: No image!
There is a problem when I run train.py
$python train.py
File "train.py", line 40
images,*y_true = example
^
SyntaxError: invalid syntax
After a couple epochs, I want to save the graph to a .pb file and use the nms_demo.py to try out the network.
I tried using utils.freeze_graph
, but I don't know what tensors should I give it to save and what tensors to use to get the output.
In convert_weight.py
you are using utils.freeze_graph(sess, './checkpoint/yolov3_gpu_nms.pb', ["concat_10", "concat_11", "concat_12"])
, but those tensors do not exist in the new graph.
and nsm_demo.py
needs to know what are the input and ouput tensors:
input_tensor, output_tensors = utils.read_pb_return_tensors(gpu_nms_graph, "./checkpoint/yolov3_gpu_nms.pb", ["Placeholder:0", "concat_10:0", "concat_11:0", "concat_12:0"])
What do you suggest?
With the current setup, we can not use batches bigger than 1. It says the examples in the tfrecord have different shapes. I resized the images and corrected the bboxes (for the new image size) and saved it into the tfrecords file. But then it said that the bboxes have different sizes (some pictures have more boxes than others).
So, I saved the y_true (for all different sizes 13x13, 26x26, 52x52) in the tfrecords file. This increased the size of the tfrecords file significantly. But my system runs out of memory if I choose batches bigger than 4.
What do you suggest I use to make all the input samples uniform, so that we can use bigger batches?
Do you have any plans to change the batch size?
用darknet训练的图片,然后用darknet测试,效果还不错,但是把.weights权重文件转化为.pb然后永nms_demo.py测试发现效果不如darknet的,不知道这是什么问题o.o
tensorflow 1.12
3.1 quick train
python quick_train.py
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1334, in _do_call
return fn(*args)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.OutOfRangeError: End of sequence
[[{{node IteratorGetNext}} = IteratorGetNextoutput_shapes=[[?,416,416,3], , , ], output_types=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"]]
[[{{node IteratorGetNext/_735}} = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_382_IteratorGetNext", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]]
utils.load_weights(...., ) make the loss "NAN", do you have this issue?
Hi, I think there is a typo in script 'quick_test.py':
img = PIL.Image.open(path)
img.resize(size = (IMAGE_H, IMAGE_W))
which should be:
img.resize(size = (IMAGE_W, IMAGE_H))
Hi, thanks for your wonderful work!
I saw code in loss_layer()
# caculate iou between true boxes and pred boxes
intersect_xy1 = tf.maximum(true_box_xy - true_box_wh / 2.0,
pred_box_xy - pred_box_xy / 2.0)
intersect_xy2 = tf.maximum(true_box_xy + true_box_wh / 2.0,
pred_box_xy + pred_box_wh / 2.0)
intersect_wh = tf.maximum(intersect_xy2 - intersect_xy1, 0.)
Is there something wrong? I think "intersect_xy2 = tf.minimum(true_box_xy + true_box_wh / 2.0,
pred_box_xy + pred_box_wh / 2.0) " may be right.
The FPS is higher than others, I expect it can support multi-GPU.
Hi, I run the demo code and it seems that the speed in cpu graph is faster than that in gpu graph.
Is it normal? @YunYang1994
File ".../tensorflow-yolov3/core/utils.py", line 379, in preprocess_true_boxes
y_true[l][i, j, k, 5+c] = 1
IndexError: index 58 is out of bounds for axis 3 with size 25
Hi,
I try to train, and the used RAM memory gradually increases until it kills the process due to out of RAM memory. I test it tf 1.12 and 1.10. Thanks.
Hi,
thanks for sharing your code, which helps a lot.
But there is a problem that when we run the train.py, three ckpt files are saved. But how can we run these model files to test performance?
I have tried two ways: **one is to use the test.py,** but it says
Traceback (most recent call last):
File "test.py", line 28, in
saver = tf.train.Saver()
File "/home/suhuiqiao/anaconda3/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 1293, in init
self.build()
File "/home/suhuiqiao/anaconda3/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 1302, in build
self._build(self._filename, build_save=True, build_restore=True)
File "/home/suhuiqiao/anaconda3/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 1327, in _build
raise ValueError("No variables to save")
ValueError: No variables to save
Another way I tried convert_weight.py --ckpt_file file --freeze, but it says
Traceback (most recent call last):
File "/home/suhuiqiao/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1361, in _do_call
return fn(*args)
File "/home/suhuiqiao/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1340, in _run_fn
target_list, status, run_metadata)
File "/home/suhuiqiao/anaconda3/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py", line 516, in exit
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [255] rhs shape= [33]
[[Node: save/Assign_349 = Assign[T=DT_FLOAT, _class=["loc:@yolov3/yolo-v3/Conv_6/biases"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](yolov3/yolo-v3/Conv_6/biases, save/RestoreV2/_149)]]
Could you help me with this problem? Thanks a lot!
Hi,
First, thanks a lot for making this project.
I spot a potential issue in assigning box GT in this code (line 99 and 100), rewritten as:
i = np.floor(gt_boxes[t,0]/self.image_w*grid_sizes[l][1]).astype('int32')
j = np.floor(gt_boxes[t,1]/self.image_h*grid_sizes[l][0]).astype('int32')
Using code above, the label assigment may encounter wrong assignment illustrated in this attachment pic, where the label should be in the grid of (1,1), but code above puts the label in grid (0,0) due to floor operation
. So, I suggest to use round operation
instead. I know, the network can still learn via the training data, but I think, using round operation
, in this case, will be more consistent and makes the network easier to learn. What do you think?
Thanks.
I think there should be an __init__.py
in the core dir. I was getting an import error for yolov3.py and utils.py before adding it when running convert_weight.py
I'm on windows.
其实我存在一个疑问?就是这个使用kmeans选框是体现在哪里的,quick_train.py似乎没涉及到?在下初学,有好多存疑的地方,希望您能解答
In utils.py
in the function resize_image_correct_bbox
, you are getting the image size by
image_size = tf.to_float(tf.shape(image)[1:3])[::-1]
but I think it should be [0:2] instead of [1:3]
image_size = tf.to_float(tf.shape(image)[0:2])[::-1]
when compute loss_class:
loss_class = tf.nn.sigmoid_cross_entropy_with_logits(labels=y_true_i[..., 5:], logits=pred_boxes_class)
the pred_boxes_class have computed by sigmoid function, I think here can’t use tf.nn.sigmoid_cross_entropy_with_logits
File "train.py", line 69, in
weights_file="../yolo_weghts/darknet53.conv.74")
File "/media/cjw/data/AI_allin/model/computerVision/tensorflow-yolov3/core/utils.py", line 258, in load_weights
with open(weights_file, "rb") as fp:
FileNotFoundError: [Errno 2] No such file or directory: '../yolo_weghts/darknet53.conv.74'
开始的时候选择较大的学习率,然后逐步减少学习率,提高训练速度。谢谢。
Hi @YunYang1994
I am curious as to why we cannot easily change the network input size of YOLOv3 in a TensorFlow implementation. In the original implementation by pjreddie and alexey, the network input size can be changed in an instant to compensate for accuracy or speed. Is this because TensorFlow is graph-based? Thank you!
Hi,
I run the train.py in the latest package to train the Yolov3 , but it shows
loss_class += result[3]
IndexError: tuple index out of range
In quick_train.py, I can run the process success and show the result 1.0 and 0.0.
Could you help to check the issue in train.py?
Thanks a lot for your help! =)
你好:
我想问一下checkpoint文件夹下面生成的pb文件中有三个cpu-gpu和feature三个版本,你在不同的地方用了不同的版本,有什么区别吗。还有nms是什么意思。
非诚感谢你。
hello大家好,我是该仓库的作者。鉴于我复现tensorflow-yolov3踩了太多坑,特此发个贴,帮助大家少走弯路。大家有问题可以在下面留言。
Reading the code about loss_layer() without loss function is just hard to understand.
Thanks a lot ! @YunYang1994
Have you solved it? i also meet same problem
when I trained the pascal_voc2012, I got this error in dataset.py
line 107, in preprocess_true_boxes
y_true[l][j, i, k, 5+c] = 1.
IndexError: index 425 is out of bounds for axis 3 with size 25
Why I can got this error?
If anyone has a solution ,please tell me
"Nan" always appears during training.
=> EPOCH: 132 loss_xy: 1.5719 loss_wh: 2.2539 loss_conf:189.6436 loss_class: 3.0253 total_loss:196.4947 rec_50:0.00 prec_50:0.00
=> EPOCH: 133 loss_xy: 1.6151 loss_wh: 2.1801 loss_conf:194.6695 loss_class: 2.7355 total_loss:201.2003 rec_50:0.00 prec_50:0.00
=> EPOCH: 134 loss_xy: 1.0573 loss_wh: 1.9318 loss_conf:184.6295 loss_class: 1.5075 total_loss:189.1261 rec_50:0.00 prec_50:0.00
=> EPOCH: 135 loss_xy: 1.2824 loss_wh: 1.9386 loss_conf:178.7914 loss_class: 1.0579 total_loss:183.0704 rec_50:0.00 prec_50:0.00
=> EPOCH: 136 loss_xy: nan loss_wh: nan loss_conf: nan loss_class: nan total_loss: nan rec_50:0.00 prec_50:0.00
=> EPOCH: 137 loss_xy: nan loss_wh: nan loss_conf: nan loss_class: nan total_loss: nan rec_50:0.00 prec_50:0.00
=> EPOCH: 138 loss_xy: nan loss_wh: nan loss_conf: nan loss_class: nan total_loss: nan rec_50:0.00 prec_50:0.00
=> EPOCH: 139 loss_xy: nan loss_wh: nan loss_conf: nan loss_class: nan total_loss: nan rec_50:0.00 prec_50:0.00
=> EPOCH: 140 loss_xy: nan loss_wh: nan loss_conf: nan loss_class: nan total_loss: nan rec_50:0.00 prec_50:0.00
=> EPOCH: 141 loss_xy: nan loss_wh: nan loss_conf: nan loss_class: nan total_loss: nan rec_50:0.00 prec_50:0.00
=> EPOCH: 142 loss_xy: nan loss_wh: nan loss_conf: nan loss_class: nan total_loss: nan rec_50:0.00 prec_50:0.0
================================================================
Below is my configuration information:
IMAGE_H, IMAGE_W = 416, 416
BATCH_SIZE = 8
EPOCHS = 2000*1000
LR = 0.0005 #0.0005
SHUFFLE_SIZE = 1000
Hi! I am facing an error when i run python nms_demo.py look forward to your reply ,
thankyou
File "E:/tensorflow-yolov3-master/nms_demo.py", line 48, in
image = utils.draw_boxes(img, boxes, scores, labels, classes, SIZE,show=True)
File "E:\tensorflow-yolov3-master\core\utils.py", line 190, in draw_boxes
draw.rectangle(bbox, outline=colors[labels[i]], width=3)
TypeError: rectangle() got an unexpected keyword argument 'width'
![qq 20181214122643](https://user-images.githubusercontent.com/37647733/49983334-1316a680-ff9d-11e8-81c8-bc8b3bcd12a9.png
OutOfRangeError: End of sequence
[[{{node cond/IteratorGetNext_1}} = IteratorGetNextoutput_shapes=[[?,416,416,3], [?,?,5]], output_types=[DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"]]
Caused by op 'cond/IteratorGetNext_1', defined at:
............................................
When I run the show_input_image.py,it occurs thr problem above. Please give a analyse.
Hi, have you a plan to support tiny yolo v3 soon ?
I think It could be really helpful
Hi, thanks for your this repository, I'm facing an issue when I run $ python convert_weight.py --convert --freeze
I'm getting this error.
File "/home/ahsan/YOLOtf/core/yolov3.py", line 125, in get_boxes_confs_scores box_centers = box_centers * stride TypeError: can't multiply sequence by non-int of type 'Tensor'
I'm using Tensorflow-gpu 1.5.0
It would be more helpful if you translate your comments to English. This way I can also help with adding more features, such as training.
=> EPOCH: 341 loss_xy: 1.6257 loss_wh: 4.0940 loss_conf:4570.6030 loss_class:14.2002
=> EPOCH: 342 loss_xy: 1.5741 loss_wh: 2.6803 loss_conf:4656.8652 loss_class:11.8622
=> EPOCH: 343 loss_xy: 1.3638 loss_wh: 2.3212 loss_conf:4657.7090 loss_class:11.4552
2019-01-22 16:34:48.172989: W tensorflow/core/framework/op_kernel.cc:1261] Unknown: IndexError: index 13 is out of bounds for axis 0 with size 13
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/script_ops.py", line 206, in call
ret = func(*args)
File "........../tensorflow-yolov3/core/dataset.py", line 105, in preprocess_true_boxes
y_true[l][j, i, k, 0:4] = gt_boxes[t, 0:4]
IndexError: index 13 is out of bounds for axis 0 with size 13
what's wrong
Hi,
I have question as running in quick_train_data.py, but there is nothing happened.
Further, when I run the train.py,
there is no saved tfrecord file when running the dataset.py. It is not in ./model_data, but it is doing the process as shown
print('Processed {} of {} images'.format(index + 1, len(image_data)))
when I have the tfrecord, the following message is shown
OutOfRangeError (see above for traceback): End of sequence
[[{{node PyFunc}} = PyFuncTin=[DT_FLOAT], Tout=[DT_FLOAT, DT_FLOAT, DT_FLOAT], token="pyfunc_0"]]
[[node IteratorGetNext (defined at train.py:27) = IteratorGetNextoutput_shapes=[[?,?,?,3], [?,?,5], , , ], output_types=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"]]
Could you please tell me how to train your work? Thank you.
When I trained my own data according to your steps, the program ran incorrectly, as follows:
File "/home/lgl/PycharmProjects/tensorflow-yolov3-New/quick_train.py", line 76, in
run_items = sess.run([train_op, write_op, y_pred, y_true] + loss, feed_dict={is_training:True})
File "/home/lgl/.conda/envs/tensorflow-gpu/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 889, in run
run_metadata_ptr)
File "/home/lgl/.conda/envs/tensorflow-gpu/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1120, in _run
feed_dict_tensor, options, run_metadata)
File "/home/lgl/.conda/envs/tensorflow-gpu/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1317, in _do_run
options, run_metadata)
File "/home/lgl/.conda/envs/tensorflow-gpu/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1336, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.UnknownError: IndexError: index 57 is out of bounds for axis 1 with size 52.
and I have change the anchors, The error changed into UnknownError: IndexError: index ** is out of bounds for axis 1 with size **. !
File "train.py", line 44, in
loss = model.compute_loss(y_pred, y_true)
File ".../tensorflow-yolov3/core/yolov3.py", line 269, in compute_loss
loss_class += result[3]
IndexError: tuple index out of range
Has anyone exported output graph and used it with c++ api?
Hi,
This line
tensorflow-yolov3/video_demo.py
Line 27 in 60c39ee
images, *y_true = example
^
SyntaxError: invalid syntax
is there a way to fix this without using python3?
im currently using python2 and i don't want to install tensorflow again
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.