zhen8838 / k210_yolo_framework Goto Github PK

Yolo v3 framework base on tensorflow, support multiple models, multiple datasets, any number of output layers, any number of anchors, model prune, and portable model to K210 !

License: MIT License

Makefile 0.09% Python 6.89% C 92.99% C++ 0.03%

k210_yolo_framework's Introduction

[toc]

K210 YOLO V3 framework

This is a clear, extensible yolo v3 framework

Real-time display recall and precision
Easy to use with other datasets
Support multiple model backbones and expand more
Support n number of output layers and m anchors
Support model weight pruning
Portable model to kendryte K210 chip

Training on Voc

Set Environment

Testing in ubuntu 18.04, Python 3.7.1, Others in requirements.txt.

Prepare dataset

first use yolo scripts:

wget https://pjreddie.com/media/files/VOCtrainval_11-May-2012.tar
wget https://pjreddie.com/media/files/VOCtrainval_06-Nov-2007.tar
wget https://pjreddie.com/media/files/VOCtest_06-Nov-2007.tar
tar xf VOCtrainval_11-May-2012.tar
tar xf VOCtrainval_06-Nov-2007.tar
tar xf VOCtest_06-Nov-2007.tar
wget https://pjreddie.com/media/files/voc_label.py
python3 voc_label.py
cat 2007_train.txt 2007_val.txt 2012_*.txt > train.txt

now you have train.txt, then merge img path and annotation to one npy file:

python3 make_voc_list.py xxxx/train.txt data/voc_img_ann.npy

Make anchors

Load the annotations generate anchors (LOW and HIGH depending on the distribution of dataset):

make anchors DATASET=voc ANCNUM=3 LOW='.0 .0' HIGH='1. 1.'

When success you will see figure like this:

NOTE: the kmeans result is random. when you get error , just rerun it.

If you want to use custom dataset, just write script and generate data/{dataset_name}_img_ann.npy, Then use make anchors DATASET=dataset_name. The more options please see with python3 ./make_anchor_list.py -h

If you want to change number of output layer, you should modify OUTSIZE in Makefile

Download pre-trian model

You must download the model weights you want to train because I load the pre-train weights by default. And put the files into K210_Yolo_framework/data directory.

My Demo use yolo_mobilev1 0.75

`MODEL`	`DEPTHMUL`	Url	Url
yolo_mobilev1	0.5	google drive	weiyun
yolo_mobilev1	0.75	google drive	weiyun
yolo_mobilev1	1.0	google drive	weiyun
yolo_mobilev2	0.5	google drive	weiyun
yolo_mobilev2	0.75	google drive	weiyun
yolo_mobilev2	1.0	google drive	weiyun
tiny_yolo		google drive	weiyun
yolo		google drive	weiyun

NOTE: The mobilenet is not original, I have modified it to fit k210

Train

When you use mobilenet, you need to specify the DEPTHMUL parameter. You don't need set DEPTHMUL to use tiny yolo or yolo.

Set MODEL and DEPTHMUL to start training:
```
make train MODEL=xxxx DEPTHMUL=xx MAXEP=10 ILR=0.001 DATASET=voc CLSNUM=20 IAA=False BATCH=16
```
You can use Ctrl+C to stop training , it will auto save weights and model in log dir.

Set CKPT to continue training:

make train MODEL=xxxx DEPTHMUL=xx MAXEP=10 ILR=0.0005 DATASET=voc CLSNUM=20 IAA=False BATCH=16 CKPT=log/xxxxxxxxx/yolo_model.h5

Set IAA to enable data augment:

make train MODEL=xxxx DEPTHMUL=xx MAXEP=10 ILR=0.0001 DATASET=voc CLSNUM=20 IAA=True BATCH=16 CKPT=log/xxxxxxxxx/yolo_model.h5

Use tensorboard:
```
tensorboard --logdir log
```

NOTE: The more options please see with python3 ./keras_train.py -h

Inference

make inference MODEL=xxxx DEPTHMUL=xx CLSNUM=xx CKPT=log/xxxxxx/yolo_model.h5 IMG=data/people.jpg

You can try with my model :

make inference MODEL=yolo_mobilev1 DEPTHMUL=0.75 CKPT=asset/yolo_model.h5 IMG=data/people.jpg

make inference MODEL=yolo_mobilev1 DEPTHMUL=0.75 CKPT=asset/yolo_model.h5 IMG=data/dog.jpg

NOTE: Since the anchor is randomly generated, your results will be different from the above image.You just need to load this model and continue training for a while.

The more options please see with python3 ./keras_inference.py -h

Prune Model

make train MODEL=xxxx MAXEP=1 ILR=0.0003 DATASET=voc CLSNUM=20 BATCH=16 PRUNE=True CKPT=log/xxxxxx/yolo_model.h5 END_EPOCH=1

When training finish, will save model as log/xxxxxx/yolo_prune_model.h5.

Freeze

toco --output_file mobile_yolo.tflite --keras_model_file log/xxxxxx/yolo_model.h5

Now you have mobile_yolo.tflite

Convert Kmodel

Please refer nncase v0.1.0-RC5 example

Demo

Use kendryte-standalone-sdk v0.5.6

KD233

Use Kflash.py

kflash yolo3_frame_test_public/kfpkg/kpu_yolov3.kfpkg -B kd233 -p /dev/ttyUSB0 -b 2000000 -t

MAIXPY GO

Use Kflash.py

kflash yolo3_frame_test_public_maixpy/kfpkg/kpu_yolov3.kfpkg -B goE -p /dev/ttyUSB1 -b 2000000 -t

NOTE: I just use kendryte yolov2 demo code to prove the validity of the model.

If you need standard yolov3 region layer code, you can buy with me.

Caution

Default parameter in Makefile
OBJWEIGHT,NOOBJWEIGHT,WHWEIGHT used to balance precision and recall
Default output two layers,if you want more output layers can modify OUTSIZE
If you want to use the full yolo, you need to modify the IMGSIZE and OUTSIZE in the Makefile to the original yolo parameters

k210_yolo_framework's People

Contributors

Stargazers

Watchers

Forkers

xushoucai simonliu009 cheaven zombob xiuyu999 svija icprog asdbaihu sunyancn zyayoung aixier yuan-onion hejinizhanghao jojodevel csjhzhu sokolegg woniupapa yunlong12 bluejazzchn coxlabinc niexiaokun lcd5478 faustpy linkapp-github prakharg24 gilleskap enbesy thideo wuqiman sunrong0511 class8hawk panda781022 hypagedev robofisshy kkmonster lxp2014 borongyuan hujiese tonywork ivy1212 build4eye apx103 thepian badstones adas-eye aiwintermuteai gky-gky vsdcjs jiangxinufo jingyang-huang xxhdxh zdm-linux alexqian001 cccczl xiaoyuxiaoer lihuawriter rmecboy hejunchao100813 tk-233 maxpark ah-forklib lee-lord lisenjie757 ealataur desonflag dianjixz sdon tuoniao2333 gokmonk

k210_yolo_framework's Issues

您好，我用自定义数据集训练出现了这个问题：ValueError: Empty training data.训练之前的步骤都可以，之前跑通了原始voc数据集的训练以及部署。自己做一个小项目分4类，数据集按照voc格式制作的，前边的步骤也都可以，一到训练这里就

(tf115) F:\1_work\K210\k210-for-yolo\yolo-for-trash_detection-k210>make train MODEL=yolo_mobilev1 DEPTHMUL=0.75 MAXEP=10 ILR=0.001 DATASET=voc CLSNUM=4 IAA=False BATCH=8
python ./keras_train.py
--train_set voc
--class_num 4
--pre_ckpt ""
--model_def yolo_mobilev1
--depth_multiplier 0.75
--augmenter False
--image_size 224 320
--output_size 7 10 14 20
--batch_size 8
--rand_seed 3
--max_nrof_epochs 10
--init_learning_rate 0.001
--learning_rate_decay_factor 0
--obj_weight 1
--noobj_weight 1
--wh_weight 1
--obj_thresh 0.7
--iou_thresh 0.5
--vaildation_split 0.05
--log_dir log
--is_prune False
--prune_initial_sparsity 0.5
--prune_final_sparsity 0.9
--prune_end_epoch 5
--prune_frequency 100
2020-08-13 16:31:51.021329: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
WARNING:tensorflow:
The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:

https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
https://github.com/tensorflow/addons
https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.

2020-08-13 16:31:54.680652: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2
2020-08-13 16:31:54.693250: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2020-08-13 16:31:54.730848: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: GeForce RTX 2060 major: 7 minor: 5 memoryClockRate(GHz): 1.2
pciBusID: 0000:01:00.0
2020-08-13 16:31:54.736948: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
2020-08-13 16:31:54.746169: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll
2020-08-13 16:31:54.753784: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_100.dll
2020-08-13 16:31:54.760818: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_100.dll
2020-08-13 16:31:54.767979: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_100.dll
2020-08-13 16:31:54.774382: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_100.dll
2020-08-13 16:31:54.787883: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-08-13 16:31:54.792935: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2020-08-13 16:31:55.371745: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-08-13 16:31:55.376201: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165] 0
2020-08-13 16:31:55.379023: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0: N
2020-08-13 16:31:55.382793: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4608 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2060, pci bus id: 0000:01:00.0, compute capability: 7.5)
�[34m[ INFO ]�[0m data augment is False
WARNING:tensorflow:From F:\software\Anaconda3\envs\tf115\lib\site-packages\tensorflow_core\python\data\util\random_seed.py:58: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
2020-08-13 16:31:55.534760: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: GeForce RTX 2060 major: 7 minor: 5 memoryClockRate(GHz): 1.2
pciBusID: 0000:01:00.0
2020-08-13 16:31:55.541031: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
2020-08-13 16:31:55.544708: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll
2020-08-13 16:31:55.547742: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_100.dll
2020-08-13 16:31:55.551895: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_100.dll
2020-08-13 16:31:55.555008: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_100.dll
2020-08-13 16:31:55.559177: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_100.dll
2020-08-13 16:31:55.562276: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-08-13 16:31:55.565958: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2020-08-13 16:31:55.569220: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: GeForce RTX 2060 major: 7 minor: 5 memoryClockRate(GHz): 1.2
pciBusID: 0000:01:00.0
2020-08-13 16:31:55.574059: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
2020-08-13 16:31:55.578232: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll
2020-08-13 16:31:55.581370: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_100.dll
2020-08-13 16:31:55.585411: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_100.dll
2020-08-13 16:31:55.588457: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_100.dll
2020-08-13 16:31:55.592139: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_100.dll
2020-08-13 16:31:55.595837: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-08-13 16:31:55.599397: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2020-08-13 16:31:55.602502: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-08-13 16:31:55.606356: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165] 0
2020-08-13 16:31:55.608364: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0: N
2020-08-13 16:31:55.610806: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4608 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2060, pci bus id: 0000:01:00.0, compute capability: 7.5)
�[34m[ INFO ]�[0m data augment is False
WARNING:tensorflow:From F:\software\Anaconda3\envs\tf115\lib\site-packages\tensorflow_core\python\ops\resource_variable_ops.py:1630: calling BaseResourceVariable.init (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
Train on 21 steps
Epoch 1/10
2020-08-13 16:32:22.746351: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-08-13 16:32:24.054899: W tensorflow/stream_executor/cuda/redzone_allocator.cc:312] Internal: Invoking ptxas not supported on Windows
Relying on driver to perform ptx compilation. This message will be only logged once.
2020-08-13 16:32:24.128920: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll
1/21 [>.............................] - ETA: 2:31 - loss: 792.4858 - l1_loss: 131.6574 - l2_loss: 660.4179 - l1_p: 0.0000e+00 - l1_r: 0.0000e+00 - l2_p: 5.8514e-04 - l2_r: 0.20002020-08-13 16:32:26.777512: I tensorflow/core/profiler/lib/profiler_session.cc:205] Profiler session started.
2020-08-13 16:32:26.787342: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cupti64_100.dll
WARNING:tensorflow:Method (on_train_batch_end) is slow compared to the batch update (0.280863). Check your callbacks.
2/21 [=>............................] - ETA: 1:17 - loss: 817.4302 - l1_loss: 172.4517 - l2_loss: 644.5675 - l1_p: 0.0012 - l1_r: 0.2500 - l2_p: 0.0025 - l2_r: 0.5000 2020-08-13 16:32:27.322285: I tensorflow/core/platform/default/device_tracer.cc:588] Collecting 3504 kernel records, 316 memcpy records.
WARNING:tensorflow:Method (on_train_batch_end) is slow compared to the batch update (0.444144). Check your callbacks.
3/21 [===>..........................] - ETA: 52s - loss: 745.5250 - l1_loss: 159.0245 - l2_loss: 586.0892 - l1_p: 8.5985e-04 - l1_r: 0.1429 - l2_p: 0.0023 - l2_r: 0.3636WARNING:tensorflow:Method (on_train_batch_end) is slow compared to the batch update (0.280863). Check your callbacks.
4/21 [====>.........................] - ETA: 37s - loss: 695.2993 - l1_loss: 148.6290 - l2_loss: 546.2584 - l1_p: 0.0015 - l1_r: 0.1538 - l2_p: 0.0022 - l2_r: 0.3200 WARNING:tensorflow:Method (on_train_batch_end) is slow compared to the batch update (0.117583). Check your callbacks.
20/21 [===========================>..] - ETA: 0s - loss: 364.4713 - l1_loss: 68.8265 - l2_loss: 295.2221 - l1_p: 0.0012 - l1_r: 0.0385 - l2_p: 0.0020 - l2_r: 0.0563Traceback (most recent call last):
File "./keras_train.py", line 155, in
args.prune_frequency)
File "./keras_train.py", line 99, in main
validation_data=h.test_dataset, validation_steps=int(h.test_epoch_step * h.validation_split))
File "F:\software\Anaconda3\envs\tf115\lib\site-packages\tensorflow_core\python\keras\engine\training.py", line 727, in fit
use_multiprocessing=use_multiprocessing)
File "F:\software\Anaconda3\envs\tf115\lib\site-packages\tensorflow_core\python\keras\engine\training_arrays.py", line 675, in fit
steps_name='steps_per_epoch')
File "F:\software\Anaconda3\envs\tf115\lib\site-packages\tensorflow_core\python\keras\engine\training_arrays.py", line 440, in model_iteration
steps_name='validation_steps')
File "F:\software\Anaconda3\envs\tf115\lib\site-packages\tensorflow_core\python\keras\engine\training_arrays.py", line 411, in model_iteration
aggregator.finalize()
File "F:\software\Anaconda3\envs\tf115\lib\site-packages\tensorflow_core\python\keras\engine\training_utils.py", line 138, in finalize
raise ValueError('Empty training data.')
ValueError: Empty training data.
Makefile:35: recipe for target 'train' failed
make: *** [train] Error 1

val_loss doesn't drop

Hello!
When training loss slowly decrease with each epoch, but val_loss doesn't decrease and slightly vary around some value without any decrease and even can increase a bit.
What is the problem? Does training lower loss as shows loss value or it is not improving as shows val_loss?

pretrained backbone

Hi can i use pretrained backbone mobilenet darknet network as feature extractor, freeze this layers and train only detector layers ? AKA transfer learning ?

Sigmoid activation

In this line of code why

activate_array(rl, index, 2 * rl->layer_width * rl->layer_height);

Why is 2 used? shouldn't it be 4? as there are 4 items in Co-ordinates.

Right now the sigmoid is applied only to 7x10x2 i.e first 140 elements and then the index shifts to 280 and applies for 70 elements.

Shouldn't it be 7x10x4 for applying to the coordinates? what is the logic behind this

自定义数据集，成功导出anchor.npy和img_ann.npy，然后在训练时却发生如下错误

IndexError: index 10 is out of bounds for axis 1 with size 10

您好：
我已经使用voc数据集跑通了20类别的分类模型，并成功跑在k210上边。现在我打算用WIDER Face的数据集来训练模型(WIDER Face已经转换成voc数据集的格式了)。
我在voc_label.py中作了以下修改：

# file list - train.txt, test.txt, val.txt
sets = [('2007', 'train'), ('2007', 'val')]

# class name
classes = ["face"]

运行完voc_label.py脚本后，成功生成了label文件(labels目录下其中一个.txt)：

0 0.498046875 0.292057761732852 0.119140625 0.1075812274368231

这是这个.txt文件所对应的.xml文件：

<annotation>
    <folder>VOC2007</folder>
    <filename>000001.jpg</filename>
    <source>
        <database>My Database</database>
        <annotation>PASCAL VOC2007</annotation>
        <image>flickr</image>
        <flickrid>NULL</flickrid>
    </source>
    <owner>
        <flickrid>NULL</flickrid>
        <name>company</name>
    </owner>
    <size>
        <width>1024</width>
        <height>1385</height>
        <depth>3</depth>
    </size>
    <segmented>0</segmented>
    <object>
        <name>face</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>449</xmin>
            <ymin>330</ymin>
            <xmax>571</xmax>
            <ymax>479</ymax>
        </bndbox>
    </object>
</annotation>

下边的步骤都是按照README.md操作的，可以成功生成face_img_ann.npy与face_anchor.npy

最后在开始训练时：

make train MODEL=yolo_mobilev1 DEPTHMUL=0.75 MAXEP=10 ILR=0.001 DATASET=face CLSNUM=1 IAA=False BATCH=16

主要报错信息：

  File "E:\Anaconda3\envs\yolo\lib\site-packages\tensorflow_core\python\ops\script_ops.py", line 233, in __call__
    return func(device, token, args)

  File "E:\Anaconda3\envs\yolo\lib\site-packages\tensorflow_core\python\ops\script_ops.py", line 122, in __call__
    ret = self._func(*args)

  File "E:\yolo\Yolo-for-k210\tools\utils.py", line 423, in _dataset_parser
    labels = self.box_to_label(true_box)

  File "E:\yolo\Yolo-for-k210\tools\utils.py", line 226, in box_to_label
    labels[l][idy, idx, n, 0:4] = np.clip(box[1:5], 1e-8, 1.)

IndexError: index 10 is out of bounds for axis 1 with size 10


  File "E:\yolo\Yolo-for-k210\tools\utils.py", line 423, in _dataset_parser
    labels = self.box_to_label(true_box)

  File "E:\yolo\Yolo-for-k210\tools\utils.py", line 226, in box_to_label
    labels[l][idy, idx, n, 0:4] = np.clip(box[1:5], 1e-8, 1.)

IndexError: index 10 is out of bounds for axis 1 with size 10

Init node conv1/kernel/Assign doesn't exist in graph when freezing a model

Hey,
I'm getting this error:
Init node conv1/kernel/Assign doesn't exist in graph

When trying to freeze a model using your library and guide.
This happens even on the asset/yolo-model...

Any idea what might be the issue?
I'm using tensorflow with GPU..

What is the precision on the VOC and COCO dataset?

Hi zhen8838!
thanks your great works。I find you use mobilenet combine with yolo. I want to know the mAP value in VOC or COCO dataset and inference time.Or the mAP and inference time Compare with yolov3_tiny

Can I run K210_Yolo_framework on colab to get kmodel for my k210

Can you guide me how to run this instruction on colab.

make: *** [Makefile:79: anchors] Error 2

usage: make_anchor_list.py [-h] [--max_iters MAX_ITERS]
[--is_random {True,False}] [--is_plot {True,False}]
[--in_hw IN_HW [IN_HW ...]]
[--out_hw OUT_HW [OUT_HW ...]]
[--low LOW [LOW ...]] [--high HIGH [HIGH ...]]
[--anchor_num ANCHOR_NUM]
train_set
make_anchor_list.py: error: argument --low: invalid float value: '.0 \t\t--high 1.'
make: *** [Makefile:79: anchors] Error 2

Train Error

I succeed on Training on Voc and get the img.

And i download the model yolo_mobilev1 0.75, put it in /data

When i try make train MODEL=yolo_mobilev1 DEPTHMUL=0.75 MAXEP=10 ILR=0.0001 DATASET=voc CLSNUM=20 IAA=True BATCH=16 CKPT=log/voc/yolo_model.h5

it returns

(k210_yolo_tf1) [chaoers@fanfan-Arch K210_Yolo_framework]$ make train MODEL=yolo_mobilev1      DEPTHMUL=0.75 MAXEP=10 ILR=0.0001 DATASET=voc CLSNUM=20 IAA=True BATCH=16 CKPT=log/voc/yolo_model.h5
python3 ./keras_train.py \
                --train_set voc \
                --class_num 20 \
                --pre_ckpt log/voc/yolo_model.h5 \
                --model_def yolo_mobilev1 \
                --depth_multiplier 0.75 \
                --augmenter True \
                --image_size 224 320 \
                --output_size 7 10 14 20 \
                --batch_size 16 \
                --rand_seed 3 \
                --max_nrof_epochs 10 \
                --init_learning_rate 0.0001 \
                --learning_rate_decay_factor 0 \
                --obj_weight 1 \
                --noobj_weight 1 \
                --wh_weight 1 \
                --obj_thresh 0.7 \
                --iou_thresh 0.5 \
                --vaildation_split 0.05 \
                --log_dir log \
                --is_prune False \
                --prune_initial_sparsity 0.5 \
                --prune_final_sparsity 0.9 \
                --prune_end_epoch 5 \
                --prune_frequency 100 
2021-03-05 08:53:25.698336: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2021-03-05 08:53:25.706801: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3400410000 Hz
2021-03-05 08:53:25.707261: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55740eac0640 executing computations on platform Host. Devices:
2021-03-05 08:53:25.707304: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): <undefined>, <undefined>
2021-03-05 08:53:25.708469: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcuda.so.1
2021-03-05 08:53:25.929544: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-03-05 08:53:25.929817: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties: 
name: GeForce GTX 1060 3GB major: 6 minor: 1 memoryClockRate(GHz): 1.7085
pciBusID: 0000:05:00.0
2021-03-05 08:53:25.929953: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcudart.so.10.0'; dlerror: libcudart.so.10.0: cannot open shared object file: No such file or directory
2021-03-05 08:53:25.930057: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcublas.so.10.0'; dlerror: libcublas.so.10.0: cannot open shared object file: No such file or directory
2021-03-05 08:53:25.930160: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcufft.so.10.0'; dlerror: libcufft.so.10.0: cannot open shared object file: No such file or directory
2021-03-05 08:53:25.930253: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcurand.so.10.0'; dlerror: libcurand.so.10.0: cannot open shared object file: No such file or directory
2021-03-05 08:53:25.930344: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcusolver.so.10.0'; dlerror: libcusolver.so.10.0: cannot open shared object file: No such file or directory
2021-03-05 08:53:25.930433: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcusparse.so.10.0'; dlerror: libcusparse.so.10.0: cannot open shared object file: No such file or directory
2021-03-05 08:53:25.930533: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcudnn.so.7'; dlerror: libcudnn.so.7: cannot open shared object file: No such file or directory
2021-03-05 08:53:25.930558: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1663] Cannot dlopen some GPU libraries. Skipping registering GPU devices...
2021-03-05 08:53:26.001765: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-03-05 08:53:26.001828: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187]      0 
2021-03-05 08:53:26.001837: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0:   N 
2021-03-05 08:53:26.003390: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-03-05 08:53:26.003801: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55740f50c890 executing computations on platform CUDA. Devices:
2021-03-05 08:53:26.003840: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): GeForce GTX 1060 3GB, Compute Capability 6.1
[ INFO  ] data augment is  True
WARNING:tensorflow:From /home/chaoers/anaconda3/envs/k210_yolo_tf1/lib/python3.7/site-packages/tensorflow/python/data/ops/dataset_ops.py:494: py_func (from tensorflow.python.ops.script_ops) is deprecated and will be removed in a future version.
Instructions for updating:
tf.py_func is deprecated in TF V2. Instead, there are two
    options available in V2.
    - tf.py_function takes a python function which manipulates tf eager
    tensors instead of numpy arrays. It's easy to convert a tf eager tensor to
    an ndarray (just call tensor.numpy()) but having access to eager tensors
    means `tf.py_function`s can use accelerators such as GPUs as well as
    being differentiable using a gradient tape.
    - tf.numpy_function maintains the semantics of the deprecated tf.py_func
    (it is not differentiable, and manipulates numpy arrays). It drops the
    stateful argument making all functions stateful.
    
WARNING:tensorflow:From /home/chaoers/anaconda3/envs/k210_yolo_tf1/lib/python3.7/site-packages/tensorflow/python/data/util/random_seed.py:58: add_dispatch_support.<locals>.wrapper (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
2021-03-05 08:53:26.197747: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-03-05 08:53:26.198099: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties: 
name: GeForce GTX 1060 3GB major: 6 minor: 1 memoryClockRate(GHz): 1.7085
pciBusID: 0000:05:00.0
2021-03-05 08:53:26.198326: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcudart.so.10.0'; dlerror: libcudart.so.10.0: cannot open shared object file: No such file or directory
2021-03-05 08:53:26.198567: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcublas.so.10.0'; dlerror: libcublas.so.10.0: cannot open shared object file: No such file or directory
2021-03-05 08:53:26.198740: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcufft.so.10.0'; dlerror: libcufft.so.10.0: cannot open shared object file: No such file or directory
2021-03-05 08:53:26.198915: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcurand.so.10.0'; dlerror: libcurand.so.10.0: cannot open shared object file: No such file or directory
2021-03-05 08:53:26.199223: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcusolver.so.10.0'; dlerror: libcusolver.so.10.0: cannot open shared object file: No such file or directory
2021-03-05 08:53:26.199594: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcusparse.so.10.0'; dlerror: libcusparse.so.10.0: cannot open shared object file: No such file or directory
2021-03-05 08:53:26.199729: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcudnn.so.7'; dlerror: libcudnn.so.7: cannot open shared object file: No such file or directory
2021-03-05 08:53:26.199764: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1663] Cannot dlopen some GPU libraries. Skipping registering GPU devices...
2021-03-05 08:53:26.200140: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-03-05 08:53:26.200185: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187]      
WARNING:tensorflow:Entity <function Helper._create_dataset.<locals>._parser_wrapper at 0x7f1def3c0c80> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <function Helper._create_dataset.<locals>._parser_wrapper at 0x7f1def3c0c80>: AssertionError: Bad argument number for Name: 3, expecting 4
[ INFO  ] data augment is  False
WARNING:tensorflow:Entity <function Helper._create_dataset.<locals>._parser_wrapper at 0x7f1def3c0d90> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <function Helper._create_dataset.<locals>._parser_wrapper at 0x7f1def3c0d90>: AssertionError: Bad argument number for Name: 3, expecting 4
WARNING:tensorflow:From /home/chaoers/anaconda3/envs/k210_yolo_tf1/lib/python3.7/site-packages/tensorflow/python/ops/init_ops.py:1251: calling VarianceScaling.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
Traceback (most recent call last):
  File "./keras_train.py", line 154, in <module>
    args.prune_frequency)
  File "./keras_train.py", line 50, in main
    yolo_model, yolo_model_warpper = network([image_size[0], image_size[1], 3], len(h.anchors[0]), class_num, alpha=depth_multiplier)
  File "/home/chaoers/workspace/K210_Yolo_framework/models/yolonet.py", line 19, in yolo_mobilev1
    base_model.load_weights('data/mobilenet_v1_base_7.h5')
  File "/home/chaoers/anaconda3/envs/k210_yolo_tf1/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py", line 162, in load_weights
    return super(Model, self).load_weights(filepath, by_name)
  File "/home/chaoers/anaconda3/envs/k210_yolo_tf1/lib/python3.7/site-packages/tensorflow/python/keras/engine/network.py", line 1424, in load_weights
    saving.load_weights_from_hdf5_group(f, self.layers)
  File "/home/chaoers/anaconda3/envs/k210_yolo_tf1/lib/python3.7/site-packages/tensorflow/python/keras/saving/hdf5_format.py", line 711, in load_weights_from_hdf5_group
    original_keras_version = f.attrs['keras_version'].decode('utf8')
AttributeError: 'str' object has no attribute 'decode'
make: *** [Makefile:35: train] Error 1

如何提升识别精度

大神好，感谢您的开源程序。近期我用你开源的代码训练了自己的模型，这个模型是一个具有5个分类的检测模型，通过maix go板上的摄像头进行实时的检测。但是，训练过程中的loss降到一定程度就不能降低了。最终结果的识别精度不好，主要的问题是对目标的分类总是搞错。
现在请教大神，我如何改进模型或者数据集才能提高识别精度？
在你的程序说明文档中有下面的提示：
OBJWEIGHT,NOOBJWEIGHT,WHWEIGHT used to balance precision and recall
我的理解是这两个参数可能会对识别精度有影响，但我清楚如何调整这两个参数。还有哪些设置参数会对精度有影响？
盼大神回复。

你这个在哪个系统上运行的啊？

我特别想知道实在win10还是liunx上，如果我想跑自己的数据集。我该怎么弄。。。新手望解答

Pre-trained models not work. Need simple explanation how code is working and simple and straightforward code for transfer learning

Hello
Pre-trained models doesn't work. Need simple explanation how code is working and simple and straightforward code for transfer learning
I converted one of pre-trained models, loaded in maix board, and run, but it not detect anything.
On maix board it is running without errors, but x and y are always 0 and it shows that is detected only 1 class all the time no matter what is captured by camera.
What is pre-trained of these models if they are not detecting anything?
I didn't tried to load weights yet because I expect that if model is pre-trained it contains trained weights.

Also I can't understand if there pre-trained models have top layer or not and what to do with it.
I want to use transfer learning method to train model on my small and simple image dataset, so I need simple explanation how to write my simple code that compile and fit model for transfer learning.
I tried to track in code what it is doing but it is too complex and complicated and there is no any comments in code explaining that and why is doing

I wand yoloV2 version of this code because maixpy doesn't support yolov3!

I wand yoloV2 version of this code because maixpy doesn't support yolov3 and I even can't test the effectiveness of this model

ulffd_landmark.tflite

您好，我在https://github.com/kendryte/nncase/blob/master/examples/facedetect_landmark/model/ulffd_landmark.tflite，看到了这个文件，但不知道究竟是怎么实现，因此想请教一下是否方便告知这个文件所在的源码呢？麻烦了～🤝

其他模型可以使用此程序部署到k210吗

我使用yolo-faster训练好的模型输出为[1,45,7,10],[1,45,14,20], 可以使用你的程序部署到k210吗

How to infer a folder of images?

As the title

Explain Mobilenet modification

Could you explain how you modified the Mobilenet v1 to fit K210?

「Solved」README中Freeze的不是Prune Model产生的model？

# Prune Model

make train MODEL=xxxx MAXEP=1 ILR=0.0003 DATASET=voc CLSNUM=20 BATCH=16 PRUNE=True CKPT=log/xxxxxx/yolo_model.h5 END_EPOCH=1

# When training finish, will save model as log/xxxxxx/yolo_prune_model.h5.
# Freeze
toco --output_file mobile_yolo.tflite --keras_model_file log/xxxxxx/yolo_model.h5

修剪之后的是 log/xxxxxx/yolo_prune_model.h5. 冻结的不应该是修建出来的模型吗？为啥还是修建之前的训练模型呢？-> log/xxxxxx/yolo_model.h5

pre-train model & prune model 模型压缩与转换

hello zheng:
当我用连接提供的voc训练集完成训练后，默认生成的.h5模型大小为15.8M，通过提剪枝命令：make train MODEL=xxxx MAXEP=1 ILR=0.0003 DATASET=voc CLSNUM=20 BATCH=16 PRUNE=True CKPT=log/xxxxxx/yolo_model.h5 END_EPOCH=1 进行剪枝后生成的模型大小没有变化，模型通过nncase转换提示失败（模型超出K210内存）我的目的时想把模型移植到K210上面跑，该如何实现，需要修改pre-train model 还是可以通过剪枝实现模型压缩？

Project needs simple and natural optimizations

This project needs following simple and basic optimizations

There is no any sense everyone to build train.txt and creating voc_img_ann.npy once more there is no directory xxxx as specified in example command: python3 make_voc_list.py xxxx/train.txt data/voc_img_ann.npy
There is no any sense all people to train model with over 3.2million parameters, so must be supplied
real pre-trained models with optimal accuracy
Must be added option for transfer learning in order to train custom classes
Code must be updated to tensorflow version 2 and latest versions of all other required components

关于版权

我将你的代码用在了我的项目中 https://github.com/SEASKY-Master/Yolo-for-k210

我在readme做了声明

running from DVP camera

Hello,

We tried to get the C code running on K210 maix bit, using the latest version of the standalone SDK.
I see your main.c contains code for DVP but its commented. so we tried to bring it to life but so far not succesfull.

I have some questions:

we get 0.2 FPS on mobile V1 with dm=0.75 .. is this realistic ? the animated gif seemed much faster. we had to hack the latest standalone sdk to get it working (remove an assert in dvp.c !is_memory_cache)
we don't get meaningful results (always the same rectangle). might be a bug in our trials.

i guess you used this repo mainly for creating a model rather than using it as a base for inference on K210. what would be the best practises for doing inference based on the models here ?

/cc @GillesKap

Greetings,
Frank

Can't produce model with only 1 output layer because code raises error and exits

K210\K210_Yolo_framework\tools\utils.py", line 545, in tf_xywh_to_all
all_pred_xy = (tf.sigmoid(grid_pred_xy[..., :]) + h.xy_offset[layer]) / h.out_hw[layer][::-1]
IndexError: index 1 is out of bounds for axis 0 with size 1

ValueError: Dimensions must be equal, ... when training is started using YOLO

Hi,

I tested all models and they work very well except for YOLO.
Below is error message when I tried to train.

tf-docker /app/K210_Yolo_framework > make train MODEL=yolo MAXEP=1 ILR=0.001 DATASET=voc CLSNUM=20 IAA=False BATCH=16
python3 ./keras_train.py \
		--train_set voc \
		--class_num 20 \
		--pre_ckpt "" \
		--model_def yolo \
		--depth_multiplier 0.75 \
		--augmenter False \
		--image_size 224 320 \
		--output_size 7 10 14 20 \
		--batch_size 16 \
		--rand_seed 3 \
		--max_nrof_epochs 1 \
		--init_learning_rate 0.001 \
		--learning_rate_decay_factor 0 \
		--obj_weight 1 \
		--noobj_weight 1 \
		--wh_weight 1 \
		--obj_thresh 0.7 \
		--iou_thresh 0.5 \
		--vaildation_split 0.05 \
		--log_dir log \
		--is_prune False \
		--prune_initial_sparsity 0.5 \
		--prune_final_sparsity 0.9 \
		--prune_end_epoch 5 \
		--prune_frequency 100 
2020-06-11 08:58:42.428160: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-06-11 08:58:42.432710: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcuda.so.1
2020-06-11 08:58:42.573288: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-06-11 08:58:42.573645: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x57b2a90 executing computations on platform CUDA. Devices:
2020-06-11 08:58:42.573665: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): GeForce GTX 1650, Compute Capability 7.5
2020-06-11 08:58:42.593525: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2599990000 Hz
2020-06-11 08:58:42.593991: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x56b8550 executing computations on platform Host. Devices:
2020-06-11 08:58:42.594015: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): <undefined>, <undefined>
2020-06-11 08:58:42.594243: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-06-11 08:58:42.594556: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties: 
name: GeForce GTX 1650 major: 7 minor: 5 memoryClockRate(GHz): 1.56
pciBusID: 0000:01:00.0
2020-06-11 08:58:42.594836: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2020-06-11 08:58:42.595891: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0
2020-06-11 08:58:42.596735: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcufft.so.10.0
2020-06-11 08:58:42.596998: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcurand.so.10.0
2020-06-11 08:58:42.598197: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusolver.so.10.0
2020-06-11 08:58:42.599063: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusparse.so.10.0
2020-06-11 08:58:42.601488: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
2020-06-11 08:58:42.601588: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-06-11 08:58:42.601883: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-06-11 08:58:42.602095: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0
2020-06-11 08:58:42.602130: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2020-06-11 08:58:42.602803: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-06-11 08:58:42.602815: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187]      0 
2020-06-11 08:58:42.602821: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0:   N 
2020-06-11 08:58:42.602904: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-06-11 08:58:42.603170: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-06-11 08:58:42.603412: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 2946 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1650, pci bus id: 0000:01:00.0, compute capability: 7.5)
[ INFO  ] data augment is  False
WARNING: Logging before flag parsing goes to stderr.
W0611 08:58:42.764019 139703720445760 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/data/ops/dataset_ops.py:494: py_func (from tensorflow.python.ops.script_ops) is deprecated and will be removed in a future version.
Instructions for updating:
tf.py_func is deprecated in TF V2. Instead, there are two
    options available in V2.
    - tf.py_function takes a python function which manipulates tf eager
    tensors instead of numpy arrays. It's easy to convert a tf eager tensor to
    an ndarray (just call tensor.numpy()) but having access to eager tensors
    means `tf.py_function`s can use accelerators such as GPUs as well as
    being differentiable using a gradient tape.
    - tf.numpy_function maintains the semantics of the deprecated tf.py_func
    (it is not differentiable, and manipulates numpy arrays). It drops the
    stateful argument making all functions stateful.
    
W0611 08:58:42.782631 139703720445760 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/data/util/random_seed.py:58: add_dispatch_support.<locals>.wrapper (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
2020-06-11 08:58:42.786745: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-06-11 08:58:42.787202: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties: 
name: GeForce GTX 1650 major: 7 minor: 5 memoryClockRate(GHz): 1.56
pciBusID: 0000:01:00.0
2020-06-11 08:58:42.787283: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2020-06-11 08:58:42.787312: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0
2020-06-11 08:58:42.787349: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcufft.so.10.0
2020-06-11 08:58:42.787398: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcurand.so.10.0
2020-06-11 08:58:42.787451: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusolver.so.10.0
2020-06-11 08:58:42.787504: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusparse.so.10.0
2020-06-11 08:58:42.787537: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
2020-06-11 08:58:42.787690: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-06-11 08:58:42.788133: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-06-11 08:58:42.788488: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0
2020-06-11 08:58:42.788899: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-06-11 08:58:42.789267: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties: 
name: GeForce GTX 1650 major: 7 minor: 5 memoryClockRate(GHz): 1.56
pciBusID: 0000:01:00.0
2020-06-11 08:58:42.789312: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2020-06-11 08:58:42.789348: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0
2020-06-11 08:58:42.789362: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcufft.so.10.0
2020-06-11 08:58:42.789388: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcurand.so.10.0
2020-06-11 08:58:42.789429: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusolver.so.10.0
2020-06-11 08:58:42.789455: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusparse.so.10.0
2020-06-11 08:58:42.789468: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
2020-06-11 08:58:42.789565: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-06-11 08:58:42.789932: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-06-11 08:58:42.790293: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0
2020-06-11 08:58:42.790313: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-06-11 08:58:42.790318: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187]      0 
2020-06-11 08:58:42.790322: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0:   N 
2020-06-11 08:58:42.790438: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-06-11 08:58:42.790889: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-06-11 08:58:42.791276: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 2946 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1650, pci bus id: 0000:01:00.0, compute capability: 7.5)
[ INFO  ] data augment is  False
W0611 08:58:42.951060 139703720445760 deprecation.py:506] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/init_ops.py:1251: calling VarianceScaling.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
W0611 08:58:46.494499 139703720445760 deprecation.py:506] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/init_ops.py:97: calling GlorotUniform.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
W0611 08:58:46.494857 139703720445760 deprecation.py:506] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/init_ops.py:97: calling Zeros.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
W0611 08:58:46.495954 139703720445760 deprecation.py:506] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/init_ops.py:97: calling Ones.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
W0611 08:58:54.212770 139703720445760 hdf5_format.py:221] No training configuration found in save file: the model was *not* compiled. Compile it manually.
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 1864, in _create_c_op
    c_op = c_api.TF_FinishOperation(op_desc)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Dimensions must be equal, but are 13 and 7 for 'loss/l1_loss/calc_mask_0/xywh_to_all_0/add' (op: 'Add') with input shapes: [?,13,13,3,2], [7,10,1,2].

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "./keras_train.py", line 163, in <module>
    args.prune_frequency)
  File "./keras_train.py", line 88, in main
    metrics=[Yolo_Precision(obj_thresh, name='p'), Yolo_Recall(obj_thresh, name='r')])
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/tracking/base.py", line 457, in _method_wrapper
    result = method(self, *args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py", line 337, in compile
    self._compile_weights_loss_and_weighted_metrics()
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/tracking/base.py", line 457, in _method_wrapper
    result = method(self, *args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py", line 1710, in _compile_weights_loss_and_weighted_metrics
    self.total_loss = self._prepare_total_loss(masks)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py", line 1770, in _prepare_total_loss
    per_sample_losses = loss_fn.call(y_true, y_pred)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/losses.py", line 215, in call
    return self.fn(y_true, y_pred, **self._fn_kwargs)
  File "/app/K210_Yolo_framework/tools/utils.py", line 760, in loss_fn
    iou_thresh, layer, h)
  File "/app/K210_Yolo_framework/tools/utils.py", line 689, in calc_ignore_mask
    pred_xy, pred_wh = tf_xywh_to_all(p_xy, p_wh, layer, helper)
  File "/app/K210_Yolo_framework/tools/utils.py", line 545, in tf_xywh_to_all
    all_pred_xy = (tf.sigmoid(grid_pred_xy[..., :]) + h.xy_offset[layer]) / h.out_hw[layer][::-1]
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/math_ops.py", line 897, in binary_op_wrapper
    return func(x, y, name=name)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/gen_math_ops.py", line 387, in add
    "Add", x=x, y=y, name=name)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper
    op_def=op_def)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 3616, in create_op
    op_def=op_def)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 2027, in __init__
    control_input_ops)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 1867, in _create_c_op
    raise ValueError(str(e))
ValueError: Dimensions must be equal, but are 13 and 7 for 'loss/l1_loss/calc_mask_0/xywh_to_all_0/add' (op: 'Add') with input shapes: [?,13,13,3,2], [7,10,1,2].
Makefile:35: recipe for target 'train' failed
make: *** [train] Error 1

I'm using TensorFlow docker with GPU support. But I think it's not problem because training with all other models works very well in the same environment.
I downloaded pretrained weights yolo_weights.h5 from your Google Drive.

Please, help me.

请问工程中提供的yolo3_frame_test_public_maixpy是如何往MAIXPY中部署并调用的？MAIXPY支持c语言程序吗？

Result have NaN value please Rerun!

When I run $ make anchors DATASET=voc ANCNUM=3 LOW="0.0 0.0" HIGH="1.0 1.0"，
the PC shows:
python ./make_anchor_list.py
voc
--max_iters 10
--is_random True
--in_hw 224 320
--out_hw 7 10 14 20
--anchor_num 3
--low 0.0 0.0
--high 1.0 1.0
WARNING:tensorflow:From ./make_anchor_list.py:135: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

WARNING:tensorflow:From ./make_anchor_list.py:161: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.

WARNING:tensorflow:From ./make_anchor_list.py:163: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.

D:\Users\L\miniconda3\envs\K210\lib\site-packages\numpy\core\fromnumeric.py:3118: RuntimeWarning: Mean of empty slice.
out=out, **kwargs)
D:\Users\L\miniconda3\envs\K210\lib\site-packages\numpy\core_methods.py:78: RuntimeWarning: invalid value encountered in true_divide
ret, rcount, out=ret, casting='unsafe', subok=False)
?[31m[ ERROR ]?[0m Result have NaN value please Rerun!

mobilenetv2 training

采用yolo_mobilev2 DEPTHMUL=0.5 训练的模型在用nncase量化时出现“KPU allocator cannot allocate more memory”问题，这个问题要怎么解决呢，需要该mobilenetv2的网络结构吗

Question

您好，我通过该项目训练自己的数据集，得到了h5，tflite，kmodel三种模型，并通过netron大致了解了h5和tflite的网络架构。
我想尝试着用Pytorch完成这一项目，按照h5和tflite重新定义网络结构，但是无法使用预训练模型。

作者能分享一下这个项目的大致实现过程吗？可以参考一下。

自己的数据集

正常导出img_ann.npy，没有成功导出过anchor.npy，仅voc可以正常导出

训练自己的voc数据集报错 ValueError: operands could not be broadcast together with shapes (0,0) (2,)

可以出一个win的版本吗？

there are some questions i want to ask you,.May I have your contact information

h5模型正常，k210上识别出一堆乱框......

你好，zhen！
训练自己数据集，模型：mobile_yolo_v1 （0.75），得到的h5模型测试效果良好，可是部署到k210上，屏幕识别画出一堆乱框。
同样的代码和流程，烧录官网给出的example，mobile_yolo的识别20类物体的kmodel，在k210上可以正常运行，屏幕显示也正常。
请教一下，产生这个乱框问题的可能原因是什么？

请问支持maix dock嘛

你好，这个代码支持maix dock嘛，如过不支持，请问需要如何修改呢，思路是什么呢

你好，[ ERROR ] Result have NaN value please Rerun! 请问这个错怎么解决呢？

Out of bound problem

Hello, I am having 'out of bound' problem when running "make train" & "make inference".

When I execute the command "make inference MODEL=yolo_mobilev1 DEPTHMUL=0.75 CKPT=asset/yolo_model.h5 IMG=data/people.jpg" in ubuntu, the following error appears.

ubuntu@ubuntu:~/user/project/darkflow/K210_Yolo_framework$ make inference MODEL=yolo_mobilev1 DEPTHMUL=0.75 CKPT=asset/yolo_model.h5 IMG=data/people.jpg
python3 ./keras_inference.py \
                asset/yolo_model.h5 \
                data/people.jpg \
                --train_set voc \
                --class_num 20 \
                --model_def yolo_mobilev1 \
                --depth_multiplier 0.75 \
                --obj_thresh 0.7 \
                --iou_thresh 0.5 \
                --image_size 224 320 \
                --output_size 7 10 14 20
2020-07-01 16:55:37.626185: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 AVX512F FMA
2020-07-01 16:55:37.633894: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2200000000 Hz
2020-07-01 16:55:37.636912: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x560326e935b0 executing computations on platform Host. Devices:
2020-07-01 16:55:37.636975: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): <undefined>, <undefined>
2020-07-01 16:55:37.638829: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcuda.so.1
2020-07-01 16:55:38.277273: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties:
name: GeForce RTX 2080 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.545
pciBusID: 0000:3b:00.0
2020-07-01 16:55:38.278141: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 1 with properties:
name: GeForce RTX 2080 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.545
pciBusID: 0000:5e:00.0
2020-07-01 16:55:38.278926: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 2 with properties:
name: GeForce RTX 2080 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.545
pciBusID: 0000:b1:00.0
2020-07-01 16:55:38.279710: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 3 with properties:
name: GeForce RTX 2080 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.545
pciBusID: 0000:d9:00.0
2020-07-01 16:55:38.280005: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.1
2020-07-01 16:55:38.281750: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10
2020-07-01 16:55:38.283322: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcufft.so.10
2020-07-01 16:55:38.283650: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcurand.so.10
2020-07-01 16:55:38.285382: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusolver.so.10
2020-07-01 16:55:38.286139: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusparse.so.10
2020-07-01 16:55:38.289717: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
2020-07-01 16:55:38.295807: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0, 1, 2, 3
2020-07-01 16:55:38.295879: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.1
2020-07-01 16:55:38.299607: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-07-01 16:55:38.299625: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187]      0 1 2 3
2020-07-01 16:55:38.299630: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0:   N N N N
2020-07-01 16:55:38.299634: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 1:   N N N N
2020-07-01 16:55:38.299638: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 2:   N N N N
2020-07-01 16:55:38.299641: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 3:   N N N N
2020-07-01 16:55:38.304238: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10312 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 Ti, pci bus id:0000:3b:00.0, compute capability: 7.5)
2020-07-01 16:55:38.306208: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 10312 MB memory) -> physical GPU (device: 1, name: GeForce RTX 2080 Ti, pci bus id:0000:5e:00.0, compute capability: 7.5)
2020-07-01 16:55:38.308043: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:2 with 10312 MB memory) -> physical GPU (device: 2, name: GeForce RTX 2080 Ti, pci bus id:0000:b1:00.0, compute capability: 7.5)
2020-07-01 16:55:38.309859: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:3 with 10312 MB memory) -> physical GPU (device: 3, name: GeForce RTX 2080 Ti, pci bus id:0000:d9:00.0, compute capability: 7.5)
2020-07-01 16:55:38.312105: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x56032968bf20 executing computations on platform CUDA. Devices:
2020-07-01 16:55:38.312125: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): GeForce RTX 2080 Ti, Compute Capability 7.5
2020-07-01 16:55:38.312134: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (1): GeForce RTX 2080 Ti, Compute Capability 7.5
2020-07-01 16:55:38.312141: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (2): GeForce RTX 2080 Ti, Compute Capability 7.5
2020-07-01 16:55:38.312147: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (3): GeForce RTX 2080 Ti, Compute Capability 7.5
Traceback (most recent call last):
  File "./keras_inference.py", line 201, in <module>
    args.test_image)
  File "./keras_inference.py", line 76, in main
    h = Helper(None, class_num, f'data/{train_set}_anchor.npy', np.reshape(np.array(image_size), (-1, 2)), np.reshape(np.array(output_size), (-1, 2)))
  File "/home/ubuntu/user/project/darkflow/K210_Yolo_framework/tools/utils.py", line 77, in __init__
    self.xy_offset = Helper._coordinate_offset(self.anchors, self.out_hw)  # type:np.ndarray
  File "/home/ubuntu/user/project/darkflow/K210_Yolo_framework/tools/utils.py", line 250, in _coordinate_offset
    grid_y = np.tile(np.reshape(np.arange(0, stop=out_hw[l][0]), [-1, 1, 1, 1]), [1, out_hw[l][1], 1, 1])
IndexError: index 2 is out of bounds for axis 0 with size 2
Makefile:66: recipe for target 'inference' failed
make: *** [inference] Error 1

I created the train.txt and voc_img_ann.npy by following the command below.

wget https://pjreddie.com/media/files/VOCtrainval_11-May-2012.tar
wget https://pjreddie.com/media/files/VOCtrainval_06-Nov-2007.tar
wget https://pjreddie.com/media/files/VOCtest_06-Nov-2007.tar
tar xf VOCtrainval_11-May-2012.tar
tar xf VOCtrainval_06-Nov-2007.tar
tar xf VOCtest_06-Nov-2007.tar
wget https://pjreddie.com/media/files/voc_label.py
python3 voc_label.py
cat 2007_train.txt 2007_val.txt 2012_*.txt > train.txt

python3 make_voc_list.py ../train.txt data/voc_img_ann.npy

How can I solve the problem?

Thanks.

Anchor generation

I am trying to understand the Anchor generation algorithm.

I have created a custom model and I am using yolo v3 style detection.

I looked at some of the keras yolo v3 implementations, they are all having anchors as integers.

some of those anchors are bigger than the input image size as I am using 224x224 input

https://github.com/qqwweee/keras-yolo3/blob/master/model_data/yolo_anchors.txt

However I read some where on AlexyAB github repository that anchors have to be generated taking in to consideration of the model input size.

Can you please explain the logic of your code in generating anchors as you are producing a floating point anchor.

My training annotation file is in this format as below, single file, single line per image. If I want to modify your code for Kmeans generation, what all should I be modifying to get the correct anchors.

racoon_dataset/images/images/raccoon-115.jpg 51,130,351,556,1
racoon_dataset/images/images/raccoon-116.jpg 51,130,351,556,1

My other question is yolo v3 has 3 anchors per detection level, in your voc_anchor.np you have only 6 anchor pairs, are you using only 2 detection levels?

Train hanging at the end of Epoch 1/10

Following the instructions in the README.md in section Train at Point 1:

make train MODEL=yolo_mobilev1 DEPTHMUL=0.75 MAXEP=10 ILR=0.001 DATASET=voc CLSNUM=20 IAA=False BATCH=16

It starts Epoch 1/10 and runs for about 2 hours with the ETA getting close to 0s, but stops/hangs at 6s.
You can't Ctrl-C it and 'top' doesn't show any processor load.
The log dir only has an args.txt and train directory.

Example output:
979/982 [============================>.] - ETA: 20s - loss: 39.1206 - l1_loss: 11.0472 - l2_loss: 27.5336 - l1_p: 0.1742 - l1_r: 0.0855 - l2_p: 0.0486 - l2_r:
0980/982 [============================>.] - ETA: 13s - loss: 39.1038 - l1_loss: 11.0427 - l2_loss: 27.5213 - l1_p: 0.1744 - l1_r: 0.0855 - l2_p: 0.0487 - l2_r:
0981/982 [============================>.] - ETA: 6s - loss: 39.0847 - l1_loss: 11.0408 - l2_loss: 27.5041 - l1_p: 0.1747 - l1_r: 0.0856 - l2_p: 0.0487 - l2_r: 0.0118

If I set MAXEP=1 it completes after 2 hours and I get the yolo_model.h5. I tried the "make inference" with this and it didn't seem to detect anything. I also tried the pre-built yolo_model.h5 in the asset directory and that works well. The instructions say to use MAXEP=10 so perhaps this is the why my model doesn't work? Why does it hang at the end of Epoch 1/10?

MODEL = yolo_mobilev1, DEPTHMUL = 0.5 cause inference issue!

Did you try MODEL = yolo_mobilev1, DEPTHMUL = 0.5 runing on K210? I modify the code so that it can run detection on the image data capturing from camera, it works quite well with MODEL = yolo_mobilev1, DEPTHMUL = 0.75, but meanwhile low fps. I simplely change DEPTHMUL to 0.5 and retrain and redeploy on K210, but when runing this small model, the inference seems to affect the display memory and cause the display jerking, and no box output. I don't know what is wrong, but I believe that should be related to the model, since I only change the DEPTHMUL param and others remain the same.

训练自己的数据集报错：ValueError: operands could not be broadcast together with shapes (0,0) (2,)

python3 ./make_anchor_list.py
voc
--max_iters 10
--is_random True
--in_hw 224 320
--out_hw 7 10 14 20
--anchor_num 3
--low 0.0 0.0
--high 1.0 1.0
Traceback (most recent call last):
File "./make_anchor_list.py", line 241, in
main(args.train_set, args.max_iters, args.in_hw, args.out_hw, args.anchor_num, args.is_random, args.is_plot, args.low, args.high)
File "./make_anchor_list.py", line 198, in main
X[i, 1][:, 1:3] = (X[i, 1][:, 1:3] * img_wh * scale + translation) / in_wh
ValueError: operands could not be broadcast together with shapes (0,0) (2,)
Makefile:79: recipe for target 'anchors' failed
make: *** [anchors] Error 1

训练好的模型，怎么跑测试集的Recall, Precision和mAP

输入大小

输入大小只能是224x320吗，如果输入是224x224该怎么改main.c文件