Giter Site home page Giter Site logo

yolo-for-k210's Introduction

Yolo-for-k210

vseasky/yolo-for-k210

教程

riscv-k210

环境配置

  1. 常规版本
  • windows
  • python3.7
  • tensorflow-gpu1.15
  • cuda10.0
  • cudnn7.4.2
  • 其它扩展你可以使用 pip3 install -r requirements.txt 命令添加。
  1. 30显卡系列
  • windows-wsl-ubuntu20.04
  • python3.8.16
  • tensorflow-gpu1.15.5
    • nvidia-pyindex
    • nvidia-tensorflow[horovod]
    • nvidia-tensorboard==1.15
  • cuda11.1
  • cudnn8.6
  • 其它扩展你可以使用 pip3 install -r requirements.txt 命令添加。
  • nvidia 为兼容最新的显卡驱动舍弃了对lite的支持,因此要想使用toco工具,可尝试安装tf2版本提供模型转换工具。

准备数据集

  1. 推荐使用Vott工具对数据集进行标注,导出为PascalVoc格式。

  2. 案例 VOC 数据集存储于 /train-image/VOCdevkit,你可以修改为自定义数据集路径为 /train-image/your_img

    数据集结构

readme.png

  1. 运行make-train.py脚本,会按照7:2:1的比例,分配为训练集、验证集、测试集文件(pscalvoc.txt、train.txt、val.txt、test.txt),同时会自动检测并删除不成对的多余文件。
  • Win平台手动下载并合并数据集。
  • Linux平台下载数据集。
cd ./train-image
wget https://pjreddie.com/media/files/VOCtrainval_11-May-2012.tar
wget https://pjreddie.com/media/files/VOCtrainval_06-Nov-2007.tar
wget https://pjreddie.com/media/files/VOCtest_06-Nov-2007.tar
# 解压文件
tar xf VOCtrainval_11-May-2012.tar
tar xf VOCtrainval_06-Nov-2007.tar
tar xf VOCtest_06-Nov-2007.tar
# 合并数据集
cd VOCdevkit/
mv ./VOC2007/* ./
cp -r ./VOC2012/* ./
rm -rf VOC2007
rm -rf VOC2012
  • 分配数据集。
# 使用VOC数据集
python make-train.py ./VOCdevkit
# 使用自定义数据集
python make-train.py ./your_img
  1. 数据集预处理 voc_label.py
  • 修改 voc_label.py

image.png

# 使用VOC数据集
python voc_label.py
cat  VOCdevkit_train.txt VOCdevkit_val.txt> train.txt   #Linux使用此命令
# type VOCdevkit_train.txt VOCdevkit_val.txt> train.txt	#windowns使用此命令
# 使用自定义数据集
python voc_label.py
cat  your_img_train.txt your_img_val.txt> train.txt     #Linux使用此命令
# type your_img_train.txt your_img_val.txt> train.txt	#windowns使用此命令
  1. 检查 txt 文件内容是否正确,文件内容为图片路径。

  2. JPEGImages 路径和 Annotations 合并到一个NPY 文件中。

python make_voc_list.py train.txt data/voc_img_ann.npy

image.png

修改配置文件

你可以直接在 Makefile 编辑默认配置,又或者在 make 操作时传入参数。

image.png

生成 Anchors

加载Annotations生成 Anchors (LOW 和 HIGH 视数据集的分布而定):

# make anchors # 使用默认参数
make anchors DATASET=voc ANCNUM=3 LOW="0.0 0.0" HIGH="1.0 1.0" # 更改自定义参数

当你成功的时候,你会看到这样以下内容:

Figure_2.png

image.png

注:结果是随机的。当你有错误时,就重新运行它。

如果要使用自定义数据集,只需修改脚本并生成 data/{your_img}_img_ann.npy,然后使用 make anchors DATASET=your_img。更多选项请参见 python ./make_anchor_list.py -h

如果要更改输出层的数目,则应修改 OUTSIZE 在 Makefile

下载预训练模型

你必须下载您想要训练的模型权重,因为默认情况下会加载训练前的权重。把文件放进./data 目录。

MODEL DEPTHMUL Url Url
yolo_mobilev1 0.5 google drive weiyun
yolo_mobilev1 0.75 google drive weiyun
yolo_mobilev1 1.0 google drive weiyun
yolo_mobilev2 0.5 google drive weiyun
yolo_mobilev2 0.75 google drive weiyun
yolo_mobilev2 1.0 google drive weiyun
tiny_yolo google drive weiyun
yolo google drive weiyun

训练

使用 Mobileenet 时,需要指定 DEPTHMUL 参数。 使用 tiny yolo 或 yolo 你不需要设定 DEPTHMUL.

  1. 设置并开始训练:MODE-LDEPTHMUL
make train MODEL=yolo_mobilev1 DEPTHMUL=0.75 MAXEP=10 ILR=0.001 DATASET=voc CLSNUM=20 IAA=False BATCH=16

image.png

使用 Ctrl+C 停止训练,它将自动在日志目录中保存权重和模型。
  1. 设置为继续训练:CKPT
make train MODEL=yolo_mobilev1 DEPTHMUL=0.75 MAXEP=10 ILR=0.0005 DATASET=voc CLSNUM=20 IAA=False BATCH=16 CKPT=log/xxxxxxxxx/yolo_model.h5
  1. 设置为启用数据增强:IAA
make train MODEL=xxxx DEPTHMUL=xx MAXEP=10 ILR=0.0001 DATASET=voc CLSNUM=20 IAA=True BATCH=16 CKPT=log/xxxxxxxxx/yolo_model.h5
  1. 使用 tensorboard:
tensorboard --logdir log

注意:更多选项请参阅与python ./keras_train.py -h

推理

  1. 使用自己训练的模型
make inference MODEL=yolo_mobilev1 DEPTHMUL=0.75 CLSNUM=20 CKPT=log/xxxxxx/yolo_model.h5 IMG=data/input/people.jpg
  1. 你可以尝试我的模型:
make inference MODEL=yolo_mobilev1 DEPTHMUL=0.75 CKPT=asset/yolo_model.h5 IMG=data/input/people.jpg

people.png

make inference MODEL=yolo_mobilev1 DEPTHMUL=0.75 CKPT=asset/yolo_model.h5 IMG=data/input/dog.jpg

dog_res.jpg

注:由于 anchors 是随机生成的,如果您的结果与上面的图像不同,你只需要加载这个模型并继续训练一段时间。

更多选项请参见python ./keras_inference.py -h

修剪模型

make train MODEL=xxxx MAXEP=1 ILR=0.001 DATASET=voc CLSNUM=20 BATCH=16 PRUNE=True CKPT=log/xxxxxx/yolo_model.h5 END_EPOCH=1

训练结束时,将模型保存为 log/xxxxxx/yolo_prune_model.h5.

Freeze

toco --output_file data/tflite/mobile_yolo.tflite --keras_model_file log/xxxxxx/yolo_model.h5

现在你有了 mobile_yolo.tflite

转换 Kmodel

请参考 nncase v0.1.0-RC5 example

  1. NNCase Converter v0.1.0 RC5
./nncase/0.1.0/ncc --version
./nncase/0.1.0/ncc data/tflite/mobile_yolo.tflite mobile_yolo_v3.kmodel -i tflite -o k210model --dataset nncase-images

将 Kmodel 部署到 K210

这是一个完整的解决方案,底层的硬件和软件部署请参考 vseasky/riscv-k210

IMG_8355.JPG

IMG_8355.JPG

IMG_8351.JPG

常见问题&FAQ

  • 默认参数Makefile
  • OBJWEIGHT,, 用于平衡精度和召回率 NOOBJWEIGHT,WHWEIGH
  • 默认输出两层,如果需要更多输出层可以修改 OUTSIZE
  • 如果要使用完整的 yolo,则需要将 和 在 Makefile 中修改为原始 yolo 参数 IMGSIZE,OUTSIZE

参考

zhen8838/K210_Yolo_framework

2020/7/5 21:04:35

yolo-for-k210's People

Contributors

vseasky avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

yolo-for-k210's Issues

输入图片归一化问题

你好,我发现在k210上,输入图片的数值范围是0-255的整数,然而在训练模型时,输入图片像素值做了归一化,这是不是一个潜在的问题?训练模型时是不是应该将这一行归一化代码注释掉呢?谢谢!

开始训练时报错 AttributeError: 'str' object has no attribute 'decode' 已找到解决办法

模型训练-v1.pdf

集MODEL和DEPTHMUL开始训练: make train MODEL=yolo_mobilev1 DEPTHMUL=0.75 MAXEP=10
ILR=0.001 DATASET=voc CLSNUM=20 IAA=False BATCH=8

报错误

$ make train MODEL=yolo_mobilev1 DEPTHMUL=0.75 MAXEP=10
ILR=0.001 DATASET=voc CLSNUM=20 IAA=False BATCH=8
python ./keras_train.py \
                --train_set voc \
                --class_num 20 \
                --pre_ckpt "" \
                --model_def yolo_mobilev1 \
                --depth_multiplier 0.75 \
                --augmenter False \
                --image_size 224 320 \
                --output_size 7 10 14 20 \
                --batch_size 32 \
                --rand_seed 3 \
                --max_nrof_epochs 10 \
                --init_learning_rate 0.0005 \
                --learning_rate_decay_factor 0 \
                --obj_weight 1 \
                --noobj_weight 1 \
                --wh_weight 1 \
                --obj_thresh 0.7 \
                --iou_thresh 0.5 \
                --vaildation_split 0.05 \
                --log_dir log \
                --is_prune False \
                --prune_initial_sparsity 0.5 \
                --prune_final_sparsity 0.9 \
                --prune_end_epoch 5 \
                --prune_frequency 100 
WARNING:tensorflow:
The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
  * https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.

2022-03-15 16:21:52.145631: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2022-03-15 16:21:52.167418: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2096875000 Hz
2022-03-15 16:21:52.168281: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x5643bd94b6a0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2022-03-15 16:21:52.168352: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2022-03-15 16:21:52.169885: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2022-03-15 16:21:52.169909: E tensorflow/stream_executor/cuda/cuda_driver.cc:318] failed call to cuInit: UNKNOWN ERROR (303)
2022-03-15 16:21:52.169922: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (xx): /proc/driver/nvidia/version does not exist
[ INFO  ] data augment is  False
WARNING:tensorflow:From /home/kearney/.conda/envs/k210/lib/python3.7/site-packages/tensorflow_core/python/data/util/random_seed.py:58: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
[ INFO  ] data augment is  False
WARNING:tensorflow:From /home/kearney/.conda/envs/k210/lib/python3.7/site-packages/tensorflow_core/python/ops/resource_variable_ops.py:1630: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
Traceback (most recent call last):
  File "./keras_train.py", line 155, in <module>
    args.prune_frequency)
  File "./keras_train.py", line 51, in main
    yolo_model, yolo_model_warpper = network([image_size[0], image_size[1], 3], len(h.anchors[0]), class_num, alpha=depth_multiplier)
  File "/home/kearney/Downloads/k210/Yolo-for-k210/Yolo-for-k210/models/yolonet.py", line 19, in yolo_mobilev1
    base_model.load_weights('data/mobilenet_v1_base_7.h5')
  File "/home/kearney/.conda/envs/k210/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training.py", line 182, in load_weights
    return super(Model, self).load_weights(filepath, by_name)
  File "/home/kearney/.conda/envs/k210/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/network.py", line 1373, in load_weights
    saving.load_weights_from_hdf5_group(f, self.layers)
  File "/home/kearney/.conda/envs/k210/lib/python3.7/site-packages/tensorflow_core/python/keras/saving/hdf5_format.py", line 645, in load_weights_from_hdf5_group
    original_keras_version = f.attrs['keras_version'].decode('utf8')
AttributeError: 'str' object has no attribute 'decode'
make: *** [Makefile:35:train] 错误 1

解决办法来自 成功解决AttributeError: ‘str‘ object has no attribute ‘decode‘

pip install 'h5py<3.0.0' -i https://pypi.tuna.tsinghua.edu.cn/simple

猜测是作者 pip freeze 的时候删除了很多东西,我 pip 出来有好多个包,如下所示

$ pip list
Package                       Version
----------------------------- -----------
absl-py                       1.0.0
astor                         0.8.1
cached-property               1.5.2
certifi                       2021.10.8
cycler                        0.11.0
gast                          0.2.2
google-pasta                  0.2.0
grpcio                        1.44.0
h5py                          3.6.0
imageio                       2.9.0
imgaug                        0.2.9
importlib-metadata            4.11.3
Keras-Applications            1.0.8
Keras-Preprocessing           1.1.2
kiwisolver                    1.4.0
Markdown                      3.3.6
matplotlib                    3.0.3
networkx                      2.6.3
numpy                         1.16.2
opencv-python                 4.0.0.21
opt-einsum                    3.3.0
Pillow                        6.2.0
pip                           21.2.2
protobuf                      3.19.4
pyparsing                     3.0.7
python-dateutil               2.8.2
PyWavelets                    1.1.1
scikit-image                  0.15.0
scipy                         1.2.1
setuptools                    58.0.4
Shapely                       1.8.1.post1
six                           1.16.0
tensorboard                   1.15.0
tensorflow                    1.15.0
tensorflow-estimator          1.15.1
tensorflow-model-optimization 0.1.1
termcolor                     1.1.0
typing_extensions             4.1.1
Werkzeug                      2.0.3
wheel                         0.37.1
wrapt                         1.14.0
zipp                          3.7.0

「解决」Linux 下 cmake 编译源码报错

版本信息

  • cmake:3.22.3-1
  • tensorflow : 1.15.0 cpu
$ riscv64-unknown-elf-gcc --version
riscv64-unknown-elf-gcc (GCC) 8.2.0
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

错误信息

$  cmake .. –DPROJ=hello_world –G "MinGW Makefiles"
CMake Error: The source directory "/home/kearney/Downloads/k210/Yolo-for-k210/seasky_yolo/build/MinGW Makefiles" does not exist.
Specify --help for usage, or press the help button on the CMake GUI.

$  cmake .. –DPROJ=hello_world
CMake Error: The source directory "/home/kearney/Downloads/k210/Yolo-for-k210/seasky_yolo/build/–DPROJ=hello_world" does not exist.
Specify --help for usage, or press the help button on the CMake GUI.

ValueError: Empty training data.

你好,自定义数据集(401张图片),报错:ValueError:Empty training data.修改过dataset里的shuffle参数,仍然报该错误,可以给指点一下嘛?
1/190 [..............................] - ETA: 21:51 - loss: 897.8262 - l1_loss: 169.9690 - l2_loss: 727.4587 - l1_p: 0.0769 - l1_r: 0.6667 - l2_p: 0.0000e+00 - l2_r: 0.0000e+002020-09-16 12:46:54.344528: I tensorflow/core/profiler/lib/profiler_session.cc:205] Profiler session started.
2020-09-16 12:46:54.349900: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cupti64_100.dll'; dlerror: cupti64_100.dll not found
2020-09-16 12:46:54.354959: W tensorflow/core/profiler/lib/profiler_session.cc:213] Encountered error while starting profiler: Unavailable: CUPTI error: CUPTI could not be loaded or symbol could not be found.
2/190 [..............................] - ETA: 11:03 - loss: 961.8444 - l1_loss: 242.9070 - l2_loss: 718.5386 - l1_p: 0.0663 - l1_r: 0.4286 - l2_p: 0.0000e+00 - l2_r: 0.0000e+002020-09-16 12:46:54.576991: I tensorflow/core/platform/default/device_tracer.cc:588] Collecting 0 kernel records, 0 memcpy records.
2020-09-16 12:46:54.583706: E tensorflow/core/platform/default/device_tracer.cc:70] CUPTI error: CUPTI could not be loaded or symbol could not be found.
189/190 [============================>.] - ETA: 0s - loss: 157.4149 - l1_loss: 80.9354 - l2_loss: 76.0476 - l1_p: 0.0444 - l1_r: 0.0053 - l2_p: 0.0000e+00 - l2_r: 0.0000e+00Traceback (most recent call last):
File "./keras_train.py", line 155, in
args.prune_frequency)
File "./keras_train.py", line 99, in main
validation_data=h.test_dataset, validation_steps=int(h.test_epoch_step * h.validation_split))
File "D:\Anaconda\envs\seayolok210\lib\site-packages\tensorflow_core\python\keras\engine\training.py", line 727, in fit
use_multiprocessing=use_multiprocessing)
File "D:\Anaconda\envs\seayolok210\lib\site-packages\tensorflow_core\python\keras\engine\training_arrays.py", line 675, in fit
steps_name='steps_per_epoch')
File "D:\Anaconda\envs\seayolok210\lib\site-packages\tensorflow_core\python\keras\engine\training_arrays.py", line 440, in model_iteration
steps_name='validation_steps')
File "D:\Anaconda\envs\seayolok210\lib\site-packages\tensorflow_core\python\keras\engine\training_arrays.py", line 411, in model_iteration
aggregator.finalize()
File "D:\Anaconda\envs\seayolok210\lib\site-packages\tensorflow_core\python\keras\engine\training_utils.py", line 138, in finalize
raise ValueError('Empty training data.')
ValueError: Empty training data.
mingw32-make: *** [Makefile:35: train] Error 1

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.