raytroop / yolov3_tf Goto Github PK

View Code? Open in Web Editor NEW

67.0 3.0 20.0 621 KB

Try to implement YOLOv3 with TensorFlow

Python 97.23% C 2.77%

yolov3_tf's Introduction

YOLOv3_tf

Try to implement YOLOv3 with TensorFlow

transform darknet weight into npz: save_npz.py
darknet 52 conv layer: darknet53_trainable.py
yolo layer(loss & predict): yolo_layer.py

train (train_net.py)

1. for now, only train detector layer(50 epoch, batch size: 8)
2. froze feature extractor('darknet53.conv.74')
3. TODO: finetune end2end

traindata: VOC2007 trainval & VOC2012 trainval

sample prediction (predict_net.py)

trained weights

pan.baidu key: lbd4
Dropbox

Acknowledgements & Reference

🔥 Darknet 🔥
🔥 YAD2K 🔥

Requirement

1. TensorFlow 1.*
2. easydict
3. PIL
4. numpy
5. matplotlib

yolov3_tf's People

Contributors

Stargazers

Watchers

yolov3_tf's Issues

can't find file

I can't find file “darknet53.conv.74” ,can you tell me how to get it ? thanks

损失函数的定义

请问有人知道yolov3损失函数是怎么定义的吗？是跟yolov1一样吗？看了作者的代码没有太明白

Hey, I was playing with your code and it's great! This really isn't an issue as much as a suggestion but if you use cv2 for things like resizes they run about 10x faster than scipy. Just thought I would drop the note in cases you wanted to speed up training/inference.

finetune end2end is difficulty?

how to run

hello, when the complete test code will come out?

Can not change dataset num classes

Try to train on my own datasets, but changed the classes number to my own, there are some error, which means this codes can only deal with 20 classes on VOC.

What params should I edit if I wanna train on other number of classes？ (edited the cfg.classes not work)

Loss function return Nan after output Label been set to 30

Hi,
I've been trying to implement these code to one of my current project(since darkflow do not have yolov3), however the output is ls like:

I tried to debug while find that objectness_loss is Nan in the very beginning,pot my modifications to your code here, basically just the number of label(and the final filter size of course):
config.py

from easydict import EasyDict as edict
import numpy as np


def getLabels():
    with open('labels.txt', 'r') as f:
        a = list()
        labs = [l.strip() for l in f.readlines()]
        for lab in labs:
            if lab == '----': break
            a += [lab]
    return a
__C = edict()
# Consumers can get config by:
#   from config import cfg
cfg = __C

__C.anchors = np.array([[10, 13], [16, 30], [33, 23], [30, 61], [62, 45], [59, 119], [116, 90], [156, 198], [373, 326]])
__C.classes = 30 # any change made to this must also change yolo_head.py(yolo_head()) and make sure last layer have 3*(class + 5) filters for each detector (only true for Yolov3)
__C.num = 9
__C.num_anchors_per_layer = 3
__C.batch_size = 8
__C.scratch = False
__C.names = getLabels()

create_tfrecords.py

import xml.etree.ElementTree as ET
import numpy as np
import os
import tensorflow as tf
from PIL import Image
from config import getLabels
import glob
# sets = [('2007', 'trainval'), ('2012', 'trainval')]


classes = getLabels()

def convert(size, box):
    dw = 1./size[0]
    dh = 1./size[1]
    x = (box[0] + box[1])/2.0
    y = (box[2] + box[3])/2.0
    w = box[1] - box[0]
    h = box[3] - box[2]
    x = x*dw
    w = w*dw
    y = y*dh
    h = h*dh
    return [x, y, w, h]


def convert_annotation(image_id):
    in_file = open('C:\\Users\\P900\\Desktop\\myWork\\dmg_inspect_YOLOv3\\CID_project_dataset\\annotation\\%s.xml'%(image_id))

    tree = ET.parse(in_file)
    root = tree.getroot()
    size = root.find('size')
    w = int(size.find('width').text)
    h = int(size.find('height').text)
    bboxes = []
    for i, obj in enumerate(root.iter('object')):
        if i > 29:
            break
        difficult = obj.find('difficult').text
        cls = obj.find('name').text
        if cls not in classes or int(difficult) == 1:
            continue
        cls_id = classes.index(cls)
        xmlbox = obj.find('bndbox')
        b = (float(xmlbox.find('xmin').text), float(xmlbox.find('xmax').text), float(xmlbox.find('ymin').text), float(xmlbox.find('ymax').text))
        bb = convert((w, h), b) + [cls_id]
        bboxes.extend(bb)
    if len(bboxes) < 30*5:
        bboxes = bboxes + [0, 0, 0, 0, 0]*(30-int(len(bboxes)/5))

    return np.array(bboxes, dtype=np.float32).flatten().tolist()

def convert_img(image_id):
    image = Image.open('C:\\Users\\P900\\Desktop\\myWork\\dmg_inspect_YOLOv3\\CID_project_dataset\\CID_Photo\\%s.jpg'%(image_id))
    resized_image = image.resize((418, 418), Image.BICUBIC)
    image_data = np.array(resized_image, dtype='float32')/255
    img_raw = image_data.tobytes()
    return img_raw

filename = os.path.join('trainval'+'0712'+'.tfrecords')
writer = tf.python_io.TFRecordWriter(filename)

os.chdir('C:\\Users\\P900\\Desktop\\myWork\\dmg_inspect_YOLOv3\\CID_project_dataset\\annotation')
image_ids = os.listdir('.')
image_ids = glob.glob(str(image_ids) + '*.xml')
# print(filename)
for image_id in image_ids:
    image_id = image_id.split('.')[0]
    xywhc = convert_annotation(image_id)
    img_raw = convert_img(image_id)

    example = tf.train.Example(features=tf.train.Features(feature={
        'xywhc':
                tf.train.Feature(float_list=tf.train.FloatList(value=xywhc)),
        'img':
                tf.train.Feature(bytes_list=tf.train.BytesList(value=[img_raw])),
        }))
    writer.write(example.SerializeToString())
writer.close()

yolo_layer.py

import tensorflow as tf
from config import cfg


class yolo_head:

    def __init__(self, istraining):
        self.istraining = istraining

    def conv_layer(self, bottom, size, stride, in_channels, out_channels, use_bn, name):
        with tf.variable_scope(name):
            conv = tf.layers.conv2d(bottom, out_channels, size, stride, padding="SAME",
                                    use_bias=not use_bn, activation=None)
            if use_bn:
                conv_bn = tf.layers.batch_normalization(conv, training=self.istraining)
                act = tf.nn.leaky_relu(conv_bn, 0.1)
            else:
                act = conv
        return act
    def build(self, feat_ex, res18, res10):
        self.conv52 = self.conv_layer(feat_ex, 1, 1, 1024, 512, True, 'conv_head_52')  		# 13x512
        self.conv53 = self.conv_layer(self.conv52, 3, 1, 512, 1024, True, 'conv_head_53')   # 13x1024
        self.conv54 = self.conv_layer(self.conv53, 1, 1, 1024, 512, True, 'conv_head_54')   # 13x512
        self.conv55 = self.conv_layer(self.conv54, 3, 1, 512, 1024, True, 'conv_head_55')   # 13x1024
        self.conv56 = self.conv_layer(self.conv55, 1, 1, 1024, 512, True, 'conv_head_56')   # 13x512
        self.conv57 = self.conv_layer(self.conv56, 3, 1, 512, 1024, True, 'conv_head_57')   # 13x1024
        self.conv58 = self.conv_layer(self.conv57, 1, 1, 1024, 105, False, 'conv_head_58')   # 13x125
        # follow yolo layer mask = 6,7,8
        self.conv59 = self.conv_layer(self.conv56, 1, 1, 512, 256, True, 'conv_head_59')    # 13x256
        size = tf.shape(self.conv59)[1]
        self.upsample0 = tf.image.resize_nearest_neighbor(self.conv59, [2*size, 2*size],
                                                          name='upsample_0')                # 26x256
        self.route0 = tf.concat([self.upsample0, res18], axis=-1, name='route_0')           # 26x768
        self.conv60 = self.conv_layer(self.route0, 1, 1, 768, 256, True, 'conv_head_60')    # 26x256
        self.conv61 = self.conv_layer(self.conv60, 3, 1, 256, 512, True, 'conv_head_61')    # 26x512
        self.conv62 = self.conv_layer(self.conv61, 1, 1, 512, 256, True, 'conv_head_62')    # 26x256
        self.conv63 = self.conv_layer(self.conv62, 3, 1, 256, 512, True, 'conv_head_63')    # 26x512
        self.conv64 = self.conv_layer(self.conv63, 1, 1, 512, 256, True, 'conv_head_64')    # 26x256
        self.conv65 = self.conv_layer(self.conv64, 3, 1, 256, 512, True, 'conv_head_65')    # 26x512
        self.conv66 = self.conv_layer(self.conv65, 1, 1, 512, 105, False, 'conv_head_66')    # 26x125
        # follow yolo layer mask = 3,4,5
        self.conv67 = self.conv_layer(self.conv64, 1, 1, 256, 128, True, 'conv_head_67')    # 26x128
        size = tf.shape(self.conv67)[1]
        self.upsample1 = tf.image.resize_nearest_neighbor(self.conv67, [2 * size, 2 * size],
                                                          name='upsample_1')                # 52x128
        self.route1 = tf.concat([self.upsample1, res10], axis=-1, name='route_1')           # 52x384
        self.conv68 = self.conv_layer(self.route1, 1, 1, 384, 128, True, 'conv_head_68')    # 52x128
        self.conv69 = self.conv_layer(self.conv68, 3, 1, 128, 256, True, 'conv_head_69')    # 52x256
        self.conv70 = self.conv_layer(self.conv69, 1, 1, 256, 128, True, 'conv_head_70')    # 52x128
        self.conv71 = self.conv_layer(self.conv70, 3, 1, 128, 256, True, 'conv_head_71')    # 52x256
        self.conv72 = self.conv_layer(self.conv71, 1, 1, 256, 128, True, 'conv_head_72')    # 52x128
        self.conv73 = self.conv_layer(self.conv72, 3, 1, 128, 256, True, 'conv_head_73')    # 52x256
        self.conv74 = self.conv_layer(self.conv73, 1, 1, 256, 105, False, 'conv_head_74')    # 52x125
        # follow yolo layer mask = 0,1,2

        return self.conv74, self.conv66, self.conv58

Pls help, thanks

mAP

What's the mAP on VOC dataset?

Hi, I think there's something wrong with your code!

I have some puzzles with tour code in preprocess_true_boxes part in yolo layer. I wanna contact with you. It seems that you're a Chinese. If so, we can contact through qq. My qq is 815301859. If not, then we shall contact by email. My email address is [email protected]. I hope to hear from you soon. You are pretty good. I have completed MTCNN and Faster RCNN already. I think we can communicate more. Thank U.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.