Giter Site home page Giter Site logo

yolov3_tf's Introduction

YOLOv3_tf

Try to implement YOLOv3 with TensorFlow

  1. transform darknet weight into npz: save_npz.py
  2. darknet 52 conv layer: darknet53_trainable.py
  3. yolo layer(loss & predict): yolo_layer.py


train (train_net.py)

1. for now, only train detector layer(50 epoch, batch size: 8)
2. froze feature extractor('darknet53.conv.74')
3. TODO: finetune end2end

traindata: VOC2007 trainval & VOC2012 trainval

sample prediction (predict_net.py)




trained weights

Acknowledgements & Reference

Requirement
1. TensorFlow 1.*
2. easydict
3. PIL
4. numpy
5. matplotlib

yolov3_tf's People

Contributors

raytroop avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

yolov3_tf's Issues

can't find file

I can't find file “darknet53.conv.74” ,can you tell me how to get it ? thanks

损失函数的定义

请问有人知道yolov3损失函数是怎么定义的吗?是跟yolov1一样吗?看了作者的代码没有太明白

cv2 vs scipy

Hey, I was playing with your code and it's great! This really isn't an issue as much as a suggestion but if you use cv2 for things like resizes they run about 10x faster than scipy. Just thought I would drop the note in cases you wanted to speed up training/inference.

Can not change dataset num classes

Try to train on my own datasets, but changed the classes number to my own, there are some error, which means this codes can only deal with 20 classes on VOC.

What params should I edit if I wanna train on other number of classes? (edited the cfg.classes not work)

Loss function return Nan after output Label been set to 30

Hi,
I've been trying to implement these code to one of my current project(since darkflow do not have yolov3), however the output is ls like:
tim 20180612165828
I tried to debug while find that objectness_loss is Nan in the very beginning,pot my modifications to your code here, basically just the number of label(and the final filter size of course):
config.py

from easydict import EasyDict as edict
import numpy as np


def getLabels():
    with open('labels.txt', 'r') as f:
        a = list()
        labs = [l.strip() for l in f.readlines()]
        for lab in labs:
            if lab == '----': break
            a += [lab]
    return a
__C = edict()
# Consumers can get config by:
#   from config import cfg
cfg = __C

__C.anchors = np.array([[10, 13], [16, 30], [33, 23], [30, 61], [62, 45], [59, 119], [116, 90], [156, 198], [373, 326]])
__C.classes = 30 # any change made to this must also change yolo_head.py(yolo_head()) and make sure last layer have 3*(class + 5) filters for each detector (only true for Yolov3)
__C.num = 9
__C.num_anchors_per_layer = 3
__C.batch_size = 8
__C.scratch = False
__C.names = getLabels()

create_tfrecords.py

import xml.etree.ElementTree as ET
import numpy as np
import os
import tensorflow as tf
from PIL import Image
from config import getLabels
import glob
# sets = [('2007', 'trainval'), ('2012', 'trainval')]


classes = getLabels()

def convert(size, box):
    dw = 1./size[0]
    dh = 1./size[1]
    x = (box[0] + box[1])/2.0
    y = (box[2] + box[3])/2.0
    w = box[1] - box[0]
    h = box[3] - box[2]
    x = x*dw
    w = w*dw
    y = y*dh
    h = h*dh
    return [x, y, w, h]


def convert_annotation(image_id):
    in_file = open('C:\\Users\\P900\\Desktop\\myWork\\dmg_inspect_YOLOv3\\CID_project_dataset\\annotation\\%s.xml'%(image_id))

    tree = ET.parse(in_file)
    root = tree.getroot()
    size = root.find('size')
    w = int(size.find('width').text)
    h = int(size.find('height').text)
    bboxes = []
    for i, obj in enumerate(root.iter('object')):
        if i > 29:
            break
        difficult = obj.find('difficult').text
        cls = obj.find('name').text
        if cls not in classes or int(difficult) == 1:
            continue
        cls_id = classes.index(cls)
        xmlbox = obj.find('bndbox')
        b = (float(xmlbox.find('xmin').text), float(xmlbox.find('xmax').text), float(xmlbox.find('ymin').text), float(xmlbox.find('ymax').text))
        bb = convert((w, h), b) + [cls_id]
        bboxes.extend(bb)
    if len(bboxes) < 30*5:
        bboxes = bboxes + [0, 0, 0, 0, 0]*(30-int(len(bboxes)/5))

    return np.array(bboxes, dtype=np.float32).flatten().tolist()

def convert_img(image_id):
    image = Image.open('C:\\Users\\P900\\Desktop\\myWork\\dmg_inspect_YOLOv3\\CID_project_dataset\\CID_Photo\\%s.jpg'%(image_id))
    resized_image = image.resize((418, 418), Image.BICUBIC)
    image_data = np.array(resized_image, dtype='float32')/255
    img_raw = image_data.tobytes()
    return img_raw

filename = os.path.join('trainval'+'0712'+'.tfrecords')
writer = tf.python_io.TFRecordWriter(filename)

os.chdir('C:\\Users\\P900\\Desktop\\myWork\\dmg_inspect_YOLOv3\\CID_project_dataset\\annotation')
image_ids = os.listdir('.')
image_ids = glob.glob(str(image_ids) + '*.xml')
# print(filename)
for image_id in image_ids:
    image_id = image_id.split('.')[0]
    xywhc = convert_annotation(image_id)
    img_raw = convert_img(image_id)

    example = tf.train.Example(features=tf.train.Features(feature={
        'xywhc':
                tf.train.Feature(float_list=tf.train.FloatList(value=xywhc)),
        'img':
                tf.train.Feature(bytes_list=tf.train.BytesList(value=[img_raw])),
        }))
    writer.write(example.SerializeToString())
writer.close()

yolo_layer.py

import tensorflow as tf
from config import cfg


class yolo_head:

    def __init__(self, istraining):
        self.istraining = istraining

    def conv_layer(self, bottom, size, stride, in_channels, out_channels, use_bn, name):
        with tf.variable_scope(name):
            conv = tf.layers.conv2d(bottom, out_channels, size, stride, padding="SAME",
                                    use_bias=not use_bn, activation=None)
            if use_bn:
                conv_bn = tf.layers.batch_normalization(conv, training=self.istraining)
                act = tf.nn.leaky_relu(conv_bn, 0.1)
            else:
                act = conv
        return act
    def build(self, feat_ex, res18, res10):
        self.conv52 = self.conv_layer(feat_ex, 1, 1, 1024, 512, True, 'conv_head_52')  		# 13x512
        self.conv53 = self.conv_layer(self.conv52, 3, 1, 512, 1024, True, 'conv_head_53')   # 13x1024
        self.conv54 = self.conv_layer(self.conv53, 1, 1, 1024, 512, True, 'conv_head_54')   # 13x512
        self.conv55 = self.conv_layer(self.conv54, 3, 1, 512, 1024, True, 'conv_head_55')   # 13x1024
        self.conv56 = self.conv_layer(self.conv55, 1, 1, 1024, 512, True, 'conv_head_56')   # 13x512
        self.conv57 = self.conv_layer(self.conv56, 3, 1, 512, 1024, True, 'conv_head_57')   # 13x1024
        self.conv58 = self.conv_layer(self.conv57, 1, 1, 1024, 105, False, 'conv_head_58')   # 13x125
        # follow yolo layer mask = 6,7,8
        self.conv59 = self.conv_layer(self.conv56, 1, 1, 512, 256, True, 'conv_head_59')    # 13x256
        size = tf.shape(self.conv59)[1]
        self.upsample0 = tf.image.resize_nearest_neighbor(self.conv59, [2*size, 2*size],
                                                          name='upsample_0')                # 26x256
        self.route0 = tf.concat([self.upsample0, res18], axis=-1, name='route_0')           # 26x768
        self.conv60 = self.conv_layer(self.route0, 1, 1, 768, 256, True, 'conv_head_60')    # 26x256
        self.conv61 = self.conv_layer(self.conv60, 3, 1, 256, 512, True, 'conv_head_61')    # 26x512
        self.conv62 = self.conv_layer(self.conv61, 1, 1, 512, 256, True, 'conv_head_62')    # 26x256
        self.conv63 = self.conv_layer(self.conv62, 3, 1, 256, 512, True, 'conv_head_63')    # 26x512
        self.conv64 = self.conv_layer(self.conv63, 1, 1, 512, 256, True, 'conv_head_64')    # 26x256
        self.conv65 = self.conv_layer(self.conv64, 3, 1, 256, 512, True, 'conv_head_65')    # 26x512
        self.conv66 = self.conv_layer(self.conv65, 1, 1, 512, 105, False, 'conv_head_66')    # 26x125
        # follow yolo layer mask = 3,4,5
        self.conv67 = self.conv_layer(self.conv64, 1, 1, 256, 128, True, 'conv_head_67')    # 26x128
        size = tf.shape(self.conv67)[1]
        self.upsample1 = tf.image.resize_nearest_neighbor(self.conv67, [2 * size, 2 * size],
                                                          name='upsample_1')                # 52x128
        self.route1 = tf.concat([self.upsample1, res10], axis=-1, name='route_1')           # 52x384
        self.conv68 = self.conv_layer(self.route1, 1, 1, 384, 128, True, 'conv_head_68')    # 52x128
        self.conv69 = self.conv_layer(self.conv68, 3, 1, 128, 256, True, 'conv_head_69')    # 52x256
        self.conv70 = self.conv_layer(self.conv69, 1, 1, 256, 128, True, 'conv_head_70')    # 52x128
        self.conv71 = self.conv_layer(self.conv70, 3, 1, 128, 256, True, 'conv_head_71')    # 52x256
        self.conv72 = self.conv_layer(self.conv71, 1, 1, 256, 128, True, 'conv_head_72')    # 52x128
        self.conv73 = self.conv_layer(self.conv72, 3, 1, 128, 256, True, 'conv_head_73')    # 52x256
        self.conv74 = self.conv_layer(self.conv73, 1, 1, 256, 105, False, 'conv_head_74')    # 52x125
        # follow yolo layer mask = 0,1,2

        return self.conv74, self.conv66, self.conv58

Pls help, thanks

mAP

What's the mAP on VOC dataset?

Hi, I think there's something wrong with your code!

I have some puzzles with tour code in preprocess_true_boxes part in yolo layer. I wanna contact with you. It seems that you're a Chinese. If so, we can contact through qq. My qq is 815301859. If not, then we shall contact by email. My email address is [email protected]. I hope to hear from you soon. You are pretty good. I have completed MTCNN and Faster RCNN already. I think we can communicate more. Thank U.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.