Giter Site home page Giter Site logo

lddl / yolo-go Goto Github PK

View Code? Open in Web Editor NEW
28.0 4.0 2.0 6.57 MB

Port of Darknet YOLO for Go ecosystem

License: Apache License 2.0

Go 99.78% Shell 0.22%
hacktoberfest yolov3 yolov3-tiny yolo-darknet darknet machine-learning computer-vision neural-network object-detection data-science

yolo-go's Introduction

WIP. PRs are welcome

Port of Darknet YOLO, but via Gorgonia Both YOLOv3 and tiny-YOLOv3 are implemented.

Usage

Navigate to example/yolo-v3 folder and run main.go.

Available flags

go run main.go -h
  -cfg string
        Path to net configuration file (default "../../test_network_data/yolov3-tiny.cfg")
  -image string
        Path to image file for 'detector' mode (default "../../test_network_data/dog_416x416.jpg")
  -mode string
        Choose the mode: detector/training (default "detector")
  -train string
        Path to folder with labeled data (default "../../test_yolo_op_data")
  -weights string
        Path to weights file (default "../../test_network_data/yolov3-tiny.weights")

For testing tiny-yolov3:

go run main.go --mode detector --cfg ../../test_network_data/yolov3-tiny.cfg --weights ../../test_network_data/yolov3-tiny.weights --image ../../test_network_data/dog_416x416.jpg

For testing yolov3:

go run main.go --mode detector --cfg ../../test_network_data/yolov3.cfg --weights ../../test_network_data/yolov3.weights --image ../../test_network_data/dog_416x416.jpg

For training WIP. PRs are welcome:

go run main.go --mode training --cfg ../../test_network_data/yolov3-tiny.cfg --weights ../../test_network_data/yolov3-tiny.weights --image ../../test_network_data/dog_416x416.jpg --train ../../test_yolo_op_data

Weights and configuration

Weights can be downloaded via curl-scripts download_weights_yolo_v3.sh and download_weights_yolo_tiny_v3.sh. Configuration files: yolov3-tiny.cfg and yolov3.cfg

Network Architecture

Tiny-YOLOv3 Architecture is:

Convolution layer: Filters->16 Padding->1 Kernel->3x3 Stride->1 Activation->leaky Batch->1 Bias->false
Maxpooling layer: Size->2 Stride->2
Convolution layer: Filters->32 Padding->1 Kernel->3x3 Stride->1 Activation->leaky Batch->1 Bias->false
Maxpooling layer: Size->2 Stride->2
Convolution layer: Filters->64 Padding->1 Kernel->3x3 Stride->1 Activation->leaky Batch->1 Bias->false
Maxpooling layer: Size->2 Stride->2
Convolution layer: Filters->128 Padding->1 Kernel->3x3 Stride->1 Activation->leaky Batch->1 Bias->false
Maxpooling layer: Size->2 Stride->2
Convolution layer: Filters->256 Padding->1 Kernel->3x3 Stride->1 Activation->leaky Batch->1 Bias->false
Maxpooling layer: Size->2 Stride->2
Convolution layer: Filters->512 Padding->1 Kernel->3x3 Stride->1 Activation->leaky Batch->1 Bias->false
Maxpooling layer: Size->2 Stride->1
Convolution layer: Filters->1024 Padding->1 Kernel->3x3 Stride->1 Activation->leaky Batch->1 Bias->false
Convolution layer: Filters->256 Padding->0 Kernel->1x1 Stride->1 Activation->leaky Batch->1 Bias->false
Convolution layer: Filters->512 Padding->1 Kernel->3x3 Stride->1 Activation->leaky Batch->1 Bias->false
Convolution layer: Filters->255 Padding->0 Kernel->1x1 Stride->1 Activation->linear Batch->0 Bias->true
YOLO layer: Mask->3 Anchors->[81, 82]   |       Mask->4 Anchors->[135, 169]     |       Mask->5 Anchors->[344, 319]
Route layer: Start->13
Convolution layer: Filters->128 Padding->0 Kernel->1x1 Stride->1 Activation->leaky Batch->1 Bias->false
Upsample layer: Scale->2
Route layer: Start->19 End->8
Convolution layer: Filters->256 Padding->1 Kernel->3x3 Stride->1 Activation->leaky Batch->1 Bias->false
Convolution layer: Filters->255 Padding->0 Kernel->1x1 Stride->1 Activation->linear Batch->0 Bias->true
YOLO layer: Mask->0 Anchors->[10, 14]   |       Mask->1 Anchors->[23, 27]       |       Mask->2 Anchors->[37, 58] 

YOLOv3 Architecture is:

Convolution layer: Filters->32 Padding->1 Kernel->3x3 Stride->1 Activation->leaky Batch->1 Bias->false
Convolution layer: Filters->64 Padding->1 Kernel->3x3 Stride->2 Activation->leaky Batch->1 Bias->false
Convolution layer: Filters->32 Padding->0 Kernel->1x1 Stride->1 Activation->leaky Batch->1 Bias->false
Convolution layer: Filters->64 Padding->1 Kernel->3x3 Stride->1 Activation->leaky Batch->1 Bias->false
Shortcut layer: index from->1 | index to->3
Convolution layer: Filters->128 Padding->1 Kernel->3x3 Stride->2 Activation->leaky Batch->1 Bias->false
Convolution layer: Filters->64 Padding->0 Kernel->1x1 Stride->1 Activation->leaky Batch->1 Bias->false
Convolution layer: Filters->128 Padding->1 Kernel->3x3 Stride->1 Activation->leaky Batch->1 Bias->false
Shortcut layer: index from->5 | index to->7
Convolution layer: Filters->64 Padding->0 Kernel->1x1 Stride->1 Activation->leaky Batch->1 Bias->false
Convolution layer: Filters->128 Padding->1 Kernel->3x3 Stride->1 Activation->leaky Batch->1 Bias->false
Shortcut layer: index from->8 | index to->10
Convolution layer: Filters->256 Padding->1 Kernel->3x3 Stride->2 Activation->leaky Batch->1 Bias->false
Convolution layer: Filters->128 Padding->0 Kernel->1x1 Stride->1 Activation->leaky Batch->1 Bias->false
Convolution layer: Filters->256 Padding->1 Kernel->3x3 Stride->1 Activation->leaky Batch->1 Bias->false
Shortcut layer: index from->12 | index to->14
Convolution layer: Filters->128 Padding->0 Kernel->1x1 Stride->1 Activation->leaky Batch->1 Bias->false
Convolution layer: Filters->256 Padding->1 Kernel->3x3 Stride->1 Activation->leaky Batch->1 Bias->false
Shortcut layer: index from->15 | index to->17
Convolution layer: Filters->128 Padding->0 Kernel->1x1 Stride->1 Activation->leaky Batch->1 Bias->false
Convolution layer: Filters->256 Padding->1 Kernel->3x3 Stride->1 Activation->leaky Batch->1 Bias->false
Shortcut layer: index from->18 | index to->20
Convolution layer: Filters->128 Padding->0 Kernel->1x1 Stride->1 Activation->leaky Batch->1 Bias->false
Convolution layer: Filters->256 Padding->1 Kernel->3x3 Stride->1 Activation->leaky Batch->1 Bias->false
Shortcut layer: index from->21 | index to->23
Convolution layer: Filters->128 Padding->0 Kernel->1x1 Stride->1 Activation->leaky Batch->1 Bias->false
Convolution layer: Filters->256 Padding->1 Kernel->3x3 Stride->1 Activation->leaky Batch->1 Bias->false
Shortcut layer: index from->24 | index to->26
Convolution layer: Filters->128 Padding->0 Kernel->1x1 Stride->1 Activation->leaky Batch->1 Bias->false
Convolution layer: Filters->256 Padding->1 Kernel->3x3 Stride->1 Activation->leaky Batch->1 Bias->false
Shortcut layer: index from->27 | index to->29
Convolution layer: Filters->128 Padding->0 Kernel->1x1 Stride->1 Activation->leaky Batch->1 Bias->false
Convolution layer: Filters->256 Padding->1 Kernel->3x3 Stride->1 Activation->leaky Batch->1 Bias->false
Shortcut layer: index from->30 | index to->32
Convolution layer: Filters->128 Padding->0 Kernel->1x1 Stride->1 Activation->leaky Batch->1 Bias->false
Convolution layer: Filters->256 Padding->1 Kernel->3x3 Stride->1 Activation->leaky Batch->1 Bias->false
Shortcut layer: index from->33 | index to->35
Convolution layer: Filters->512 Padding->1 Kernel->3x3 Stride->2 Activation->leaky Batch->1 Bias->false
Convolution layer: Filters->256 Padding->0 Kernel->1x1 Stride->1 Activation->leaky Batch->1 Bias->false
Convolution layer: Filters->512 Padding->1 Kernel->3x3 Stride->1 Activation->leaky Batch->1 Bias->false
Shortcut layer: index from->37 | index to->39
Convolution layer: Filters->256 Padding->0 Kernel->1x1 Stride->1 Activation->leaky Batch->1 Bias->false
Convolution layer: Filters->512 Padding->1 Kernel->3x3 Stride->1 Activation->leaky Batch->1 Bias->false
Shortcut layer: index from->40 | index to->42
Convolution layer: Filters->256 Padding->0 Kernel->1x1 Stride->1 Activation->leaky Batch->1 Bias->false
Convolution layer: Filters->512 Padding->1 Kernel->3x3 Stride->1 Activation->leaky Batch->1 Bias->false
Shortcut layer: index from->43 | index to->45
Convolution layer: Filters->256 Padding->0 Kernel->1x1 Stride->1 Activation->leaky Batch->1 Bias->false
Convolution layer: Filters->512 Padding->1 Kernel->3x3 Stride->1 Activation->leaky Batch->1 Bias->false
Shortcut layer: index from->46 | index to->48
Convolution layer: Filters->256 Padding->0 Kernel->1x1 Stride->1 Activation->leaky Batch->1 Bias->false
Convolution layer: Filters->512 Padding->1 Kernel->3x3 Stride->1 Activation->leaky Batch->1 Bias->false
Shortcut layer: index from->49 | index to->51
Convolution layer: Filters->256 Padding->0 Kernel->1x1 Stride->1 Activation->leaky Batch->1 Bias->false
Convolution layer: Filters->512 Padding->1 Kernel->3x3 Stride->1 Activation->leaky Batch->1 Bias->false
Shortcut layer: index from->52 | index to->54
Convolution layer: Filters->256 Padding->0 Kernel->1x1 Stride->1 Activation->leaky Batch->1 Bias->false
Convolution layer: Filters->512 Padding->1 Kernel->3x3 Stride->1 Activation->leaky Batch->1 Bias->false
Shortcut layer: index from->55 | index to->57
Convolution layer: Filters->256 Padding->0 Kernel->1x1 Stride->1 Activation->leaky Batch->1 Bias->false
Convolution layer: Filters->512 Padding->1 Kernel->3x3 Stride->1 Activation->leaky Batch->1 Bias->false
Shortcut layer: index from->58 | index to->60
Convolution layer: Filters->1024 Padding->1 Kernel->3x3 Stride->2 Activation->leaky Batch->1 Bias->false
Convolution layer: Filters->512 Padding->0 Kernel->1x1 Stride->1 Activation->leaky Batch->1 Bias->false
Convolution layer: Filters->1024 Padding->1 Kernel->3x3 Stride->1 Activation->leaky Batch->1 Bias->false
Shortcut layer: index from->62 | index to->64
Convolution layer: Filters->512 Padding->0 Kernel->1x1 Stride->1 Activation->leaky Batch->1 Bias->false
Convolution layer: Filters->1024 Padding->1 Kernel->3x3 Stride->1 Activation->leaky Batch->1 Bias->false
Shortcut layer: index from->65 | index to->67
Convolution layer: Filters->512 Padding->0 Kernel->1x1 Stride->1 Activation->leaky Batch->1 Bias->false
Convolution layer: Filters->1024 Padding->1 Kernel->3x3 Stride->1 Activation->leaky Batch->1 Bias->false
Shortcut layer: index from->68 | index to->70
Convolution layer: Filters->512 Padding->0 Kernel->1x1 Stride->1 Activation->leaky Batch->1 Bias->false
Convolution layer: Filters->1024 Padding->1 Kernel->3x3 Stride->1 Activation->leaky Batch->1 Bias->false
Shortcut layer: index from->71 | index to->73
Convolution layer: Filters->512 Padding->0 Kernel->1x1 Stride->1 Activation->leaky Batch->1 Bias->false
Convolution layer: Filters->1024 Padding->1 Kernel->3x3 Stride->1 Activation->leaky Batch->1 Bias->false
Convolution layer: Filters->512 Padding->0 Kernel->1x1 Stride->1 Activation->leaky Batch->1 Bias->false
Convolution layer: Filters->1024 Padding->1 Kernel->3x3 Stride->1 Activation->leaky Batch->1 Bias->false
Convolution layer: Filters->512 Padding->0 Kernel->1x1 Stride->1 Activation->leaky Batch->1 Bias->false
Convolution layer: Filters->1024 Padding->1 Kernel->3x3 Stride->1 Activation->leaky Batch->1 Bias->false
Convolution layer: Filters->255 Padding->0 Kernel->1x1 Stride->1 Activation->linear Batch->0 Bias->true
YOLO layer: Mask->6 Anchors->[116, 90]  |       Mask->7 Anchors->[156, 198]     |       Mask->8 Anchors->[373, 326]
Route layer: Start->79
Convolution layer: Filters->256 Padding->0 Kernel->1x1 Stride->1 Activation->leaky Batch->1 Bias->false
Upsample layer: Scale->2
Route layer: Start->85 End->61
Convolution layer: Filters->256 Padding->0 Kernel->1x1 Stride->1 Activation->leaky Batch->1 Bias->false
Convolution layer: Filters->512 Padding->1 Kernel->3x3 Stride->1 Activation->leaky Batch->1 Bias->false
Convolution layer: Filters->256 Padding->0 Kernel->1x1 Stride->1 Activation->leaky Batch->1 Bias->false
Convolution layer: Filters->512 Padding->1 Kernel->3x3 Stride->1 Activation->leaky Batch->1 Bias->false
Convolution layer: Filters->256 Padding->0 Kernel->1x1 Stride->1 Activation->leaky Batch->1 Bias->false
Convolution layer: Filters->512 Padding->1 Kernel->3x3 Stride->1 Activation->leaky Batch->1 Bias->false
Convolution layer: Filters->255 Padding->0 Kernel->1x1 Stride->1 Activation->linear Batch->0 Bias->true
YOLO layer: Mask->3 Anchors->[30, 61]   |       Mask->4 Anchors->[62, 45]       |       Mask->5 Anchors->[59, 119]
Route layer: Start->91
Convolution layer: Filters->128 Padding->0 Kernel->1x1 Stride->1 Activation->leaky Batch->1 Bias->false
Upsample layer: Scale->2
Route layer: Start->97 End->36
Convolution layer: Filters->128 Padding->0 Kernel->1x1 Stride->1 Activation->leaky Batch->1 Bias->false
Convolution layer: Filters->256 Padding->1 Kernel->3x3 Stride->1 Activation->leaky Batch->1 Bias->false
Convolution layer: Filters->128 Padding->0 Kernel->1x1 Stride->1 Activation->leaky Batch->1 Bias->false
Convolution layer: Filters->256 Padding->1 Kernel->3x3 Stride->1 Activation->leaky Batch->1 Bias->false
Convolution layer: Filters->128 Padding->0 Kernel->1x1 Stride->1 Activation->leaky Batch->1 Bias->false
Convolution layer: Filters->256 Padding->1 Kernel->3x3 Stride->1 Activation->leaky Batch->1 Bias->false
Convolution layer: Filters->255 Padding->0 Kernel->1x1 Stride->1 Activation->linear Batch->0 Bias->true
YOLO layer: Mask->0 Anchors->[10, 13]   |       Mask->1 Anchors->[16, 30]       |       Mask->2 Anchors->[33, 23]

yolo-go's People

Contributors

lddl avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Forkers

isgasho tyb-98

yolo-go's Issues

[FEATURE REQUEST] Loss function

Is your feature request related to a problem? Please describe.
Current trainer - https://github.com/LdDl/yolo-go/blob/master/yolo_trainer.go#L11
Current loss function for YOLO bounding boxes - https://github.com/LdDl/yolo-go/blob/master/yolo_trainer.go#L100
I have doubts that it is right code (or even approach). Need to implement it step-by-step and compare with classic C-based implementation (may be take inspirations by some tensorflow/pytorch implementation)

Describe the solution you'd like and provide pseudocode examples if you can
According to source code of YOLO:
IOU's - https://github.com/AlexeyAB/darknet/blob/master/src/yolo_layer.c#L166:

float delta_yolo_box(box truth, float *x, float *biases, int n, int index, int i, int j, int lw, int lh, int w, int h, float *delta, float scale, int stride)
{
    box pred = get_yolo_box(x, biases, n, index, i, j, lw, lh, w, h, stride);
    float iou = box_iou(pred, truth);

    float tx = (truth.x*lw - i);
    float ty = (truth.y*lh - j);
    float tw = log(truth.w*w / biases[2*n]);
    float th = log(truth.h*h / biases[2*n + 1]);

    delta[index + 0*stride] = scale * (tx - x[index + 0*stride]);
    delta[index + 1*stride] = scale * (ty - x[index + 1*stride]);
    delta[index + 2*stride] = scale * (tw - x[index + 2*stride]);
    delta[index + 3*stride] = scale * (th - x[index + 3*stride]);
    return iou;
}

classes - https://github.com/AlexeyAB/darknet/blob/master/src/yolo_layer.c#L276:

void delta_yolo_class(float *output, float *delta, int index, int class, int classes, int stride, float *avg_cat)
{
    int n;
    if (delta[index]){
        delta[index + stride*class] = 1 - output[index + stride*class];
        if(avg_cat) *avg_cat += output[index + stride*class];
        return;
    }
    for(n = 0; n < classes; ++n){
        delta[index + stride*n] = ((n == class)?1 : 0) - output[index + stride*n];
        if(n == class && avg_cat) *avg_cat += output[index + stride*n];
    }
}

Describe alternatives you've considered and provide pseudocode examples if you can
nope

Additional context
nope

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.