zqpei / dssd Goto Github PK

View Code? Open in Web Editor NEW

56.0 3.0 20.0 1.22 MB

Pytorch implementation of DSSD (Deconvolutional Single Shot Detector)

License: MIT License

Python 93.15% C++ 2.40% C 0.17% Cuda 3.27% Shell 1.02%

dssd object-detection one-stage one-shot-object-detection

dssd's Introduction

Hi there 👋, this is ZQPei's github.

🔭 I’m currently working on MLSys.
📫 How to reach me: [email protected].

Github Homepage Statistics

dssd's People

Contributors

Stargazers

Watchers

dssd's Issues

Questiones about `decoder`

The forward pass in DSSD/dssd/modeling/decoder/decoder.py, the features[-2-level] updates in each circle, did it do better than use the ORIGINAL feature comes from ResNet.

Another question, is this repo modified from mask-rcnn? The code architecture looks so familiar.

I'm really appreciate your work. Thank you very much.

Could you please provide the mode size value for us?

Hi,

Could you please provide the mode size value for us?

It is convenient for us to compare the value with DSSD.

Thanks,
Chris

prior_box

Thank you for sharing the code. I would like to ask a question, do you use the prior box during the training?

About reize operate

DSSD/dssd/data/transforms/transforms.py

Line 129 in ac3e775

class Resize(object):

Why bbox not change by resize?

How can I add new backbone network for DSSD

I try to add other network for backbone and modify the network but not work.
Please help to check how can I add this DSSD network.

`from torch import nn

from dssd.modeling import registry
from dssd.utils.model_zoo import load_state_dict_from_url
from torchsummary import summary

model_urls = {
'mobilenet_v2': 'https://download.pytorch.org/models/mobilenet_v2-b0353104.pth',
}

class ConvBNReLU(nn.Sequential):
def init(self, in_planes, out_planes, kernel_size=3, stride=1, groups=1):
padding = (kernel_size - 1) // 2
super(ConvBNReLU, self).init(
nn.Conv2d(in_planes, out_planes, kernel_size, stride, padding, groups=groups, bias=False),
nn.BatchNorm2d(out_planes),
nn.ReLU6(inplace=True)
)

class InvertedResidual(nn.Module):
def init(self, inp, oup, stride, expand_ratio):
super(InvertedResidual, self).init()
self.stride = stride
assert stride in [1, 2]

    hidden_dim = int(round(inp * expand_ratio))
    self.use_res_connect = self.stride == 1 and inp == oup

    layers = []
    if expand_ratio != 1:
        # pw
        layers.append(ConvBNReLU(inp, hidden_dim, kernel_size=1))
    layers.extend([
        # dw
        ConvBNReLU(hidden_dim, hidden_dim, stride=stride, groups=hidden_dim),
        # pw-linear
        nn.Conv2d(hidden_dim, oup, 1, 1, 0, bias=False),
        nn.BatchNorm2d(oup),
    ])
    self.conv = nn.Sequential(*layers)

def forward(self, x):
    if self.use_res_connect:
        return x + self.conv(x)
    else:
        return self.conv(x)

class MobileNetV2(nn.Module):
def init(self, width_mult=1.0, inverted_residual_setting=None):
super(MobileNetV2, self).init()
block = InvertedResidual
input_channel = 32
last_channel = 1280

    if inverted_residual_setting is None:
        inverted_residual_setting = [
            # t, c, n, s
            [1, 16, 1, 1],
            [6, 24, 2, 2],
            [6, 32, 3, 2],
            [6, 64, 4, 2],
            [6, 96, 3, 1],
            [6, 160, 3, 2],
            [6, 320, 1, 1],
        ]

    # only check the first element, assuming user knows t,c,n,s are required
    if len(inverted_residual_setting) == 0 or len(inverted_residual_setting[0]) != 4:
        raise ValueError("inverted_residual_setting should be non-empty "
                         "or a 4-element list, got {}".format(inverted_residual_setting))

    # building first layer
    input_channel = int(input_channel * width_mult)
    self.last_channel = int(last_channel * max(1.0, width_mult))
    features = [ConvBNReLU(3, input_channel, stride=2)]
    # building inverted residual blocks
    for t, c, n, s in inverted_residual_setting:
        output_channel = int(c * width_mult)
        for i in range(n):
            stride = s if i == 0 else 1
            features.append(block(input_channel, output_channel, stride, expand_ratio=t))
            input_channel = output_channel
    # building last several layers
    features.append(ConvBNReLU(input_channel, self.last_channel, kernel_size=1))
    # make it nn.Sequential
    self.features = nn.Sequential(*features)
    self.extras = nn.ModuleList([
        InvertedResidual(1280, 512, 2, 0.2),
        InvertedResidual(512, 256, 2, 0.25),
        InvertedResidual(256, 256, 2, 0.5),
        InvertedResidual(256, 64, 2, 0.25)
    ])

    self.reset_parameters()

def reset_parameters(self):
    # weight initialization
    for m in self.modules():
        if isinstance(m, nn.Conv2d):
            nn.init.kaiming_normal_(m.weight, mode='fan_out')
            if m.bias is not None:
                nn.init.zeros_(m.bias)
        elif isinstance(m, nn.BatchNorm2d):
            nn.init.ones_(m.weight)
            nn.init.zeros_(m.bias)
        elif isinstance(m, nn.Linear):
            nn.init.normal_(m.weight, 0, 0.01)
            nn.init.zeros_(m.bias)

def forward(self, x):
    features = []
    for i in range(14):
        x = self.features[i](x)
    features.append(x)

    for i in range(14, len(self.features)):
        x = self.features[i](x)
    features.append(x)

    for i in range(len(self.extras)):
        x = self.extras[i](x)
        features.append(x)

    return tuple(features)

@registry.BACKBONES.register('mobilenet_v2')
def mobilenet_v2(cfg, pretrained=True):
model = MobileNetV2()
if pretrained:
model.load_state_dict(load_state_dict_from_url(model_urls['mobilenet_v2']), strict=False)
return model

if name == 'main':
darknet = MobileNetV2().cuda()
summary(darknet, (3,320,320))

RuntimeError: The size of tensor a (3) must match the size of tensor b (2) at non-singleton dimension 3
`import torch
import torch.nn as nn
import torch.nn.functional as F

from torchsummary import summary

from dssd.layers import L2Norm
from dssd.modeling import registry
from dssd.utils.model_zoo import load_state_dict_from_url

model_urls = {
'vgg': 'https://s3.amazonaws.com/amdegroot-models/vgg16_reducedfc.pth',
}

borrowed from https://github.com/amdegroot/ssd.pytorch/blob/master/ssd.py

def add_vgg(cfg, batch_norm=False):
layers = []
in_channels = 3
for v in cfg:
if v == 'M':
layers += [nn.MaxPool2d(kernel_size=2, stride=2)]
elif v == 'C':
layers += [nn.MaxPool2d(kernel_size=2, stride=2, ceil_mode=True)]
else:
conv2d = nn.Conv2d(in_channels, v, kernel_size=3, padding=1)
if batch_norm:
layers += [conv2d, nn.BatchNorm2d(v), nn.ReLU(inplace=True)]
else:
layers += [conv2d, nn.ReLU(inplace=True)]
in_channels = v
pool5 = nn.MaxPool2d(kernel_size=3, stride=1, padding=1)
conv6 = nn.Conv2d(512, 1024, kernel_size=3, padding=6, dilation=6)
conv7 = nn.Conv2d(1024, 1024, kernel_size=1)
layers += [pool5, conv6,
nn.ReLU(inplace=True), conv7, nn.ReLU(inplace=True)]
return layers

def add_extras(cfg, i, size=300):
# Extra layers added to VGG for feature scaling
layers = []
in_channels = i
flag = False
for k, v in enumerate(cfg):
if in_channels != 'S':
if v == 'S':
layers += [nn.Conv2d(in_channels, cfg[k + 1], kernel_size=(1, 3)[flag], stride=2, padding=1)]
else:
layers += [nn.Conv2d(in_channels, v, kernel_size=(1, 3)[flag])]
flag = not flag
in_channels = v
if size == 512:
layers.append(nn.Conv2d(in_channels, 128, kernel_size=1, stride=1))
layers.append(nn.Conv2d(128, 256, kernel_size=4, stride=1, padding=1))
return layers

vgg_base = {
'300': [64, 64, 'M', 128, 128, 'M', 256, 256, 256, 'C', 512, 512, 512, 'M',
512, 512, 512],
'512': [64, 64, 'M', 128, 128, 'M', 256, 256, 256, 'C', 512, 512, 512, 'M',
512, 512, 512],
}
extras_base = {
'300': [256, 'S', 512, 128, 'S', 256, 128, 256, 128, 256],
'512': [256, 'S', 512, 128, 'S', 256, 128, 'S', 256, 128, 'S', 256],
}

class VGG(nn.Module):
def init(self, cfg):
super(VGG, self).init()
size = cfg.INPUT.IMAGE_SIZE
vgg_config = vgg_base[str(size)]
extras_config = extras_base[str(size)]

    self.vgg = nn.ModuleList(add_vgg(vgg_config))
    self.extras = nn.ModuleList(add_extras(extras_config, i=1024, size=size))
    self.l2_norm = L2Norm(512, scale=20)
    self.reset_parameters()

def reset_parameters(self):
    for m in self.extras.modules():
        if isinstance(m, nn.Conv2d):
            nn.init.xavier_uniform_(m.weight)
            nn.init.zeros_(m.bias)

def init_from_pretrain(self, state_dict):
    self.vgg.load_state_dict(state_dict)

def forward(self, x):
    features = []
    for i in range(23):
        x = self.vgg[i](x)
    s = self.l2_norm(x)  # Conv4_3 L2 normalization
    features.append(s)

    # apply vgg up to fc7
    for i in range(23, len(self.vgg)):
        x = self.vgg[i](x)
    features.append(x)

    for k, v in enumerate(self.extras):
        x = F.relu(v(x), inplace=True)
        if k % 2 == 1:
            features.append(x)

    return tuple(features)

@registry.BACKBONES.register('vgg')
def vgg(cfg, pretrained=True):
model = VGG(cfg)
if pretrained:
model.init_from_pretrain(load_state_dict_from_url(model_urls['vgg']))
return model

if name == 'main':
darknet = VGG(300).cuda()
summary(darknet, (3,320,320))
`
RuntimeError: The size of tensor a (20) must match the size of tensor b (19) at non-singleton dimension 3

Can you provide model's checkpoint?

Do you have a VGG version？

Thank you very much for your hard work，Do you have a VGG version？

反卷积模块的 forward 好像有点问题

在 deconv_module.py 中，当 self.elementwise_type == "prod" 时，不应该返回 self.relu(y_deconv * y_conv) 吗

def forward(self, x_deconv, x_conv):
        y_deconv = self.deconv_layer(x_deconv)
        y_conv = self.conv_layer(x_conv)
        if self.elementwise_type == "sum":
            return self.relu(y_deconv + y_conv)
        elif self.elementwise_type == "prod":
            return self.relu(y_deconv + y_conv)

How to run DSSD520

your github only have resnet101_dssd320_voc0712.yaml. Don't have 520
How to run DSSD520?
Thanks

Incorrect operation in deconv_module.py forward

In file deconv_module.py, function forward, line self.elementwise_type == "prod", operation should be multiply instead of sum.

希望可以给一些提高小目标检测AP和AR的建议

目前在做毕业论文，关于肺结节检测，初步试下来效果一般般，希望能修改或加点东西来提高模型效果，谢谢

Model Center Variance and Size Variance

I'm trying to run this DSSD implementation on BDD Dataset which has images of 720x1280.

I first started with input size 320 because a lot model parameters were defined for it. I'm trying to understand the center variance and size variance used in defaults.py. I could identify them being used in box_utils.py but can you please help me understand them?

Are they model parameters or dependant on the choice of input? Given my choice of dataset running with input size 320 do I need to change them?
Also if I provide my ground truth box co-ordinates (x1, y1, x2, y2) relative to the actual input image in the dataset (720x1280), they (gt_boxes) will be normalised as per the model input size (320) right?

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.