- 🔭 I’m currently working on MLSys.
- 📫 How to reach me: [email protected].
zqpei / dssd Goto Github PK
View Code? Open in Web Editor NEWPytorch implementation of DSSD (Deconvolutional Single Shot Detector)
License: MIT License
Pytorch implementation of DSSD (Deconvolutional Single Shot Detector)
License: MIT License
The forward
pass in DSSD/dssd/modeling/decoder/decoder.py
, the features[-2-level]
updates in each circle, did it do better than use the ORIGINAL
feature comes from ResNet.
Another question, is this repo modified from mask-rcnn? The code architecture looks so familiar.
I'm really appreciate your work. Thank you very much.
Hi,
Could you please provide the mode size value for us?
It is convenient for us to compare the value with DSSD.
Thanks,
Chris
Thank you for sharing the code. I would like to ask a question, do you use the prior box during the training?
DSSD/dssd/data/transforms/transforms.py
Line 129 in ac3e775
Why bbox not change by resize?
I try to add other network for backbone and modify the network but not work.
Please help to check how can I add this DSSD network.
`from torch import nn
from dssd.modeling import registry
from dssd.utils.model_zoo import load_state_dict_from_url
from torchsummary import summary
model_urls = {
'mobilenet_v2': 'https://download.pytorch.org/models/mobilenet_v2-b0353104.pth',
}
class ConvBNReLU(nn.Sequential):
def init(self, in_planes, out_planes, kernel_size=3, stride=1, groups=1):
padding = (kernel_size - 1) // 2
super(ConvBNReLU, self).init(
nn.Conv2d(in_planes, out_planes, kernel_size, stride, padding, groups=groups, bias=False),
nn.BatchNorm2d(out_planes),
nn.ReLU6(inplace=True)
)
class InvertedResidual(nn.Module):
def init(self, inp, oup, stride, expand_ratio):
super(InvertedResidual, self).init()
self.stride = stride
assert stride in [1, 2]
hidden_dim = int(round(inp * expand_ratio))
self.use_res_connect = self.stride == 1 and inp == oup
layers = []
if expand_ratio != 1:
# pw
layers.append(ConvBNReLU(inp, hidden_dim, kernel_size=1))
layers.extend([
# dw
ConvBNReLU(hidden_dim, hidden_dim, stride=stride, groups=hidden_dim),
# pw-linear
nn.Conv2d(hidden_dim, oup, 1, 1, 0, bias=False),
nn.BatchNorm2d(oup),
])
self.conv = nn.Sequential(*layers)
def forward(self, x):
if self.use_res_connect:
return x + self.conv(x)
else:
return self.conv(x)
class MobileNetV2(nn.Module):
def init(self, width_mult=1.0, inverted_residual_setting=None):
super(MobileNetV2, self).init()
block = InvertedResidual
input_channel = 32
last_channel = 1280
if inverted_residual_setting is None:
inverted_residual_setting = [
# t, c, n, s
[1, 16, 1, 1],
[6, 24, 2, 2],
[6, 32, 3, 2],
[6, 64, 4, 2],
[6, 96, 3, 1],
[6, 160, 3, 2],
[6, 320, 1, 1],
]
# only check the first element, assuming user knows t,c,n,s are required
if len(inverted_residual_setting) == 0 or len(inverted_residual_setting[0]) != 4:
raise ValueError("inverted_residual_setting should be non-empty "
"or a 4-element list, got {}".format(inverted_residual_setting))
# building first layer
input_channel = int(input_channel * width_mult)
self.last_channel = int(last_channel * max(1.0, width_mult))
features = [ConvBNReLU(3, input_channel, stride=2)]
# building inverted residual blocks
for t, c, n, s in inverted_residual_setting:
output_channel = int(c * width_mult)
for i in range(n):
stride = s if i == 0 else 1
features.append(block(input_channel, output_channel, stride, expand_ratio=t))
input_channel = output_channel
# building last several layers
features.append(ConvBNReLU(input_channel, self.last_channel, kernel_size=1))
# make it nn.Sequential
self.features = nn.Sequential(*features)
self.extras = nn.ModuleList([
InvertedResidual(1280, 512, 2, 0.2),
InvertedResidual(512, 256, 2, 0.25),
InvertedResidual(256, 256, 2, 0.5),
InvertedResidual(256, 64, 2, 0.25)
])
self.reset_parameters()
def reset_parameters(self):
# weight initialization
for m in self.modules():
if isinstance(m, nn.Conv2d):
nn.init.kaiming_normal_(m.weight, mode='fan_out')
if m.bias is not None:
nn.init.zeros_(m.bias)
elif isinstance(m, nn.BatchNorm2d):
nn.init.ones_(m.weight)
nn.init.zeros_(m.bias)
elif isinstance(m, nn.Linear):
nn.init.normal_(m.weight, 0, 0.01)
nn.init.zeros_(m.bias)
def forward(self, x):
features = []
for i in range(14):
x = self.features[i](x)
features.append(x)
for i in range(14, len(self.features)):
x = self.features[i](x)
features.append(x)
for i in range(len(self.extras)):
x = self.extras[i](x)
features.append(x)
return tuple(features)
@registry.BACKBONES.register('mobilenet_v2')
def mobilenet_v2(cfg, pretrained=True):
model = MobileNetV2()
if pretrained:
model.load_state_dict(load_state_dict_from_url(model_urls['mobilenet_v2']), strict=False)
return model
if name == 'main':
darknet = MobileNetV2().cuda()
summary(darknet, (3,320,320))
`
RuntimeError: The size of tensor a (3) must match the size of tensor b (2) at non-singleton dimension 3
`import torch
import torch.nn as nn
import torch.nn.functional as F
from torchsummary import summary
from dssd.layers import L2Norm
from dssd.modeling import registry
from dssd.utils.model_zoo import load_state_dict_from_url
model_urls = {
'vgg': 'https://s3.amazonaws.com/amdegroot-models/vgg16_reducedfc.pth',
}
def add_vgg(cfg, batch_norm=False):
layers = []
in_channels = 3
for v in cfg:
if v == 'M':
layers += [nn.MaxPool2d(kernel_size=2, stride=2)]
elif v == 'C':
layers += [nn.MaxPool2d(kernel_size=2, stride=2, ceil_mode=True)]
else:
conv2d = nn.Conv2d(in_channels, v, kernel_size=3, padding=1)
if batch_norm:
layers += [conv2d, nn.BatchNorm2d(v), nn.ReLU(inplace=True)]
else:
layers += [conv2d, nn.ReLU(inplace=True)]
in_channels = v
pool5 = nn.MaxPool2d(kernel_size=3, stride=1, padding=1)
conv6 = nn.Conv2d(512, 1024, kernel_size=3, padding=6, dilation=6)
conv7 = nn.Conv2d(1024, 1024, kernel_size=1)
layers += [pool5, conv6,
nn.ReLU(inplace=True), conv7, nn.ReLU(inplace=True)]
return layers
def add_extras(cfg, i, size=300):
# Extra layers added to VGG for feature scaling
layers = []
in_channels = i
flag = False
for k, v in enumerate(cfg):
if in_channels != 'S':
if v == 'S':
layers += [nn.Conv2d(in_channels, cfg[k + 1], kernel_size=(1, 3)[flag], stride=2, padding=1)]
else:
layers += [nn.Conv2d(in_channels, v, kernel_size=(1, 3)[flag])]
flag = not flag
in_channels = v
if size == 512:
layers.append(nn.Conv2d(in_channels, 128, kernel_size=1, stride=1))
layers.append(nn.Conv2d(128, 256, kernel_size=4, stride=1, padding=1))
return layers
vgg_base = {
'300': [64, 64, 'M', 128, 128, 'M', 256, 256, 256, 'C', 512, 512, 512, 'M',
512, 512, 512],
'512': [64, 64, 'M', 128, 128, 'M', 256, 256, 256, 'C', 512, 512, 512, 'M',
512, 512, 512],
}
extras_base = {
'300': [256, 'S', 512, 128, 'S', 256, 128, 256, 128, 256],
'512': [256, 'S', 512, 128, 'S', 256, 128, 'S', 256, 128, 'S', 256],
}
class VGG(nn.Module):
def init(self, cfg):
super(VGG, self).init()
size = cfg.INPUT.IMAGE_SIZE
vgg_config = vgg_base[str(size)]
extras_config = extras_base[str(size)]
self.vgg = nn.ModuleList(add_vgg(vgg_config))
self.extras = nn.ModuleList(add_extras(extras_config, i=1024, size=size))
self.l2_norm = L2Norm(512, scale=20)
self.reset_parameters()
def reset_parameters(self):
for m in self.extras.modules():
if isinstance(m, nn.Conv2d):
nn.init.xavier_uniform_(m.weight)
nn.init.zeros_(m.bias)
def init_from_pretrain(self, state_dict):
self.vgg.load_state_dict(state_dict)
def forward(self, x):
features = []
for i in range(23):
x = self.vgg[i](x)
s = self.l2_norm(x) # Conv4_3 L2 normalization
features.append(s)
# apply vgg up to fc7
for i in range(23, len(self.vgg)):
x = self.vgg[i](x)
features.append(x)
for k, v in enumerate(self.extras):
x = F.relu(v(x), inplace=True)
if k % 2 == 1:
features.append(x)
return tuple(features)
@registry.BACKBONES.register('vgg')
def vgg(cfg, pretrained=True):
model = VGG(cfg)
if pretrained:
model.init_from_pretrain(load_state_dict_from_url(model_urls['vgg']))
return model
if name == 'main':
darknet = VGG(300).cuda()
summary(darknet, (3,320,320))
`
RuntimeError: The size of tensor a (20) must match the size of tensor b (19) at non-singleton dimension 3
Thank you very much for your hard work,Do you have a VGG version?
在 deconv_module.py
中,当 self.elementwise_type == "prod"
时,不应该返回 self.relu(y_deconv * y_conv)
吗
def forward(self, x_deconv, x_conv):
y_deconv = self.deconv_layer(x_deconv)
y_conv = self.conv_layer(x_conv)
if self.elementwise_type == "sum":
return self.relu(y_deconv + y_conv)
elif self.elementwise_type == "prod":
return self.relu(y_deconv + y_conv)
your github only have resnet101_dssd320_voc0712.yaml. Don't have 520
How to run DSSD520?
Thanks
In file deconv_module.py, function forward, line self.elementwise_type == "prod", operation should be multiply instead of sum.
目前在做毕业论文,关于肺结节检测,初步试下来效果一般般,希望能修改或加点东西来提高模型效果,谢谢
I'm trying to run this DSSD implementation on BDD Dataset which has images of 720x1280.
I first started with input size 320 because a lot model parameters were defined for it. I'm trying to understand the center variance and size variance used in defaults.py. I could identify them being used in box_utils.py but can you please help me understand them?
Are they model parameters or dependant on the choice of input? Given my choice of dataset running with input size 320 do I need to change them?
Also if I provide my ground truth box co-ordinates (x1, y1, x2, y2) relative to the actual input image in the dataset (720x1280), they (gt_boxes) will be normalised as per the model input size (320) right?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.