Giter Site home page Giter Site logo

denseaspp's Introduction

DenseASPP for Semantic Segmentation in Street Scenes pdf

Introduction

Semantic image segmentation is a basic street scene understanding task in autonomous driving, where each pixel in a high resolution image is categorized into a set of semantic labels. Unlike other scenarios, objects in autonomous driving scene exhibit very large scale changes, which poses great challenges for high-level feature representation in a sense that multi-scale information must be correctly encoded.

To remedy this problem, atrous convolution[2, 3] was introduced to generate features with larger receptive fields without sacrificing spatial resolution. Built upon atrous convolution, Atrous Spatial Pyramid Pooling (ASPP)[3] was proposed to concatenate multiple atrous-convolved features using different dilation rates into a final feature representation. Although ASPP is able to generate multi-scale features, we argue the feature resolution in the scale-axis is not dense enough for the autonomous driving scenario. To this end, we propose Densely connected Atrous Spatial Pyramid Pooling (DenseASPP), which connects a set of atrous convolutional layers in a dense way, such that it generates multi-scale features that not only cover a larger scale range, but also cover that scale range densely, without significantly increasing the model size. We evaluate DenseASPP on the street scene benchmark Cityscapes[4] and achieve state-of-the-art performance.

Usage

1. Clone the repository:

git clone https://github.com/DeepMotionAIResearch/DenseASPP.git

2. Download pretrained model:

Put the model at the folder weights. We provide some checkpoints to run the code:

DenseNet161 based model: GoogleDrive

Mobilenet v2 based model: Coming soon.

Performance of these checkpoints:

Checkpoint name Multi-scale inference Cityscapes mIOU (val) Cityscapes mIOU (test) File Size
DenseASPP161 False
True
79.9%
80.6 %
-
79.5%
142.7 MB
MobileNetDenseASPP False
True
74.5%
75.0 %
-
-
10.2 MB

Please note that the performance of these checkpoints can be further improved by fine-tuning. Besides, these models were trained with Pytorch 0.3.1

3. Inference

First cd to your code root, then run:

 python demo.py  --model_name DenseASPP161 --model_path <your checkpoint path> --img_dir <your img directory>

4. Evaluation the results

Please cd to ./utils, then run:

 python transfer.py

And eval the results with the official evaluation code of Cityscapes, which can be found at there

References

  1. DenseASPP for Semantic Segmentation in Street Scenes
    Maoke Yang, Kun Yu, Chi Zhang, Zhiwei Li, Kuiyuan Yang.
    link. In CVPR, 2018.

  2. Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs
    Liang-Chieh Chen+, George Papandreou+, Iasonas Kokkinos, Kevin Murphy, Alan L. Yuille (+ equal contribution).
    link. In ICLR, 2015.

  3. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
    Liang-Chieh Chen+, George Papandreou+, Iasonas Kokkinos, Kevin Murphy, and Alan L Yuille (+ equal contribution).
    link. TPAMI 2017.

  4. The Cityscapes Dataset for Semantic Urban Scene Understanding
    Cordts, Marius, Mohamed Omran, Sebastian Ramos, Timo Rehfeld, Markus Enzweiler, Rodrigo Benenson, Uwe Franke, Stefan Roth, Bernt Schiele.
    link. In CVPR, 2016.

denseaspp's People

Contributors

deepmotionairesearch avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

denseaspp's Issues

Pretrained model

Hi,

Thanks for your open source code! Nice work!

Could you please share the pre-trained model of densenet121 with me? I will appreciate it if you could! Really thanks for your help!

Something about batchnorm2d

In your script, the parameter 'momentum' in Batchnorm2d has been set to 0.0003(default=0.1). Do you use it during training, or just for inference?

Checkpoint of the Mobilenet v2 based model

Hi,
Are you planning about releasing the checkpoint of the Mobilenet v2 based model?

Without pretrained on ImageNet, I can only get 54.8% on the val set of CityScapes.

Hoping for your pretrained model.

Thank you so much!

About the test performance on Cityscapes

Hi, I have read the paper. It reported a performance 80.6 mIOU on the test set. But why the released model DenseASPP161 only achieved 79.0% mIoU as shown in the chart?
Thank you!

KeyError: 'module name can't contain "."'

Traceback (most recent call last):
File "/home/oliver/PycharmProjects/semantic-segmentation-pytorch/DenseASPP/demo.py", line 18, in
infer = Inference(args.model_name, args.model_path)
File "/home/oliver/PycharmProjects/semantic-segmentation-pytorch/DenseASPP/inference.py", line 28, in init
self.seg_model = self.__init_model(model_name, model_path, is_local=False)
File "/home/oliver/PycharmProjects/semantic-segmentation-pytorch/DenseASPP/inference.py", line 50, in __init_model
seg_model = DenseASPP(Model_CFG, n_class=N_CLASS, output_stride=8)
File "/home/oliver/PycharmProjects/semantic-segmentation-pytorch/DenseASPP/models/DenseASPP.py", line 40, in init
bn_size=bn_size, growth_rate=growth_rate, drop_rate=drop_rate)
File "/home/oliver/PycharmProjects/semantic-segmentation-pytorch/DenseASPP/models/DenseASPP.py", line 188, in init
bn_size, drop_rate, dilation_rate=dilation_rate)
File "/home/oliver/PycharmProjects/semantic-segmentation-pytorch/DenseASPP/models/DenseASPP.py", line 166, in init
self.add_module('norm.1', bn(num_input_features)),
File "/home/oliver/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 180, in add_module
raise KeyError("module name can't contain "."")
KeyError: 'module name can't contain "."'

Environment:

  • pytorch 1.0.1
  • Cuda 10.1

After reading this link: taey16/pix2pixBEGAN.pytorch#7, I'm wondering whether replacing is the only way to solve this problem?

Can you give me some advice?

Thank you so much!

RuntimeError: Need input.size[1] == 3 but got 4 instead.

When I run :

python demo.py --model_name DenseASPP161 --model_path ./weights/denseASPP161.pkl --img_dir ./img

I got this:

loading pre-trained weight
1.png
Traceback (most recent call last):
File "demo.py", line 19, in
infer.folder_inference(args.img_dir, is_multiscale=False)
File "/home/dl/code/DenseASPP/inference.py", line 71, in folder_inference
pre = self.single_inference(img)
File "/home/dl/code/DenseASPP/inference.py", line 94, in single_inference
pre = self.seg_model.forward(image)
File "/home/dl/code/DenseASPP/models/DenseASPP.py", line 114, in forward
feature = self.features(_input)
File "/home/dl/anaconda2/lib/python2.7/site-packages/torch/nn/modules/module.py", line 224, in call
result = self.forward(*input, **kwargs)
File "/home/dl/anaconda2/lib/python2.7/site-packages/torch/nn/modules/container.py", line 67, in forward
input = module(input)
File "/home/dl/anaconda2/lib/python2.7/site-packages/torch/nn/modules/module.py", line 224, in call
result = self.forward(*input, **kwargs)
File "/home/dl/anaconda2/lib/python2.7/site-packages/torch/nn/modules/conv.py", line 254, in forward
self.padding, self.dilation, self.groups)
File "/home/dl/anaconda2/lib/python2.7/site-packages/torch/nn/functional.py", line 52, in conv2d
return f(input, weight, bias)
RuntimeError: Need input.size[1] == 3 but got 4 instead.

Do you know what is the problem? Thank you!

Bug of channel in first floor

I got a channel on the first floor of [1,4,1024,2048], the 4 in the list should be 3, however, I haven't found the code to transform it.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.