deepmotionairesearch / denseaspp Goto Github PK

DenseASPP for Semantic Segmentation in Street Scenes

Python 100.00%

denseaspp's Introduction

DenseASPP for Semantic Segmentation in Street Scenes pdf

Introduction

Semantic image segmentation is a basic street scene understanding task in autonomous driving, where each pixel in a high resolution image is categorized into a set of semantic labels. Unlike other scenarios, objects in autonomous driving scene exhibit very large scale changes, which poses great challenges for high-level feature representation in a sense that multi-scale information must be correctly encoded.

To remedy this problem, atrous convolution[2, 3] was introduced to generate features with larger receptive fields without sacrificing spatial resolution. Built upon atrous convolution, Atrous Spatial Pyramid Pooling (ASPP)[3] was proposed to concatenate multiple atrous-convolved features using different dilation rates into a final feature representation. Although ASPP is able to generate multi-scale features, we argue the feature resolution in the scale-axis is not dense enough for the autonomous driving scenario. To this end, we propose Densely connected Atrous Spatial Pyramid Pooling (DenseASPP), which connects a set of atrous convolutional layers in a dense way, such that it generates multi-scale features that not only cover a larger scale range, but also cover that scale range densely, without significantly increasing the model size. We evaluate DenseASPP on the street scene benchmark Cityscapes[4] and achieve state-of-the-art performance.

Usage

1. Clone the repository:

git clone https://github.com/DeepMotionAIResearch/DenseASPP.git

2. Download pretrained model:

Put the model at the folder weights. We provide some checkpoints to run the code:

DenseNet161 based model: GoogleDrive

Mobilenet v2 based model: Coming soon.

Performance of these checkpoints:

Checkpoint name	Multi-scale inference	Cityscapes mIOU (val)	Cityscapes mIOU (test)	File Size
DenseASPP161	False True	79.9% 80.6 %	- 79.5%	142.7 MB
MobileNetDenseASPP	False True	74.5% 75.0 %	- -	10.2 MB

Please note that the performance of these checkpoints can be further improved by fine-tuning. Besides, these models were trained with Pytorch 0.3.1

3. Inference

First cd to your code root, then run:

 python demo.py  --model_name DenseASPP161 --model_path <your checkpoint path> --img_dir <your img directory>

4. Evaluation the results

Please cd to ./utils, then run:

 python transfer.py

And eval the results with the official evaluation code of Cityscapes, which can be found at there

References

DenseASPP for Semantic Segmentation in Street Scenes
Maoke Yang, Kun Yu, Chi Zhang, Zhiwei Li, Kuiyuan Yang.
link. In CVPR, 2018.
Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs
Liang-Chieh Chen+, George Papandreou+, Iasonas Kokkinos, Kevin Murphy, Alan L. Yuille (+ equal contribution).
link. In ICLR, 2015.
DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
Liang-Chieh Chen+, George Papandreou+, Iasonas Kokkinos, Kevin Murphy, and Alan L Yuille (+ equal contribution).
link. TPAMI 2017.
The Cityscapes Dataset for Semantic Urban Scene Understanding
Cordts, Marius, Mohamed Omran, Sebastian Ramos, Timo Rehfeld, Markus Enzweiler, Rodrigo Benenson, Uwe Franke, Stefan Roth, Bernt Schiele.
link. In CVPR, 2016.

denseaspp's People

Contributors

Stargazers

Watchers

Forkers

bjchen666 zijundeng arasharchor yangyongguang 94mia irfanicmll eternalsunshine1314 shlpu dreadlord1984 jdc08161063 ustc2016 holyhao wpf535236337 gengrui1983 codes-kzhan wjyao snoworday ericargus dq-soulie zhulei2016 klqulei ssurprising amirunpri2018 fendaq zizi21 lchia justin0111 zc280330 rkshuai ericking19 hxwinchina elena-ssq bigdatasciencegroup ximi2018 zhengzhiteng zhangziquan alicemegatron czstudydl avinash-chouhan xiong224 albertomcs leethony hsveh ytzhao dreamvlee xyj77 louisnust hali1122 shengzhang90 egrass koala0qoo zhuxuhan gatsby-2019 sj-li cv-ip wambugunm joahhe thotnd173389 ddz-2020 lei-zhang-0 rover-bor rsdljm qiaokangqi malondaclement next-mooon maohule krystal-kk skyler9901 erelin6613 pietrosanguin

denseaspp's Issues

How about the speed of the mobile denseaspp?

Pretrained model

Hi,

Thanks for your open source code! Nice work!

Could you please share the pre-trained model of densenet121 with me? I will appreciate it if you could! Really thanks for your help!

How to use pretrained model such as densnet161?

The code does not match the description of the paper

Something about batchnorm2d

In your script, the parameter 'momentum' in Batchnorm2d has been set to 0.0003(default=0.1). Do you use it during training, or just for inference?

How can I train my own model on LIP?

How to do? I don't know which file was damaged.

RuntimeError: unexpected EOF, expected 243719 more bytes. The file might be corrupted.

Checkpoint of the Mobilenet v2 based model

Hi,
Are you planning about releasing the checkpoint of the Mobilenet v2 based model?

Without pretrained on ImageNet, I can only get 54.8% on the val set of CityScapes.

Hoping for your pretrained model.

Thank you so much!

About the test performance on Cityscapes

Hi, I have read the paper. It reported a performance 80.6 mIOU on the test set. But why the released model DenseASPP161 only achieved 79.0% mIoU as shown in the chart?
Thank you!

KeyError: 'module name can't contain "."'

Traceback (most recent call last):
File "/home/oliver/PycharmProjects/semantic-segmentation-pytorch/DenseASPP/demo.py", line 18, in
infer = Inference(args.model_name, args.model_path)
File "/home/oliver/PycharmProjects/semantic-segmentation-pytorch/DenseASPP/inference.py", line 28, in init
self.seg_model = self.__init_model(model_name, model_path, is_local=False)
File "/home/oliver/PycharmProjects/semantic-segmentation-pytorch/DenseASPP/inference.py", line 50, in __init_model
seg_model = DenseASPP(Model_CFG, n_class=N_CLASS, output_stride=8)
File "/home/oliver/PycharmProjects/semantic-segmentation-pytorch/DenseASPP/models/DenseASPP.py", line 40, in init
bn_size=bn_size, growth_rate=growth_rate, drop_rate=drop_rate)
File "/home/oliver/PycharmProjects/semantic-segmentation-pytorch/DenseASPP/models/DenseASPP.py", line 188, in init
bn_size, drop_rate, dilation_rate=dilation_rate)
File "/home/oliver/PycharmProjects/semantic-segmentation-pytorch/DenseASPP/models/DenseASPP.py", line 166, in init
self.add_module('norm.1', bn(num_input_features)),
File "/home/oliver/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 180, in add_module
raise KeyError("module name can't contain "."")
KeyError: 'module name can't contain "."'

Environment:

pytorch 1.0.1
Cuda 10.1

After reading this link: taey16/pix2pixBEGAN.pytorch#7, I'm wondering whether replacing is the only way to solve this problem?

Can you give me some advice?

Thank you so much!

RuntimeError: Need input.size[1] == 3 but got 4 instead.

When I run :

python demo.py --model_name DenseASPP161 --model_path ./weights/denseASPP161.pkl --img_dir ./img

I got this:

loading pre-trained weight
1.png
Traceback (most recent call last):
File "demo.py", line 19, in
infer.folder_inference(args.img_dir, is_multiscale=False)
File "/home/dl/code/DenseASPP/inference.py", line 71, in folder_inference
pre = self.single_inference(img)
File "/home/dl/code/DenseASPP/inference.py", line 94, in single_inference
pre = self.seg_model.forward(image)
File "/home/dl/code/DenseASPP/models/DenseASPP.py", line 114, in forward
feature = self.features(_input)
File "/home/dl/anaconda2/lib/python2.7/site-packages/torch/nn/modules/module.py", line 224, in call
result = self.forward(*input, **kwargs)
File "/home/dl/anaconda2/lib/python2.7/site-packages/torch/nn/modules/container.py", line 67, in forward
input = module(input)
File "/home/dl/anaconda2/lib/python2.7/site-packages/torch/nn/modules/module.py", line 224, in call
result = self.forward(*input, **kwargs)
File "/home/dl/anaconda2/lib/python2.7/site-packages/torch/nn/modules/conv.py", line 254, in forward
self.padding, self.dilation, self.groups)
File "/home/dl/anaconda2/lib/python2.7/site-packages/torch/nn/functional.py", line 52, in conv2d
return f(input, weight, bias)
RuntimeError: Need input.size[1] == 3 but got 4 instead.

Do you know what is the problem? Thank you!

Will you release your training scripts

Thanks for your great work.
Will you release your training scripts?

Bug of channel in first floor

I got a channel on the first floor of [1,4,1024,2048], the 4 in the list should be 3, however, I haven't found the code to transform it.