junfu1115 / danet Goto Github PK

View Code? Open in Web Editor NEW

2.4K 36.0 483.0 21.77 MB

Dual Attention Network for Scene Segmentation (CVPR2019)

License: MIT License

Makefile 0.04% Shell 0.51% Python 77.09% C++ 8.67% Cuda 13.69%

danet's Introduction

Dual Attention Network for Scene Segmentation(CVPR2019)

Jun Fu, Jing Liu, Haijie Tian, Yong Li, Yongjun Bao, Zhiwei Fang,and Hanqing Lu

Introduction

We propose a Dual Attention Network (DANet) to adaptively integrate local features with their global dependencies based on the self-attention mechanism. And we achieve new state-of-the-art segmentation performance on three challenging scene segmentation datasets, i.e., Cityscapes, PASCAL Context and COCO Stuff-10k dataset.

Cityscapes testing set result

We train our DANet-101 with only fine annotated data and submit our test results to the official evaluation server.

Updates

2020/9：Renew the code, which supports Pytorch 1.4.0 or later!

2020/8：The new TNNLS version DRANet achieves 82.9% on Cityscapes test set (submit the result on August, 2019), which is a new state-of-the-arts performance with only using fine annotated dataset and Resnet-101. The code will be released in DRANet.

2020/7：DANet is supported on MMSegmentation, in which DANet achieves 80.47% with single scale testing and 82.02% with multi-scale testing on Cityscapes val set.

2018/9：DANet released. The trained model with ResNet101 achieves 81.5% on Cityscapes test set.

Usage

Install pytorch
- The code is tested on python3.6 and torch 1.4.0.
- The code is modified from PyTorch-Encoding.

Clone the resposity

git clone https://github.com/junfu1115/DANet.git 
cd DANet 
python setup.py install

Dataset
- Download the Cityscapes dataset and convert the dataset to 19 categories.
- Please put dataset in folder ./datasets
Evaluation for DANet
- Download trained model DANet101 and put it in folder ./experiments/segmentation/models/
- cd ./experiments/segmentation/
- For single scale testing, please run:
- ```
CUDA_VISIBLE_DEVICES=0,1,2,3 python test.py --dataset citys --model danet --backbone resnet101 --resume  models/DANet101.pth.tar --eval --base-size 2048 --crop-size 768 --workers 1 --multi-grid --multi-dilation 4 8 16 --os 8 --aux --no-deepstem
```
- Evaluation Result
  
  The expected scores will show as follows: DANet101 on cityscapes val set (mIoU/pAcc): 79.93/95.97(ss)
Evaluation for DRANet
- Download trained model DRANet101 and put it in folder ./experiments/segmentation/models/
- Evaluation code is in folder ./experiments/segmentation/
- cd ./experiments/segmentation/
- For single scale testing, please run:
- ```
CUDA_VISIBLE_DEVICES=0,1,2,3 python test.py --dataset citys --model dran --backbone resnet101 --resume  models/dran101.pth.tar --eval --base-size 2048 --crop-size 768 --workers 1 --multi-grid --multi-dilation 4 8 16 --os 8 --aux
```
- Evaluation Result
  
  The expected scores will show as follows: DRANet101 on cityscapes val set (mIoU/pAcc): 81.63/96.62 (ss)

Citation

if you find DANet and DRANet useful in your research, please consider citing:

@article{fu2020scene,
  title={Scene Segmentation With Dual Relation-Aware Attention Network},
  author={Fu, Jun and Liu, Jing and Jiang, Jie and Li, Yong and Bao, Yongjun and Lu, Hanqing},
  journal={IEEE Transactions on Neural Networks and Learning Systems},
  year={2020},
  publisher={IEEE}
}

@inproceedings{fu2019dual,
  title={Dual attention network for scene segmentation},
  author={Fu, Jun and Liu, Jing and Tian, Haijie and Li, Yong and Bao, Yongjun and Fang, Zhiwei and Lu, Hanqing},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  pages={3146--3154},
  year={2019}
}

Acknowledgement

Thanks PyTorch-Encoding, especially the Synchronized BN!

danet's People

Contributors

Stargazers

Watchers

Forkers

colorfulblue jdc08161063 deepylt yangkang779 baby47 zijundeng cclauss hyzcn fendaq ife1er chzhu940222 songxiliang dreadlord1984 johndpope pkurainbow lihua213 clowread niranjanaryan baiyancheng20 panna19951227 runauto linghushaoxia yanliang0813 liyong3forever yunyouhuang sarah20187 guoya1003 wpf535236337 rongchangzhao tianhaijie ns2mitu winwinjjiang sooheang-hutom shaojinding wutianyirosun xiaopingzeng fanglw silliam chunde moplast 2232088201wzu codes-kzhan kobiso soulempty shubhampachori12110095 yougoforward forlovess yldcs shunfeng66 00001101-xt zumbalamambo 123fengye741 guoqizhou6 nemonameless xxchenxx starstylesky erkang caozewei lianqing11 zhangxgu henghuiding sankin1770 wugavin tqdavidlr wyk0517 zqdeepbluesky shlpu tjulyz fireae zizi21 zhenguoyuan entn-at xiaochengcike codeislife99 volodymyrahafonov mischief233 xwyangjshb tandychao yugenlgy dontlovebugs queenie88 abbottmo tangyoubao zhukequan xiaoketongxue belye zihao-lu frizy-up ml-lab zhong-xin clxie zimenglan-sysu-512 rsip4sh wanboyang jackyspeed joyies aliekesk xyishere mathpopo phygod

danet's Issues

The value of self.gamma

Hi, when the gamma is initialized to 0, how does it become other values. I know it's a parameter that can be changed after learning, but I don't understand how it changed. I am confused, because i think its value is always 0.

The backbone weights

When training,the backbone（resnet ） weights change or not?

Reproduced Performance Discussion

Very interesting!

I find that you employ three independent loss functions on the predictions of the "PAM branch" and "CAM branch".

I am wondering the influence of the loss functions on these two branches and it would be great if you could share the related experiments.

Besides, I want to share my reproduced performance with your PAM and CAM based on the OCNet, (especially, we only employ one loss function and employ lr=0.01 and wd=0.0005 following the default settings within OCNet)

we report the mIoU on training set /validation set here: 82.0/74.4 (single scale).

Could anyone help us to check where is the problem?

Here we also provide the network configuration,

class ResNet(nn.Module):
    def __init__(self, block, layers, num_classes):
        self.inplanes = 128
        super(ResNet, self).__init__()
        self.conv1 = conv3x3(3, 64, stride=2)
        self.bn1 = BatchNorm2d(64)
        self.relu1 = nn.ReLU(inplace=False)
        self.conv2 = conv3x3(64, 64)
        self.bn2 = BatchNorm2d(64)
        self.relu2 = nn.ReLU(inplace=False)
        self.conv3 = conv3x3(64, 128)
        self.bn3 = BatchNorm2d(128)
        self.relu3 = nn.ReLU(inplace=False)
        self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
        self.relu = nn.ReLU(inplace=False)
        self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1, ceil_mode=True) # change
        self.layer1 = self._make_layer(block, 64, layers[0])
        self.layer2 = self._make_layer(block, 128, layers[1], stride=2)
        self.layer3 = self._make_layer(block, 256, layers[2], stride=1, dilation=2)
        self.layer4 = self._make_layer(block, 512, layers[3], stride=1, dilation=4, multi_grid=(1,1,1))

        # extra added layers
        self.head = DANetHead(2048, num_classes)
        self.dsn = nn.Sequential(
            nn.Conv2d(1024, 512, kernel_size=3, stride=1, padding=1),
            InPlaceABNSync(512),
            nn.Dropout2d(0.1),
            nn.Conv2d(512, num_classes, kernel_size=1, stride=1, padding=0, bias=True)
            )

Here we provide the configuration of the DAHead,

class DANetHead(nn.Module):
    def __init__(self, in_channels, out_channels, norm_layer=InPlaceABNSync):
        super(DANetHead, self).__init__()
        inter_channels = in_channels // 4
        self.conv5a = nn.Sequential(nn.Conv2d(in_channels, inter_channels, 3, padding=1, bias=False),
                                   norm_layer(inter_channels),
                                   )    
        self.conv5c = nn.Sequential(nn.Conv2d(in_channels, inter_channels, 3, padding=1, bias=False),
                                   norm_layer(inter_channels),
                                   )
        self.sa = PAM_Module(inter_channels)
        self.sc = CAM_Module(inter_channels)
        self.conv51 = nn.Sequential(nn.Conv2d(inter_channels, inter_channels, 3, padding=1, bias=False),
                                   norm_layer(inter_channels),
                                   )
        self.conv52 = nn.Sequential(nn.Conv2d(inter_channels, inter_channels, 3, padding=1, bias=False),
                                   norm_layer(inter_channels),
                                   )
        self.conv8 = nn.Sequential(nn.Dropout2d(0.1, False), nn.Conv2d(512, out_channels, 1))

I guess the main difference is only about the choice of the SyncBN, we employ the InPlaceABNSync (contains PReLU) following OCNet.

who encounter problem:No module named 'encoding'

import encoding.utils as utils
12 from encoding.nn import SegmentationLosses,BatchNorm2d
13 from encoding.nn import SegmentationMultiLosses

ModuleNotFoundError: No module named 'encoding'

IN train

How to test for custom Images

I'm trying to run inference by modifying test.py to run on custom images. I got the following error with respect to the model checkpoint. I have downloaded the pre-trained model and have added it to danet/cityscapes/model path. Here is my modified code; its referencing the correct path of the checkpoint directory.

args = Options().parse()
model = get_segmentation_model(args.model, dataset=args.dataset,backbone=args.backbone, aux=args.aux,se_loss=args.se_loss, norm_layer=BatchNorm2d, base_size=args.base_size, crop_size=args.crop_size,multi_grid=args.multi_grid, multi_dilation=args.multi_dilation)

if args.resume_dir is None or not os.path.isdir(args.resume_dir):
        raise RuntimeError("=> no checkpoint found at '{}'".format(args.resume_dir))
for resume_file in os.listdir(args.resume_dir):
    if os.path.splitext(resume_file)[1] == '.tar':
        args.resume = os.path.join(args.resume_dir, resume_file)
        assert os.path.exists(args.resume)
        print(args.resume)

checkpoint = torch.load(args.resume) # strict=False, so that it is compatible with old pytorch saved models
model.load_state_dict(checkpoint['state_dict'], strict=False)
print(model)

I'm running this from my terminal

CUDA_VISIBLE_DEVICES=0 python test_on_custom.py --model danet --resume-dir cityscapes/model/ --base-size 2048 --crop-size 768 --workers 1 --backbone resnet101 --multi-grid --multi-dilation 4 8 16 --eval

This is the error:

Traceback (most recent call last):
  File "test_on_custom.py", line 34, in <module>
    model.load_state_dict(checkpoint['state_dict'], strict=False)
  File "/datadrive/virtualenvs/torch3.5/lib/python3.5/site-packages/torch/nn/modules/module.py", line 719, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for DANet:
	size mismatch for head.conv6.1.weight: copying a param of torch.Size([150, 512, 1, 1]) from checkpoint, where the shape is torch.Size([19, 512, 1, 1]) in current model.
	size mismatch for head.conv6.1.bias: copying a param of torch.Size([150]) from checkpoint, where the shape is torch.Size([19]) in current model.
	size mismatch for head.conv7.1.weight: copying a param of torch.Size([150, 512, 1, 1]) from checkpoint, where the shape is torch.Size([19, 512, 1, 1]) in current model.
	size mismatch for head.conv7.1.bias: copying a param of torch.Size([150]) from checkpoint, where the shape is torch.Size([19]) in current model.
	size mismatch for head.conv8.1.weight: copying a param of torch.Size([150, 512, 1, 1]) from checkpoint, where the shape is torch.Size([19, 512, 1, 1]) in current model.
	size mismatch for head.conv8.1.bias: copying a param of torch.Size([150]) from checkpoint, where the shape is torch.Size([19]) in current model.

Line 34 is model.load_state_dict(checkpoint['state_dict'], strict=False)

UnboundLocalError：local variable 'pixAcc' referenced before assignment. In line 118 test.py.

When I ran the codeCUDA_VISIBLE_DEVICES=0,1,2,3 python test.py --dataset cityscapes --model danet --resume-dir cityscapes/model --base-size 2048 --crop-size 768 --workers 1 --backbone resnet101 --multi-grid --multi-dilation 4 8 16, I get the error above. Then I add "pixAcc = None" before for loop in line 101 test.py. Then I run the terminal again, but I get another error about "RuntimeError：CUDA error：out of memory". So I want to konw how large GPU memory the code needed?

About the results on PASCAL Context

Question 1:
The DANet performs well on PASCAL Context dataset, outperforming EncNet by 1%. Do that use any improvement strategies mentioned in the subsection 'Ablation Study for Improvement Strategies' ?
As far as I known, the EncNet achieves 51.7 mIoU on the same dataset without using multiple scales strategy.
Question 2:
Before conducting softmax operation on x, we usually subtract the maximum value of x. But for CAM_Module, I see that energy_new = torch.max(energy, -1, keepdim=True)[0].expand_as(energy)-energy (https://github.com/junfu1115/DANet/blob/master/encoding/nn/attention.py#L75) rather than energy_new = energy- torch.max(energy, -1, keepdim=True)[0].expand_as(energy) followed by softmax op. Therefore, the weights of each channel are reverse based on the two different op before softmax.
Can you share why the former is used?

About the Multi-grid Parts

Interesting work. We also have a concurrent work OCNet.

In fact, we used to have tried the multi-grid method but find it brings no performance gains.

Thus I am wondering whether you only use multi-grid for only testing or both training and testing?

Which crop_size did you adopt in your ablation experiments on Cityscapes, 768 or 1024?

Since we usually test with the full size, I am a little confused with scripts you provided for single scale testing and multi-scale testing, in which the crop_size are 768 and 1024. Then which crop_size did you use in ablation study in your paper on Cityscapes dataset , 768 or 1024?

Visualization of the attention map.

Nice jobs! Do you mind to provide the code or details about how to implement the visualization of the attention map?

Channel attention module

Hi! the paper says the channel attention map X is N xN .Is it wrong here?I think channel attention map X should be C xC.

Can you provide the pretained model DANet101 again?

because i find the download link is invalid, can you provide again? thanks!

Data augmentation

https://github.com/rishizek/tensorflow-deeplab-v3/blob/a5d7ff2a14ebafb8a9b783271e4add82488528d3/utils/preprocessing.py#L115,this is deeplabv3,

but i can't find any opt in your code about training with random scaling!Am I getting it wrong?thanks a lot

The Molde Initial

When training the danet model,how to initial?

Undefined names: Can raise NameError at runtime

flake8 testing of https://github.com/junfu1115/DANet on Python 3.7.0

$ flake8 . --count --select=E901,E999,F821,F822,F823 --show-source --statistics

./experiments/segmentation/get_weight.py:67:12: F821 undefined name 'self'
        }, self.args, is_best, 'DANet101_reduce.pth.tar')
           ^
./experiments/segmentation/get_weight.py:67:23: F821 undefined name 'is_best'
        }, self.args, is_best, 'DANet101_reduce.pth.tar')
                      ^
./encoding/models/base.py:101:34: F821 undefined name 'target_gpus'
        kwargs = scatter(kwargs, target_gpus, dim) if kwargs else []
                                 ^
./encoding/models/base.py:101:47: F821 undefined name 'dim'
        kwargs = scatter(kwargs, target_gpus, dim) if kwargs else []
                                              ^
./encoding/datasets/base.py:112:44: F821 undefined name 'batch'
    raise TypeError((error_msg.format(type(batch[0]))))
                                           ^
./encoding/datasets/pascal_voc.py:65:41: F821 undefined name 'mask'
            mask = self._mask_transform(mask)
                                        ^
./encoding/utils/presets.py:7:1: F822 undefined name 'subtract_imagenet_mean_batch' in __all__
__all__ = ['load_image', 'subtract_imagenet_mean_batch']
^
./encoding/nn/encoding.py:299:21: F821 undefined name 'math'
        stdv = 1. / math.sqrt(n)
                    ^
7     F821 undefined name 'batch'
1     F822 undefined name 'subtract_imagenet_mean_batch' in __all__
8

Why the size of x is B X C X W X H?

I wonder why the size of x is B X C X W X H? Although it doesn't affect the calculation.

here:

DANet/encoding/nn/attention.py

Line 42 in 664e98d

m_batchsize, C, width, height = x.size()

and also in the comments:

DANet/encoding/nn/attention.py

Line 37 in 664e98d

x : input feature maps( B X C X W X H)

Thank you!

Meet some problem when i try to eval..

Traceback (most recent call last):
File "/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 946, in _build_extension_module
check=True)
File "/home/xieke/anaconda3/envs/pt_source/lib/python3.6/subprocess.py", line 438, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "test.py", line 16, in
import encoding.utils as utils
File "/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/init.py", line 13, in
from . import nn, functions, dilated, parallel, utils, models, datasets
File "/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/nn/init.py", line 12, in
from .encoding import *
File "/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/nn/encoding.py", line 19, in
from ..functions import scaledL2, aggregate, pairwise_cosine
File "/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/functions/init.py", line 2, in
from .encoding import *
File "/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/functions/encoding.py", line 14, in
from .. import lib
File "/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/init.py", line 12, in
], build_directory=cpu_path, verbose=False)
File "/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 645, in load
is_python_module)
File "/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 814, in _jit_compile
with_cuda=with_cuda)
File "/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 863, in _write_ninja_file_and_build
_build_extension_module(name, build_directory, verbose)
File "/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 959, in build_extension_module
raise RuntimeError(message)
RuntimeError: Error building extension 'enclib_cpu': b'[1/3] c++ -MMD -MF roi_align_cpu.o.d -DTORCH_EXTENSION_NAME=enclib_cpu -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/lib/include -isystem /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include -isystem /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/lib/include/TH -isystem /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/lib/include/THC -isystem /home/xieke/anaconda3/envs/pt_source/include/python3.6m -fPIC -std=c++11 -c /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp -o roi_align_cpu.o\nFAILED: roi_align_cpu.o \nc++ -MMD -MF roi_align_cpu.o.d -DTORCH_EXTENSION_NAME=enclib_cpu -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/lib/include -isystem /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include -isystem /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/lib/include/TH -isystem /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/lib/include/THC -isystem /home/xieke/anaconda3/envs/pt_source/include/python3.6m -fPIC -std=c++11 -c /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp -o roi_align_cpu.o\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp: In function 'at::Tensor ROIAlignForwardCPU(const at::Tensor&, const at::Tensor&, int64_t, int64_t, double, int64_t)':\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp:407:30: error: 'struct at::Type' has no member named 'tensor'\n auto output = input.type().tensor({num_rois, channels, pooled_height, pooled_width});\n ^\nIn file included from /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/lib/include/ATen/ATen.h:9:0,\n from /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp:1:\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp: In lambda function:\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp:425:27: error: expected primary-expression before '>' token\n output.data<scalar_t>());\n ^\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp:425:29: error: expected primary-expression before ')' token\n output.data<scalar_t>());\n ^\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp: In lambda function:\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp:425:27: error: expected primary-expression before '>' token\n output.data<scalar_t>());\n ^\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp:425:29: error: expected primary-expression before ')' token\n output.data<scalar_t>());\n ^\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp: In function 'at::Tensor ROIAlignBackwardCPU(const at::Tensor&, const at::Tensor&, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t, double, int64_t)':\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp:454:37: error: 'struct at::Type' has no member named 'tensor'\n auto grad_in = bottom_rois.type().tensor({b_size, channels, height, width}).zero(); \n ^\nIn file included from /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/lib/include/ATen/ATen.h:9:0,\n from /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp:1:\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp: In lambda function:\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp:470:28: error: expected primary-expression before '>' token\n grad_in.data<scalar_t>(),\n ^\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp:470:30: error: expected primary-expression before ')' token\n grad_in.data<scalar_t>(),\n ^\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp: In lambda function:\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp:470:28: error: expected primary-expression before '>' token\n grad_in.data<scalar_t>(),\n ^\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp:470:30: error: expected primary-expression before ')' token\n grad_in.data<scalar_t>(),\n ^\n[2/3] c++ -MMD -MF roi_align.o.d -DTORCH_EXTENSION_NAME=enclib_cpu -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/lib/include -isystem /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include -isystem /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/lib/include/TH -isystem /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/lib/include/THC -isystem /home/xieke/anaconda3/envs/pt_source/include/python3.6m -fPIC -std=c++11 -c /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align.cpp -o roi_align.o\nIn file included from /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align.cpp:1:0:\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include/torch/torch.h:7:2: warning: #warning "Including torch/torch.h for C++ extensions is deprecated. Please include torch/extension.h" [-Wcpp]\n #warning \\n ^\nninja: build stopped: subcommand failed.\n'
(pt_source) xieke@ubuntu:~/wuyong/codes/DANet/danet$ CUDA_VISIBLE_DEVICES=0,1,2,3 python test.py --dataset cityscapes --model danet --resume-dir cityscapes/model --base-size 2048 --crop-size 1024 --workers 1 --backbone resnet101 --multi-grid --multi-dilation 4 8 16 --eval --multi-scales
Traceback (most recent call last):
File "/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 946, in _build_extension_module
check=True)
File "/home/xieke/anaconda3/envs/pt_source/lib/python3.6/subprocess.py", line 438, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "test.py", line 16, in
import encoding.utils as utils
File "/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/init.py", line 13, in
from . import nn, functions, dilated, parallel, utils, models, datasets
File "/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/nn/init.py", line 12, in
from .encoding import *
File "/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/nn/encoding.py", line 19, in
from ..functions import scaledL2, aggregate, pairwise_cosine
File "/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/functions/init.py", line 2, in
from .encoding import *
File "/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/functions/encoding.py", line 14, in
from .. import lib
File "/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/init.py", line 12, in
], build_directory=cpu_path, verbose=False)
File "/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 645, in load
is_python_module)
File "/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 814, in _jit_compile
with_cuda=with_cuda)
File "/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 863, in _write_ninja_file_and_build
_build_extension_module(name, build_directory, verbose)
File "/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 959, in build_extension_module
raise RuntimeError(message)
RuntimeError: Error building extension 'enclib_cpu': b"[1/2] c++ -MMD -MF roi_align_cpu.o.d -DTORCH_EXTENSION_NAME=enclib_cpu -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/lib/include -isystem /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include -isystem /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/lib/include/TH -isystem /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/lib/include/THC -isystem /home/xieke/anaconda3/envs/pt_source/include/python3.6m -fPIC -std=c++11 -c /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp -o roi_align_cpu.o\nFAILED: roi_align_cpu.o \nc++ -MMD -MF roi_align_cpu.o.d -DTORCH_EXTENSION_NAME=enclib_cpu -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/lib/include -isystem /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include -isystem /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/lib/include/TH -isystem /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/lib/include/THC -isystem /home/xieke/anaconda3/envs/pt_source/include/python3.6m -fPIC -std=c++11 -c /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp -o roi_align_cpu.o\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp: In function 'at::Tensor ROIAlignForwardCPU(const at::Tensor&, const at::Tensor&, int64_t, int64_t, double, int64_t)':\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp:407:30: error: 'struct at::Type' has no member named 'tensor'\n auto output = input.type().tensor({num_rois, channels, pooled_height, pooled_width});\n ^\nIn file included from /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/lib/include/ATen/ATen.h:9:0,\n from /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp:1:\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp: In lambda function:\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp:425:27: error: expected primary-expression before '>' token\n output.data<scalar_t>());\n ^\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp:425:29: error: expected primary-expression before ')' token\n output.data<scalar_t>());\n ^\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp: In lambda function:\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp:425:27: error: expected primary-expression before '>' token\n output.data<scalar_t>());\n ^\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp:425:29: error: expected primary-expression before ')' token\n output.data<scalar_t>());\n ^\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp: In function 'at::Tensor ROIAlignBackwardCPU(const at::Tensor&, const at::Tensor&, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t, double, int64_t)':\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp:454:37: error: 'struct at::Type' has no member named 'tensor'\n auto grad_in = bottom_rois.type().tensor({b_size, channels, height, width}).zero(); \n ^\nIn file included from /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/lib/include/ATen/ATen.h:9:0,\n from /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp:1:\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp: In lambda function:\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp:470:28: error: expected primary-expression before '>' token\n grad_in.data<scalar_t>(),\n ^\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp:470:30: error: expected primary-expression before ')' token\n grad_in.data<scalar_t>(),\n ^\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp: In lambda function:\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp:470:28: error: expected primary-expression before '>' token\n grad_in.data<scalar_t>(),\n ^\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp:470:30: error: expected primary-expression before ')' token\n grad_in.data<scalar_t>(),\n ^\nninja: build stopped: subcommand failed.\n"

change dataset to train

Hi, i was wondering if you could provide a method on steps to change the dataset, like except change the name typed in terminal, what else do we need to do.

convert Cityscapes dataset to 19 categories.

It's the wonderful work.
For the section Dataset of Usage in readme file, you mention that

Download the Cityscapes dataset and convert the dataset to 19 categories.

Here, why do we first convert Cityscapes dataset to 19 categories rather than directly using the original one?
The question maybe too simple and I'm still looking forword to your replying if you have enough time.

Convert the Cityscapes dataset to 19 categories

Hi , I am new to segmentation field, and I meet the problem about how to convert the Cityscapes dataset to 19 categories. Look forward to your reply. Thank you !!

Training setting for cocostuff?

Hi! Thank you for your great work! Could you please tell me the settings of cocostuff10k to reproduce your results? (like base-size, crop-size, epochs, batch-size, lr and multi-dilation, etc.)

Thanks again!

I CAN NOT get same result

i trained the model using the pretrained model(resnet-101) with cityscapes dataset, but i only got 70.07 mIOU, How can i the same result as the paper? please~~~~ thanks !
I set the batch_size 15 (5 GPUs), epochs 120 .

Install from Pytorch@commitfd25a2a

Hi,
I followed the tutorial to install pytorch@commit fd25a2a from the source. But when I run the following command to install the third/party:
git submodule update --init --recursive

Then the following error message appeared:
remote: Repository not found.
fatal: repository 'https://github.com/NervanaSystems/nervanagpu.git/' not found
fatal: clone of 'https://github.com/NervanaSystems/nervanagpu.git' into submodule path 'third_party/nervanagpu' failed

That means the Repository has gone, and I can't install the pytorch from source.
Do you know how to solve it?

The difference between the paper and the code

Given a local feature A, first feed it into a convolution layers with batch normalization and ReLU layers to generate two new feature maps B and C.(in paper)
But, there is no BN and ReLU in code self.query_conv = Conv2d(in_channels=in_dim, out_channels=in_dim//8, kernel_size=1) and self.key_conv = Conv2d(in_channels=in_dim, out_channels=in_dim//8, kernel_size=1), in PAM_Module.
Which is right?

I can't run this code in windows with single GPU

Has anyone succeeded in running a single GPU under Windows? Please let me know.

why the learned position attention map highlights regions of the same class as the query pixel

Hi, May I ask why the learned position attention map highlights regions of the same class as the query pixel ? if it's because each pixel is imposed with label supervision to make pixels from same class share similar feature, then why do we need to harvest feature from other pixels of the same class?

'Ninja is required to load C++ extensions'

Ninja is required to load C++ extensions
Can you tell me what to do?

how to insert DANet module to other backbone network?

@junfu1115
I want to apply danet module to deeplab, so i am not sure how to insert it, wheater the front/middle/end of backbone(xception),or before ASPP module and behind ASPP, can you help me?

energy_new in CAM

Hi @junfu1115 ,
why do you use the energy_new in attention.py?
Correct me if I miss something.

finetuning

Hi @junfu1115 ,

I was wondering how to fine tune the model on a custom dataset?
I have noticed that there is an--ftargument which can be utilized, but I am not sure about the correct procedure.

why add "energy_new = torch.max(energy, -1, keepdim=True)[0].expand_as(energy)-energy"

In channel Attention Module, why add a line "energy_new = torch.max(energy, -1, keepdim=True)[0].expand_as(energy)-energy" ???

why center crop for testing instead of the whole image?

hi @junfu1115
i have a question: in BaseDataset, the func _val_sync_transform use center-crop to crop the image. is it reasonable to use center-crop to test the model instead of the whole image?
thanks.

Reproducing Cityscapes

Running the command supplied by the repo,

CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --dataset cityscapes --model danet --backbone resnet101 --checkname danet101 --base-size 1024 --crop-size 768 --epochs 240 --batch-size 8 --lr 0.003 --workers 2 --multi-grid --multi-dilation 4 8 16

returned a mIOU of 0.735 at the end of 240 epochs. There is probably some randomness in running through the dataset, but I'm surprised it was that much lower than anticipated. Any advice on how to get closer to reported scores?

trained model DANet is damaged.

After the download of the compressed package is completed, it cannot be decompressed, and the file is damaged. Can it be re-uploaded please?

Adding Link to EncNet Repo

Great work! I found this project is based on EncNet / Encoding repository. Could you please consider adding a link to the EncNet Repo probably in the acknowledgement of the readme? I appreciate that.
https://github.com/zhanghang1989/PyTorch-Encoding

About 'self.gamma'

Hi! I do not understand why set 'self.gamma' to be zero, in this way, "out‘ equals 'x', right? Thanks a lot!

"class PAM_Module(nn.Module):
"""Position attention module"""

# Ref from Sagan
def __init__(self, in_dim):
    super(PAM_Module, self).__init__()
    # self.channel_in = in_dim

    self.query_conv = nn.Conv2d(in_channels=in_dim, out_channels=in_dim // 8, kernel_size=1)
    self.key_conv = nn.Conv2d(in_channels=in_dim, out_channels=in_dim // 8, kernel_size=1)
    self.value_conv = nn.Conv2d(in_channels=in_dim, out_channels=in_dim, kernel_size=1)
    self.gamma = nn.Parameter(torch.zeros(1))

    self.softmax = torch.nn.Softmax(dim=-1)

def forward(self, x):
    """
        inputs :
            x : input feature maps( B X C X H X W)
        returns :
            out : attention value + input feature
            attention: B X (HxW) X (HxW)
    """
    m_batchsize, C, height, width = x.size()
    proj_query = self.query_conv(x).view(m_batchsize, -1, width * height).permute(0, 2, 1)  # (H*W) * C//8
    proj_key = self.key_conv(x).view(m_batchsize, -1, width * height)  # C//8 * (H*W)
    energy = torch.bmm(proj_query, proj_key)  # (H*W) * (H*W)
    attention = self.softmax(energy)
    proj_value = self.value_conv(x).view(m_batchsize, -1, width * height)  # C * (H*W)

    out = torch.bmm(proj_value, attention.permute(0, 2, 1))  #
    out = out.view(m_batchsize, C, height, width)

    out = self.gamma * out + x
    return out

How large GPU memory you used?

Hi, @junfu1115 thanks for your work!
I am curious about the GPU memory you used, since the attention will introduce many parameters and the image size of cityscape is very large 2048*1024, I found you resize the image to a base size 608, is the 12GP memory enough for that?

Unable to get repr for <class 'torch.Tensor'>

Hi, When I train with my own data and compute the loss, I get the following error：

Unable to get repr for <class 'torch.Tensor'>
Have you ever encountered this problem, how should I solve it?

`t >= 0 && t < n_classes`

I use my dataset which has 81 catg，when train the model, error like:
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:99: void cunn_SpatialClassNLLCriterion_updateOutput_kernel(T *, T *, T *, long *, T *, int, int, int, int, int, long) [with T = float, AccumT = float]: block: [1,0,0], thread: [246,0,0] Assertion t >= 0 && t < n_classes failed.

training on single gpu

baseline FCN

  the baseline FCN can gain 70.03% mIOU in your paper，now I want to realize  the result based your paper.But my result is not good.
  I noticed the paper used LR=0.001,but the git used LR = 0.003
  Can you tell me more details about the training process.Thank you.

About the `Softmax` in the `CAM_Module`

DANet/encoding/nn/attention.py

Line 75 in 799424e

energy_new = torch.max(energy, -1, keepdim=True)[0].expand_as(energy)-energy

Is your code trying to improve numerical stability? Maybe it should be in this form.

       energy_new = torch.max(energy, -1, keepdim=True)
       energy_new = energy_new[0].expand_as(energy)
       energy_new = energy - energy_new
       attention = self.softmax(energy_new)

I found a error in SegmentationMultiLosses forward function

Hi,
I found a error in SegmentationMultiLosses's forward function at

DANet/encoding/nn/customize.py

Line 100 in 201d308

pred1, pred2 ,pred3= tuple(preds)

'*preds' is the correct variable instead of 'preds'. Otherwise, it pops up an error like:
"ValueError: not enough values to unpack (expected 3, got 1)"

Cheers!

Install pytorch commit fd25a2a

I was running evaluation on cityscape and did not find args.test_scale anywhere.
Its not mentioned in options.py aswell.

Error:

if len(args.test_scale) == 1:
AttributeError: 'Namespace' object has no attribute 'test_scale'

Running this from terminal:
CUDA_VISIBLE_DEVICES=0 python3.6 test.py --dataset cityscapes --model danet --resume-dir cityscapes/model --crop-size 768 --workers 1 --backbone resnet101 --multi-grid --multi-dilation 4 8 16 --eval

data loading time, input.cuda()

Hi,

I was just wondering why

DANet/encoding/models/base.py

Line 100 in 201d308

inputs = [(input.unsqueeze(0).cuda(device),)

, takes around 1.3secs for each input tensor for single GPU

Bug in encoding/parallel.py

I was wondering if targets[0] is missing in line 182: _worker(0, modules[0], inputs[0], kwargs_tup[0], devices[0])

error: [Errno 13] Permission denied: '/usr/local/lib/python3.5/dist-packages/torch_encoding-0.4.5+799424e-py3.5.egg-info'

how to solve this problem?

Performance issue

I clone your code and run the training script.
But, I only get the performance at 73.5% mIoU on val set(CityScapes)
Is there any training tricks?

training with single GPU

Hi,

So while trying to train the network I encountered this error. I can't figure out what the mistake is. I'm using the proper pytorch commit. I have not made any modifications to the code.

From my terminal:
CUDA_VISIBLE_DEVICES=0 python train.py --dataset cityscapes --model danet --backbone resnet101 --checkname danet101 --base-size 1024 --crop-size 768 --epochs 240 --batch-size 8 --lr 0.003 --workers 2 --multi-grid --multi-dilation 4 8 16

Error:

Traceback (most recent call last):
  File "train.py", line 201, in <module>
    trainer.training(epoch)
  File "train.py", line 125, in training
    outputs = self.model(image)
  File "/datadrive/virtualenvs/torchDA/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/datadrive/virtualenvs/torchDA/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 121, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "/datadrive/virtualenvs/torchDA/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/datadrive/virtualenvs/torchDA/lib/python3.6/site-packages/encoding/models/danet.py", line 45, in forward
    _, _, c3, c4 = self.base_forward(x)
  File "/datadrive/virtualenvs/torchDA/lib/python3.6/site-packages/encoding/models/base.py", line 58, in base_forward
    x = self.pretrained.bn1(x)
  File "/datadrive/virtualenvs/torchDA/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/datadrive/virtualenvs/torchDA/lib/python3.6/site-packages/encoding/nn/syncbn.py", line 57, in forward
    mean, inv_std = self._slave_pipe.run_slave(_ChildMessage(xsum, xsqsum, N))
AttributeError: 'NoneType' object has no attribute 'run_slave' ```

what does the NUM_CLASS menas？

when I use custome images to train the model. What does the NUM_CLASS means? Is it means the number of the class in one image.

junfu1115 / danet Goto Github PK

danet's Introduction

Introduction

Cityscapes testing set result

Updates

Usage

Citation

Acknowledgement

danet's People

Contributors

Stargazers

Watchers

Forkers

danet's Issues

Recommend Projects

Recommend Topics

Recommend Org