Giter Site home page Giter Site logo

danet's Introduction

Jun Fu, Jing Liu, Haijie Tian, Yong Li, Yongjun Bao, Zhiwei Fang,and Hanqing Lu

Introduction

We propose a Dual Attention Network (DANet) to adaptively integrate local features with their global dependencies based on the self-attention mechanism. And we achieve new state-of-the-art segmentation performance on three challenging scene segmentation datasets, i.e., Cityscapes, PASCAL Context and COCO Stuff-10k dataset.

image

Cityscapes testing set result

We train our DANet-101 with only fine annotated data and submit our test results to the official evaluation server.

image

Updates

2020/9Renew the code, which supports Pytorch 1.4.0 or later!

2020/8:The new TNNLS version DRANet achieves 82.9% on Cityscapes test set (submit the result on August, 2019), which is a new state-of-the-arts performance with only using fine annotated dataset and Resnet-101. The code will be released in DRANet.

2020/7:DANet is supported on MMSegmentation, in which DANet achieves 80.47% with single scale testing and 82.02% with multi-scale testing on Cityscapes val set.

2018/9:DANet released. The trained model with ResNet101 achieves 81.5% on Cityscapes test set.

Usage

  1. Install pytorch

    • The code is tested on python3.6 and torch 1.4.0.
    • The code is modified from PyTorch-Encoding.
  2. Clone the resposity

    git clone https://github.com/junfu1115/DANet.git 
    cd DANet 
    python setup.py install
  3. Dataset

    • Download the Cityscapes dataset and convert the dataset to 19 categories.
    • Please put dataset in folder ./datasets
  4. Evaluation for DANet

    • Download trained model DANet101 and put it in folder ./experiments/segmentation/models/

    • cd ./experiments/segmentation/

    • For single scale testing, please run:

    • CUDA_VISIBLE_DEVICES=0,1,2,3 python test.py --dataset citys --model danet --backbone resnet101 --resume  models/DANet101.pth.tar --eval --base-size 2048 --crop-size 768 --workers 1 --multi-grid --multi-dilation 4 8 16 --os 8 --aux --no-deepstem
    • Evaluation Result

      The expected scores will show as follows: DANet101 on cityscapes val set (mIoU/pAcc): 79.93/95.97(ss)

  5. Evaluation for DRANet

    • Download trained model DRANet101 and put it in folder ./experiments/segmentation/models/

    • Evaluation code is in folder ./experiments/segmentation/

    • cd ./experiments/segmentation/

    • For single scale testing, please run:

    • CUDA_VISIBLE_DEVICES=0,1,2,3 python test.py --dataset citys --model dran --backbone resnet101 --resume  models/dran101.pth.tar --eval --base-size 2048 --crop-size 768 --workers 1 --multi-grid --multi-dilation 4 8 16 --os 8 --aux
    • Evaluation Result

      The expected scores will show as follows: DRANet101 on cityscapes val set (mIoU/pAcc): 81.63/96.62 (ss)

Citation

if you find DANet and DRANet useful in your research, please consider citing:

@article{fu2020scene,
  title={Scene Segmentation With Dual Relation-Aware Attention Network},
  author={Fu, Jun and Liu, Jing and Jiang, Jie and Li, Yong and Bao, Yongjun and Lu, Hanqing},
  journal={IEEE Transactions on Neural Networks and Learning Systems},
  year={2020},
  publisher={IEEE}
}
@inproceedings{fu2019dual,
  title={Dual attention network for scene segmentation},
  author={Fu, Jun and Liu, Jing and Tian, Haijie and Li, Yong and Bao, Yongjun and Fang, Zhiwei and Lu, Hanqing},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  pages={3146--3154},
  year={2019}
}

Acknowledgement

Thanks PyTorch-Encoding, especially the Synchronized BN!

danet's People

Contributors

junfu1115 avatar matthewpurri avatar mhamedlmarbouh avatar serend1p1ty avatar stacyyang avatar wydwww avatar zhanghang1989 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

danet's Issues

The value of self.gamma

Hi, when the gamma is initialized to 0, how does it become other values. I know it's a parameter that can be changed after learning, but I don't understand how it changed. I am confused, because i think its value is always 0.

Reproduced Performance Discussion

Very interesting!

I find that you employ three independent loss functions on the predictions of the "PAM branch" and "CAM branch".

I am wondering the influence of the loss functions on these two branches and it would be great if you could share the related experiments.

Besides, I want to share my reproduced performance with your PAM and CAM based on the OCNet, (especially, we only employ one loss function and employ lr=0.01 and wd=0.0005 following the default settings within OCNet)

we report the mIoU on training set /validation set here: 82.0/74.4 (single scale).

Could anyone help us to check where is the problem?

Here we also provide the network configuration,

class ResNet(nn.Module):
    def __init__(self, block, layers, num_classes):
        self.inplanes = 128
        super(ResNet, self).__init__()
        self.conv1 = conv3x3(3, 64, stride=2)
        self.bn1 = BatchNorm2d(64)
        self.relu1 = nn.ReLU(inplace=False)
        self.conv2 = conv3x3(64, 64)
        self.bn2 = BatchNorm2d(64)
        self.relu2 = nn.ReLU(inplace=False)
        self.conv3 = conv3x3(64, 128)
        self.bn3 = BatchNorm2d(128)
        self.relu3 = nn.ReLU(inplace=False)
        self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
        self.relu = nn.ReLU(inplace=False)
        self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1, ceil_mode=True) # change
        self.layer1 = self._make_layer(block, 64, layers[0])
        self.layer2 = self._make_layer(block, 128, layers[1], stride=2)
        self.layer3 = self._make_layer(block, 256, layers[2], stride=1, dilation=2)
        self.layer4 = self._make_layer(block, 512, layers[3], stride=1, dilation=4, multi_grid=(1,1,1))

        # extra added layers
        self.head = DANetHead(2048, num_classes)
        self.dsn = nn.Sequential(
            nn.Conv2d(1024, 512, kernel_size=3, stride=1, padding=1),
            InPlaceABNSync(512),
            nn.Dropout2d(0.1),
            nn.Conv2d(512, num_classes, kernel_size=1, stride=1, padding=0, bias=True)
            )

Here we provide the configuration of the DAHead,

class DANetHead(nn.Module):
    def __init__(self, in_channels, out_channels, norm_layer=InPlaceABNSync):
        super(DANetHead, self).__init__()
        inter_channels = in_channels // 4
        self.conv5a = nn.Sequential(nn.Conv2d(in_channels, inter_channels, 3, padding=1, bias=False),
                                   norm_layer(inter_channels),
                                   )    
        self.conv5c = nn.Sequential(nn.Conv2d(in_channels, inter_channels, 3, padding=1, bias=False),
                                   norm_layer(inter_channels),
                                   )
        self.sa = PAM_Module(inter_channels)
        self.sc = CAM_Module(inter_channels)
        self.conv51 = nn.Sequential(nn.Conv2d(inter_channels, inter_channels, 3, padding=1, bias=False),
                                   norm_layer(inter_channels),
                                   )
        self.conv52 = nn.Sequential(nn.Conv2d(inter_channels, inter_channels, 3, padding=1, bias=False),
                                   norm_layer(inter_channels),
                                   )
        self.conv8 = nn.Sequential(nn.Dropout2d(0.1, False), nn.Conv2d(512, out_channels, 1))

I guess the main difference is only about the choice of the SyncBN, we employ the InPlaceABNSync (contains PReLU) following OCNet.

who encounter problem:No module named 'encoding'

import encoding.utils as utils
12 from encoding.nn import SegmentationLosses,BatchNorm2d
13 from encoding.nn import SegmentationMultiLosses

ModuleNotFoundError: No module named 'encoding'

IN train

How to test for custom Images

I'm trying to run inference by modifying test.py to run on custom images. I got the following error with respect to the model checkpoint. I have downloaded the pre-trained model and have added it to danet/cityscapes/model path. Here is my modified code; its referencing the correct path of the checkpoint directory.

args = Options().parse()
model = get_segmentation_model(args.model, dataset=args.dataset,backbone=args.backbone, aux=args.aux,se_loss=args.se_loss, norm_layer=BatchNorm2d, base_size=args.base_size, crop_size=args.crop_size,multi_grid=args.multi_grid, multi_dilation=args.multi_dilation)

if args.resume_dir is None or not os.path.isdir(args.resume_dir):
        raise RuntimeError("=> no checkpoint found at '{}'".format(args.resume_dir))
for resume_file in os.listdir(args.resume_dir):
    if os.path.splitext(resume_file)[1] == '.tar':
        args.resume = os.path.join(args.resume_dir, resume_file)
        assert os.path.exists(args.resume)
        print(args.resume)

checkpoint = torch.load(args.resume) # strict=False, so that it is compatible with old pytorch saved models
model.load_state_dict(checkpoint['state_dict'], strict=False)
print(model)

I'm running this from my terminal

CUDA_VISIBLE_DEVICES=0 python test_on_custom.py --model danet --resume-dir cityscapes/model/ --base-size 2048 --crop-size 768 --workers 1 --backbone resnet101 --multi-grid --multi-dilation 4 8 16 --eval

This is the error:

Traceback (most recent call last):
  File "test_on_custom.py", line 34, in <module>
    model.load_state_dict(checkpoint['state_dict'], strict=False)
  File "/datadrive/virtualenvs/torch3.5/lib/python3.5/site-packages/torch/nn/modules/module.py", line 719, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for DANet:
	size mismatch for head.conv6.1.weight: copying a param of torch.Size([150, 512, 1, 1]) from checkpoint, where the shape is torch.Size([19, 512, 1, 1]) in current model.
	size mismatch for head.conv6.1.bias: copying a param of torch.Size([150]) from checkpoint, where the shape is torch.Size([19]) in current model.
	size mismatch for head.conv7.1.weight: copying a param of torch.Size([150, 512, 1, 1]) from checkpoint, where the shape is torch.Size([19, 512, 1, 1]) in current model.
	size mismatch for head.conv7.1.bias: copying a param of torch.Size([150]) from checkpoint, where the shape is torch.Size([19]) in current model.
	size mismatch for head.conv8.1.weight: copying a param of torch.Size([150, 512, 1, 1]) from checkpoint, where the shape is torch.Size([19, 512, 1, 1]) in current model.
	size mismatch for head.conv8.1.bias: copying a param of torch.Size([150]) from checkpoint, where the shape is torch.Size([19]) in current model.

Line 34 is model.load_state_dict(checkpoint['state_dict'], strict=False)

UnboundLocalError:local variable 'pixAcc' referenced before assignment. In line 118 test.py.

When I ran the codeCUDA_VISIBLE_DEVICES=0,1,2,3 python test.py --dataset cityscapes --model danet --resume-dir cityscapes/model --base-size 2048 --crop-size 768 --workers 1 --backbone resnet101 --multi-grid --multi-dilation 4 8 16, I get the error above. Then I add "pixAcc = None" before for loop in line 101 test.py. Then I run the terminal again, but I get another error about "RuntimeError:CUDA error:out of memory". So I want to konw how large GPU memory the code needed?

About the results on PASCAL Context

Question 1:
The DANet performs well on PASCAL Context dataset, outperforming EncNet by 1%. Do that use any improvement strategies mentioned in the subsection 'Ablation Study for Improvement Strategies' ?
As far as I known, the EncNet achieves 51.7 mIoU on the same dataset without using multiple scales strategy.
Question 2:
Before conducting softmax operation on x, we usually subtract the maximum value of x. But for CAM_Module, I see that energy_new = torch.max(energy, -1, keepdim=True)[0].expand_as(energy)-energy (https://github.com/junfu1115/DANet/blob/master/encoding/nn/attention.py#L75) rather than energy_new = energy- torch.max(energy, -1, keepdim=True)[0].expand_as(energy) followed by softmax op. Therefore, the weights of each channel are reverse based on the two different op before softmax.
Can you share why the former is used?

About the Multi-grid Parts

Interesting work. We also have a concurrent work OCNet.

In fact, we used to have tried the multi-grid method but find it brings no performance gains.

Thus I am wondering whether you only use multi-grid for only testing or both training and testing?

Channel attention module

Hi! the paper says the channel attention map X is N xN .Is it wrong here?I think channel attention map X should be C xC.

Undefined names: Can raise NameError at runtime

flake8 testing of https://github.com/junfu1115/DANet on Python 3.7.0

$ flake8 . --count --select=E901,E999,F821,F822,F823 --show-source --statistics

./experiments/segmentation/get_weight.py:67:12: F821 undefined name 'self'
        }, self.args, is_best, 'DANet101_reduce.pth.tar')
           ^
./experiments/segmentation/get_weight.py:67:23: F821 undefined name 'is_best'
        }, self.args, is_best, 'DANet101_reduce.pth.tar')
                      ^
./encoding/models/base.py:101:34: F821 undefined name 'target_gpus'
        kwargs = scatter(kwargs, target_gpus, dim) if kwargs else []
                                 ^
./encoding/models/base.py:101:47: F821 undefined name 'dim'
        kwargs = scatter(kwargs, target_gpus, dim) if kwargs else []
                                              ^
./encoding/datasets/base.py:112:44: F821 undefined name 'batch'
    raise TypeError((error_msg.format(type(batch[0]))))
                                           ^
./encoding/datasets/pascal_voc.py:65:41: F821 undefined name 'mask'
            mask = self._mask_transform(mask)
                                        ^
./encoding/utils/presets.py:7:1: F822 undefined name 'subtract_imagenet_mean_batch' in __all__
__all__ = ['load_image', 'subtract_imagenet_mean_batch']
^
./encoding/nn/encoding.py:299:21: F821 undefined name 'math'
        stdv = 1. / math.sqrt(n)
                    ^
7     F821 undefined name 'batch'
1     F822 undefined name 'subtract_imagenet_mean_batch' in __all__
8

Meet some problem when i try to eval..

Traceback (most recent call last):
File "/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 946, in _build_extension_module
check=True)
File "/home/xieke/anaconda3/envs/pt_source/lib/python3.6/subprocess.py", line 438, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "test.py", line 16, in
import encoding.utils as utils
File "/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/init.py", line 13, in
from . import nn, functions, dilated, parallel, utils, models, datasets
File "/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/nn/init.py", line 12, in
from .encoding import *
File "/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/nn/encoding.py", line 19, in
from ..functions import scaledL2, aggregate, pairwise_cosine
File "/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/functions/init.py", line 2, in
from .encoding import *
File "/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/functions/encoding.py", line 14, in
from .. import lib
File "/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/init.py", line 12, in
], build_directory=cpu_path, verbose=False)
File "/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 645, in load
is_python_module)
File "/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 814, in _jit_compile
with_cuda=with_cuda)
File "/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 863, in _write_ninja_file_and_build
_build_extension_module(name, build_directory, verbose)
File "/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 959, in build_extension_module
raise RuntimeError(message)
RuntimeError: Error building extension 'enclib_cpu': b'[1/3] c++ -MMD -MF roi_align_cpu.o.d -DTORCH_EXTENSION_NAME=enclib_cpu -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/lib/include -isystem /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include -isystem /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/lib/include/TH -isystem /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/lib/include/THC -isystem /home/xieke/anaconda3/envs/pt_source/include/python3.6m -fPIC -std=c++11 -c /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp -o roi_align_cpu.o\nFAILED: roi_align_cpu.o \nc++ -MMD -MF roi_align_cpu.o.d -DTORCH_EXTENSION_NAME=enclib_cpu -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/lib/include -isystem /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include -isystem /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/lib/include/TH -isystem /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/lib/include/THC -isystem /home/xieke/anaconda3/envs/pt_source/include/python3.6m -fPIC -std=c++11 -c /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp -o roi_align_cpu.o\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp: In function 'at::Tensor ROIAlignForwardCPU(const at::Tensor&, const at::Tensor&, int64_t, int64_t, double, int64_t)':\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp:407:30: error: 'struct at::Type' has no member named 'tensor'\n auto output = input.type().tensor({num_rois, channels, pooled_height, pooled_width});\n ^\nIn file included from /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/lib/include/ATen/ATen.h:9:0,\n from /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp:1:\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp: In lambda function:\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp:425:27: error: expected primary-expression before '>' token\n output.data<scalar_t>());\n ^\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp:425:29: error: expected primary-expression before ')' token\n output.data<scalar_t>());\n ^\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp: In lambda function:\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp:425:27: error: expected primary-expression before '>' token\n output.data<scalar_t>());\n ^\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp:425:29: error: expected primary-expression before ')' token\n output.data<scalar_t>());\n ^\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp: In function 'at::Tensor ROIAlignBackwardCPU(const at::Tensor&, const at::Tensor&, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t, double, int64_t)':\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp:454:37: error: 'struct at::Type' has no member named 'tensor'\n auto grad_in = bottom_rois.type().tensor({b_size, channels, height, width}).zero
(); \n ^\nIn file included from /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/lib/include/ATen/ATen.h:9:0,\n from /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp:1:\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp: In lambda function:\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp:470:28: error: expected primary-expression before '>' token\n grad_in.data<scalar_t>(),\n ^\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp:470:30: error: expected primary-expression before ')' token\n grad_in.data<scalar_t>(),\n ^\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp: In lambda function:\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp:470:28: error: expected primary-expression before '>' token\n grad_in.data<scalar_t>(),\n ^\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp:470:30: error: expected primary-expression before ')' token\n grad_in.data<scalar_t>(),\n ^\n[2/3] c++ -MMD -MF roi_align.o.d -DTORCH_EXTENSION_NAME=enclib_cpu -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/lib/include -isystem /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include -isystem /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/lib/include/TH -isystem /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/lib/include/THC -isystem /home/xieke/anaconda3/envs/pt_source/include/python3.6m -fPIC -std=c++11 -c /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align.cpp -o roi_align.o\nIn file included from /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align.cpp:1:0:\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include/torch/torch.h:7:2: warning: #warning "Including torch/torch.h for C++ extensions is deprecated. Please include torch/extension.h" [-Wcpp]\n #warning \\n ^\nninja: build stopped: subcommand failed.\n'
(pt_source) xieke@ubuntu:~/wuyong/codes/DANet/danet$ CUDA_VISIBLE_DEVICES=0,1,2,3 python test.py --dataset cityscapes --model danet --resume-dir cityscapes/model --base-size 2048 --crop-size 1024 --workers 1 --backbone resnet101 --multi-grid --multi-dilation 4 8 16 --eval --multi-scales
Traceback (most recent call last):
File "/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 946, in _build_extension_module
check=True)
File "/home/xieke/anaconda3/envs/pt_source/lib/python3.6/subprocess.py", line 438, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "test.py", line 16, in
import encoding.utils as utils
File "/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/init.py", line 13, in
from . import nn, functions, dilated, parallel, utils, models, datasets
File "/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/nn/init.py", line 12, in
from .encoding import *
File "/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/nn/encoding.py", line 19, in
from ..functions import scaledL2, aggregate, pairwise_cosine
File "/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/functions/init.py", line 2, in
from .encoding import *
File "/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/functions/encoding.py", line 14, in
from .. import lib
File "/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/init.py", line 12, in
], build_directory=cpu_path, verbose=False)
File "/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 645, in load
is_python_module)
File "/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 814, in _jit_compile
with_cuda=with_cuda)
File "/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 863, in _write_ninja_file_and_build
_build_extension_module(name, build_directory, verbose)
File "/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 959, in build_extension_module
raise RuntimeError(message)
RuntimeError: Error building extension 'enclib_cpu': b"[1/2] c++ -MMD -MF roi_align_cpu.o.d -DTORCH_EXTENSION_NAME=enclib_cpu -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/lib/include -isystem /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include -isystem /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/lib/include/TH -isystem /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/lib/include/THC -isystem /home/xieke/anaconda3/envs/pt_source/include/python3.6m -fPIC -std=c++11 -c /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp -o roi_align_cpu.o\nFAILED: roi_align_cpu.o \nc++ -MMD -MF roi_align_cpu.o.d -DTORCH_EXTENSION_NAME=enclib_cpu -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/lib/include -isystem /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include -isystem /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/lib/include/TH -isystem /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/lib/include/THC -isystem /home/xieke/anaconda3/envs/pt_source/include/python3.6m -fPIC -std=c++11 -c /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp -o roi_align_cpu.o\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp: In function 'at::Tensor ROIAlignForwardCPU(const at::Tensor&, const at::Tensor&, int64_t, int64_t, double, int64_t)':\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp:407:30: error: 'struct at::Type' has no member named 'tensor'\n auto output = input.type().tensor({num_rois, channels, pooled_height, pooled_width});\n ^\nIn file included from /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/lib/include/ATen/ATen.h:9:0,\n from /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp:1:\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp: In lambda function:\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp:425:27: error: expected primary-expression before '>' token\n output.data<scalar_t>());\n ^\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp:425:29: error: expected primary-expression before ')' token\n output.data<scalar_t>());\n ^\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp: In lambda function:\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp:425:27: error: expected primary-expression before '>' token\n output.data<scalar_t>());\n ^\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp:425:29: error: expected primary-expression before ')' token\n output.data<scalar_t>());\n ^\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp: In function 'at::Tensor ROIAlignBackwardCPU(const at::Tensor&, const at::Tensor&, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t, double, int64_t)':\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp:454:37: error: 'struct at::Type' has no member named 'tensor'\n auto grad_in = bottom_rois.type().tensor({b_size, channels, height, width}).zero
(); \n ^\nIn file included from /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/torch/lib/include/ATen/ATen.h:9:0,\n from /home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp:1:\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp: In lambda function:\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp:470:28: error: expected primary-expression before '>' token\n grad_in.data<scalar_t>(),\n ^\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp:470:30: error: expected primary-expression before ')' token\n grad_in.data<scalar_t>(),\n ^\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp: In lambda function:\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp:470:28: error: expected primary-expression before '>' token\n grad_in.data<scalar_t>(),\n ^\n/home/xieke/anaconda3/envs/pt_source/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp:470:30: error: expected primary-expression before ')' token\n grad_in.data<scalar_t>(),\n ^\nninja: build stopped: subcommand failed.\n"

change dataset to train

Hi, i was wondering if you could provide a method on steps to change the dataset, like except change the name typed in terminal, what else do we need to do.

convert Cityscapes dataset to 19 categories.

It's the wonderful work.
For the section Dataset of Usage in readme file, you mention that

  1. Download the Cityscapes dataset and convert the dataset to 19 categories.

Here, why do we first convert Cityscapes dataset to 19 categories rather than directly using the original one?
The question maybe too simple and I'm still looking forword to your replying if you have enough time.

Training setting for cocostuff?

Hi! Thank you for your great work! Could you please tell me the settings of cocostuff10k to reproduce your results? (like base-size, crop-size, epochs, batch-size, lr and multi-dilation, etc.)

Thanks again!

I CAN NOT get same result

i trained the model using the pretrained model(resnet-101) with cityscapes dataset, but i only got 70.07 mIOU, How can i the same result as the paper? please~~~~ thanks !
I set the batch_size 15 (5 GPUs), epochs 120 .

Install from Pytorch@commitfd25a2a

Hi,
I followed the tutorial to install pytorch@commit fd25a2a from the source. But when I run the following command to install the third/party:
git submodule update --init --recursive

Then the following error message appeared:
remote: Repository not found.
fatal: repository 'https://github.com/NervanaSystems/nervanagpu.git/' not found
fatal: clone of 'https://github.com/NervanaSystems/nervanagpu.git' into submodule path 'third_party/nervanagpu' failed

That means the Repository has gone, and I can't install the pytorch from source.
Do you know how to solve it?

The difference between the paper and the code

Given a local feature A, first feed it into a convolution layers with batch normalization and ReLU layers to generate two new feature maps B and C.(in paper)
But, there is no BN and ReLU in code self.query_conv = Conv2d(in_channels=in_dim, out_channels=in_dim//8, kernel_size=1) and self.key_conv = Conv2d(in_channels=in_dim, out_channels=in_dim//8, kernel_size=1), in PAM_Module.
Which is right?

finetuning

Hi @junfu1115 ,

I was wondering how to fine tune the model on a custom dataset?
I have noticed that there is an--ftargument which can be utilized, but I am not sure about the correct procedure.

Reproducing Cityscapes

Running the command supplied by the repo,

CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --dataset cityscapes --model danet --backbone resnet101 --checkname danet101 --base-size 1024 --crop-size 768 --epochs 240 --batch-size 8 --lr 0.003 --workers 2 --multi-grid --multi-dilation 4 8 16

returned a mIOU of 0.735 at the end of 240 epochs. There is probably some randomness in running through the dataset, but I'm surprised it was that much lower than anticipated. Any advice on how to get closer to reported scores?

trained model DANet is damaged.

After the download of the compressed package is completed, it cannot be decompressed, and the file is damaged. Can it be re-uploaded please?

About 'self.gamma'

Hi! I do not understand why set 'self.gamma' to be zero, in this way, "out‘ equals 'x', right? Thanks a lot!

"class PAM_Module(nn.Module):
"""Position attention module"""

# Ref from Sagan
def __init__(self, in_dim):
    super(PAM_Module, self).__init__()
    # self.channel_in = in_dim

    self.query_conv = nn.Conv2d(in_channels=in_dim, out_channels=in_dim // 8, kernel_size=1)
    self.key_conv = nn.Conv2d(in_channels=in_dim, out_channels=in_dim // 8, kernel_size=1)
    self.value_conv = nn.Conv2d(in_channels=in_dim, out_channels=in_dim, kernel_size=1)
    self.gamma = nn.Parameter(torch.zeros(1))

    self.softmax = torch.nn.Softmax(dim=-1)

def forward(self, x):
    """
        inputs :
            x : input feature maps( B X C X H X W)
        returns :
            out : attention value + input feature
            attention: B X (HxW) X (HxW)
    """
    m_batchsize, C, height, width = x.size()
    proj_query = self.query_conv(x).view(m_batchsize, -1, width * height).permute(0, 2, 1)  # (H*W) * C//8
    proj_key = self.key_conv(x).view(m_batchsize, -1, width * height)  # C//8 * (H*W)
    energy = torch.bmm(proj_query, proj_key)  # (H*W) * (H*W)
    attention = self.softmax(energy)
    proj_value = self.value_conv(x).view(m_batchsize, -1, width * height)  # C * (H*W)

    out = torch.bmm(proj_value, attention.permute(0, 2, 1))  #
    out = out.view(m_batchsize, C, height, width)

    out = self.gamma * out + x
    return out

"

How large GPU memory you used?

Hi, @junfu1115 thanks for your work!
I am curious about the GPU memory you used, since the attention will introduce many parameters and the image size of cityscape is very large 2048*1024, I found you resize the image to a base size 608, is the 12GP memory enough for that?

Unable to get repr for <class 'torch.Tensor'>

Hi, When I train with my own data and compute the loss, I get the following error:

Unable to get repr for <class 'torch.Tensor'>
Have you ever encountered this problem, how should I solve it?

`t >= 0 && t < n_classes`

I use my dataset which has 81 catg,when train the model, error like:
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:99: void cunn_SpatialClassNLLCriterion_updateOutput_kernel(T *, T *, T *, long *, T *, int, int, int, int, int, long) [with T = float, AccumT = float]: block: [1,0,0], thread: [246,0,0] Assertion t >= 0 && t < n_classes failed.

baseline FCN

  the baseline FCN can gain 70.03% mIOU in your paper,now I want to realize  the result based your paper.But my result is not good.
  I noticed the paper used LR=0.001,but the git used LR = 0.003
  Can you tell me more details about the training process.Thank you.

About the `Softmax` in the `CAM_Module`

energy_new = torch.max(energy, -1, keepdim=True)[0].expand_as(energy)-energy

image

Is your code trying to improve numerical stability? Maybe it should be in this form.

       energy_new = torch.max(energy, -1, keepdim=True)
       energy_new = energy_new[0].expand_as(energy)
       energy_new = energy - energy_new
       attention = self.softmax(energy_new)

Install pytorch commit fd25a2a

I was running evaluation on cityscape and did not find args.test_scale anywhere.
Its not mentioned in options.py aswell.

Error:

if len(args.test_scale) == 1:
AttributeError: 'Namespace' object has no attribute 'test_scale'

Running this from terminal:
CUDA_VISIBLE_DEVICES=0 python3.6 test.py --dataset cityscapes --model danet --resume-dir cityscapes/model --crop-size 768 --workers 1 --backbone resnet101 --multi-grid --multi-dilation 4 8 16 --eval

Bug in encoding/parallel.py

I was wondering if targets[0] is missing in line 182: _worker(0, modules[0], inputs[0], kwargs_tup[0], devices[0])

Performance issue

I clone your code and run the training script.
But, I only get the performance at 73.5% mIoU on val set(CityScapes)
Is there any training tricks?

training with single GPU

Hi,

So while trying to train the network I encountered this error. I can't figure out what the mistake is. I'm using the proper pytorch commit. I have not made any modifications to the code.

From my terminal:
CUDA_VISIBLE_DEVICES=0 python train.py --dataset cityscapes --model danet --backbone resnet101 --checkname danet101 --base-size 1024 --crop-size 768 --epochs 240 --batch-size 8 --lr 0.003 --workers 2 --multi-grid --multi-dilation 4 8 16

Error:

Traceback (most recent call last):
  File "train.py", line 201, in <module>
    trainer.training(epoch)
  File "train.py", line 125, in training
    outputs = self.model(image)
  File "/datadrive/virtualenvs/torchDA/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/datadrive/virtualenvs/torchDA/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 121, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "/datadrive/virtualenvs/torchDA/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/datadrive/virtualenvs/torchDA/lib/python3.6/site-packages/encoding/models/danet.py", line 45, in forward
    _, _, c3, c4 = self.base_forward(x)
  File "/datadrive/virtualenvs/torchDA/lib/python3.6/site-packages/encoding/models/base.py", line 58, in base_forward
    x = self.pretrained.bn1(x)
  File "/datadrive/virtualenvs/torchDA/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/datadrive/virtualenvs/torchDA/lib/python3.6/site-packages/encoding/nn/syncbn.py", line 57, in forward
    mean, inv_std = self._slave_pipe.run_slave(_ChildMessage(xsum, xsqsum, N))
AttributeError: 'NoneType' object has no attribute 'run_slave' ```

what does the NUM_CLASS menas?

when I use custome images to train the model. What does the NUM_CLASS means? Is it means the number of the class in one image.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.