Giter Site home page Giter Site logo

open-aff's Introduction

Attentional Feature Fusion

MXNet/Gluon code for "Attentional Feature Fusion" https://arxiv.org/abs/2009.14082

What's in this repo so far:

  • Code, trained models, and training logs for ImageNet

PS:

  • If you are the reviewers of our submitted paper, please note that the accuracy of current implementation is a bit higher than the accuracy in the paper because it is a new implementation with a bag of tricks.
  • 如果您是我的学位论文评审专家,发现论文与这个 repo 的数字有些出入,那是因为在论文提交后我又将代码重新实现了一遍,添加了 AutoAugment、Label Smooting 这些技巧,所以目前这个 repo 中的分类准确率会比论文中的数字高一些,还请见谅。

Change Logs:

  • 2020-10-08: Re-implement the image classification code with a bag of tricks
  • 2020-09-29: Upload the image classification codes and trained models for the submitted paper

To Do:

  • Running AFF-ResNeXt-50 and AFF-ResNet-50 on ImageNet
  • Update Grad-CAM results on new trained models
  • Re-implement the segmentation code
  • Convert to PyTorch

In Progress:

  • Running iAFF-ResNeXt-50 on ImageNet

Done:

  • Re-implement the image classification code with a bag of tricks

Requirements

Install MXNet and Gluon-CV:

pip install --upgrade mxnet-cu101 gluoncv

If you are going to use autoaugment:

python3 -m pip install --upgrade "mxnet_cu101<2.0.0"
python3 -m pip install autogluon

Experiments

All trained model params and training logs are in ./params

The training commands / shell scripts are in cmd_scripts.txt

ImageNet

Architecture Params top-1 err.
ResNet-101 [1] 42.5M 23.2
Efficient-Channel-Attention-Net-101 [2] 42.5M 21.4
Attention-Augmented-ResNet-101 [3] 45.4M 21.3
SENet-101 [4] 49.4M 20.9
Gather-Excite-$\theta^{+}$-ResNet-101 [5] 58.4M 20.7
Local-Importance-Pooling-ResNet-101 [6] 42.9M 20.7
AFF-ResNet-50 (ours) 30.3M 20.3
iAFF-ResNet-50 (ours) 35.1M 20.2
iAFF-ResNeXt-50-32x4d (ours) 34.7M 19.78

PyTorch Version

@bobo0810 has contributed the PyTorch version. Please check the aff_pytorch directory for details.

Many thanks for @bobo0810 for his contribution.

Citation

Please cite our paper in your publications if our work helps your research. BibTeX reference is as follows.

@inproceedings{dai21aff,
  title   =  {Attentional Feature Fusion},
  author  =  {Yimian Dai and Fabian Gieseke and Stefan Oehmcke and Yiquan Wu and Kobus Barnard},
  booktitle =  {{IEEE} Winter Conference on Applications of Computer Vision, {WACV} 2021}
  year    =  {2021}
}

References

[1] Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun: Deep Residual Learning for Image Recognition. CVPR 2016: 770-778

[2] Qilong Wang, Banggu Wu, Pengfei Zhu, Peihua Li, Wangmeng Zuo, Qinghua Hu: ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks. CVPR 2020: 11531-11539

[3] Irwan Bello, Barret Zoph, Quoc Le, Ashish Vaswani, Jonathon Shlens: Attention Augmented Convolutional Networks. ICCV 2019: 3285-3294

[4] Jie Hu, Li Shen, Gang Sun: Squeeze-and-Excitation Networks. CVPR 2018: 7132-7141

[5] Jie Hu, Li Shen, Samuel Albanie, Gang Sun, Andrea Vedaldi: Gather-Excite: Exploiting Feature Context in Convolutional Neural Networks. NeurIPS 2018: 9423-9433

[6] Ziteng Gao, Limin Wang, Gangshan Wu: LIP: Local Importance-Based Pooling. ICCV 2019: 3354-3363

[7] Xiang Li, Wenhai Wang, Xiaolin Hu, Jian Yang: Selective Kernel Networks. CVPR 2019: 510-519

[8] Dongyoon Han, Jiwhan Kim, Junmo Kim: Deep Pyramidal Residual Networks. CVPR 2017: 6307-6315

[9] Zhichao Lu, Gautam Sreekumar, Erik D. Goodman, Wolfgang Banzhaf, Kalyanmoy Deb, Vishnu Naresh Boddeti: Neural Architecture Transfer. CoRR abs/2005.05859 (2020)

[10] Ekin D. Cubuk, Barret Zoph, Dandelion Mane, Vijay Vasudevan, Quoc V. Le: AutoAugment: Learning Augmentation Strategies From Data. CVPR 2019: 113-123

open-aff's People

Contributors

bobo0810 avatar yimiandai avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

open-aff's Issues

ASKCResNetFPN

from __future__ import division
import os
from mxnet.gluon.block import HybridBlock
from mxnet.gluon import nn
from mxnet.gluon.nn import BatchNorm
from gluoncv.model_zoo.fcn import _FCNHead
from mxnet import nd

from .askc import LCNASKCFuse

from model.atac.backbone import ATACBlockV1, conv1ATAC, DynamicCell
from model.atac.convolution import LearnedCell, ChaDyReFCell, SeqDyReFCell, SK_ChaDyReFCell, \
    SK_1x1DepthDyReFCell, SK_MSSpaDyReFCell, SK_SpaDyReFCell, Direct_AddCell, SKCell, \
    SK_SeqDyReFCell, Sub_MSSpaDyReFCell, SK_MSSeqDyReFCell, iAAMSSpaDyReFCell
from model.atac.convolution import \
    LearnedConv, ChaDyReFConv, SeqDyReFConv, SK_ChaDyReFConv, \
    SK_1x1DepthDyReFConv, SK_MSSpaDyReFConv, SK_SpaDyReFConv, Direct_AddConv, SKConv, \
    SK_SeqDyReFConv
    # , SK_MSSeqDyReFConv
from .activation import xUnit, SpaATAC, ChaATAC, SeqATAC, MSSeqATAC, MSSeqATACAdd, \
    MSSeqATACConcat, MSSeqAttentionMap, xUnitAttentionMap
from model.atac.fusion import Direct_AddFuse_Reduce, SK_MSSpaFuse, SKFuse_Reduce, LocalChaFuse, \
    GlobalChaFuse, \
    LocalGlobalChaFuse_Reduce, LocalLocalChaFuse_Reduce, GlobalGlobalChaFuse_Reduce, \
    AYforXplusYChaFuse_Reduce, XplusAYforYChaFuse_Reduce, IASKCChaFuse_Reduce,\
    GAUChaFuse_Reduce, SpaFuse_Reduce, ConcatFuse_Reduce, AXYforXplusYChaFuse_Reduce,\
    BiLocalChaFuse_Reduce, BiGlobalChaFuse_Reduce, LocalGAUChaFuse_Reduce, GlobalSpaFuse,\
    AsymBiLocalChaFuse_Reduce, BiSpaChaFuse_Reduce, AsymBiSpaChaFuse_Reduce, LocalSpaFuse, \
    BiGlobalLocalChaFuse_Reduce

# from gluoncv.model_zoo.resnetv1b import BasicBlockV1b
from gluoncv.model_zoo.cifarresnet import CIFARBasicBlockV1


class ASKCResNetFPN(HybridBlock):
    def __init__(self, layers, channels, fuse_mode, act_dilation, classes=1, tinyFlag=False,
                 norm_layer=BatchNorm, norm_kwargs=None, **kwargs):
        super(ASKCResNetFPN, self).__init__(**kwargs)

        self.layer_num = len(layers)
        self.tinyFlag = tinyFlag
        with self.name_scope():

            stem_width = int(channels[0])
            self.stem = nn.HybridSequential(prefix='stem')
            self.stem.add(norm_layer(scale=False, center=False,
                                     **({} if norm_kwargs is None else norm_kwargs)))
            if tinyFlag:
                self.stem.add(nn.Conv2D(channels=stem_width*2, kernel_size=3, strides=1,
                                         padding=1, use_bias=False))
                self.stem.add(norm_layer(in_channels=stem_width*2))
                self.stem.add(nn.Activation('relu'))
            else:
                self.stem.add(nn.Conv2D(channels=stem_width, kernel_size=3, strides=2,
                                         padding=1, use_bias=False))
                self.stem.add(norm_layer(in_channels=stem_width))
                self.stem.add(nn.Activation('relu'))
                self.stem.add(nn.Conv2D(channels=stem_width, kernel_size=3, strides=1,
                                         padding=1, use_bias=False))
                self.stem.add(norm_layer(in_channels=stem_width))
                self.stem.add(nn.Activation('relu'))
                self.stem.add(nn.Conv2D(channels=stem_width*2, kernel_size=3, strides=1,
                                         padding=1, use_bias=False))
                self.stem.add(norm_layer(in_channels=stem_width*2))
                self.stem.add(nn.Activation('relu'))
                self.stem.add(nn.MaxPool2D(pool_size=3, strides=2, padding=1))

            # self.head1 = _FCNHead(in_channels=channels[1], channels=classes)
            # self.head2 = _FCNHead(in_channels=channels[2], channels=classes)
            # self.head3 = _FCNHead(in_channels=channels[3], channels=classes)
            # self.head4 = _FCNHead(in_channels=channels[4], channels=classes)

            self.head = _FCNHead(in_channels=channels[1], channels=classes)

            self.layer1 = self._make_layer(block=CIFARBasicBlockV1, layers=layers[0],
                                           channels=channels[1], stride=1, stage_index=1,
                                           in_channels=channels[1])

            self.layer2 = self._make_layer(block=CIFARBasicBlockV1, layers=layers[1],
                                           channels=channels[2], stride=2, stage_index=2,
                                           in_channels=channels[1])

            self.layer3 = self._make_layer(block=CIFARBasicBlockV1, layers=layers[2],
                                           channels=channels[3], stride=2, stage_index=3,
                                           in_channels=channels[2])

            if self.layer_num == 4:
                self.layer4 = self._make_layer(block=CIFARBasicBlockV1, layers=layers[3],
                                               channels=channels[4], stride=2, stage_index=4,
                                               in_channels=channels[3])

            if self.layer_num == 4:
                self.fuse34 = self._fuse_layer(fuse_mode, channels=channels[3],
                                               act_dilation=act_dilation)  # channels[4]

            self.fuse23 = self._fuse_layer(fuse_mode, channels=channels[2],
                                           act_dilation=act_dilation)  # 64
            self.fuse12 = self._fuse_layer(fuse_mode, channels=channels[1],
                                           act_dilation=act_dilation)  # 32

            # if fuse_order == 'reverse':
            #     self.fuse12 = self._fuse_layer(fuse_mode, channels=channels[2])  # channels[2]
            #     self.fuse23 = self._fuse_layer(fuse_mode, channels=channels[3])  # channels[3]
            #     self.fuse34 = self._fuse_layer(fuse_mode, channels=channels[4])  # channels[4]
            # elif fuse_order == 'normal':
	           #  self.fuse34 = self._fuse_layer(fuse_mode, channels=channels[4])  # channels[4]
	           #  self.fuse23 = self._fuse_layer(fuse_mode, channels=channels[4])  # channels[4]
	           #  self.fuse12 = self._fuse_layer(fuse_mode, channels=channels[4])  # channels[4]

    def _make_layer(self, block, layers, channels, stride, stage_index, in_channels=0,
                    norm_layer=BatchNorm, norm_kwargs=None):
        layer = nn.HybridSequential(prefix='stage%d_'%stage_index)
        with layer.name_scope():
            downsample = (channels != in_channels) or (stride != 1)
            layer.add(block(channels, stride, downsample, in_channels=in_channels,
                            prefix='', norm_layer=norm_layer, norm_kwargs=norm_kwargs))
            for _ in range(layers-1):
                layer.add(block(channels, 1, False, in_channels=channels, prefix='',
                                norm_layer=norm_layer, norm_kwargs=norm_kwargs))
        return layer

    def _fuse_layer(self, fuse_mode, channels, act_dilation):
        if fuse_mode == 'Direct_Add':
            fuse_layer = Direct_AddFuse_Reduce(channels=channels)
        elif fuse_mode == 'Concat':
            fuse_layer = ConcatFuse_Reduce(channels=channels)
        elif fuse_mode == 'SK':
            fuse_layer = SKFuse_Reduce(channels=channels)
        # elif fuse_mode == 'LocalCha':
        #     fuse_layer = LocalChaFuse(channels=channels)
        # elif fuse_mode == 'GlobalCha':
        #     fuse_layer = GlobalChaFuse(channels=channels)
        elif fuse_mode == 'LocalGlobalCha':
            fuse_layer = LocalGlobalChaFuse_Reduce(channels=channels)
        elif fuse_mode == 'LocalLocalCha':
            fuse_layer = LocalLocalChaFuse_Reduce(channels=channels)
        elif fuse_mode == 'GlobalGlobalCha':
            fuse_layer = GlobalGlobalChaFuse_Reduce(channels=channels)
        elif fuse_mode == 'IASKCChaFuse':
            fuse_layer = IASKCChaFuse_Reduce(channels=channels)
        elif fuse_mode == 'AYforXplusY':
            fuse_layer = AYforXplusYChaFuse_Reduce(channels=channels)
        elif fuse_mode == 'AXYforXplusY':
            fuse_layer = AXYforXplusYChaFuse_Reduce(channels=channels)
        elif fuse_mode == 'XplusAYforY':
            fuse_layer = XplusAYforYChaFuse_Reduce(channels=channels)
        elif fuse_mode == 'GAU':
            fuse_layer = GAUChaFuse_Reduce(channels=channels)
        elif fuse_mode == 'LocalGAU':
            fuse_layer = LocalGAUChaFuse_Reduce(channels=channels)
        elif fuse_mode == 'SpaFuse':
            fuse_layer = SpaFuse_Reduce(channels=channels, act_dialtion=act_dilation)
        elif fuse_mode == 'BiLocalCha':
            fuse_layer = BiLocalChaFuse_Reduce(channels=channels)
        elif fuse_mode == 'BiGlobalLocalCha':
            fuse_layer = BiGlobalLocalChaFuse_Reduce(channels=channels)
        elif fuse_mode == 'AsymBiLocalCha':
            fuse_layer = AsymBiLocalChaFuse_Reduce(channels=channels)
        elif fuse_mode == 'BiGlobalCha':
            fuse_layer = BiGlobalChaFuse_Reduce(channels=channels)
        elif fuse_mode == 'BiSpaCha':
            fuse_layer = BiSpaChaFuse_Reduce(channels=channels)
        elif fuse_mode == 'AsymBiSpaCha':
            fuse_layer = AsymBiSpaChaFuse_Reduce(channels=channels)
        # elif fuse_mode == 'LocalSpa':
        #     fuse_layer = LocalSpaFuse(channels=channels, act_dilation=act_dilation)
        # elif fuse_mode == 'GlobalSpa':
        #     fuse_layer = GlobalSpaFuse(channels=channels, act_dilation=act_dilation)
        # elif fuse_mode == 'SK_MSSpa':
        #     # fuse_layer.add(SK_MSSpaFuse(channels=channels, act_dilation=act_dilation))
        #     fuse_layer = SK_MSSpaFuse(channels=channels, act_dilation=act_dilation)
        else:
            raise ValueError('Unknown fuse_mode')

        return fuse_layer

    def hybrid_forward(self, F, x):

        _, _, hei, wid = x.shape

        x = self.stem(x)      # down 4, 32
        c1 = self.layer1(x)   # down 4, 32
        c2 = self.layer2(c1)  # down 8, 64
        out = self.layer3(c2)  # down 16, 128
        if self.layer_num == 4:
            c4 = self.layer4(out)  # down 32
            if self.tinyFlag:
                c4 = F.contrib.BilinearResize2D(c4, height=hei//4, width=wid//4)  # down 4
            else:
                c4 = F.contrib.BilinearResize2D(c4, height=hei//16, width=wid//16)  # down 16
            out = self.fuse34(c4, out)
        if self.tinyFlag:
            out = F.contrib.BilinearResize2D(out, height=hei//2, width=wid//2)  # down 2, 128
        else:
            out = F.contrib.BilinearResize2D(out, height=hei//8, width=wid//8)  # down 8, 128
        out = self.fuse23(out, c2)
        if self.tinyFlag:
            out = F.contrib.BilinearResize2D(out, height=hei, width=wid)  # down 1
        else:
            out = F.contrib.BilinearResize2D(out, height=hei//4, width=wid//4)  # down 8
        out = self.fuse12(out, c1)

        pred = self.head(out)
        if self.tinyFlag:
            out = pred
        else:
            out = F.contrib.BilinearResize2D(pred, height=hei, width=wid)  # down 4

        ######### reverse order ##########
        # up_c2 = F.contrib.BilinearResize2D(c2, height=hei//4, width=wid//4)  # down 4
        # fuse2 = self.fuse12(up_c2, c1)  # down 4, channels[2]
        #
        # up_c3 = F.contrib.BilinearResize2D(c3, height=hei//4, width=wid//4)  # down 4
        # fuse3 = self.fuse23(up_c3, fuse2)  # down 4, channels[3]
        #
        # up_c4 = F.contrib.BilinearResize2D(c4, height=hei//4, width=wid//4)  # down 4
        # fuse4 = self.fuse34(up_c4, fuse3)  # down 4, channels[4]
        #

        ######### normal order ##########
        # out = F.contrib.BilinearResize2D(c4, height=hei//16, width=wid//16)
        # out = self.fuse34(out, c3)
        # out = F.contrib.BilinearResize2D(out, height=hei//8, width=wid//8)
        # out = self.fuse23(out, c2)
        # out = F.contrib.BilinearResize2D(out, height=hei//4, width=wid//4)
        # out = self.fuse12(out, c1)
        # out = self.head(out)
        # out = F.contrib.BilinearResize2D(out, height=hei, width=wid)


        return out

    def evaluate(self, x):
        """evaluating network with inputs and targets"""
        return self.forward(x)

问题

您好,我想尝试把MS-CAM模块加入到其它网络模型中,请问加到backbone的最后,效果会怎么样呢?

Can you share inference time data for different AFF/iAFF models?

Hi Yimian, I came here from your WACV 2021 presentation. This work looks pretty impressive.

As we discussed during your presentation, could you share the inference time data for different configurations (ResNet-50, -101, ResNeXt-50, etc.) as well as the numbers for baseline models and the hardware details? Thank you!

可否进行实现三种特征图的融合

作者您好,在拜读了您的文章以后,想要请教一下,无论是AFF还是IAFF似乎都是两种图的融合,那么如果要进行三种图的融合,该如何进行更改呢?

关于应用于FPN

您好,我想请问一下AFF或者说iAFF如何应用于FPN呢,以AFF为例,X、Y分别为不同stage下的特征层,文章中说Y是高语义的,那么以resnet中下采样为32的层(Y)和下采样为16的层(X)来说,如何能使X+Y呢,这两个首先特征图的尺度不同,其次维度也不同。

很抱歉打扰您,期待您的回复,谢谢!

ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 256, 1, 1])

我采用你的方法进行特征融合,
nn.AdaptiveAvgPool2d(1),
nn.Conv2d(channels, inter_channels, kernel_size=1, stride=1, padding=0),
nn.BatchNorm2d(inter_channels),
nn.ReLU(inplace=True),
nn.Conv2d(inter_channels, channels, kernel_size=1, stride=1, padding=0),
nn.BatchNorm2d(channels),
第一行将自适应池化的输出按照原文设为1时会报错ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 256, 1, 1]),请问有解决办法吗。

Training from scratch - log report

Hi,

First, I think your paper is very interesting, excellent work!

I was wondering if you have training from scratch report avialable ?
All the aviable reports are based on pretrained models with already high-accuracy (specifically, I am referring to CIFAR100 expierement).

关于BN层

请问当我应用iaff模块时候存在以下情况:1、F的batch_size为1;2、全局平均池化将特征图的面积变为1*1。这导致了经过BN层的时候会报错ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 256, 1, 1]),请问这个有处理方法吗

注意力机制的预训练模型有restnet18的吗?

作者您好,请问该注意力机制有预训练模型resnet18的吗?另外,该注意力机制能否更好的捕捉低层次的细节信息呢?就像传统的图像方向的算子那样?最后,该注意力机制能否实现即插即用呢?在不增加网络参数的情况下。谢谢,期待您的回答。

一些问题请教一下

1.您的论文中关于在FPN中应用AFF代码是哪个部分,我没找到
2.Global + Local方式您是在哪个分支上增加globalpooling的,还是两个分支任意哪个都可以?
麻烦您能帮我解答一下,谢谢

运行模块时报错请教

您好,我在将您提供的模块添加进网络后,出现了如下报错,请问这要怎么解决呢?
RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same

关于 Channel Attention 的 Scale

第一:
为什么多尺度信息也存在于通道当中?
第二:
为什么求得的attention map H ,其中H<i,j,C> 表示的含义?表示的意义是什么意思?表示通道之间的依赖性?

回复:

谢谢您的来信。

更准确的说法应该是,在论文中,我们认为 通道注意力也应该是有 尺度 这一概念/属性的,而目前 SENet / SKNet 中所用的只是极端情况,最大的尺度 Global Scale 时候的 Channel Attention,而 AFF 论文里用的另一个分支,则是另一种情况,就是最小的尺度最最 Local 时候的 Channel Attention。AFF 用了最简单的多尺度,也就是 Local + Global,来聚合多尺度信息。

特征图的大小是 C x H x W, 因为我们用了 Local Channel Attention,所以计算出来的 Attention Map 的大小也是 C x H x W,实现 Local / Element-wise 的 Refinement。与之相对的是 SENet,一个 Channel 的权重是施加给整个 H x W 的,大小为 H x W 的 feature map 上每个元素所接收的权重都是一样的。通道之间的依赖性还是照常,SENet 用 Fully Connected 来抓取,那么 AFF 其实也一样, 就是用 Point-wise Conv 来抓取,在 Global 分支中,Point-wise Conv 跟 Fully Connected 是一模一样的。

总之,论文的假设就是 通道注意力也应该是有 尺度 的,控制尺度的变量就是 Pooling 的 Size。这个其实跟 SIFT 之类的经典方法通过 控制不同大小的高斯滤波器 来实现不同尺度空间的想法是一样的,只不过 AFF 里面用的是 AVGPooling。一旦接受了 Channel Attention 也应该有尺度这个概念,就可以了。

最后,有个不情之请,我们最好能在代码的 Issues 里面一起讨论 https://github.com/YimianDai/open-aff/issues ,这样的好处是大家都能看到。

祝您身体健康,工作顺利~

关于 Network visualization

作者您好,非常感谢您的工作!就是,那个我有一个很简单的问题,就是想问问您那个网络可视化咋做的呢?
是提取最后的特征图,然后按照数字大小,绘制彩色云图然后调整透明度,覆盖到原图上面吗。
还是别的思路呢。

AFF-FPN

作者你好,请问一下,根据论文里面的说明,构建AFF-FPN,最后通道数输出就很大啊,看图吧,以resnet18为例吧
微信图片_20210601111510

__init__() takes from 1 to 3 positional arguments but 6 were given

Hi I was trying modify the class AFF() code to support new version of keras, but stuggling with this error
The modified AFF class
`class AFF(tf.keras.layers.Layer):
'''
多特征融合 AFF
'''

def __init__(self, channels=64, r=4):
    super().__init__()
    inter_channels = int(channels // r)

    self.local_att = tf.keras.Sequential(
        Conv2D(filters=64, kernel_size=(3,3), strides=1, padding='same'),
        tf.keras.layers.BatchNormalization(inter_channels),
        tf.keras.layers.ReLU(),
        Conv2D(filters=64, kernel_size=(3,3), strides=1, padding='same'),
        tf.keras.layers.BatchNormalization(channels),
    )

    self.global_att = tf.keras.Sequential(
        tf.keras.layers.AveragePooling2D(1),
        Conv2D(filters=64, kernel_size=(3,3), strides=1, padding='same'),
        tf.keras.layers.BatchNormalization(inter_channels),
        tf.keras.layers.ReLU(),
        Conv2D(filters=64, kernel_size=(3,3), strides=1, padding='same'),
        tf.keras.layers.BatchNormalization(channels),
    )

    self.sigmoid = nn.Sigmoid()

def forward(self, x, residual):
    xa = x + residual
    xl = self.local_att(xa)
    xg = self.global_att(xa)
    xlg = xl + xg
    wei = self.sigmoid(xlg)

    xo = 2 * x * wei + 2 * residual * (1 - wei)
    return xo`

The error
image

The create model function

    tf.keras.backend.clear_session()

    input = Input(shape=(256,256,3), name="input_layer")
    print("Input =",input.shape)

    conv_block = Convolutional_block()(input)
    print("Conv block =",conv_block.shape)
    ca_block = Channel_attention()(conv_block)
    sa_block = SpatialGate()(conv_block)
    # AFF block instead of concatenate
    ca_block = AFF()(ca_block)

    model = Model(inputs=[input], outputs=[ca_block])
    return model

model = create_model()
model.summary()```

Input is an image of size 256,256,3
        

关于 Network visualization 的问题

作者您好,非常感谢您的工作!就是,那个我有一个很简单的问题,就是想问问您那个网络可视化咋做的呢?
是提取最后的特征图,然后按照数字大小,绘制彩色云图然后调整透明度,覆盖到原图上面吗。
还是别的思路呢。谢谢您

resnet50预训练模型

作者,您好!拜读了您的文章和代码,收益匪浅,收获颇丰,谢谢您。对于你所提供的resnet50预训练模型是mxnet中的.params作为模型保存后缀,如何将其转化为pytorch中进行使用呢?或者作者您可以提供一下pytorch版本的预训练模型。期待您的解答,万分感谢!

Batch norm问题

你好,最近我在使用pytorch进行复现,但是遇到一个问题。
ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 64, 1, 1])

原因是因为经过GlobalAvgPooling后的特征图尺度都是C11,这个1*1的特征图在BN就会报这个错误
您可以试试以下的代码,就可以复现我的问题了
import torch
a = torch.randn(1, 64, 1, 1)
bn = torch.nn.BatchNorm2d(64)
bn(a)

aff使用于fpn或者Unet结构

作者你好,我在实验中想使用aff应用于解码结构,但是在使用的时候发现在最深层的尺度时aff是有效的,在其他尺度或者在所有尺度添加aff的时候反而效果变差了,想请教一下这是什么问题,需要对哪些部分进行调整吗?

CIFAR-100 dataset in train_cifar.py has only 20 classes.

I am interested in AFF-ResNets, and I evaluated the performance of AFF-ResNets with both this implementation and my own implementation.
Through the experiments, I found a bug.

In train_cifar.py, CIFAR-100 dataset is loaded as follows:

train_data = gluon.data.DataLoader(
   gluon.data.vision.CIFAR100(train=True).transform_first(transform_train),
   batch_size=batch_size, shuffle=True, last_batch='discard', num_workers=num_workers)

but, according to API reference of gluon, CIFAR-100 with default settings has only 20 classes.

https://mxnet.apache.org/versions/1.7/api/python/docs/api/gluon/data/vision/datasets/index.html#mxnet.gluon.data.vision.datasets.CIFAR100

Option fine_label=True is required to compare the performance with other models on CIFAR-100 classification task.

Typo?

What does z^' in the Eq. (2) mean?

TypeError: forward() missing 1 required positional argument: 'residual'

Hi, thanks for your outstanding work. i am trying to add your AFF module with my model decoder and unfortunately getting this error. I don't know which value should i give to residual ? or how solve this issue.
Hint....

    self.AFFBlock4 = AFF(512)
    self.AFFBlock3 = AFF(256)
    self.AFFBlock2 = AFF(128)
    self.AFFBlock1 = AFF(64) 

add with decoder....

    d4 = self.decoder4(e4) + e3
    d4 = self.AFFBlock4(d4)
    d3 = self.decoder3(d4) + e2
    d3 = self.AFFBlock3(d3)
    d2 = e1 + F.upsample(self.decoder2(d3), (e1.size(2), e1.size(3)), mode='bilinear')
    #d2 = self.decoder2(d3) + e1
    d2 = self.AFFBlock2(d2)
    d1 = self.decoder1(d2) + x
    d1 = self.AFFBlock1(d1)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.