yimiandai / open-aff Goto Github PK

View Code? Open in Web Editor NEW

699.0 8.0 95.0 982.19 MB

code and trained models for "Attentional Feature Fusion"

Python 100.00%

open-aff's Introduction

Attentional Feature Fusion

MXNet/Gluon code for "Attentional Feature Fusion" https://arxiv.org/abs/2009.14082

What's in this repo so far:

Code, trained models, and training logs for ImageNet

PS:

If you are the reviewers of our submitted paper, please note that the accuracy of current implementation is a bit higher than the accuracy in the paper because it is a new implementation with a bag of tricks.
如果您是我的学位论文评审专家，发现论文与这个 repo 的数字有些出入，那是因为在论文提交后我又将代码重新实现了一遍，添加了 AutoAugment、Label Smooting 这些技巧，所以目前这个 repo 中的分类准确率会比论文中的数字高一些，还请见谅。

Change Logs:

2020-10-08: Re-implement the image classification code with a bag of tricks
2020-09-29: Upload the image classification codes and trained models for the submitted paper

To Do:

Running AFF-ResNeXt-50 and AFF-ResNet-50 on ImageNet
Update Grad-CAM results on new trained models
Re-implement the segmentation code
Convert to PyTorch

In Progress:

Running iAFF-ResNeXt-50 on ImageNet

Done:

Re-implement the image classification code with a bag of tricks

Requirements

Install MXNet and Gluon-CV:

pip install --upgrade mxnet-cu101 gluoncv

If you are going to use autoaugment:

python3 -m pip install --upgrade "mxnet_cu101<2.0.0"
python3 -m pip install autogluon

Experiments

All trained model params and training logs are in ./params

The training commands / shell scripts are in cmd_scripts.txt

ImageNet

Architecture	Params	top-1 err.
ResNet-101 [1]	42.5M	23.2
Efficient-Channel-Attention-Net-101 [2]	42.5M	21.4
Attention-Augmented-ResNet-101 [3]	45.4M	21.3
SENet-101 [4]	49.4M	20.9
Gather-Excite-$\theta^{+}$-ResNet-101 [5]	58.4M	20.7
Local-Importance-Pooling-ResNet-101 [6]	42.9M	20.7
AFF-ResNet-50 (ours)	30.3M	20.3
iAFF-ResNet-50 (ours)	35.1M	20.2
iAFF-ResNeXt-50-32x4d (ours)	34.7M	19.78

PyTorch Version

@bobo0810 has contributed the PyTorch version. Please check the aff_pytorch directory for details.

Many thanks for @bobo0810 for his contribution.

Citation

Please cite our paper in your publications if our work helps your research. BibTeX reference is as follows.

@inproceedings{dai21aff,
  title   =  {Attentional Feature Fusion},
  author  =  {Yimian Dai and Fabian Gieseke and Stefan Oehmcke and Yiquan Wu and Kobus Barnard},
  booktitle =  {{IEEE} Winter Conference on Applications of Computer Vision, {WACV} 2021}
  year    =  {2021}
}

References

[1] Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun: Deep Residual Learning for Image Recognition. CVPR 2016: 770-778

[2] Qilong Wang, Banggu Wu, Pengfei Zhu, Peihua Li, Wangmeng Zuo, Qinghua Hu: ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks. CVPR 2020: 11531-11539

[3] Irwan Bello, Barret Zoph, Quoc Le, Ashish Vaswani, Jonathon Shlens: Attention Augmented Convolutional Networks. ICCV 2019: 3285-3294

[4] Jie Hu, Li Shen, Gang Sun: Squeeze-and-Excitation Networks. CVPR 2018: 7132-7141

[5] Jie Hu, Li Shen, Samuel Albanie, Gang Sun, Andrea Vedaldi: Gather-Excite: Exploiting Feature Context in Convolutional Neural Networks. NeurIPS 2018: 9423-9433

[6] Ziteng Gao, Limin Wang, Gangshan Wu: LIP: Local Importance-Based Pooling. ICCV 2019: 3354-3363

[7] Xiang Li, Wenhai Wang, Xiaolin Hu, Jian Yang: Selective Kernel Networks. CVPR 2019: 510-519

[8] Dongyoon Han, Jiwhan Kim, Junmo Kim: Deep Pyramidal Residual Networks. CVPR 2017: 6307-6315

[9] Zhichao Lu, Gautam Sreekumar, Erik D. Goodman, Wolfgang Banzhaf, Kalyanmoy Deb, Vishnu Naresh Boddeti: Neural Architecture Transfer. CoRR abs/2005.05859 (2020)

[10] Ekin D. Cubuk, Barret Zoph, Dandelion Mane, Vijay Vasudevan, Quoc V. Le: AutoAugment: Learning Augmentation Strategies From Data. CVPR 2019: 113-123

open-aff's People

Contributors

Stargazers

Watchers

Forkers

wddwzc zjuqiushi kaizen123 ma3252788 mldl xuexiy1ge tony-hou sjjdd weihua93 haobabuhaoba zerodohero sunhx0914 qingfengting2017 nuaasxr xrosliang learncrazy sheninexorable yukoamamiya yinyuhangi yukaizhou mx1mx2 zehaoyao mrtianlee yuanqinglie chuanqil sophieyl pwjworks zhimangshi masonzong codeapeyb wateronthemoon senwang98 mymuli wyh-sys yuqing-que gengxiaomeng lovecodestudent anonymousdestroyer aiyodiulehuner cassie-cv akilasadhish xiafeng-nb sainatarajan invinciblexiao kannee qinzhengmei sorrowyn yncao littlespongebob mayinjin lsm-sunny tor4k tang16 xingxinggood woodszp snoopybingo famishedfish thuyandang yu-0626 argusswift iwrange yidan-zhang lxp2014 zacharyzgw dawnywu liuaoy xiaoerlaigeid chenpeng68 shuowang-ai linlianjiang kidchou aoteman233 towfiqpranto gloryofroad wallufo xiaoyingjian zlm200 victor4869 whiteinoc dqy626 arthurxl abbasturkoglu max-well-wang sxq-study nkzhangheng hudeleimre paeflorall strawberryl tualatinlz yuuius cuihaojie-plus zj56 1996tangyuan duohakehu erfclight

open-aff's Issues

ASKCResNetFPN

from __future__ import division
import os
from mxnet.gluon.block import HybridBlock
from mxnet.gluon import nn
from mxnet.gluon.nn import BatchNorm
from gluoncv.model_zoo.fcn import _FCNHead
from mxnet import nd

from .askc import LCNASKCFuse

from model.atac.backbone import ATACBlockV1, conv1ATAC, DynamicCell
from model.atac.convolution import LearnedCell, ChaDyReFCell, SeqDyReFCell, SK_ChaDyReFCell, \
    SK_1x1DepthDyReFCell, SK_MSSpaDyReFCell, SK_SpaDyReFCell, Direct_AddCell, SKCell, \
    SK_SeqDyReFCell, Sub_MSSpaDyReFCell, SK_MSSeqDyReFCell, iAAMSSpaDyReFCell
from model.atac.convolution import \
    LearnedConv, ChaDyReFConv, SeqDyReFConv, SK_ChaDyReFConv, \
    SK_1x1DepthDyReFConv, SK_MSSpaDyReFConv, SK_SpaDyReFConv, Direct_AddConv, SKConv, \
    SK_SeqDyReFConv
    # , SK_MSSeqDyReFConv
from .activation import xUnit, SpaATAC, ChaATAC, SeqATAC, MSSeqATAC, MSSeqATACAdd, \
    MSSeqATACConcat, MSSeqAttentionMap, xUnitAttentionMap
from model.atac.fusion import Direct_AddFuse_Reduce, SK_MSSpaFuse, SKFuse_Reduce, LocalChaFuse, \
    GlobalChaFuse, \
    LocalGlobalChaFuse_Reduce, LocalLocalChaFuse_Reduce, GlobalGlobalChaFuse_Reduce, \
    AYforXplusYChaFuse_Reduce, XplusAYforYChaFuse_Reduce, IASKCChaFuse_Reduce,\
    GAUChaFuse_Reduce, SpaFuse_Reduce, ConcatFuse_Reduce, AXYforXplusYChaFuse_Reduce,\
    BiLocalChaFuse_Reduce, BiGlobalChaFuse_Reduce, LocalGAUChaFuse_Reduce, GlobalSpaFuse,\
    AsymBiLocalChaFuse_Reduce, BiSpaChaFuse_Reduce, AsymBiSpaChaFuse_Reduce, LocalSpaFuse, \
    BiGlobalLocalChaFuse_Reduce

# from gluoncv.model_zoo.resnetv1b import BasicBlockV1b
from gluoncv.model_zoo.cifarresnet import CIFARBasicBlockV1


class ASKCResNetFPN(HybridBlock):
    def __init__(self, layers, channels, fuse_mode, act_dilation, classes=1, tinyFlag=False,
                 norm_layer=BatchNorm, norm_kwargs=None, **kwargs):
        super(ASKCResNetFPN, self).__init__(**kwargs)

        self.layer_num = len(layers)
        self.tinyFlag = tinyFlag
        with self.name_scope():

            stem_width = int(channels[0])
            self.stem = nn.HybridSequential(prefix='stem')
            self.stem.add(norm_layer(scale=False, center=False,
                                     **({} if norm_kwargs is None else norm_kwargs)))
            if tinyFlag:
                self.stem.add(nn.Conv2D(channels=stem_width*2, kernel_size=3, strides=1,
                                         padding=1, use_bias=False))
                self.stem.add(norm_layer(in_channels=stem_width*2))
                self.stem.add(nn.Activation('relu'))
            else:
                self.stem.add(nn.Conv2D(channels=stem_width, kernel_size=3, strides=2,
                                         padding=1, use_bias=False))
                self.stem.add(norm_layer(in_channels=stem_width))
                self.stem.add(nn.Activation('relu'))
                self.stem.add(nn.Conv2D(channels=stem_width, kernel_size=3, strides=1,
                                         padding=1, use_bias=False))
                self.stem.add(norm_layer(in_channels=stem_width))
                self.stem.add(nn.Activation('relu'))
                self.stem.add(nn.Conv2D(channels=stem_width*2, kernel_size=3, strides=1,
                                         padding=1, use_bias=False))
                self.stem.add(norm_layer(in_channels=stem_width*2))
                self.stem.add(nn.Activation('relu'))
                self.stem.add(nn.MaxPool2D(pool_size=3, strides=2, padding=1))

            # self.head1 = _FCNHead(in_channels=channels[1], channels=classes)
            # self.head2 = _FCNHead(in_channels=channels[2], channels=classes)
            # self.head3 = _FCNHead(in_channels=channels[3], channels=classes)
            # self.head4 = _FCNHead(in_channels=channels[4], channels=classes)

            self.head = _FCNHead(in_channels=channels[1], channels=classes)

            self.layer1 = self._make_layer(block=CIFARBasicBlockV1, layers=layers[0],
                                           channels=channels[1], stride=1, stage_index=1,
                                           in_channels=channels[1])

            self.layer2 = self._make_layer(block=CIFARBasicBlockV1, layers=layers[1],
                                           channels=channels[2], stride=2, stage_index=2,
                                           in_channels=channels[1])

            self.layer3 = self._make_layer(block=CIFARBasicBlockV1, layers=layers[2],
                                           channels=channels[3], stride=2, stage_index=3,
                                           in_channels=channels[2])

            if self.layer_num == 4:
                self.layer4 = self._make_layer(block=CIFARBasicBlockV1, layers=layers[3],
                                               channels=channels[4], stride=2, stage_index=4,
                                               in_channels=channels[3])

            if self.layer_num == 4:
                self.fuse34 = self._fuse_layer(fuse_mode, channels=channels[3],
                                               act_dilation=act_dilation)  # channels[4]

            self.fuse23 = self._fuse_layer(fuse_mode, channels=channels[2],
                                           act_dilation=act_dilation)  # 64
            self.fuse12 = self._fuse_layer(fuse_mode, channels=channels[1],
                                           act_dilation=act_dilation)  # 32

            # if fuse_order == 'reverse':
            #     self.fuse12 = self._fuse_layer(fuse_mode, channels=channels[2])  # channels[2]
            #     self.fuse23 = self._fuse_layer(fuse_mode, channels=channels[3])  # channels[3]
            #     self.fuse34 = self._fuse_layer(fuse_mode, channels=channels[4])  # channels[4]
            # elif fuse_order == 'normal':
	           #  self.fuse34 = self._fuse_layer(fuse_mode, channels=channels[4])  # channels[4]
	           #  self.fuse23 = self._fuse_layer(fuse_mode, channels=channels[4])  # channels[4]
	           #  self.fuse12 = self._fuse_layer(fuse_mode, channels=channels[4])  # channels[4]

    def _make_layer(self, block, layers, channels, stride, stage_index, in_channels=0,
                    norm_layer=BatchNorm, norm_kwargs=None):
        layer = nn.HybridSequential(prefix='stage%d_'%stage_index)
        with layer.name_scope():
            downsample = (channels != in_channels) or (stride != 1)
            layer.add(block(channels, stride, downsample, in_channels=in_channels,
                            prefix='', norm_layer=norm_layer, norm_kwargs=norm_kwargs))
            for _ in range(layers-1):
                layer.add(block(channels, 1, False, in_channels=channels, prefix='',
                                norm_layer=norm_layer, norm_kwargs=norm_kwargs))
        return layer

    def _fuse_layer(self, fuse_mode, channels, act_dilation):
        if fuse_mode == 'Direct_Add':
            fuse_layer = Direct_AddFuse_Reduce(channels=channels)
        elif fuse_mode == 'Concat':
            fuse_layer = ConcatFuse_Reduce(channels=channels)
        elif fuse_mode == 'SK':
            fuse_layer = SKFuse_Reduce(channels=channels)
        # elif fuse_mode == 'LocalCha':
        #     fuse_layer = LocalChaFuse(channels=channels)
        # elif fuse_mode == 'GlobalCha':
        #     fuse_layer = GlobalChaFuse(channels=channels)
        elif fuse_mode == 'LocalGlobalCha':
            fuse_layer = LocalGlobalChaFuse_Reduce(channels=channels)
        elif fuse_mode == 'LocalLocalCha':
            fuse_layer = LocalLocalChaFuse_Reduce(channels=channels)
        elif fuse_mode == 'GlobalGlobalCha':
            fuse_layer = GlobalGlobalChaFuse_Reduce(channels=channels)
        elif fuse_mode == 'IASKCChaFuse':
            fuse_layer = IASKCChaFuse_Reduce(channels=channels)
        elif fuse_mode == 'AYforXplusY':
            fuse_layer = AYforXplusYChaFuse_Reduce(channels=channels)
        elif fuse_mode == 'AXYforXplusY':
            fuse_layer = AXYforXplusYChaFuse_Reduce(channels=channels)
        elif fuse_mode == 'XplusAYforY':
            fuse_layer = XplusAYforYChaFuse_Reduce(channels=channels)
        elif fuse_mode == 'GAU':
            fuse_layer = GAUChaFuse_Reduce(channels=channels)
        elif fuse_mode == 'LocalGAU':
            fuse_layer = LocalGAUChaFuse_Reduce(channels=channels)
        elif fuse_mode == 'SpaFuse':
            fuse_layer = SpaFuse_Reduce(channels=channels, act_dialtion=act_dilation)
        elif fuse_mode == 'BiLocalCha':
            fuse_layer = BiLocalChaFuse_Reduce(channels=channels)
        elif fuse_mode == 'BiGlobalLocalCha':
            fuse_layer = BiGlobalLocalChaFuse_Reduce(channels=channels)
        elif fuse_mode == 'AsymBiLocalCha':
            fuse_layer = AsymBiLocalChaFuse_Reduce(channels=channels)
        elif fuse_mode == 'BiGlobalCha':
            fuse_layer = BiGlobalChaFuse_Reduce(channels=channels)
        elif fuse_mode == 'BiSpaCha':
            fuse_layer = BiSpaChaFuse_Reduce(channels=channels)
        elif fuse_mode == 'AsymBiSpaCha':
            fuse_layer = AsymBiSpaChaFuse_Reduce(channels=channels)
        # elif fuse_mode == 'LocalSpa':
        #     fuse_layer = LocalSpaFuse(channels=channels, act_dilation=act_dilation)
        # elif fuse_mode == 'GlobalSpa':
        #     fuse_layer = GlobalSpaFuse(channels=channels, act_dilation=act_dilation)
        # elif fuse_mode == 'SK_MSSpa':
        #     # fuse_layer.add(SK_MSSpaFuse(channels=channels, act_dilation=act_dilation))
        #     fuse_layer = SK_MSSpaFuse(channels=channels, act_dilation=act_dilation)
        else:
            raise ValueError('Unknown fuse_mode')

        return fuse_layer

    def hybrid_forward(self, F, x):

        _, _, hei, wid = x.shape

        x = self.stem(x)      # down 4, 32
        c1 = self.layer1(x)   # down 4, 32
        c2 = self.layer2(c1)  # down 8, 64
        out = self.layer3(c2)  # down 16, 128
        if self.layer_num == 4:
            c4 = self.layer4(out)  # down 32
            if self.tinyFlag:
                c4 = F.contrib.BilinearResize2D(c4, height=hei//4, width=wid//4)  # down 4
            else:
                c4 = F.contrib.BilinearResize2D(c4, height=hei//16, width=wid//16)  # down 16
            out = self.fuse34(c4, out)
        if self.tinyFlag:
            out = F.contrib.BilinearResize2D(out, height=hei//2, width=wid//2)  # down 2, 128
        else:
            out = F.contrib.BilinearResize2D(out, height=hei//8, width=wid//8)  # down 8, 128
        out = self.fuse23(out, c2)
        if self.tinyFlag:
            out = F.contrib.BilinearResize2D(out, height=hei, width=wid)  # down 1
        else:
            out = F.contrib.BilinearResize2D(out, height=hei//4, width=wid//4)  # down 8
        out = self.fuse12(out, c1)

        pred = self.head(out)
        if self.tinyFlag:
            out = pred
        else:
            out = F.contrib.BilinearResize2D(pred, height=hei, width=wid)  # down 4

        ######### reverse order ##########
        # up_c2 = F.contrib.BilinearResize2D(c2, height=hei//4, width=wid//4)  # down 4
        # fuse2 = self.fuse12(up_c2, c1)  # down 4, channels[2]
        #
        # up_c3 = F.contrib.BilinearResize2D(c3, height=hei//4, width=wid//4)  # down 4
        # fuse3 = self.fuse23(up_c3, fuse2)  # down 4, channels[3]
        #
        # up_c4 = F.contrib.BilinearResize2D(c4, height=hei//4, width=wid//4)  # down 4
        # fuse4 = self.fuse34(up_c4, fuse3)  # down 4, channels[4]
        #

        ######### normal order ##########
        # out = F.contrib.BilinearResize2D(c4, height=hei//16, width=wid//16)
        # out = self.fuse34(out, c3)
        # out = F.contrib.BilinearResize2D(out, height=hei//8, width=wid//8)
        # out = self.fuse23(out, c2)
        # out = F.contrib.BilinearResize2D(out, height=hei//4, width=wid//4)
        # out = self.fuse12(out, c1)
        # out = self.head(out)
        # out = F.contrib.BilinearResize2D(out, height=hei, width=wid)


        return out

    def evaluate(self, x):
        """evaluating network with inputs and targets"""
        return self.forward(x)

Concat and fusion instead of add in iAFF and AFF

Hi author, have you ever considered using Concat and fusion like : $conv1x1(cat([x1, x2]))$ instead of $+$ in iAFF and AFF

您好，请问AFF模块中的最后一步为什么要乘以2？有什么意义呢？希望收到您的回复

问题

您好，我想尝试把MS-CAM模块加入到其它网络模型中，请问加到backbone的最后，效果会怎么样呢？

Do you have the code of Pytorch version?

Or will you consider publishing the pytorch version of the code in the future?

Can you share inference time data for different AFF/iAFF models?

Hi Yimian, I came here from your WACV 2021 presentation. This work looks pretty impressive.

As we discussed during your presentation, could you share the inference time data for different configurations (ResNet-50, -101, ResNeXt-50, etc.) as well as the numbers for baseline models and the hardware details? Thank you!

可否进行实现三种特征图的融合

作者您好，在拜读了您的文章以后，想要请教一下，无论是AFF还是IAFF似乎都是两种图的融合，那么如果要进行三种图的融合，该如何进行更改呢？

关于localization和small object

作者您好，您可以详细解释一下MS-CAM、AFF对检测中localization和small object的影响原因吗？

关于应用于FPN

您好，我想请问一下AFF或者说iAFF如何应用于FPN呢，以AFF为例，X、Y分别为不同stage下的特征层，文章中说Y是高语义的，那么以resnet中下采样为32的层（Y）和下采样为16的层（X）来说，如何能使X+Y呢，这两个首先特征图的尺度不同，其次维度也不同。

很抱歉打扰您，期待您的回复，谢谢！

代码中AFF模块中，为什么要用2×呢？

xo = 2 * x * wei + 2 * residual * (1 - wei)

When the website will be ready?

As the title. Thanks!

ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 256, 1, 1])

我采用你的方法进行特征融合，
nn.AdaptiveAvgPool2d(1),
nn.Conv2d(channels, inter_channels, kernel_size=1, stride=1, padding=0),
nn.BatchNorm2d(inter_channels),
nn.ReLU(inplace=True),
nn.Conv2d(inter_channels, channels, kernel_size=1, stride=1, padding=0),
nn.BatchNorm2d(channels),
第一行将自适应池化的输出按照原文设为1时会报错ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 256, 1, 1])，请问有解决办法吗。

Training from scratch - log report

Hi,

First, I think your paper is very interesting, excellent work!

I was wondering if you have training from scratch report avialable ?
All the aviable reports are based on pretrained models with already high-accuracy (specifically, I am referring to CIFAR100 expierement).

code implementation: multiplication of 2 in the final layer output

Hi @YimianDai , thanks for sharing your work and code.
Just want to quick check the reason why you multiply 2 at the end of module block.
Does it help you train the model or is it a normalization parameter?

关于BN层

请问当我应用iaff模块时候存在以下情况：1、F的batch_size为1；2、全局平均池化将特征图的面积变为1*1。这导致了经过BN层的时候会报错ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 256, 1, 1])，请问这个有处理方法吗

为何将iaff加入到resnet第一层会报如下错误：TypeError: forward() missing 1 required positional argument: 'residual'

注意力机制的预训练模型有restnet18的吗？

作者您好，请问该注意力机制有预训练模型resnet18的吗？另外，该注意力机制能否更好的捕捉低层次的细节信息呢？就像传统的图像方向的算子那样？最后，该注意力机制能否实现即插即用呢？在不增加网络参数的情况下。谢谢，期待您的回答。

一些问题请教一下

1.您的论文中关于在FPN中应用AFF代码是哪个部分，我没找到
2.Global + Local方式您是在哪个分支上增加globalpooling的，还是两个分支任意哪个都可以？
麻烦您能帮我解答一下，谢谢

运行模块时报错请教

您好，我在将您提供的模块添加进网络后，出现了如下报错，请问这要怎么解决呢？
RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same

Excellent work, looking forward to your pytorch version！

关于 Channel Attention 的 Scale

第一:
为什么多尺度信息也存在于通道当中？
第二:
为什么求得的attention map H ,其中H<i,j,C> 表示的含义?表示的意义是什么意思？表示通道之间的依赖性？

回复：

谢谢您的来信。

更准确的说法应该是，在论文中，我们认为通道注意力也应该是有尺度这一概念/属性的，而目前 SENet / SKNet 中所用的只是极端情况，最大的尺度 Global Scale 时候的 Channel Attention，而 AFF 论文里用的另一个分支，则是另一种情况，就是最小的尺度最最 Local 时候的 Channel Attention。AFF 用了最简单的多尺度，也就是 Local + Global，来聚合多尺度信息。

特征图的大小是 C x H x W, 因为我们用了 Local Channel Attention，所以计算出来的 Attention Map 的大小也是 C x H x W，实现 Local / Element-wise 的 Refinement。与之相对的是 SENet，一个 Channel 的权重是施加给整个 H x W 的，大小为 H x W 的 feature map 上每个元素所接收的权重都是一样的。通道之间的依赖性还是照常，SENet 用 Fully Connected 来抓取，那么 AFF 其实也一样，就是用 Point-wise Conv 来抓取，在 Global 分支中，Point-wise Conv 跟 Fully Connected 是一模一样的。

总之，论文的假设就是通道注意力也应该是有尺度的，控制尺度的变量就是 Pooling 的 Size。这个其实跟 SIFT 之类的经典方法通过控制不同大小的高斯滤波器来实现不同尺度空间的想法是一样的，只不过 AFF 里面用的是 AVGPooling。一旦接受了 Channel Attention 也应该有尺度这个概念，就可以了。

最后，有个不情之请，我们最好能在代码的 Issues 里面一起讨论 https://github.com/YimianDai/open-aff/issues ，这样的好处是大家都能看到。

祝您身体健康，工作顺利~

作者您好，我想请教下关于通道注意力的尺度问题

iAFF类的forward函数中, xg2 = self.global_att(xi) 这一行是不是笔误

forward函数中, xg2 = self.global_att(xi)。这样self.global_att2似乎没有用上，请问这里是不是不小心写错了？

关于 Network visualization

作者您好，非常感谢您的工作！就是，那个我有一个很简单的问题，就是想问问您那个网络可视化咋做的呢？
是提取最后的特征图，然后按照数字大小，绘制彩色云图然后调整透明度，覆盖到原图上面吗。
还是别的思路呢。

请教这种特征融合思路能否应用到 NLP 中？

NLP 的特征大都是维度相同的，使用这种特征融合方案是否会效果更好

是否可以应用到3D目标检测中

作者你好，我看你的AFF特征融合好像只是针对2D检测的，请问怎样可以应用到3D检测中？

module -> self.global_att2

open-aff/aff_pytorch/aff_net/fusion.py

Line 74 in 0bcfd8a

xg2 = self.global_att(xi)

I found that your module "self.global_att2" did not use in your iAFF.
I wonder if "xg2 = self.global_att(xi)" should be modified as "xg2 = self.global_att2(xi)"
Thanks for your contributions, Dai

AFF-FPN

作者你好，请问一下，根据论文里面的说明，构建AFF-FPN，最后通道数输出就很大啊，看图吧，以resnet18为例吧

请问怎么在pytorch直接使用AFF呀？

init() takes from 1 to 3 positional arguments but 6 were given

Hi I was trying modify the class AFF() code to support new version of keras, but stuggling with this error
The modified AFF class
`class AFF(tf.keras.layers.Layer):
'''
多特征融合 AFF
'''

def __init__(self, channels=64, r=4):
    super().__init__()
    inter_channels = int(channels // r)

    self.local_att = tf.keras.Sequential(
        Conv2D(filters=64, kernel_size=(3,3), strides=1, padding='same'),
        tf.keras.layers.BatchNormalization(inter_channels),
        tf.keras.layers.ReLU(),
        Conv2D(filters=64, kernel_size=(3,3), strides=1, padding='same'),
        tf.keras.layers.BatchNormalization(channels),
    )

    self.global_att = tf.keras.Sequential(
        tf.keras.layers.AveragePooling2D(1),
        Conv2D(filters=64, kernel_size=(3,3), strides=1, padding='same'),
        tf.keras.layers.BatchNormalization(inter_channels),
        tf.keras.layers.ReLU(),
        Conv2D(filters=64, kernel_size=(3,3), strides=1, padding='same'),
        tf.keras.layers.BatchNormalization(channels),
    )

    self.sigmoid = nn.Sigmoid()

def forward(self, x, residual):
    xa = x + residual
    xl = self.local_att(xa)
    xg = self.global_att(xa)
    xlg = xl + xg
    wei = self.sigmoid(xlg)

    xo = 2 * x * wei + 2 * residual * (1 - wei)
    return xo`

The error

The create model function

    tf.keras.backend.clear_session()

    input = Input(shape=(256,256,3), name="input_layer")
    print("Input =",input.shape)

    conv_block = Convolutional_block()(input)
    print("Conv block =",conv_block.shape)
    ca_block = Channel_attention()(conv_block)
    sa_block = SpatialGate()(conv_block)
    # AFF block instead of concatenate
    ca_block = AFF()(ca_block)

    model = Model(inputs=[input], outputs=[ca_block])
    return model

model = create_model()
model.summary()```

Input is an image of size 256,256,3

mxnet parameters to pytorch

thanks for your work, I want to there any way to extract parameters and save it to pytorch pth?

关于 Network visualization 的问题

作者您好，非常感谢您的工作！就是，那个我有一个很简单的问题，就是想问问您那个网络可视化咋做的呢？
是提取最后的特征图，然后按照数字大小，绘制彩色云图然后调整透明度，覆盖到原图上面吗。
还是别的思路呢。谢谢您

您好可以发一下iAFF-ResNeXt-50-32x4d训练ImageNet的权重文件吗

resnet50预训练模型

作者，您好！拜读了您的文章和代码，收益匪浅，收获颇丰，谢谢您。对于你所提供的resnet50预训练模型是mxnet中的.params作为模型保存后缀，如何将其转化为pytorch中进行使用呢？或者作者您可以提供一下pytorch版本的预训练模型。期待您的解答，万分感谢！

Batch norm问题

你好，最近我在使用pytorch进行复现，但是遇到一个问题。
ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 64, 1, 1])

原因是因为经过GlobalAvgPooling后的特征图尺度都是C11，这个1*1的特征图在BN就会报这个错误
您可以试试以下的代码，就可以复现我的问题了
import torch
a = torch.randn(1, 64, 1, 1)
bn = torch.nn.BatchNorm2d(64)
bn(a)

aff使用于fpn或者Unet结构

作者你好，我在实验中想使用aff应用于解码结构，但是在使用的时候发现在最深层的尺度时aff是有效的，在其他尺度或者在所有尺度添加aff的时候反而效果变差了，想请教一下这是什么问题，需要对哪些部分进行调整吗？

CIFAR-100 dataset in train_cifar.py has only 20 classes.

I am interested in AFF-ResNets, and I evaluated the performance of AFF-ResNets with both this implementation and my own implementation.
Through the experiments, I found a bug.

In train_cifar.py, CIFAR-100 dataset is loaded as follows:

train_data = gluon.data.DataLoader(
   gluon.data.vision.CIFAR100(train=True).transform_first(transform_train),
   batch_size=batch_size, shuffle=True, last_batch='discard', num_workers=num_workers)

but, according to API reference of gluon, CIFAR-100 with default settings has only 20 classes.

https://mxnet.apache.org/versions/1.7/api/python/docs/api/gluon/data/vision/datasets/index.html#mxnet.gluon.data.vision.datasets.CIFAR100

Option fine_label=True is required to compare the performance with other models on CIFAR-100 classification task.

Typo?

What does z^' in the Eq. (2) mean?

TypeError: forward() missing 1 required positional argument: 'residual'

Hi, thanks for your outstanding work. i am trying to add your AFF module with my model decoder and unfortunately getting this error. I don't know which value should i give to residual ? or how solve this issue.
Hint....

    self.AFFBlock4 = AFF(512)
    self.AFFBlock3 = AFF(256)
    self.AFFBlock2 = AFF(128)
    self.AFFBlock1 = AFF(64)

add with decoder....

    d4 = self.decoder4(e4) + e3
    d4 = self.AFFBlock4(d4)
    d3 = self.decoder3(d4) + e2
    d3 = self.AFFBlock3(d3)
    d2 = e1 + F.upsample(self.decoder2(d3), (e1.size(2), e1.size(3)), mode='bilinear')
    #d2 = self.decoder2(d3) + e1
    d2 = self.AFFBlock2(d2)
    d1 = self.decoder1(d2) + x
    d1 = self.AFFBlock1(d1)

请问AFF的代码中2 * F.broadcast_mul(x, wei) + 2 * F.broadcast_mul(residual, 1-wei)

为什么要用2的方式，是为了防止数值太小梯度消失吗，，但为什么iAFF里面又不用2了呢