Giter Site home page Giter Site logo

res2net / res2net-pretrainedmodels Goto Github PK

View Code? Open in Web Editor NEW
1.0K 27.0 212.0 78 KB

(ImageNet pretrained models) The official pytorch implemention of the TPAMI paper "Res2Net: A New Multi-scale Backbone Architecture"

Home Page: https://mmcheng.net/res2net/

Python 100.00%
res2net backbone pytorch multi-scale jittor

res2net-pretrainedmodels's Introduction

Res2Net

The official pytorch implemention of the paper "Res2Net: A New Multi-scale Backbone Architecture"

Our paper is accepted by IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI).

Update

Introduction

We propose a novel building block for CNNs, namely Res2Net, by constructing hierarchical residual-like connections within one single residual block. The Res2Net represents multi-scale features at a granular level and increases the range of receptive fields for each network layer. The proposed Res2Net block can be plugged into the state-of-the-art backbone CNN models, e.g. , ResNet, ResNeXt, BigLittleNet, and DLA. We evaluate the Res2Net block on all these models and demonstrate consistent performance gains over baseline models.

Sample

Res2Net module

Useage

Requirement

PyTorch>=0.4.1

Examples

git clone https://github.com/gasvn/Res2Net.git

from res2net import res2net50
model = res2net50(pretrained=True)

Input image should be normalized as follows:

normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                  std=[0.229, 0.224, 0.225])

(By default, the model will be downloaded automatically. If the default download link is not available, please refer to the Download Link listed on Pretrained models.)

Pretrained models

model #Params MACCs top-1 error top-5 error Link
Res2Net-50-48w-2s 25.29M 4.2 22.68 6.47 OneDrive
Res2Net-50-26w-4s 25.70M 4.2 22.01 6.15 OneDrive
Res2Net-50-14w-8s 25.06M 4.2 21.86 6.14 OneDrive
Res2Net-50-26w-6s 37.05M 6.3 21.42 5.87 OneDrive
Res2Net-50-26w-8s 48.40M 8.3 20.80 5.63 OneDrive
Res2Net-101-26w-4s 45.21M 8.1 20.81 5.57 OneDrive
Res2NeXt-50 24.67M 4.2 21.76 6.09 OneDrive
Res2Net-DLA-60 21.15M 4.2 21.53 5.80 OneDrive
Res2NeXt-DLA-60 17.33M 3.6 21.55 5.86 OneDrive
Res2Net-v1b-50 25.72M 4.5 19.73 4.96 Link
Res2Net-v1b-101 45.23M 8.3 18.77 4.64 Link
Res2Net-v1d-200-SSLD 76.21M 15.7 14.87 2.58 PaddlePaddleLink

News

  • Res2Net_v1b is now available.
  • You can load the pretrained model by using pretrained = True.

The download link from Baidu Disk is now available. (Baidu Disk password: vbix)

Applications

Other applications such as Classification, Instance segmentation, Object detection, Semantic segmentation, Salient object detection, Class activation map,Tumor segmentation on CT scans can be found on https://mmcheng.net/res2net/ .

Citation

If you find this work or code is helpful in your research, please cite:

@article{gao2019res2net,
  title={Res2Net: A New Multi-scale Backbone Architecture},
  author={Gao, Shang-Hua and Cheng, Ming-Ming and Zhao, Kai and Zhang, Xin-Yu and Yang, Ming-Hsuan and Torr, Philip},
  journal={IEEE TPAMI},
  year={2021},
  doi={10.1109/TPAMI.2019.2938758}, 
}

Contact

If you have any questions, feel free to E-mail me via: shgao(at)live.com

License

The code is released under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International Public License for Noncommercial use only. Any commercial use should get formal permission first.

res2net-pretrainedmodels's People

Contributors

gasvn avatar mgrankin avatar mingmingcheng avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

res2net-pretrainedmodels's Issues

stype=stage with AveragePooling - mentioned in paper?

According to the code, there is a parameter stype. If stype=='stage', there will be no additive connections. Moreover, an average pooling layer is applied to the last "slice", see

out = torch.cat((out, self.pool(spx[self.nums])),1)

In #11 you mention that stype=='stage' refers to the three downsampling blocks in the resnet architecture.

Is this parameter and the removal of the additive connections mentioned anywhere in the paper? Has it always been there or does it represent a new variation?

1. res2net152_v1b_26w_4s has no pre-trained model 2.

Thanks for your great job and contribution!

  1. I noticed that in res2net_v1.py,res2net152_v1b_26w_4s function didn’t has pre-trained model. like this:

--> 212 model.load_state_dict(model_zoo.load_url(model_urls['res2net152_v1b_26w_4s']))
213 return model
214

KeyError: 'res2net152_v1b_26w_4s'

  1. In README.md, Input image should be normalized. But in res2net_v1.py, it don't use Input image normalized. As follows:

normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225])

images = torch.rand(1, 3, 224, 224).cuda(0)
model = res2net50_v1b_26w_4s(pretrained=True)
model = model.cuda(0)
print(model(images).size())

Thanks for your help

Input dimension doesn't match

Hi

I stoped here when I using the code from
https://github.com/Res2Net/Res2Net-ImageNet-Training
I got error as:

`
Epoch: [0][210/225] Time 0.167 (0.280) Data 0.037 (0.042) Loss 2.6880 (4.3539) Prec@1 17.969 (21.686) Prec@5 85.938 (83.897)
Epoch: [0][220/225] Time 0.170 (0.275) Data 0.030 (0.042) Loss 2.6971 (4.2777) Prec@1 28.906 (21.826) Prec@5 88.281 (83.993)
Traceback (most recent call last):
File "D:/Github_code/Res2Net_ImageNet/res2net_pami/main.py", line 378, in
main()
File "D:/Github_code/Res2Net_ImageNet/res2net_pami/main.py", line 215, in main
prec1, prec5 = validate(PublicTestloader, model.cuda(), criterion, epoch)
File "D:/Github_code/Res2Net_ImageNet/res2net_pami/main.py", line 301, in validate
output = model(input)
File "C:\Users\zhy34\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "D:\Github_code\Res2Net_ImageNet\res2net_pami\res2net.py", line 143, in forward
x = self.conv1(x)
File "C:\Users\zhy34\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "C:\Users\zhy34\Anaconda3\lib\site-packages\torch\nn\modules\conv.py", line 353, in forward
return self._conv_forward(input, self.weight)
File "C:\Users\zhy34\Anaconda3\lib\site-packages\torch\nn\modules\conv.py", line 350, in _conv_forward
self.padding, self.dilation, self.groups)
RuntimeError: Expected 4-dimensional input for 4-dimensional weight [64, 3, 7, 7], but got 5-dimensional input of size [128, 10, 3, 44, 44] instead

`
The dataset I'm using is FER2013, the biggest change I have made is "train_loader" and "val_loader".
Can author or any friends help me? Any comments are appreciated.

difference between res2net and res2net_v1b

thank your amazing work.
compared with original version of Res2Net, res2net_v1b make two difference:
1.replace conv 7*7 with this
2.downsample replace conv(stride=2) with avgpool2d this

after changing these, Res2Net_v1b has more than 2% improvement on ImageNet top1 acc. compared with original version of Res2Net
am I right? thank your reply

KeyError: u'content-length'

model.load_state_dict(model_zoo.load_url(model_urls['res2net50_26w_4s']))
File "/usr/local/lib/python2.7/dist-packages/torch/utils/model_zoo.py", line 66, in load_url
_download_url_to_file(url, cached_file, hash_prefix, progress=progress)
File "/usr/local/lib/python2.7/dist-packages/torch/utils/model_zoo.py", line 73, in _download_url_to_file
file_size = int(u.headers["Content-Length"])
File "/usr/local/lib/python2.7/dist-packages/requests/structures.py", line 52, in getitem
return self._store[key.lower()][1]
KeyError: u'content-length'

i got the key error and is there somthing wrong

精度问题

训练次数410000次,使用test_net.py评价网络效果:
MODEL:
META_ARCHITECTURE: "GeneralizedRCNN"
WEIGHT: "/root/data/Res2Net-maskrcnn/output/model_0100000.pth"
BACKBONE:
CONV_BODY: "R2-50-FPN"

WEIGHT: "/root/data/Res2Net-maskrcnn/output/model_0410000.pth"的评价结果一模一样,这是不合理的

load pretrain model issue

RuntimeError: Error(s) in loading state_dict for Res2Net:
size mismatch for layer1.0.downsample.1.weight: copying a param of torch.Size([256]) from checkpoint, where the shape is torch.Size([256, 64, 1, 1]) in current model.
size mismatch for layer2.0.downsample.1.weight: copying a param of torch.Size([512]) from checkpoint, where the shape is torch.Size([512, 256, 1, 1]) in current model.
size mismatch for layer3.0.downsample.1.weight: copying a param of torch.Size([1024]) from checkpoint, where the shape is torch.Size([1024, 512, 1, 1]) in current model.

when i load res2net50_v1b.pth, i meet this problem, but i can load res2net50_26w_4s.pth pretrain model to train my network.
i guess there is a problem between pytorch version?

pytorch: 0.4.1;
torchvision: 0.2.1
python: 3.6

Link 504 Gateway Time-out

Hi, when I download the pretrained models, it is found 504 Gateway Time-out. Also, the OneDrive Link can't be opened.

Thank you.

About the pretrained model

Hello, I used res2net provided by you for my own classification task, but there was a strange phenomenon when using the pre trained model parameters provided by you. At the beginning of training, there was a large loss, and the classification accuracy was very low, about 2%. In this case, I have only encountered it on the network that does not use the Imagenet pre training. At the same time, I also tried to set the result of pretrained = False, which is the same as that of pretrained = True. Although my classification program has not been completed, but from the current trend of classification accuracy rising, the potential is limited. (I will reply to your final result when the program is finished.) So I'd like to ask, will the same phenomenon occur in your own classification tasks?
Thank you

[state_dict] HTTP Error 403: Forbidden

Hey there,

Thanks a lot for your implementation!
I tweaked it a bit for my personal need but it definitely helped me a lot.

Initially, when I checked, none of the links provided for res2net or res2next were working. So I used links to OneDrive you provided in the Readme.

Unfortunately, OneDrive seems to regenerate download links regularly, meaning that I cannot use any static link for wget-like instructions. So I checked again and the links in your python files were working again.

Today I checked:

wget http://mc.nankai.edu.cn/projects/res2net/pretrainmodels/res2net50_26w_4s-06e79181.pth

and I received again the following error:

--2019-09-29 12:50:09--  http://mc.nankai.edu.cn/projects/res2net/pretrainmodels/res2net50_26w_4s-06e79181.pth
Resolving mc.nankai.edu.cn (mc.nankai.edu.cn)... 222.30.45.190
Connecting to mc.nankai.edu.cn (mc.nankai.edu.cn)|222.30.45.190|:80... connected.
HTTP request sent, awaiting response... 403 Forbidden
2019-09-29 12:50:11 ERROR 403: Forbidden.

Am I doing anything wrong?
Could you let us know about the best way / fixed URLs to get the weights please?

Cheers

结构和论文不符

作者你好,你的代码网络结构全部用了stype='stage'的连接,这种连接方式完全没体现出论文里面说的那种递进式连接关系,反而很像group卷积的意思,除了有一个用了pooling层而不是conv层,不知道你是出于什么考量。

v1d

In the original mmdetection repository I am reading something about res2net_v1d (e.g. here). I cannot find any information about a v1d in your repo. Is this just a typo or is it a different version from this repo?

Runtime Performance

Hello:

May I ask you how you measured the Runtime (Table 3)of your model and on which platform?

Thank you very much!

代码问题

self.nums = scale -1请问你论文是分的4组吧,
你代码为什么又分的三组呢

convert to tensorRT model

Assertion failed: *tensor = importer_ctx->network()->addInput( input.name().c_str(), trt_dtype, trt_dims)

Res2Net-v1b-50

请问Res2Net-v1b-50改进了哪些地方,为什么效果比之前有较大提升呢?

load model issue

when I load the model in distributed mode,there exists the issue as follows. The first node cost the sum memory of the all memory. But when I load the resnet models in the same way ,the issues disappear.Could you tell me the reason? Thanks a lot.

Input image should be normalized as follows:

Input image should be normalized as follows:

normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225])

can you tell me how to do it in detail and give a example?
thanks

Loading State Dict

Hello, I am trying to load in the state dict provided in the OneDrive link, but ran into issues due to there being differences between the expected state_dict and the given one. Specifically, the res2net101 checkpoint is failing for me.

training scripts

Hi,

Thanks for your work. Can you share the training scripts to reproduce the those numbers (top-1 and top-5 accuracy)?

Thanks a lot

Top-1 and Top-5 Accuracy

Hi,
Thanks for releasing pretrained models. I am trying to reproduce the result of Res2Net (top-1 and top-5 accuracy). I am using the official PyTorch training code.

  1. Is the reported result evaluated on the validation set? or test set?
  2. If the result is evaluated on the test set, where could I submit my results?
    Thank you!

dimension not match error in res2net_dla60

File "C:/Users/Administrator/PycharmProjects/PSENet/models/res2net_dla.py", line 261, in forward
    x1 = self.tree1(x, residual)
  File "D:\anaconda3\lib\site-packages\torch\nn\modules\module.py", line 547, in __call__
    result = self.forward(*input, **kwargs)
  File "C:/Users/Administrator/PycharmProjects/PSENet/models/res2net_dla.py", line 261, in forward
    x1 = self.tree1(x, residual)
  File "D:\anaconda3\lib\site-packages\torch\nn\modules\module.py", line 547, in __call__
    result = self.forward(*input, **kwargs)
  File "C:/Users/Administrator/PycharmProjects/PSENet/models/res2net_dla.py", line 103, in forward
    out += residual
RuntimeError: The size of tensor a (38) must match the size of tensor b (37) at non-singleton dimension 3

dimension is changed in line 89, is there any thing wrong with self.convs?

res2net50移植到faster-rcnn网络中,检测结果不好

作者你好,非常感谢你的代码,我将pytorch-faster-rcnn的特征提取网络换成了res2net50_26w_4s,在VOC2007上训练和测试,最终的测试结果mAP = 71.05%,没有得到你论文中的mAP=74.4%,请问是什么原因呢?我将res2net50_26w_4s的block1设置为不参加训练,block4设置stype = 'normal',使得在block4中的stride=1,请问我有哪些需要注意的地方吗?我用的faster-rcnn源代码是pytorch-faster-rcnn,期待你的回复

I cannot achieve the result reported in paper with Res2NeXt-29

I tried to implement Res2NeXt-29, 8cx25wx4s by pytorch but could only got classification accuracy 82.32% instead of 83.07% reported in this paper on the CIFAR-100. This error was resulted by randomly initializing parameters of network?

And the detail of architecture is as below. It's correct?

CifarRes2NeXt(
(conv_1_3x3): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn_1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(stage_1): Sequential(
(stage_1_bottleneck_0): ResNeXtBottleneck(
(conv_reduce): Conv2d(64, 800, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn_reduce): BatchNorm2d(800, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(convs): ModuleList(
(0): Conv2d(200, 200, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=8, bias=False)
(1): Conv2d(200, 200, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=8, bias=False)
(2): Conv2d(200, 200, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=8, bias=False)
)
(bns): ModuleList(
(0): BatchNorm2d(200, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(1): BatchNorm2d(200, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): BatchNorm2d(200, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(conv_expand): Conv2d(800, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn_expand): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(shortcut): Sequential()
)
(stage_1_bottleneck_1): ResNeXtBottleneck(
(conv_reduce): Conv2d(64, 800, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn_reduce): BatchNorm2d(800, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(convs): ModuleList(
(0): Conv2d(200, 200, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=8, bias=False)
(1): Conv2d(200, 200, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=8, bias=False)
(2): Conv2d(200, 200, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=8, bias=False)
)
(bns): ModuleList(
(0): BatchNorm2d(200, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(1): BatchNorm2d(200, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): BatchNorm2d(200, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(conv_expand): Conv2d(800, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn_expand): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(shortcut): Sequential()
)
(stage_1_bottleneck_2): ResNeXtBottleneck(
(conv_reduce): Conv2d(64, 800, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn_reduce): BatchNorm2d(800, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(convs): ModuleList(
(0): Conv2d(200, 200, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=8, bias=False)
(1): Conv2d(200, 200, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=8, bias=False)
(2): Conv2d(200, 200, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=8, bias=False)
)
(bns): ModuleList(
(0): BatchNorm2d(200, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(1): BatchNorm2d(200, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): BatchNorm2d(200, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(conv_expand): Conv2d(800, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn_expand): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(shortcut): Sequential()
)
)
(stage_2): Sequential(
(stage_2_bottleneck_0): ResNeXtBottleneck(
(conv_reduce): Conv2d(64, 1600, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn_reduce): BatchNorm2d(1600, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(pool): AvgPool2d(kernel_size=3, stride=2, padding=1)
(convs): ModuleList(
(0): Conv2d(400, 400, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=8, bias=False)
(1): Conv2d(400, 400, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=8, bias=False)
(2): Conv2d(400, 400, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=8, bias=False)
)
(bns): ModuleList(
(0): BatchNorm2d(400, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(1): BatchNorm2d(400, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): BatchNorm2d(400, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(conv_expand): Conv2d(1600, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn_expand): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(shortcut): Sequential(
(shortcut_conv): Conv2d(64, 128, kernel_size=(1, 1), stride=(2, 2), bias=False)
(shortcut_bn): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(stage_2_bottleneck_1): ResNeXtBottleneck(
(conv_reduce): Conv2d(128, 1600, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn_reduce): BatchNorm2d(1600, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(convs): ModuleList(
(0): Conv2d(400, 400, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=8, bias=False)
(1): Conv2d(400, 400, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=8, bias=False)
(2): Conv2d(400, 400, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=8, bias=False)
)
(bns): ModuleList(
(0): BatchNorm2d(400, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(1): BatchNorm2d(400, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): BatchNorm2d(400, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(conv_expand): Conv2d(1600, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn_expand): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(shortcut): Sequential()
)
(stage_2_bottleneck_2): ResNeXtBottleneck(
(conv_reduce): Conv2d(128, 1600, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn_reduce): BatchNorm2d(1600, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(convs): ModuleList(
(0): Conv2d(400, 400, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=8, bias=False)
(1): Conv2d(400, 400, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=8, bias=False)
(2): Conv2d(400, 400, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=8, bias=False)
)
(bns): ModuleList(
(0): BatchNorm2d(400, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(1): BatchNorm2d(400, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): BatchNorm2d(400, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(conv_expand): Conv2d(1600, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn_expand): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(shortcut): Sequential()
)
)
(stage_3): Sequential(
(stage_3_bottleneck_0): ResNeXtBottleneck(
(conv_reduce): Conv2d(128, 3200, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn_reduce): BatchNorm2d(3200, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(pool): AvgPool2d(kernel_size=3, stride=2, padding=1)
(convs): ModuleList(
(0): Conv2d(800, 800, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=8, bias=False)
(1): Conv2d(800, 800, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=8, bias=False)
(2): Conv2d(800, 800, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=8, bias=False)
)
(bns): ModuleList(
(0): BatchNorm2d(800, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(1): BatchNorm2d(800, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): BatchNorm2d(800, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(conv_expand): Conv2d(3200, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn_expand): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(shortcut): Sequential(
(shortcut_conv): Conv2d(128, 256, kernel_size=(1, 1), stride=(2, 2), bias=False)
(shortcut_bn): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(stage_3_bottleneck_1): ResNeXtBottleneck(
(conv_reduce): Conv2d(256, 3200, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn_reduce): BatchNorm2d(3200, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(convs): ModuleList(
(0): Conv2d(800, 800, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=8, bias=False)
(1): Conv2d(800, 800, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=8, bias=False)
(2): Conv2d(800, 800, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=8, bias=False)
)
(bns): ModuleList(
(0): BatchNorm2d(800, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(1): BatchNorm2d(800, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): BatchNorm2d(800, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(conv_expand): Conv2d(3200, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn_expand): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(shortcut): Sequential()
)
(stage_3_bottleneck_2): ResNeXtBottleneck(
(conv_reduce): Conv2d(256, 3200, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn_reduce): BatchNorm2d(3200, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(convs): ModuleList(
(0): Conv2d(800, 800, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=8, bias=False)
(1): Conv2d(800, 800, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=8, bias=False)
(2): Conv2d(800, 800, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=8, bias=False)
)
(bns): ModuleList(
(0): BatchNorm2d(800, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(1): BatchNorm2d(800, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): BatchNorm2d(800, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(conv_expand): Conv2d(3200, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn_expand): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(shortcut): Sequential()
)
)
(classifier): Linear(in_features=256, out_features=100, bias=True)
)

In mmdetection,applying DCNv2 in the Res2Net code will cause errors, while DCN will not?

File "/data/xmy/mmdet_new/mmdetection/mmdet/models/backbones/res2net.py", line 211, in forward
out = _inner_forward(x)
File "/data/xmy/mmdet_new/mmdetection/mmdet/models/backbones/res2net.py", line 181, in _inner_forward
main() sp = self.convsi
File "/home/l547/anaconda3/envs/MMDET_xmy1/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in call

File "./tools/train.py", line 138, in main
meta=meta)
File "/data/xmy/mmdet_new/mmdetection/mmdet/apis/train.py", line 102, in train_detector
result = self.forward(*input, **kwargs)
File "/data/xmy/mmdet_new/mmdetection/mmdet/ops/dcn/deform_conv.py", line 403, in forward
meta=meta)
File "/data/xmy/mmdet_new/mmdetection/mmdet/apis/train.py", line 177, in _dist_train
self.groups, self.deformable_groups)
File "/data/xmy/mmdet_new/mmdetection/mmdet/ops/dcn/deform_conv.py", line 148, in forward
ctx.groups, ctx.deformable_groups, ctx.with_bias)

RuntimeErrorrunner.run(data_loaders, cfg.workflow, cfg.total_epochs): 

input tensor has to be contiguous (modulated_deform_conv_cuda_forward at mmdet/ops/dcn/src/deform_conv_cuda.cpp:497)
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&) + 0x6d (0x7ff852af333d in /home/l547/anaconda3/envs/MMDET_xmy1/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: modulated_deform_conv_cuda_forward(at::Tensor, at::Tensor, at::Tensor, at::Tensor, at::Tensor, at::Tensor, at::Tensor, at::Tensor, int, int, int, int, int, int, int, int, int, int, bool) + 0x1285 (0x7ff81a84ceb5 in /data/xmy/mmdet_new/mmdetection/mmdet/ops/dcn/deform_conv_cuda.cpython-37m-x86_64-linux-gnu.so)
frame #2: + 0x26fed (0x7ff81a85cfed in /data/xmy/mmdet_new/mmdetection/mmdet/ops/dcn/deform_conv_cuda.cpython-37m-x86_64-linux-gnu.so)
frame #3: + 0x2722e (0x7ff81a85d22e in /data/xmy/mmdet_new/mmdetection/mmdet/ops/dcn/deform_conv_cuda.cpython-37m-x86_64-linux-gnu.so)
frame #4: + 0x22f9c (0x7ff81a858f9c in /data/xmy/mmdet_new/mmdetection/mmdet/ops/dcn/deform_conv_cuda.cpython-37m-x86_64-linux-gnu.so)
.......

Confused on tensor splitting

In line here, you split the input tensor based on the number of width, but I am expecting it should be based on scale. Am I missing something? Also, you process 1st split/group with conv and bn while your paper said that "To reduce the number of parameters, we omit the convolution for the first split, which can also be regarded as a form of feature reuse.". Could you enlighten me?

EDIT: for first question, I get it now (I was confused about "the number" vs "the size" in torch.split)
EDIT2: got it, turns out conv is not applied in last input

Thanks!

load model issue

Thank you for your excellent work.
Just like issue 5, I ran into the same problem and added "strict = False" to the solution you gave, but the problem still hasn't been solved and the same error still occurs. I tried to load res2net-50-26w-4s.
Look forward to your reply.
cuda:8.0
python:3.6
pytorch:0.4.0
torchvision:0.1.18

License

Would you consider specifying a license for the repo?

Why basewidth is divided by constant value of 64?

It's regarding following implementation in Bottle2neck :

width = int(math.floor(planes * (baseWidth/64.0)))
In paper, you mention n=w*s but nothing about basewidth?
The other approach is implemented here where plan size in always multiples of (w and s). But, that's not official version. I am wondering which one is correct or gives better result?

About Imagenet data classification test

@gasvn
Hello, I used the model you provided and the model transformed into gloun to test and verify the classification results of ImageNet. The model test results of the provided pytorch version are very low in accuracy of about 20%, but the converted gloun model can achieve about 71% (the two data operations are consistent). So I would like to ask you, do you need to do anything special before data entry?
The following is my data processing of the two. Could you please help me to see what the problem is? Thank you very much.
pytorch data processing

   class TestDataset(torch.utils.data.Dataset):
    def __init__(self, file, root, **kwargs):
        # mean and std
        self._root = os.path.expanduser(root)
        self.labelfile = file
        self._exts = ['.jpg', '.jpeg', '.png']
        self.normalize = transforms.Normalize(
            mean=[0.485, 0.456, 0.406],
            std=[0.229, 0.224, 0.225])
        self.size = (224,224)
        self.parse_input_list(self._root,self.labelfile)
        self.num_sample = len(self.list_sample)
        self.tranform = transforms.Compose([
            transforms.Resize(256),
            transforms.CenterCrop(224),
            transforms.ToTensor(),
            self.normalize,
        ])

    def parse_input_list(self, path,labelfile):
        self.list_sample = []
        labellist = {}
        with open(labelfile, "r") as f:  # 打开文件
            data = f.read().split('\n')[:-1]
            for i in range(len(data)): labellist[data[i].split(' ')[0]] = data[i].split(' ')[1]
            for filename in sorted(os.listdir(path)):
                if filename in labellist.keys():
                    label = labellist[filename]
                filename = os.path.join(path, filename)
                ext = os.path.splitext(filename)[1]
                if ext.lower() not in self._exts: print(
                    'Ignoring %s of type %s. Only support %s' % (filename, ext, ', '.join(self._exts)))
                self.list_sample.append((filename, label))
    def img_transform(self,img):
        img = self.tranform(img)
        return img.cuda(0)

    def __getitem__(self, index):
        from PIL import Image
        image_path,label = self.list_sample[index]
        img = Image.open(image_path).convert('RGB')
        if self.img_transform:
            img = self.img_transform(img)
        return img,label

    def __len__(self):
        return self.num_sample

gloun data processing

class TestDataset(gdata.Dataset):
   def __init__(self, root, labelfile, flag=1, transform=None):
       self._root = os.path.expanduser(root)
       self._flag = flag
       self._transform = transform
       self._exts = ['.jpg', '.jpeg', '.png']
       self.labelfile = labelfile
       self._list_images(self._root,self.labelfile)
   def _list_images(self, path,labelfile):
       self.items = []
       labellist = {}
       with open(labelfile, "r") as f:  # 打开文件
           data = f.read().split('\n')[:-1]
           for i in range(len(data)):labellist[data[i].split(' ')[0]] = data[i].split(' ')[1]
           for filename in sorted(os.listdir(path)):
               if filename in labellist.keys():
                   label = labellist[filename]
               filename = os.path.join(path, filename)
               ext = os.path.splitext(filename)[1]
               if ext.lower() not in self._exts: print('Ignoring %s of type %s. Only support %s' % (filename, ext, ', '.join(self._exts)))
               self.items.append((filename, label))

   def __getitem__(self, idx):

       img = image.imread(self.items[idx][0], self._flag)
       label = self.items[idx][1]
       if self._transform is not None:
           return self._transform(img, label)
       return img, label

   def __len__(self):
       return len(self.items)
def transform_train(data, label):
   im1 = image.imresize(data.astype('float32') / 255, 224, 224)
   auglist1 = image.CreateAugmenter(data_shape=(3, 224, 224), resize=0,
                       rand_crop=False, rand_resize=False, rand_mirror=False,
                       mean=numpy.array([0.485, 0.456, 0.406]), std=numpy.array([0.229, 0.224, 0.225]),
                       brightness=0, contrast=0,
                       saturation=0, hue=0,
                       pca_noise=0, rand_gray=0, inter_method=2)

   for aug in auglist1:
       im1 = aug(im1)
   # 将数据格式从"高*宽*通道"改为"通道*高*宽"。
   im1 = nd.transpose(im1, (2,0,1))
   # im1 = nd.expand_dims(data=im1, axis=0)
   return (im1, nd.array([label],ctx=mx.gpu()).asscalar().astype('float32'))

res2net50 mmdetection v2

I love to use res2net in mmdetection
How can i convert pretrained res2net from mmdetection v1 to v2
I want to train res2net50 and convert some my experiments from v1.1

Res2NeXt on Cifar100

Hi @gasvn ,

Thanks for the brilliant work!

I have a couple of simple questions regarding Res2NeXt on Cifar100.

  1. The implementation for ImageNet used the block without hierarchical addition for downsampling, but the code you mentioned in other issue threads (https://gist.github.com/gasvn/cd7653ef93fb147be05f1ae4abad6589) used group convolutions as the first block at each stage for downsampling instead. I wonder which one is the correct one?
  2. Did you use batch size 256 or 128 for the training? I saw your init LR was set to 0.05, which was used by ResNeXt for batch size 256.

Best wishes,

Qiang

some other question

Thank you very much for your great work. I am a novice. Generally, when judging whether the network architecture is effective enough, do you only train part of the data? Training imagenet is too big. looking for your reply!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.