tonylins / pytorch-mobilenet-v2 Goto Github PK

A PyTorch implementation of MobileNet V2 architecture and pretrained model.

License: Apache License 2.0

Python 100.00%

pytorch-mobilenet-v2's Introduction

A PyTorch implementation of MobileNetV2

This is a PyTorch implementation of MobileNetV2 architecture as described in the paper Inverted Residuals and Linear Bottlenecks: Mobile Networks for Classification, Detection and Segmentation.

[NEW] Add the code to automatically download the pre-trained weights.

Training Recipe

Recently I have figured out a good training setting:

number of epochs: 150
learning rate schedule: cosine learning rate, initial lr=0.05
weight decay: 4e-5
remove dropout

You should get >72% top-1 accuracy with this training recipe!

Accuracy & Statistics

Here is a comparison of statistics against the official TensorFlow implementation.

	FLOPs	Parameters	Top1-acc	Pretrained Model
Official TF	300 M	3.47 M	71.8%	-
Ours	300.775 M	3.471 M	71.8%	[google drive]

Usage

To use the pretrained model, run

from MobileNetV2 import mobilenet_v2

net = mobilenet_v2(pretrained=True)

Data Pre-processing

I used the following code for data pre-processing on ImageNet:

normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                 std=[0.229, 0.224, 0.225])

input_size = 224
train_dataset = datasets.ImageFolder(
    traindir,
    transforms.Compose([
        transforms.RandomResizedCrop(input_size, scale=(0.2, 1.0)), 
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(),
        normalize,
    ]))

train_loader = torch.utils.data.DataLoader(
    train_dataset, batch_size=batch_size, shuffle=True,
    num_workers=n_worker, pin_memory=True)

val_loader = torch.utils.data.DataLoader(
    datasets.ImageFolder(valdir, transforms.Compose([
        transforms.Resize(int(input_size/0.875)),
        transforms.CenterCrop(input_size),
        transforms.ToTensor(),
        normalize,
    ])),
    batch_size=batch_size, shuffle=False,
    num_workers=n_worker, pin_memory=True)

pytorch-mobilenet-v2's People

Contributors

Stargazers

Watchers

Forkers

xuguozhi shubhampachori12110095 ahirner dreadlord1984 alexliyang mahlermozart qingyuanxingsi mingfeima kuan-wang changebio lcwyylcwyy entn-at murari023 wpf535236337 lixiaosi33 mrwhitehomeman junweima zhangyuancv barry2025 toory465 yu45020 xxradon sapjunior ein-farbe jerrybonjour dyelax vickymodiface templeblock nightinwhite gzhermit jadielam shuangseu bjchen666 miaoshasha happog zijundeng peiswang ababycat irfanicmll jiangyangbo cuimolei 5wang wh-forker advaza parsonszeng hyzcn m0redr1nk lyken17 cookie-yang greenteahua holyhao swall0w suyanzhou626 lanyouzibetty happy-ngh charlesjiangxm cpsxhao sufeidechabei shirleyxting chaoso codingboo arunkumarramanan nyon-one vipermdl initmaks kalviny maplewzx berkeman sdxass barbecacov srqj wang93 eric612 gdfishhannah yyf8989 zhouzhubin amirunpri2018 toby5box autogyro w9 joy1112 felixcaae code1230 zhaoj9014 chenyyx lite-java pgadosey freedevelope roszcz kouyoumin jakel21 zhong-xin xiaxiaofu caozhengquan shaowa chipper1 msoliman6 softwaregift pinkney03 xiangyu-cas

pytorch-mobilenet-v2's Issues

The pre-trained model file is corrupted?

I have downloaded the pre-trained model file several times and tried to extract the tar file. But it always shows "An error occurred while loading the archive"

file broken ?

Hi, I downloaded the file and try to unzip or untar it, but did not succeed, I tried both on ubuntu and mac, no good luck. Can you make sure the file is still good ?

Thanks in advance

Question regarding inverted residual block

Quick question regarding inverted residual block.

    if expand_ratio == 1:
        self.conv = nn.Sequential(
            # dw
            nn.Conv2d(hidden_dim, hidden_dim, 3, stride, 1, groups=hidden_dim, bias=False),
            nn.BatchNorm2d(hidden_dim),
            nn.ReLU6(inplace=True),
            # pw-linear
            nn.Conv2d(hidden_dim, oup, 1, 1, 0, bias=False),
            nn.BatchNorm2d(oup),
        )
    else:
        self.conv = nn.Sequential(
            # pw
            nn.Conv2d(inp, hidden_dim, 1, 1, 0, bias=False),
            nn.BatchNorm2d(hidden_dim),
            nn.ReLU6(inplace=True),
            # dw
            nn.Conv2d(hidden_dim, hidden_dim, 3, stride, 1, groups=hidden_dim, bias=False),
            nn.BatchNorm2d(hidden_dim),
            nn.ReLU6(inplace=True),
            # pw-linear
            nn.Conv2d(hidden_dim, oup, 1, 1, 0, bias=False),
            nn.BatchNorm2d(oup),
        )

def forward(self, x):
    if self.use_res_connect:
        return x + self.conv(x)
    else:
        return self.conv(x)

Why is there no activation at the end ??
Shouldnt it return either
nn.ReLU(x + self.conv(x)) or nn.ReLU(self.conv(x)) ???

Thank you

the problem with RMSprop

something seems wrong with the loss of RMSprop. Any advice?

how to train?

Can you please tell us how can we train the model from scratch? and also how can we test it?

num of Input channels are more than 3

Hello! Thanks for the code!

I am trying to transfer the Mobile Net to my own dateset which contain the image and saliency data, but when I concat them ,there was an error about the channel should be 3 rather than 4.

Would you please give me some advise? Thanks a lot

depthwise "separable" convolution not implemented correctly?

First of all, great work :)

I'm just a little bit concerned about your implementation of the depthwise separable convolution. AFAIK, it's supposed to be a depthwise convolution followed by a pointwise convolution.

What I am thinking of is something like this:

cin, cout = channels_in, channels_out

depthwise = nn.Conv2d(cin, cin, kernel_size=3, padding=(1,1), groups=cin)
pointwise = nn.Conv2d(cin, cout, kernel_size=1)

depthwise_separable_convolution = nn.Sequential(
    depthwise,
    pointwise
)

But in your implementation, you implemented it as a single convolution, which I believe is a depthwise convolution, but not a depthwise "separable" convolution.

Dataset

Hi
Thank you for sharing the pre-trained model. I wanted to know which dataset you have used for training the model?

about the Flops and params

I've measured this implementation using this script:
https://github.com/ShichenLiu/CondenseNet/blob/master/utils.py

and got the following result:
FLOPs: 320.19M, Params: 3.51M

which is slightly higher than the paper.
I just confusing about this.

what the purpose of having `assert input_size % 32 == 0` in the code?

I dont get the meaning of assert input_size % 32 == 0 in the code.
why do we want the input size be divisible by 32?

type error when loading model

Hi,

I clone the repo and made my own file main.py which has two lines:
from MobileNetV2 import MobileNetV2

net = MobileNetV2(n_class=1000)

However, when I run main, it gives me the following error:

File "main.py", line 3, in
net = MobileNetV2(n_class=1000)
File "/home/liuyanqi/pytorch-mobilenet-v2/MobileNetV2.py", line 89, in init
self.features.append(block(input_channel, output_channel, s, expand_ratio=t))
File "/home/liuyanqi/pytorch-mobilenet-v2/MobileNetV2.py", line 33, in init
nn.Conv2d(hidden_dim, hidden_dim, 3, stride, 1, groups=hidden_dim, bias=False),
File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/conv.py", line 297, in init
False, _pair(0), groups, bias)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/conv.py", line 33, in init
out_channels, in_channels // groups, *kernel_size))
TypeError: new() received an invalid combination of arguments - got (float, float, int, int), but expected one of:

(torch.device device)
(torch.Storage storage)
(Tensor other)
(tuple of ints size, torch.device device)
(object data, torch.device device)

I'm using pytorch 0.4.0 and python 2.7. Any help would be appreciated.

Thanks in advance!

Model architecture

why your implementation doesn't include avgpool layer? As the paper shows, mobilenet-v2 should has one pooling layer in the last!

About the invert res-block module

Hi, I have a question about the invert res-block module in your code. When I implemented this model, I feel confused about how to build the invert res-block module.
I found that in your code, you use this:
self.use_res_connect = self.stride == 1 and inp == oup
to ensure that input channels match with output channels. However, the input channels and output channels are always different. Thus, seems that there is no use for the this skip connection because (inp == oup) is always false.
Hope you can reply this issue, thanks very match.

will you release width_mult = 0.5 pretrained model?

Did you use weight decay on depthwise layer？ Can you put on the training log?

Training Details

Hi,

Thanks for sharing this awesome work. Could you please also provide the batch size, the hyper-parameters for the optimizer and the decay steps of the cosine learning rate?

the dropout rate should be 0.2 in original paper?

https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet/mobilenet.py

# of FLOPs

Hi. How did you compute the FLOPs for your model? I appreciate it if you share your code.

calculate flops

bn layer is not involved when calulate flops?

last two layers gone

Dude, looks like you deleted the avgpool and the fc following at the end

linux extract mobilenet_v2.pth.tar and it is empty?

tar -xvf mobilenet_v2.pth.tar
tar: This does not look like a tar archive
tar: Skipping to next header
tar: Exiting with failure status due to previous errors
ll mobilenet_v2.pth.tar
-rwxrwxrwx 1 *** users 14205652 Jul 7 20:52 mobilenet_v2.pth.tar*

how the input image is normalized? what is the mean and the std?

Pre-processing

Thanks for this repo and the provided model.

Would you please say how you pre-process the images for evaluation (which mean and std are being used)?
Cheers

the pretrained model file is broken, could you provide a right ?

I have download the file , and uncompress it:

tar xvf mobilenet_v2.pth.tar

it turn out :

tar: This does not look like a tar archive
tar: Skipping to next header
tar: Exiting with failure status due to previous errors

it seem some error occur with the model file, could you provide a new one?

How do you preprocess your data?

my preprocess is:

(1) firstly, resize to 256x256;
(2) secondly, Subtract the imagenet mean, then divided by 256;
(3) finally, center crop to 224x224;

then feed these validation data to your mobilenet_v2, but I didn't reach your accuracy.

The InvertedResidual module. No matter expansion is 1 or 6. There should be a bottle neck. pw+dw+pw(linear)

The InvertedResidual module. No matter expansion is 1 or 6. There should be a bottle neck. pw+dw+pw(linear).
But the code is if expansion is 1, then, only dw+pw-linear.
nn.Conv2d(in_channels, inner_channels, kernel_size=3, stride=stride, padding=1, groups=inner_channels, bias=False), nn.BatchNorm2d(inner_channels), nn.ReLU6(inplace=True), # pw-linear nn.Conv2d(inner_channels, out_channels, 1, 1, 0, bias=False), nn.BatchNorm2d(out_channels),

It is slow than pytorch-mobilenet-v1

I run the model on Mac, but it is slow than pytorch-mobilenet-v1, ('MobileNetV2 forward time:', 0.29175710678100586), ('MobileNet forward time:', 0.2466120719909668), but I check the paper the arch is faster, does the different implementing result?

about warmup epochs

how many warmup epochs did you use? i use the same parameters as described in README.md, but i can only get 71.35%.

input_size seems not used

Hi, I wonder how the input_size are used? the only place I can see is check the input size is 32*k.

trained models

hello,have you trained mobilenet_v2_x1.5 in imagenet.

how to train the net on my own dataset

Pretty good job tony！Thanks for the brilliant work you've done!
I am a novice in DL and just wondering how to train the network on my own dataset. Where is the “”input data“” snippet：）

Incompatible keys when loading the model

Hi!
I'm having trouble loading the model, it seems that there are some incompatible keys.
I've saved the pretrained model and python script on a folder named mobilenet, this is my code:

import torch
from mobilenet.MobileNetV2 import MobileNetV2
net = MobileNetV2(n_class=1000)
state_dict = torch.load('mobilenet/mobilenet_v2.pth.tar')
net.load_state_dict(state_dict)

After running it, I get this error on the last line:

IncompatibleKeys(missing_keys=[], unexpected_keys=[])

I am using Pytorch 1.1.0.
Any help is appreciated!

Extra 1k parameters compared to TF

The README mentions that number of parameters in this model is higher than those from official TF version by 1k (3.471 M vs 3.47 M). Why is that so?

problem about the last version of MobileNetV2

Hi，Can you provided the pretrained model of MobileNetV2 in your last version. thankyou~

Regarding Hyperparameters

Thanks for opensourcing the code so quick.
can you let me know the Hyperparameters you used? I want to train my own? Did you set batchsize 96 only?
and
net = models.MobileNetV2(n_class=1000)
I am getting error "name 'models' is not defined".

Did you understand the Theorm 1?

regarding number of epochs

Do you mind sharing the number of epochs you used to generate your results?

temporal shift module error

Hello

First of all thank you for your selfless open source, very good work!!!

Actually I want to question in the TSM program, but can't find the entrance to the questions, so I can only come here to ask questions, because the TSM within the project does not provide ucf101 and hmdb51 datasets split. txt file, The TSN project have related code but a lot of extra libraries to install, so I had to write my own code to generate the split. TXT file and run it with the command-line arguments you provided, but in the end the result is very poor（only 1.3%）, I think it is because there is an error in the split. txt file I generated, so I want to see the internal data form of the split. txt file you generated, is that ok?
My split. txt data format is：
/home/dpc/3D-ResNets-PyTorch-master/data/test_jpg/ApplyEyeMakeup/v_ApplyEyeMakeup_g08_c01.avi 121 0
/home/dpc/3D-ResNets-PyTorch-master/data/test_jpg/ApplyEyeMakeup/v_ApplyEyeMakeup_g08_c02.avi 118 0
/home/dpc/3D-ResNets-PyTorch-master/data/test_jpg/ApplyEyeMakeup/v_ApplyEyeMakeup_g08_c03.avi 147 0
......

My RGB frame directory format is：
ApplyEyeMakeup/v_ApplyEyeMakeup_g01_c01.avi/img_00001.jpg
ApplyEyeMakeup/v_ApplyEyeMakeup_g01_c01.avi/img_00002.jpg
ApplyEyeMakeup/v_ApplyEyeMakeup_g01_c01.avi/img_00003.jpg
......

In fact, I'm still not sure what caused the low accuracy......

I would be very grateful if I could receive your reply！：）