Giter Site home page Giter Site logo

pytorch-mobilenet-v2's Introduction

A PyTorch implementation of MobileNetV2

This is a PyTorch implementation of MobileNetV2 architecture as described in the paper Inverted Residuals and Linear Bottlenecks: Mobile Networks for Classification, Detection and Segmentation.

[NEW] Add the code to automatically download the pre-trained weights.

Training Recipe

Recently I have figured out a good training setting:

  1. number of epochs: 150
  2. learning rate schedule: cosine learning rate, initial lr=0.05
  3. weight decay: 4e-5
  4. remove dropout

You should get >72% top-1 accuracy with this training recipe!

Accuracy & Statistics

Here is a comparison of statistics against the official TensorFlow implementation.

FLOPs Parameters Top1-acc Pretrained Model
Official TF 300 M 3.47 M 71.8% -
Ours 300.775 M 3.471 M 71.8% [google drive]

Usage

To use the pretrained model, run

from MobileNetV2 import mobilenet_v2

net = mobilenet_v2(pretrained=True)

Data Pre-processing

I used the following code for data pre-processing on ImageNet:

normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                 std=[0.229, 0.224, 0.225])

input_size = 224
train_dataset = datasets.ImageFolder(
    traindir,
    transforms.Compose([
        transforms.RandomResizedCrop(input_size, scale=(0.2, 1.0)), 
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(),
        normalize,
    ]))

train_loader = torch.utils.data.DataLoader(
    train_dataset, batch_size=batch_size, shuffle=True,
    num_workers=n_worker, pin_memory=True)

val_loader = torch.utils.data.DataLoader(
    datasets.ImageFolder(valdir, transforms.Compose([
        transforms.Resize(int(input_size/0.875)),
        transforms.CenterCrop(input_size),
        transforms.ToTensor(),
        normalize,
    ])),
    batch_size=batch_size, shuffle=False,
    num_workers=n_worker, pin_memory=True)

pytorch-mobilenet-v2's People

Contributors

tonylins avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pytorch-mobilenet-v2's Issues

The pre-trained model file is corrupted?

I have downloaded the pre-trained model file several times and tried to extract the tar file. But it always shows "An error occurred while loading the archive"

file broken ?

Hi, I downloaded the file and try to unzip or untar it, but did not succeed, I tried both on ubuntu and mac, no good luck. Can you make sure the file is still good ?

Thanks in advance

Question regarding inverted residual block

Quick question regarding inverted residual block.

    if expand_ratio == 1:
        self.conv = nn.Sequential(
            # dw
            nn.Conv2d(hidden_dim, hidden_dim, 3, stride, 1, groups=hidden_dim, bias=False),
            nn.BatchNorm2d(hidden_dim),
            nn.ReLU6(inplace=True),
            # pw-linear
            nn.Conv2d(hidden_dim, oup, 1, 1, 0, bias=False),
            nn.BatchNorm2d(oup),
        )
    else:
        self.conv = nn.Sequential(
            # pw
            nn.Conv2d(inp, hidden_dim, 1, 1, 0, bias=False),
            nn.BatchNorm2d(hidden_dim),
            nn.ReLU6(inplace=True),
            # dw
            nn.Conv2d(hidden_dim, hidden_dim, 3, stride, 1, groups=hidden_dim, bias=False),
            nn.BatchNorm2d(hidden_dim),
            nn.ReLU6(inplace=True),
            # pw-linear
            nn.Conv2d(hidden_dim, oup, 1, 1, 0, bias=False),
            nn.BatchNorm2d(oup),
        )

def forward(self, x):
    if self.use_res_connect:
        return x + self.conv(x)
    else:
        return self.conv(x)

Why is there no activation at the end ??
Shouldnt it return either
nn.ReLU(x + self.conv(x)) or nn.ReLU(self.conv(x)) ???

Thank you

how to train?

Can you please tell us how can we train the model from scratch? and also how can we test it?

num of Input channels are more than 3

Hello! Thanks for the code!

I am trying to transfer the Mobile Net to my own dateset which contain the image and saliency data, but when I concat them ,there was an error about the channel should be 3 rather than 4.

Would you please give me some advise? Thanks a lot

depthwise "separable" convolution not implemented correctly?

First of all, great work :)

I'm just a little bit concerned about your implementation of the depthwise separable convolution. AFAIK, it's supposed to be a depthwise convolution followed by a pointwise convolution.

What I am thinking of is something like this:

cin, cout = channels_in, channels_out

depthwise = nn.Conv2d(cin, cin, kernel_size=3, padding=(1,1), groups=cin)
pointwise = nn.Conv2d(cin, cout, kernel_size=1)

depthwise_separable_convolution = nn.Sequential(
    depthwise,
    pointwise
)

But in your implementation, you implemented it as a single convolution, which I believe is a depthwise convolution, but not a depthwise "separable" convolution.

Dataset

Hi
Thank you for sharing the pre-trained model. I wanted to know which dataset you have used for training the model?

type error when loading model

Hi,

I clone the repo and made my own file main.py which has two lines:
from MobileNetV2 import MobileNetV2

net = MobileNetV2(n_class=1000)

However, when I run main, it gives me the following error:

File "main.py", line 3, in
net = MobileNetV2(n_class=1000)
File "/home/liuyanqi/pytorch-mobilenet-v2/MobileNetV2.py", line 89, in init
self.features.append(block(input_channel, output_channel, s, expand_ratio=t))
File "/home/liuyanqi/pytorch-mobilenet-v2/MobileNetV2.py", line 33, in init
nn.Conv2d(hidden_dim, hidden_dim, 3, stride, 1, groups=hidden_dim, bias=False),
File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/conv.py", line 297, in init
False, _pair(0), groups, bias)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/conv.py", line 33, in init
out_channels, in_channels // groups, *kernel_size))
TypeError: new() received an invalid combination of arguments - got (float, float, int, int), but expected one of:

  • (torch.device device)
  • (torch.Storage storage)
  • (Tensor other)
  • (tuple of ints size, torch.device device)
  • (object data, torch.device device)

I'm using pytorch 0.4.0 and python 2.7. Any help would be appreciated.

Thanks in advance!

Model architecture

why your implementation doesn't include avgpool layer? As the paper shows, mobilenet-v2 should has one pooling layer in the last!

About the invert res-block module

Hi, I have a question about the invert res-block module in your code. When I implemented this model, I feel confused about how to build the invert res-block module.
I found that in your code, you use this:
self.use_res_connect = self.stride == 1 and inp == oup
to ensure that input channels match with output channels. However, the input channels and output channels are always different. Thus, seems that there is no use for the this skip connection because (inp == oup) is always false.
Hope you can reply this issue, thanks very match.

Training Details

Hi,

Thanks for sharing this awesome work. Could you please also provide the batch size, the hyper-parameters for the optimizer and the decay steps of the cosine learning rate?

# of FLOPs

Hi. How did you compute the FLOPs for your model? I appreciate it if you share your code.

linux extract mobilenet_v2.pth.tar and it is empty?

tar -xvf mobilenet_v2.pth.tar
tar: This does not look like a tar archive
tar: Skipping to next header
tar: Exiting with failure status due to previous errors
ll mobilenet_v2.pth.tar
-rwxrwxrwx 1 *** users 14205652 Jul 7 20:52 mobilenet_v2.pth.tar*

Pre-processing

Thanks for this repo and the provided model.

Would you please say how you pre-process the images for evaluation (which mean and std are being used)?
Cheers

the pretrained model file is broken, could you provide a right ?

I have download the file , and uncompress it:

tar xvf mobilenet_v2.pth.tar

it turn out :

tar: This does not look like a tar archive
tar: Skipping to next header
tar: Exiting with failure status due to previous errors

it seem some error occur with the model file, could you provide a new one?

How do you preprocess your data?

my preprocess is:

(1) firstly, resize to 256x256;
(2) secondly, Subtract the imagenet mean, then divided by 256;
(3) finally, center crop to 224x224;

then feed these validation data to your mobilenet_v2, but I didn't reach your accuracy.

The InvertedResidual module. No matter expansion is 1 or 6. There should be a bottle neck. pw+dw+pw(linear)

The InvertedResidual module. No matter expansion is 1 or 6. There should be a bottle neck. pw+dw+pw(linear).
But the code is if expansion is 1, then, only dw+pw-linear.
nn.Conv2d(in_channels, inner_channels, kernel_size=3, stride=stride, padding=1, groups=inner_channels, bias=False), nn.BatchNorm2d(inner_channels), nn.ReLU6(inplace=True), # pw-linear nn.Conv2d(inner_channels, out_channels, 1, 1, 0, bias=False), nn.BatchNorm2d(out_channels),

It is slow than pytorch-mobilenet-v1

I run the model on Mac, but it is slow than pytorch-mobilenet-v1, ('MobileNetV2 forward time:', 0.29175710678100586), ('MobileNet forward time:', 0.2466120719909668), but I check the paper the arch is faster, does the different implementing result?
image

about warmup epochs

how many warmup epochs did you use? i use the same parameters as described in README.md, but i can only get 71.35%.

input_size seems not used

Hi, I wonder how the input_size are used? the only place I can see is check the input size is 32*k.

trained models

hello,have you trained mobilenet_v2_x1.5 in imagenet.

how to train the net on my own dataset

Pretty good job tony!Thanks for the brilliant work you've done!
I am a novice in DL and just wondering how to train the network on my own dataset. Where is the “”input data“” snippet:)

Incompatible keys when loading the model

Hi!
I'm having trouble loading the model, it seems that there are some incompatible keys.
I've saved the pretrained model and python script on a folder named mobilenet, this is my code:

import torch
from mobilenet.MobileNetV2 import MobileNetV2
net = MobileNetV2(n_class=1000)
state_dict = torch.load('mobilenet/mobilenet_v2.pth.tar')
net.load_state_dict(state_dict)

After running it, I get this error on the last line:

IncompatibleKeys(missing_keys=[], unexpected_keys=[])

I am using Pytorch 1.1.0.
Any help is appreciated!

Extra 1k parameters compared to TF

The README mentions that number of parameters in this model is higher than those from official TF version by 1k (3.471 M vs 3.47 M). Why is that so?

Regarding Hyperparameters

Thanks for opensourcing the code so quick.
can you let me know the Hyperparameters you used? I want to train my own? Did you set batchsize 96 only?
and
net = models.MobileNetV2(n_class=1000)
I am getting error "name 'models' is not defined".

Did you understand the Theorm 1?

temporal shift module error

Hello

First of all thank you for your selfless open source, very good work!!!

Actually I want to question in the TSM program, but can't find the entrance to the questions, so I can only come here to ask questions, because the TSM within the project does not provide ucf101 and hmdb51 datasets split. txt file, The TSN project have related code but a lot of extra libraries to install, so I had to write my own code to generate the split. TXT file and run it with the command-line arguments you provided, but in the end the result is very poor(only 1.3%), I think it is because there is an error in the split. txt file I generated, so I want to see the internal data form of the split. txt file you generated, is that ok?
My split. txt data format is:
/home/dpc/3D-ResNets-PyTorch-master/data/test_jpg/ApplyEyeMakeup/v_ApplyEyeMakeup_g08_c01.avi 121 0
/home/dpc/3D-ResNets-PyTorch-master/data/test_jpg/ApplyEyeMakeup/v_ApplyEyeMakeup_g08_c02.avi 118 0
/home/dpc/3D-ResNets-PyTorch-master/data/test_jpg/ApplyEyeMakeup/v_ApplyEyeMakeup_g08_c03.avi 147 0
......

My RGB frame directory format is:
ApplyEyeMakeup/v_ApplyEyeMakeup_g01_c01.avi/img_00001.jpg
ApplyEyeMakeup/v_ApplyEyeMakeup_g01_c01.avi/img_00002.jpg
ApplyEyeMakeup/v_ApplyEyeMakeup_g01_c01.avi/img_00003.jpg
......

In fact, I'm still not sure what caused the low accuracy......

I would be very grateful if I could receive your reply! :)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.