Giter Site home page Giter Site logo

cuong3004 / mobilevit-gsoc Goto Github PK

View Code? Open in Web Editor NEW

This project forked from sayannath/mobilevit-gsoc

0.0 0.0 0.0 24.97 MB

Implementation of MobileViT in TensorFlow and Keras

Home Page: https://arxiv.org/abs/2110.02178

License: Apache License 2.0

Shell 0.03% Python 16.05% Jupyter Notebook 83.92%

mobilevit-gsoc's Introduction

MobileViT GSoC 2022

gsoc-logo

Description

  • Year: 2022
  • Organisation: TensorFlow
  • Project Title: Publish fine-tuned MobileViT in TensorFlow Hub TensorFlow Hub is the main TensorFlow model repository with thousands of pre-trained models with documentation, sample code and readily available to use or fine-tune. The idea behind the project is to develop new State-of-the-Art models like MobileViT and publish the pre-trained models on TensorFlow Hub using the ImageNet1k dataset. MobileViT is light-weight and general-purpose vision transformer for mobile devices. MobileViT presents a different perspective for the global processing of information with transformers, i.e., transformers as convolutions. Our results show that MobileViT significantly outperforms CNN- and ViT-based networks across different tasks and datasets. On the ImageNet-1k dataset, MobileViT achieves top-1 accuracy of 78.4% with about 6 million parameters, which is 3.2% and 6.2% more accurate than MobileNetv3 (CNN-based) and DeIT (ViT-based) for a similar number of parameters. On the MS-COCO object detection task, MobileViT is 5.7% more accurate than MobileNetv3 for a similar number of parameters.
  • Mentors: Luis Gustavo Martins & Sayak Paul

Project Report

This repository provides TensorFlow / Keras implementations of different MobileViT [1] variants. It also provides the TensorFlow / Keras models that have been populated with the original MobileViT pre-trained weights available from [2]. These models are not blackbox SavedModels i.e., they can be fully expanded into tf.keras.Model objects and one can call all the utility functions on them (example: .summary()).

As of today, all the TensorFlow / Keras variants of the models listed here are available in this repository. This list includes the ImageNet-1k models.

Refer to the "Using the models" section to get started.

Conversion

TensorFlow / Keras implementations are available in mobilevit/models/mobilevit.py. Conversion utilities are in convert.py.

Models

The converted models will be available on TF-Hub.

There should be a total of 3 different models each having two variants: classifier and feature extractor. You can load any model and get started like so:

import tensorflow as tf

model = tf.keras.models.load_model('model_path')
print(model.summary())

The model names are interpreted as follows:

  • mobilevit_xxs_1k_256: Means that the model was pre-trained on the ImageNet-1k dataset with a resolution of 256x256.

Results

Results are on ImageNet-1k validation set (top-1 accuracy).

name original acc@1 keras acc@1
MobileViT_XXS 69.0 68.59
MobileViT_XS 74.7 74.67
MobileViT_S 78.3 78.36

Differences in the results are primarily because of the differences in the library implementations especially how image resizing is implemented in PyTorch and TensorFlow. Results can be verified with the code in imagenet_1k_eval. Logs are available at this URL.

Using the models

Pre-trained models:

Randomly initialized models:

from mobilevit.models.mobilevit import get_mobilevit_model

model = get_mobilevit_model(
      model_name='mobilevit_xxs', # [mobilevit_xxs, mobilevit_xs, mobilevit_s]
      image_shape=(256, 256, 3),
      num_classes=1000,
    )

print(model.summary())

To view different model configurations, refer here.

Upcoming Contributions

  • Allow the models to accept more input shapes (useful for downstream tasks)
  • Convert the saved_models to TFLite.
  • Fine-tuning notebook
  • Off-the-shelf-classification notebook
  • Publish models on TF-Hub

References

[1] MobileViT Paper: https://arxiv.org/abs/2110.02178

[2] Official MobileViT weights: https://github.com/apple/ml-cvnets

[3] Hugging Face MobileViT: MobileViT-HF

Acknowledgements

๐Ÿ”— Links

portfolio linkedin twitter

mobilevit-gsoc's People

Contributors

cuong3004 avatar sayannath avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.