Giter Site home page Giter Site logo

rishikksh20 / crossvit-pytorch Goto Github PK

View Code? Open in Web Editor NEW
174.0 174.0 18.0 234 KB

Implementation of CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification

License: MIT License

Python 100.00%
classifier computer-vision image-classification pytorch transformers vision-transformers

crossvit-pytorch's Introduction

Hi there, I'm Rishikesh, Speech and Computer Vision Researcher👋

Hi friends, I'm Rishikesh, Co-founder and CTO of Dubpro.ai (formely known as DeepSync Technologies). I graduated from NIT Silchar and immediately after my graduation I joined my first organisation, Nucleus Software as Full Stack Developer role. I have a keen interest in machine learning and deep learning research, especially in a field of speech synthesis and computer vision.

  • 🔭 I’m currently working on Speech Synthesis and End to End Text to Speech (TTS) engines.
  • 🌱 I love to code and contribute to Open Source.
  • 💬 Ask me anything regarding my work, code and research here (Please tag me @rishikksh20 in your comment.).
  • 📫 How to reach me: [email protected]

Connect with me:

ai_rishikesh | Twitter


Languages and Tools:

Python

PyTorch

Github

Visual Studio Code

AWS

Azure

Github

Github

crossvit-pytorch's People

Contributors

rishikksh20 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

crossvit-pytorch's Issues

About the questions in Table 7 of the article

Hello,I am very interested in your job, but there is one thing I don't understand.I would like to ask you about the comparison between the information in Table 7 of the paper and the information in Crossvit -S, the first line should be K=3,N=1,M=4 and L=1. I don't quite understand the setting in the first line of Table 7, and I feel it is inconsistent with the content mentioned in the article. Maybe I haven't understood your meaning correctly, and I hope to receive your reply.Thank you!

Multilabel Image classification

@rishikksh20 thanks for sharing code base , i had few queries
1.can we train crossvit for multilabel classification problem , if so what is the procedure
2. i have a custom dataset of 10.5k with 25class labels with instance as label vectors of 0 and 1
3. can remove the pre-trained classifer head and add our customr classifier ?

Thanks in advance

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.