Giter Site home page Giter Site logo

staghado / vit.cpp Goto Github PK

View Code? Open in Web Editor NEW
177.0 6.0 16.0 2.23 MB

Inference Vision Transformer (ViT) in plain C/C++ with ggml

License: MIT License

Python 12.77% C++ 82.20% CMake 2.63% Shell 2.40%
cpu ggml vision-transformer whisper-cpp edge-computing llamacpp ai computer-vision image-classification c

vit.cpp's People

Contributors

ggerganov avatar mehdi-elion avatar staghado avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

vit.cpp's Issues

Implementation of vision models

Hi @staghado thank you for publishing this amazing implementation of ViT using ggml.

I was thinking of doing a similar implementation for another transformer-based model by following your codebase. However, I could not find good documentation of ggml to know about existing functionalities and how to use them. Also, certain concepts (e.g., https://github.com/staghado/vit.cpp/blob/main/main.cpp#L82-L91) are not seen in a Python-based inference script.

Can you please share your approach when you implemented this ggml-based vit code? What were the resources that helped you to build this project? I appreciate any help you can provide.

Batch forward

Hey @staghado thanks for the great work!

I'm wondering whether you have any idea about implementing the batched images forward given batch size > 1.

I notice the batch processing in llama.cpp seems pretty different. No batch dim is involved in the forward pass. I also propose a [discussion]
(ggerganov/llama.cpp#4371) there. Do you have any idea about this? Thanks in advance!

Is it possible to build it for Android

It would be nice if we can try this out on Android. I followed instructions from https://github.com/ggerganov/ggml

I tried it and seems that we need to remove -march=native from CMakeLists.txt

-set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -O3 -march=native")
-set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -O3 -march=native")

However, after that I got an error

CANNOT LINK EXECUTABLE "/data/local/tmp/bin/vit": cannot locate symbol "__emutls_get_address" referenced by "/data/local/tmp/bin/vit"...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.