Giter Site home page Giter Site logo

Comments (4)

solrex avatar solrex commented on May 28, 2024

The example codes takes 5 seconds to recognize a 28x28 number picture on my iPhone 6s.

It is unexpected. There must be something wrong with your phone or building environment. On my iPhone 5S, run the demo app CaffeSimple for the first time:
Loading caffe model...129.169006ms
Caffe infering...194.81996ms
And the second run output:
Caffe infering...74.63800ms

I really want to see how I can help to use GPU to speed it up.
There is an old repo by integrating Metal framework DeepLearningKit. but it uses swift, I am afraid the rewrite them to OC/C++ make take a pretty long time.

Although there are some works on utilizing the power of GPU with Metal framework/Renderscript/OpenCL. They cannot directly integrated with Caffe. DeepLearningKit can import caffe model & weight. But it is another deep learning framework, compared to Caffe.

If you want Metal framework in Caffe, you must rewrite forward_gpu() function for EVERY layer used in Object-C, and add corresponding *_layer.mm for it. Similar work has been done for CUDA in *_layer.cu source files.

Let me know if you are working on it or how can I help.

The GPUs in mobile devices are not as powerful as the Nvidia family in desktop/server computer. So the GPU acceleration rate is much less than expected. More importantly, there are no CUDA like solutions for mobile GPU families(PowerVR, Mali, Adreno). Currently I'm looking at the trends but not working on it.

from caffe-mobile.

weixingsun avatar weixingsun commented on May 28, 2024

Thanks for your reply, so surprised to know your 200ms from cold start.
I built my lib using the same script as you described, except "make -j 4" -> "make -j 2"(but it shouldn't be a problem)
Could you please check if my models look the same as yours, or you can share your prototxt / models

model.zip
log.txt

I'll check this first, and then head to the GPU, thanks

from caffe-mobile.

solrex avatar solrex commented on May 28, 2024

A quick glance at your net.prototxt shows that you chose an improper batch size for testing phase:

2,7c2,6
< input: "data"
< input_shape {
<   dim: 10000
<   dim: 1
<   dim: 28
<   dim: 28
---
> layer {
>   name: "data"
>   type: "Input"
>   top: "data"
>   input_param { shape: { dim: 64 dim: 1 dim: 28 dim: 28 } }

The first dim should be 1 instead of 10000. The 64 in my/net.prototxt is also improper. Test shows that the second run time reduces to 6.047ms if the first dim set to 1. Thank you for letting me be aware of it. I'll add a batch size checking step in README.md.

from caffe-mobile.

weixingsun avatar weixingsun commented on May 28, 2024

@solrex yes, you are right, I copy from my server. after changing to 1, the job is done in 14ms, and down to 5ms for the second time(that's what we should get from caffe). thanks for pointing it out.
for GPU, I saw many codes are cuda related, not quite flexible to migrate. let me think about how to make a patch on metal for ios, and others for android.

from caffe-mobile.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.