Light

Inception-v3 works, AlexNet causes OOM about tensorflow-on-raspberry-pi HOT 9 OPEN

samjabrahams commented on September 13, 2024

Inception-v3 works, AlexNet causes OOM

from tensorflow-on-raspberry-pi.

Comments (9)

samjabrahams commented on September 13, 2024 1

If the goal is to use the Raspberry Pi to classify the flowers, I would suggest training the model on your desktop computer, saving/exporting it, and then loading that trained model onto the RPi. Check out the official how-to here.

Maybe I'll make some sort of baby TensorFlow Serving server for running pre-trained models on RPi at some point.

from tensorflow-on-raspberry-pi.

ArnoXf commented on September 13, 2024 1

Oh no, actually I already trained the model on my desktop machine using a GTX 750 ti. I saved my model there using tflearns model.save("my_model"). Then transfered the saved weights file to the Pi, build the net architecture there (like desicribed in my first post) and load the weights using model.load("my_model"). I don't want to train on the Pi, but just load the model and predict single images (what works on my desktop machine).

from tensorflow-on-raspberry-pi.

samjabrahams commented on September 13, 2024 1

Great- glad to hear you're already doing that! Next question: when you use model.save() and model.load(), are you including the last line of your code?

network = regression(fc3, optimizer='momentum', loss='categorical_crossentropy',
                     learning_rate=0.01)

Even if you pre-trained your weights, that line is going to cause your model to continue training when you run it on your Pi and not just feed values forward.

Apologies if you've already tried the things I'm suggesting: I don't know what you've previously attempted, so I'm trying to get a better understanding of where we stand.

from tensorflow-on-raspberry-pi.

samjabrahams commented on September 13, 2024 1

No problem! I believe that the Inception model resizes images automatically, which is why tiny images have the same compute time as huge ones. Getting the model to run faster is something that a fair number of people are currently working on. Here's a short list of things that may be causing the slowdown on the RPi compared to other computers running Inception on CPU:

Raspbian is a 32-bit operating system, which means that 64-bit computation suffers slowdown (I'm pretty sure the Inception V3 model has 64-bit numbers in there)
Slow read speed, both from the SD card and from RAM
Less powerful CPU
Compiler not optimizing everything perfectly

I'm probably forgetting several important factors, but that's something to start from. Here are some ways that one might try to alleviate these issues:

Build and train your own smaller model from scratch! It's going to be more difficult, but it may not be possible to achieve high-speed evaluation with the Inception model on the RPi, no matter how efficient we are able to get it. Especially if you are reading from a camera at the same time on the same RPi!
Try to find the "magic" compiler options. Until today, we were using the -mfpu=neon flag with the GCC compiler, which added a not-insignificant percentage improvement in speed. Unfortunately, the new 16-bit floating point types in TensorFlow don't play nice with NEON. But there may be other, even better options that could improve the speed of TensorFlow on its own!
Use 8-bit quantization to reduce overhead from the model. This is relatively new, and some of the kinks are still being worked out, but TensorFlow has quantization tools available in the repository. The main issue is that you have to jump through a few hoops to get it to work properly- I'm hoping to be able to finish testing it on the Inception model tomorrow. There's no guarantee that it will lower the speed (it depends on what exact methods are used, as only some Operations have quantized Ops ready). But we may as well try! Follow this comment chain to cover some "gotchas" if you want to tackle it on your own. I'm going to setup a Pi to compile the quantized ops overnight and I'll try to get back to you within a few days.

from tensorflow-on-raspberry-pi.

samjabrahams commented on September 13, 2024

Thanks for posting this! I'll try to take a look at these memory issues sometime over the weekend. I believe that the issue is that you're actually training the AlexNet model on the Raspberry Pi, whereas the Inception model is pretrained.

When you train a model, the machine has to store values of each node along the way in order to compute the gradient, which means training takes a much larger amount of memory.

from tensorflow-on-raspberry-pi.

mrubashkin-svds commented on September 13, 2024

Hey Sam, Thanks for answering @ArnoXf 's questions. Have you been able to successfully train any part or whole model on the Pi3? If not, do you know of any other places where things like "learning_rate" need to be turned off to avoid building model errors?

from tensorflow-on-raspberry-pi.

samjabrahams commented on September 13, 2024

Hi @mrubashkin-svds - are you referring to AlexNet, or any model in general? I've done toy training on the RPi to make sure that the TensorFlow binaries work properly, but I haven't done any significant training on large models with the Raspberry Pi.

I don't have much experience with TFLearn, so I'm not sure how it runs Sessions, but the main thing is to not pass in any sort of Optimizer Operations in Session.run(), otherwise it'll have to store a huge amount of data in memory.

from tensorflow-on-raspberry-pi.

mrubashkin-svds commented on September 13, 2024

Hey @samjabrahams thank you for the input! I've been working with Inception V3 specifically (no luck with AlexNet) over the past few days and while unable to build any model on the pi, I was able to persist a 85mb model in memory and evaluate single images against it in ~real-time (7 sec processing time).

One more question if you have the time, do you have any suggestions for speeding up the processing time? The time seems to be independent of the picture size (i.e. same amount of time for a 24x24 or 240x240 image). Thanks again @samjabrahams !!

from tensorflow-on-raspberry-pi.

danbri commented on September 13, 2024

@samjabrahams on that last point, did you have any suggests with quantization?

from tensorflow-on-raspberry-pi.

Related Issues (20)

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.