I really appreciate your great jobs. And I share what I met when I tried to conver

Plus, GPU/NNAPI delegates were not created properly <div class="snippet-clipboard-

Converting EfficientFormer into tflite doesn't work about keras_cv_attention_models HOT 11 CLOSED

choong-park commented on July 4, 2024

Converting EfficientFormer into tflite doesn't work

from keras_cv_attention_models.

Comments (11)

choong-park commented on July 4, 2024

Plus, GPU/NNAPI delegates were not created properly

INFO: Initialized TensorFlow Lite runtime.
INFO: Created TensorFlow Lite delegate for GPU.
GPU delegate created.
native : lite/tools/evaluation/stages/tflite_inference_stage.cc:156 Failed to apply delegate 0
Aborted

I used main branch of this repo, and run the converted model on Galaxy S22

from keras_cv_attention_models.

leondgarse commented on July 4, 2024

I didn't have a testing environment on hand, and haven't run tflite myself for a long time. How about skip all model_surgery functions? Seems raw gelu is supported in tflite, and the other 2 surgery are not taking effect on EfficientFormerL1.

mm = efficientformer.EfficientFormerL1()
converter = tf.lite.TFLiteConverter.from_keras_model(mm)
open(mm.name + ".tflite", "wb").write(converter.convert())

Or can a basic resnet50 work?

mm = keras.applications.ResNet50()
converter = tf.lite.TFLiteConverter.from_keras_model(mm)
open(mm.name + ".tflite", "wb").write(converter.convert())

print(eval_func.TFLiteModelInterf(mm.name + '.tflite')(tf.ones([1, 224, 224, 3])).shape)
# (1, 1000)

Uh, there is a related one and not solved tflite conversion - GPU/XNNPACK fails #89.

from keras_cv_attention_models.

choong-park commented on July 4, 2024

Thank you for your reply

I tried what you said.

I removed the "surgeries", but it didn't work as it had been. Maybe surgery is not the problem.
As you said, I tried resnet50, and it is working. "Working" means there is no aborting and it continues to process the image classification work. (as regnetz_d32.tflite in your guideline)

So, in my opinion, "transformer" models are not converted properly into tflite models.
Could you investigate this?

from keras_cv_attention_models.

leondgarse commented on July 4, 2024

It's actually because even a single Dense layer with a not 2D inputs is not supported using xnnpack:

inputs = keras.layers.Input([32, 32, 3])
nn = keras.layers.Dense(12, use_bias=True)(inputs)
mm = keras.models.Model(inputs, nn, name='test_dense')

! adb push test_dense.tflite /data/local/tmp/
! adb shell /data/local/tmp/android_aarch64_benchmark_model --graph=/data/local/tmp/test_dense.tflite --use_xnnpack=true
# WARNING: Attempting to use a delegate that only supports static-sized tensors with a graph that has dynamic-sized tensors (tensor#14 is a dynamic-sized tensor).
# ERROR: Failed to apply XNNPACK delegate.
# ERROR: Benchmarking failed.

Just added a model_surgery function convert_dense_to_conv that converts all Dense layer with 3D / 4D inputs to Conv1D / Conv2D. MobileViT_S and EfficientFormerL1 should work now.

from keras_cv_attention_models import beit, model_surgery, efficientformer, mobilevit

# mm = mobilevit.MobileViT_S()
mm = efficientformer.EfficientFormerL1()
mm = model_surgery.convert_dense_to_conv(mm)  # Convert all Dense layers
converter = tf.lite.TFLiteConverter.from_keras_model(mm)
open(mm.name + ".tflite", "wb").write(converter.convert())

Tested results.

Model	Using Dense use_xnnpack=false	Using Conv use_xnnpack=false	Using Conv use_xnnpack=true
MobileViT_S	Inference (avg) 215371 us	Inference (avg) 163836 us	Inference (avg) 163817 us
EfficientFormerL1	Inference (avg) 126829 us	Inference (avg) 107053 us	Inference (avg) 107132 us

I think xnnpack is enabled as much as possible even while --use_xnnpack=false now, as we discused in How to speed up inference on a quantized model #44.
Readme is updated keras_cv_attention_models#tflite-conversion.

from keras_cv_attention_models.

choong-park commented on July 4, 2024

Thanks for your support! EfficientFormer and MobileViT are working now with XNNPACK delegate!

But unfortunately, when I use "delegate=gpu", the messages below come out.

INFO: Initialized TensorFlow Lite runtime.
INFO: Created TensorFlow Lite delegate for GPU.
GPU delegate created.
ERROR: Following operations are not supported by GPU delegate:
EXPAND_DIMS: Operation is not supported.
GELU: Operation is not supported.
16 operations will run on the GPU, and the remaining 166 operations will run on the CPU.
VERBOSE: Replacing 16 node(s) with delegate (TfLiteGpuDelegateV2) node, yielding 2 partitions.
INFO: Initialized OpenCL-based API.
INFO: Created 1 GPU delegate kernels.

And the latency is not better than XNNPACK delegate.
GELU seems to be unsupported for GPU delegate.

If I replaced GELU with RELU, more operations were replaced into GPU delegate (and faster), but accuracy was totally broken.
Of course, It seems obvious :)

So in my opinion, GELU is not supported for GPU delegate. The surgery for approximate GELU didn't work.

There're my questions.

Could GELU be supported for tflite with GPU delegate?
If not, how can EfficientFormer with ReLU be trained with this framework?

I always appreciate your support. Thank you.

from keras_cv_attention_models.

leondgarse commented on July 4, 2024

Never have I tried GPU delegate. Ya, it seems only some limited ops supported GPU delegates for TensorFlow Lite, and gelu and tanh for gelu/app are not in the list.

Just added a gelu/quick using sigmoid from ViT clip text model, and exp for sigmoid is in the supported ops list. May try if model = model_surgery.convert_gelu_to_approximate(model, target_activation="gelu/quick") works.
If using the train_script.py for training, just give --additional_model_kwargs '{"activation": "relu"}'. May also give --pretrained default for loading pretrained weights. Or build model with model = efficientformer.EfficientFormerL1(..., activation="relu") in your own training script, I think you should have known this.

from keras_cv_attention_models.

choong-park commented on July 4, 2024

Thank you for your support! It was really helpful!

from keras_cv_attention_models.

leondgarse commented on July 4, 2024

So is it the gelu/quick or your retraining works?

from keras_cv_attention_models.

choong-park commented on July 4, 2024

Unfortunately, I did only try gelu/quick, and it didn't work. And I am trying to retrain it.
I just appreciated your support :)

from keras_cv_attention_models.

leondgarse commented on July 4, 2024

Lol, alright. :)

from keras_cv_attention_models.

leondgarse commented on July 4, 2024

Closing, may re-open or raise a new one if still an issue.

from keras_cv_attention_models.

Converting EfficientFormer into tflite doesn't work about keras_cv_attention_models HOT 11 CLOSED

Comments (11)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent