Hi, Thank you so much for sharing this useful code with us. I am

this is my code: <div class="snippet-clipboard-content notranslate position-relati

keras to CoreML with custom loss function about coreml-custom-layers HOT 10 CLOSED

hollance commented on August 18, 2024

keras to CoreML with custom loss function

from coreml-custom-layers.

Comments (10)

hollance commented on August 18, 2024 1

The Keras model only needs to have that loss for training, not for making predictions.

from coreml-custom-layers.

hollance commented on August 18, 2024 1

You should convert base_model instead of model. That’s all.

from coreml-custom-layers.

hollance commented on August 18, 2024

Your Core ML model should not include the loss function as that is used only for training, which is not possible with Core ML.

from coreml-custom-layers.

un-lock-me commented on August 18, 2024

Thank you so much for replying back:)
you mean that we can not convert any model which has custom loss to core ML?
if so, do you have any idea how can I convert the ocr code to any other model understandable by ios?
Most OCR keras code I explored have the lambda layer for their loss.

Thanks again for taking the time

from coreml-custom-layers.

un-lock-me commented on August 18, 2024

this is my code:


import coremltools

def convert_lambda(layer):
    # Only convert this Lambda layer if it is for our swish function.
    if layer.function == ctc_lambda_func:
        params = NeuralNetwork_pb2.CustomLayerParams()

        # The name of the Swift or Obj-C class that implements this layer.
        params.className = "x"

        # The desciption is shown in Xcode's mlmodel viewer.
        params.description = "A fancy new loss"

        return params
    else:
        return None


print("\nConverting the model:")

# Convert the model to Core ML.
coreml_model = coremltools.converters.keras.convert(
    model,
    # 'weightswithoutstnlrchangedbackend.best.hdf5',
    input_names="image",
    image_input_names="image",
    output_names="output",
    add_custom_layers=True,
    custom_conversion_functions={"Lambda": convert_lambda},
    )

and this is the error:

Converting the model:
Traceback (most recent call last):
  File "/home/sgnbx/Downloads/projects/CRNN-with-STN-master/CRNN_with_STN.py", line 201, in <module>
    custom_conversion_functions={"Lambda": convert_lambda},
  File "/home/sgnbx/anaconda3/envs/tf_gpu/lib/python3.6/site-packages/coremltools/converters/keras/_keras_converter.py", line 760, in convert
    custom_conversion_functions=custom_conversion_functions)
  File "/home/sgnbx/anaconda3/envs/tf_gpu/lib/python3.6/site-packages/coremltools/converters/keras/_keras_converter.py", line 556, in convertToSpec
    custom_objects=custom_objects)
  File "/home/sgnbx/anaconda3/envs/tf_gpu/lib/python3.6/site-packages/coremltools/converters/keras/_keras2_converter.py", line 255, in _convert
    if input_names[idx] in input_name_shape_dict:
IndexError: list index out of range
Input name length mismatch

from coreml-custom-layers.

hollance commented on August 18, 2024

The easiest thing to do is to remove this lambda layer from the Keras model first.

from coreml-custom-layers.

un-lock-me commented on August 18, 2024

Isnt it obligatory to have lambda layer here as the model need to use a custom version of the loss rather than the already built in loss in the keras?
I have looked at a couple of the code with crrn architecture, all had the same custom loss.

from coreml-custom-layers.

un-lock-me commented on August 18, 2024

Can you please link me to a blog, source whatever?
I cant understand your saying honestly. as I just use the custom loss in the training. then saved the model, then use the saved model to convert to CoreMl.
Which part I am missing that I can not get you. sorry for the many questions and taking your time.

from coreml-custom-layers.

hollance commented on August 18, 2024

After training, remove the lambda layer from the Keras model and save this model. Then use coremltools to convert that Keras model (without the lambda layer) to Core ML.

from coreml-custom-layers.

un-lock-me commented on August 18, 2024

So I can not get your approach.
Do you mind have a quick look on the code and say that which part I have to change?(myconfusion is that, I can save the trained model, how can I save a model without training, what will be saved on the model, in terms of weight or other parameter?)
I am really sorry for taking your time and much appreciated.
(if you know a link regarding your approach I will be happy to explore it and do by myself but unfortunately I can not get your saying)

from coremltools.proto import NeuralNetwork_pb2
from keras import backend ,optimizers
from keras.callbacks import *
from keras.layers import *
from keras.models import *
from keras.optimizers import SGD
from keras.utils import *

from keras.callbacks import ModelCheckpoint
from keras.callbacks import TensorBoard

# from STN.spatial_transformer import SpatialTransformer

from Batch_Generator import img_gen, img_gen_val

from config import learning_rate, load_model_path, width, height, characters, label_len, label_classes, \
    cp_save_path, base_model_path, tb_log_dir

    
# functions
class Evaluate(Callback):

    def on_epoch_end(self, epoch, logs=None):
        acc = evaluate(base_model)
        print('')
        print('acc:'+str(acc)+"%")

evaluator = Evaluate()


def evaluate(input_model):
    correct_prediction = 0
    generator = img_gen_val()

    x_test, y_test = next(generator)
    # print(" ")
    y_pred = input_model.predict(x_test) 
    shape = y_pred[:, 2:, :].shape 
    ctc_decode = backend.ctc_decode(y_pred[:, 2:, :], input_length=np.ones(shape[0])*shape[1])[0][0]
    out = backend.get_value(ctc_decode)[:, :label_len]

    for m in range(1000):
        result_str = ''.join([characters[k] for k in out[m]])
        result_str = result_str.replace('-', '')
        if result_str == y_test[m]:
            correct_prediction += 1
            # print(m)
        else:
            print(result_str, y_test[m])

    return correct_prediction*1.0/10

def ctc_lambda_func(args):
    iy_pred, ilabels, iinput_length, ilabel_length = args
    # the 2 is critical here since the first couple outputs of the RNN
    # tend to be garbage:
    iy_pred = iy_pred[:, 2:, :]  # no such influence
    return backend.ctc_batch_cost(ilabels, iy_pred, iinput_length, ilabel_length)


# initial bias_initializer
def loc_net(input_shape):
    b = np.zeros((2, 3), dtype='float32')
    b[0, 0] = 1
    b[1, 1] = 1
    w = np.zeros((64, 6), dtype='float32')
    weights = [w, b.flatten()]

    loc_input = Input(input_shape)

    loc_conv_1 = Conv2D(16, (5, 5), padding='same', activation='relu')(loc_input)
    loc_conv_2 = Conv2D(32, (5, 5), padding='same', activation='relu')(loc_conv_1)
    loc_fla = Flatten()(loc_conv_2)
    loc_fc_1 = Dense(64, activation='relu')(loc_fla)
    loc_fc_2 = Dense(6, weights=weights)(loc_fc_1)

    output = Model(inputs=loc_input, outputs=loc_fc_2)

    return output


# build model
inputShape = Input((width, height, 3))  # base on Tensorflow backend
conv_1 = Conv2D(64, (3, 3), activation='relu', padding='same')(inputShape)
batchnorm_1 = BatchNormalization()(conv_1)

conv_2 = Conv2D(128, (3, 3), activation='relu', padding='same')(conv_1)
conv_3 = Conv2D(256, (3, 3), activation='relu', padding='same')(conv_2)
batchnorm_3 = BatchNormalization()(conv_3)
pool_3 = MaxPooling2D(pool_size=(2, 2))(batchnorm_3)

conv_4 = Conv2D(256, (3, 3), activation='relu', padding='same')(pool_3)
conv_5 = Conv2D(512, (3, 3), activation='relu', padding='same')(conv_4)
batchnorm_5 = BatchNormalization()(conv_5)
pool_5 = MaxPooling2D(pool_size=(2, 2))(batchnorm_5)

conv_6 = Conv2D(512, (3, 3), activation='relu', padding='same')(pool_5)
conv_7 = Conv2D(512, (3, 3), activation='relu', padding='same')(conv_6)
batchnorm_7 = BatchNormalization()(conv_7)

bn_shape = batchnorm_7.get_shape()  # (?, {dimension}50, {dimension}12, {dimension}256)

'''----------------------STN-------------------------'''
# you can run the model without this STN part by commenting out the STN lines then connecting batchnorm_7 to x_reshape,
# which may bring you a higher accuracy
# stn_input_shape = batchnorm_7.get_shape()
# loc_input_shape = (stn_input_shape[1].value, stn_input_shape[2].value, stn_input_shape[3].value)
# stn = SpatialTransformer(localization_net=loc_net(loc_input_shape),
#                          output_size=(loc_input_shape[0], loc_input_shape[1]))(batchnorm_7)
'''----------------------STN-------------------------'''

print(bn_shape)  # (?, 50, 7, 512)

# reshape to (batch_size, width, height*dim)
# x_reshape = Reshape(target_shape=(int(bn_shape[1]), int(bn_shape[2] * bn_shape[3])))(stn_7)
x_reshape = Reshape(target_shape=(int(bn_shape[1]), int(bn_shape[2] * bn_shape[3])))(batchnorm_7)

fc_1 = Dense(128, activation='relu')(x_reshape)  # (?, 50, 128)

print(x_reshape.get_shape())  # (?, 50, 3584)
print(fc_1.get_shape())  # (?, 50, 128)

rnn_1 = LSTM(128, kernel_initializer="he_normal", return_sequences=True)(fc_1)
rnn_1b = LSTM(128, kernel_initializer="he_normal", go_backwards=True, return_sequences=True)(fc_1)
rnn1_merged = add([rnn_1, rnn_1b])

rnn_2 = LSTM(128, kernel_initializer="he_normal", return_sequences=True)(rnn1_merged)
rnn_2b = LSTM(128, kernel_initializer="he_normal", go_backwards=True, return_sequences=True)(rnn1_merged)
rnn2_merged = concatenate([rnn_2, rnn_2b])

drop_1 = Dropout(0.25)(rnn2_merged)

fc_2 = Dense(label_classes, kernel_initializer='he_normal', activation='softmax')(drop_1)

# model setting
base_model = Model(inputs=inputShape, outputs=fc_2)  # the model for prediecting

labels = Input(name='the_labels', shape=[label_len], dtype='float32')
input_length = Input(name='input_length', shape=[1], dtype='int64')
label_length = Input(name='label_length', shape=[1], dtype='int64')

loss_out = Lambda(ctc_lambda_func, output_shape=(1,), name='ctc')([fc_2, labels, input_length, label_length])

model = Model(inputs=[inputShape, labels, input_length, label_length], outputs=[loss_out])  # the model for trainning

# clipnorm seems to speeds up convergence
sgd = SGD(lr=learning_rate, decay=1e-6, momentum=0.9, nesterov=True, clipnorm=5)
adam = optimizers.Adam()

model.compile(loss={'ctc': lambda y_true, y_pred: y_pred}, optimizer=sgd)

model.summary()  # print a summary representation of your model.
# plot_model(model, to_file='CRNN_with_STN.png', show_shapes=True)  # save a image which is the architecture of the model
cp_save_path = "weightswithoutstnlrchanged.best.hdf5"
checkpoint = ModelCheckpoint(cp_save_path, monitor='loss', verbose=1, save_best_only=True, mode='min')

# if you want to load a trained model weights, just fill the load_model_path in config.py, it will automaticlly add into the
# new trainning. if you want to train a new model, just set load_model_path = ''.


# if len(load_model_path) > 5:
#     savedmodel = load_model(load_model_path, custom_objects={"bknd": backend,'loss':loss_out})


# try your own fit_generator() settings, you may get a better result
model.fit_generator(img_gen(input_shape=bn_shape), steps_per_epoch=100, epochs=1, verbose=1,
                    callbacks=[evaluator,
                               checkpoint,
                               TensorBoard(log_dir=tb_log_dir)])

base_model.save(base_model_path)

import coremltools

def convert_lambda(layer):
    # Only convert this Lambda layer if it is for our swish function.
    if layer.function == ctc_lambda_func:
        params = NeuralNetwork_pb2.CustomLayerParams()

        # The name of the Swift or Obj-C class that implements this layer.
        params.className = "x"

        # The desciption is shown in Xcode's mlmodel viewer.
        params.description = "A fancy new loss"

        return params
    else:
        return None


print("\nConverting the model:")

# Convert the model to Core ML.
coreml_model = coremltools.converters.keras.convert(
    model,
    # 'weightswithoutstnlrchangedbackend.best.hdf5',
    input_names="image",
    image_input_names="image",
    output_names="output",
    add_custom_layers=True,
    custom_conversion_functions={"Lambda": convert_lambda},

from coreml-custom-layers.

keras to CoreML with custom loss function about coreml-custom-layers HOT 10 CLOSED

Comments (10)

Related Issues (6)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent