tuner.get_best_models() fails with Sequential models, with the following exception

Following as I have same issue and am using workaround <a class="issue-link js-issue-l

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Sequential models may cause issues with get_best_models(),about keras-team/keras-tuner

Comments (14)

jamlong commented on May 12, 2024

So, basically, the rundown here is that in tuner.py:get_best_models(), we:

clone the hyperparameters from the trial
build a new model via hypermodel.build()
_compile_model
model.load_weights

At this point, it crashes, because the model doesn't know what it's input shape is yet, because that information typically comes in when you start applying the model to your training data.

We could add a manual model.build(shape) call here - but where to get the shape?

Overall options seem to be:

Require users of Sequential models with Kerastuner to set input_dim on their first layer. While this seems a little restrictive, given that they will be passing the same data to all instances of the hypermodel, this may be suitable.
Read the model configuration in tuner.get_best_models

            best_checkpoint = executions[0].best_checkpoint + '-weights.h5'
            best_checkpoint_config = executions[0].best_checkpoint + '-config.json'
            model_config = json.load(open(best_checkpoint_config, "rt"))
            build_input_shape = model_config['config']['build_input_shape']
            model.build(build_input_shape)
            model.load_weights(....)

This seems a bit redundant, given that we're otherwise recreating the model without said config file. However, we may NEED to create the model from code.

Store the input shape in the trial. This seems like extraneous information 99% of the time, unless we're adding it in terms of "model training data metadata" (e.g. input shape, training size, test size, etc.)

@fchollet - can you comment on the logic behind building a new model instance from the hypermodel, vs. loading the model config we already have? Was that to avoid issues with custom layers, etc. making hard to reload? Feels like 1 or 2 are the way to go, depending on whether you feel that the "delayed build Sequential model in KerasTuner" is something that should be supported.

from keras-tuner.

omalleyt12 commented on May 12, 2024

Thanks for the issue! I think we'll have to switch to model.save(<tf_format_file>) and tf.keras.models.load_model to get around this. This will save and load a SavedModel, which works for all Keras model types

Read the model configuration in tuner.get_best_models

The problem is that subclassed Models aren't required to implement get_config.

from keras-tuner.

rcmagic1 commented on May 12, 2024

Following as I have same issue and am using workaround #1 described by @jamlong above.

from keras-tuner.

omalleyt12 commented on May 12, 2024

@rcmagic1 What version of TF are you using? I'm trying to repro this but it works for me in TF2.0. There's not a great solution to get around this that works for all models, so if it's only an issue in TF1.x we may have to keep the current solution

from keras-tuner.

omalleyt12 commented on May 12, 2024

Note that in TF2.0 a model can have its weights restored before they are even created. The checkpoint will use restore-on-create semantics for those weights:

import numpy as np
import tensorflow as tf

model = tf.keras.Sequential([tf.keras.layers.Dense(1)])
model.compile('sgd', 'mse')  # Weights don't exist yet
model.fit(np.ones((10, 10)), np.ones((10, 1)))  # Weights created here
model.save('tmp')

new_model = tf.keras.Sequential([tf.keras.layers.Dense(1)])
new_model.compile('sgd', 'mse')
checkpoint_status = new_model.load_weights('tmp')  # Weights aren't created yet.
new_model.predict(np.ones((10, 10)))
checkpoint_status.assert_consumed() . # Weights are now the same as `model`.

from keras-tuner.

rcmagic1 commented on May 12, 2024

@omalleyt12 I’m using TF2 beta

from keras-tuner.

omalleyt12 commented on May 12, 2024

@rcmagic1 Hm, strange I'm not sure what could be going on then as that should be working. Could you please share a minimal repro of this issue as well? Thanks!

from keras-tuner.

rcmagic1 commented on May 12, 2024

Hi @omalleyt12
Here's a simple NN that fails for me when get_best_models is called.
I pasted the error output at the bottom.
If I comment-out "model.add( Input( shape=(self.num_features,) ) )" line and uncoment-out the next two lines then there is no error.
I noticed this particular error may be related with the following issue:
keras-team/keras#10417

            # construct model
            model = Sequential()

            #Note:  Keras bug with InputLayer, so use separate Dense layer with input_dim for now: https://github.com/keras-team/keras/issues/10417
            model.add( Input( shape=(self.num_features,) ) )
#            model.add( Dense( input_dim=self.num_features, units=hp.Int('num_units',min_value=num_units_min,max_value=num_units_max+1,step=num_units_step), activation='relu', kernel_initializer='he_normal' ) )
#            model.add( Dropout( hp.Float('dropout',min_value=dropout_min,max_value=dropout_max,step=dropout_step) ) )
            for i in range( hp.Int('num_fc_layers',min_value=num_fc_layers_min-1,max_value=num_fc_layers_max) ):
                model.add( Dense( units=hp.Int('num_units_'+str(i+1),min_value=num_units_min,max_value=num_units_max+1,step=num_units_step), activation='relu', kernel_initializer='he_normal' ) )
                model.add( Dropout( hp.Float('dropout_'+str(i+1),min_value=dropout_min,max_value=dropout_max,step=dropout_step) ) )
            model.add( Dense(units=self.num_classes, activation='softmax', kernel_initializer='he_normal' ) )
    
            # compile model
            model.compile( optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'] )

Error output:

  File "<blank>", line 1658, in trainNNModel
    model = tuner.get_best_models(num_models=1)[0]
  File "<blank>.local/lib/python3.6/site-packages/kerastuner/engine/tuner.py", line 179, in get_best_models
    return super(Tuner, self).get_best_models(num_models)
  File "<blank>.local/lib/python3.6/site-packages/kerastuner/engine/base_tuner.py", line 188, in get_best_models
    models = [self.load_model(trial) for trial in best_trials]
  File "<blank>.local/lib/python3.6/site-packages/kerastuner/engine/base_tuner.py", line 188, in <listcomp>
    models = [self.load_model(trial) for trial in best_trials]
  File "<blank>.local/lib/python3.6/site-packages/kerastuner/engine/tuner.py", line 142, in load_model
    trial.trial_id, best_epoch))
  File "<blank>.conda/envs/virtenv368/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 162, in load_weights
    return super(Model, self).load_weights(filepath, by_name)
  File "<blank>.conda/envs/virtenv368/lib/python3.6/site-packages/tensorflow/python/keras/engine/network.py", line 1387, in load_weights
    status = self._trackable_saver.restore(filepath)
  File "<blank>.conda/envs/virtenv368/lib/python3.6/site-packages/tensorflow/python/training/tracking/util.py", line 1221, in restore
    checkpoint=checkpoint, proto_id=0).restore(self._graph_view.root)
  File "<blank>.conda/envs/virtenv368/lib/python3.6/site-packages/tensorflow/python/training/tracking/base.py", line 210, in restore
    restore_ops = trackable._restore_from_checkpoint_position(self)  # pylint: disable=protected-access
  File "<blank>.conda/envs/virtenv368/lib/python3.6/site-packages/tensorflow/python/training/tracking/base.py", line 867, in _restore_from_checkpoint_position
    visit_queue=visit_queue)))
  File "<blank>.conda/envs/virtenv368/lib/python3.6/site-packages/tensorflow/python/training/tracking/base.py", line 879, in _single_restoration_from_checkpoint_position
    restore_ops = checkpoint_position.restore_ops()
  File "<blank>.conda/envs/virtenv368/lib/python3.6/site-packages/tensorflow/python/training/tracking/base.py", line 396, in restore_ops
    self._checkpoint.restore_saveables(tensor_saveables, python_saveables))
  File "<blank>.conda/envs/virtenv368/lib/python3.6/site-packages/tensorflow/python/training/tracking/util.py", line 223, in restore_saveables
    validated_saveables).restore(self.save_path_tensor)
  File "<blank>.conda/envs/virtenv368/lib/python3.6/site-packages/tensorflow/python/training/saving/functional_saver.py", line 255, in restore
    restore_ops.update(saver.restore(file_prefix))
  File "<blank>.conda/envs/virtenv368/lib/python3.6/site-packages/tensorflow/python/training/saving/functional_saver.py", line 102, in restore
    restored_tensors, restored_shapes=None)
  File "<blank>.conda/envs/virtenv368/lib/python3.6/site-packages/tensorflow/python/training/saving/saveable_object_util.py", line 114, in restore
    self.handle_op, self._var_shape, restored_tensor)
  File "<blank>.conda/envs/virtenv368/lib/python3.6/site-packages/tensorflow/python/ops/resource_variable_ops.py", line 286, in shape_safe_assign_variable_handle
    shape.assert_is_compatible_with(value_tensor.shape)
  File "<blank>.conda/envs/virtenv368/lib/python3.6/site-packages/tensorflow/python/framework/tensor_shape.py", line 1110, in assert_is_compatible_with
    raise ValueError("Shapes %s and %s are incompatible" % (self, other))
ValueError: Shapes (100000, 9) and (100000, 70) are incompatible

from keras-tuner.

omalleyt12 commented on May 12, 2024

@rcmagic1 Thanks for the repro! Are you using keras-team/keras rather than tf.keras?

If you're using import keras rather than from tensorflow import keras, the Keras team encourages you switch to from tensorflow import keras as this is better supported and has a superset of keras-team/keras features. This repo is only targeting full support for tf.keras with TensorFlow 2.0

This code was able to run OK for me using tf.keras:

import numpy as np
import tensorflow as tf
import kerastuner as kt

def build_model(hp):
    model = tf.keras.Sequential()
    model.add(tf.keras.Input(shape=(num_features,)))
    for i in range( hp.Int('num_fc_layers', 1, 3)):
        model.add( tf.keras.layers.Dense(
            units=hp.Int('num_units_'+str(i+1), 10, 20, step=2),
            activation='relu',
            kernel_initializer='he_normal' ) )
        model.add( tf.keras.layers.Dropout( hp.Float('dropout_'+str(i+1), 0.1, 0.5, step=0.1)))
    model.add( tf.keras.layers.Dense(units=num_classes, activation='softmax', kernel_initializer='he_normal' ) )
    # compile model
    model.compile( optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
    return model

tuner = kt.RandomSearch(
    hypermodel=build_model,
    objective='val_accuracy',
    max_trials=3,
    overwrite=True,
    directory='tom')

x, y = np.ones((10, 10)), np.ones((10, 3))
tuner.search(x, y, epochs=2, validation_data=(x, y))

best_model = tuner.get_best_models()[0]

reloaded_acc = best_model.evaluate(x, y)[1]
best_acc = tuner.oracle.get_best_trials()[0].score

assert reloaded_acc == best_acc

Please let me know if you're still seeing issues

from keras-tuner.

rcmagic1 commented on May 12, 2024

I have the following:

import tensorflow as tf
from tensorflow.python import keras as keras
from tensorflow.python.keras import backend as K

Tensorflow version = 2.0.0-beta1
using tensorflow-gpu

from keras-tuner.

omalleyt12 commented on May 12, 2024

@rcmagic1 imports should be (don't use the python module it's meant to be private):

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import backend as K

Could you try that and upgrading to the TF2.0 release and see if you're still seeing issues? Does the code I pasted work for you?

from keras-tuner.

omalleyt12 commented on May 12, 2024

@rcmagic1 Closing this out for now as I can't repro, please feel free to reopen if you're still seeing the issue with the TF2.0 release

from keras-tuner.

johnmcgowan1 commented on May 12, 2024

Hi,

I am running into this issue as well with Keras tuner using tf 2.1.0 but with the functional API rather than the sequential API. My imports are:

import numpy as np
import tensorflow as tf
import kerastuner as kt

And here is my code that builds the model and calls the search

`def make_classif(hp):
num_layers = hp.Int('dense_layers',3,5,default=3)
learn_rate = hp.Float('learning_rate',1e-5,1e-2,sampling='log')
num_nodes = hp.Int('num_nodes',50,300,step=50)
activation = hp.Choice('activation',values=['selu','tanh','relu','leakyrelu'],default='tanh')
opt = hp.Choice('optimizer',values=['rmsprop','adam','sgd'],default='sgd')
inputs = tf.keras.layers.Input(dtype=tf.float64,shape=(train_features.shape[1],))
print('===HP VALUES ===')
print(opt)
print(activation)

Dx = inputs
for i in range(num_layers):
if activation == 'tanh':
Dx = tf.keras.layers.Dense(num_nodes,activation=activation)(Dx)
print('tanh')
elif activation == 'selu':
Dx = tf.keras.layers.Dense(num_nodes,kernel_initializer='lecun_normal',activation=activation)(Dx)
print('selu')
elif activation == 'relu':
Dx = tf.keras.layers.Dense(num_nodes,kernel_initializer='he_normal',activation=activation)(Dx)
print('relu')
else:
Dx = tf.keras.layers.Dense(num_nodes,kernel_initializer='he_normal')(Dx)
Dx = tf.keras.layers.LeakyReLU(alpha=0.2)(Dx)
print('leaky relu')
Dx = tf.keras.layers.BatchNormalization()(Dx)
Dx = tf.keras.layers.Dense(1,activation="sigmoid")(Dx)

model = tf.keras.models.Model([inputs],[Dx])
if opt == 'rmsprop':
print('rms prop')
optimizer = tf.keras.optimizers.RMSprop(learning_rate=learn_rate,rho=0.9)
elif opt == 'sgd':
print('sgd')
optimizer = tf.keras.optimizers.SGD(learning_rate=learn_rate,momentum=0.9,nesterov=True)
else:
print('adam')
optimizer = tf.keras.optimizers.Adam(learning_rate=learn_rate)

model.compile(loss = tf.keras.losses.binary_crossentropy, optimizer = optimizer,metrics=["accuracy"])
return model

callback_es = tf.keras.callbacks.EarlyStopping(monitor='val_loss',patience = 10)
callback_tb = tf.keras.callbacks.TensorBoard(get_run_logdir())

tuner = kt.RandomSearch(
make_classif,
objective='val_loss',
max_trials=10,
executions_per_trial=1,
directory = 'test_dir_r')

tuner.search(x=train_features,
y=train_labels,
sample_weight=train_weights,
class_weight=classweight,
callbacks=[callback_tb,callback_es],
epochs=200,
verbose=2,
validation_data=(valid_features,valid_labels,valid_weights))`

The error I get is similar to the error mentioned earlier:

Traceback (most recent call last):
File "./classif_hyperparams_r.py", line 131, in
f = tuner.get_best_models(1)[0]
File "/ws/tensorflow/lib/python3.6/site-packages/kerastuner/engine/tuner.py", line 231, in get_best_models
return super(Tuner, self).get_best_models(num_models)
File "/ws/tensorflow/lib/python3.6/site-packages/kerastuner/engine/base_tuner.py", line 238, in get_best_models
models = [self.load_model(trial) for trial in best_trials]
File "//ws/tensorflow/lib/python3.6/site-packages/kerastuner/engine/base_tuner.py", line 238, in
models = [self.load_model(trial) for trial in best_trials]
File "/ws/tensorflow/lib/python3.6/site-packages/kerastuner/engine/tuner.py", line 157, in load_model
trial.trial_id, best_epoch))
File "/ws/tensorflow/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training.py", line 234, in load_weights
return super(Model, self).load_weights(filepath, by_name, skip_mismatch)
File "/ws/tensorflow/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/network.py", line 1193, in load_weights
status = self._trackable_saver.restore(filepath)
File "/ws/tensorflow/lib/python3.6/site-packages/tensorflow_core/python/training/tracking/util.py", line 1283, in restore
checkpoint=checkpoint, proto_id=0).restore(self._graph_view.root)
File "/ws/tensorflow/lib/python3.6/site-packages/tensorflow_core/python/training/tracking/base.py", line 209, in restore
restore_ops = trackable._restore_from_checkpoint_position(self) # pylint: disable=protected-access
File "/ws/tensorflow/lib/python3.6/site-packages/tensorflow_core/python/training/tracking/base.py", line 908, in _restore_from_checkpoint_position
tensor_saveables, python_saveables))
File "/ws/tensorflow/lib/python3.6/site-packages/tensorflow_core/python/training/tracking/util.py", line 289, in restore_saveables
validated_saveables).restore(self.save_path_tensor)
File "/ws/tensorflow/lib/python3.6/site-packages/tensorflow_core/python/training/saving/functional_saver.py", line 255, in restore
restore_ops.update(saver.restore(file_prefix))
File "/ws/tensorflow/lib/python3.6/site-packages/tensorflow_core/python/training/saving/functional_saver.py", line 102, in restore
restored_tensors, restored_shapes=None)
File "/ws/tensorflow/lib/python3.6/site-packages/tensorflow_core/python/training/saving/saveable_object_util.py", line 116, in restore
self.handle_op, self._var_shape, restored_tensor)
File "/ws/tensorflow/lib/python3.6/site-packages/tensorflow_core/python/ops/resource_variable_ops.py", line 297, in shape_safe_assign_variable_handle
shape.assert_is_compatible_with(value_tensor.shape)
File "/ws/tensorflow/lib/python3.6/site-packages/tensorflow_core/python/framework/tensor_shape.py", line 1110, in assert_is_compatible_with
raise ValueError("Shapes %s and %s are incompatible" % (self, other))
ValueError: Shapes (13, 300) and (15, 300) are incompatible

I am testing now with option 1 listed above, but it seems like get_best_model should take the input layer into account when building the model. Any help/advice is much appreciated.

from keras-tuner.

microprediction commented on May 12, 2024

I ran into this problem. Although it is against the grain of Sequential, maybe force input_shape to be provided on the first layer?

   model.add(keras.layers.Dense(
        hp.Choice('units', [8, 16, 24, 32, 64]),hp.Choice('activation',['linear','relu']), input_shape=(1, n_input) ))

from keras-tuner.

Sequential models may cause issues with get_best_models() about keras-tuner HOT 14 CLOSED

Comments (14)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent