Hello there! I hope you are doing well. I created an issue earlier which was about get

Hyperparameter tuning to avoid overfitting about tf-levenberg-marquardt HOT 1 CLOSED

Tasfia266 commented on July 20, 2024

Hyperparameter tuning to avoid overfitting

from tf-levenberg-marquardt.

Comments (1)

fabiodimarco commented on July 20, 2024

Hi,
if you have overfitting that means that the training algorithm is working fine, the problem is that the model has learned the training data "too well" and the resulting function is not smooth.
There are many ways to address overfitting, dropout is one of them but you probably want to use it for more complex models.
Other two simple ways are by using small models that have less capacity to overfit the data and early stopping by checking on validation data during training.
Here an example code that I adapted from this resource:
https://www.kaggle.com/arunkumarramanan/tensorflow-tutorial-and-housing-price-prediction

import tensorflow as tf
import time
import levenberg_marquardt as lm

# load dataset
(x_train, y_train), (x_test, y_test) = \
    tf.keras.datasets.boston_housing.load_data()

x_train = tf.cast(x_train, tf.float32)
x_test = tf.cast(x_test, tf.float32)

y_train = tf.cast(y_train, tf.float32)
y_test = tf.cast(y_test, tf.float32)

# normalize input
x_train_mean = tf.math.reduce_mean(x_train, axis=0)
x_train_std = tf.math.reduce_std(x_train, axis=0)
x_train = (x_train - x_train_mean) / x_train_std
x_test = (x_test - x_train_mean) / x_train_std

# normalize output
y_train_mean = tf.math.reduce_mean(y_train, axis=0)
y_train_std = tf.math.reduce_std(y_train, axis=0)
y_train = (y_train - y_train_mean) / y_train_std
y_test = (y_test - y_train_mean) / y_train_std

# create the model
model = tf.keras.Sequential([
    tf.keras.layers.Dense(20, activation=tf.nn.elu,
                          input_shape=[x_train.shape[1]]),
    tf.keras.layers.Dense(1)
])

model.compile(optimizer=tf.keras.optimizers.Adam(), loss='mse')

model.summary()

model_wrapper = lm.ModelWrapper(tf.keras.models.clone_model(model))

model_wrapper.compile(
    optimizer=tf.keras.optimizers.SGD(learning_rate=0.01),
    loss=lm.MeanSquaredError(),
    damping_algorithm=lm.DampingAlgorithm(min_value=1e-10),
    solve_method='solve')

# train the model
early_stop = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=100)

print("Train using Adam")
t1_start = time.perf_counter()
model.fit(x_train, y_train, batch_size=100, epochs=2000,
          validation_split=0.1, callbacks=[early_stop])
t1_stop = time.perf_counter()
print("Elapsed time: ", t1_stop - t1_start)

print("\n_________________________________________________________________")
print("Train using Levenberg-Marquardt")
t2_start = time.perf_counter()
model_wrapper.fit(x_train, y_train, batch_size=100, epochs=2000,
                  validation_split=0.1, callbacks=[early_stop])
t2_stop = time.perf_counter()
print("Elapsed time: ", t2_stop - t2_start)

print("\n_________________________________________________________________")
print("Test set results")

test_loss = model.evaluate(x=x_test, y=y_test, verbose=0)
print("adam - test_loss: %f" % test_loss)

test_loss = model_wrapper.evaluate(x=x_test, y=y_test, verbose=0)
print("lm - test_loss: %f" % test_loss)

However, experimentally I have noticed that using small learning_rates helps to obtain a model with less overfitting. You could also try to use regularization and see if it improves the result.

from tf-levenberg-marquardt.

Hyperparameter tuning to avoid overfitting about tf-levenberg-marquardt HOT 1 CLOSED

Comments (1)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent