surmenok / keras_lr_finder Goto Github PK

View Code? Open in Web Editor NEW

250.0 250.0 65.0 55 KB

Plots the change of the loss function of a Keras model when the learning rate is exponentially increasing.

License: MIT License

Python 100.00%

keras_lr_finder's People

Contributors

Stargazers

Watchers

Forkers

eugeneware ale152 aitorshuffle hermansje ralogon amlarraz j-greer tony32769 amirunpri2018 rikirolly smirzaei tongyoungg jiafulow lincolt acere david-pavlov riviera2015 shuyib jonnoftw raamraam gurudatta07 chirag2saraiya caizhuo gops75 qubvel fau-dlm markoorescanin leethaiduy girishponkiya mjkvaak sbarman-mi9 bdevans nsarang mecthew santhoshbjeeffy pgsrv kkose hdmamin astupidbear raphaelfloresca phymucs christianversloot praveenanantharaman astlaan annab-rss prateekvyas1996 ganeshmg ppvastar aniketmaurya chenghuige otmane-el-aloi sandy4321 txnguyen292 kimiaameri taeokimeng andrey-klochkov-liftoff magisterbrown lily-hust elijahahianyo iq-scm sithphil abdullah2020

keras_lr_finder's Issues

LRFinder doesn't work with Multi-Input data

The first problem here is that the num_batches is computed using x_train.shape. For multi-input models, x-train is a list of np.array, hence, it doesn't have a shape.

To solve this you could add an option to pass num_batches and allow the user to calculate this. This would fix this problem.

Error Logs


---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-18-57b811cf4161> in <module>()
     16 
     17 lr_finder = LRFinder(model)
---> 18 lr_finder.find(x_train, y_train, 0.0001, 1, 512, 5)

~/miniconda3/lib/python3.6/site-packages/keras_lr_finder/lr_finder.py in find(self, x_train, y_train, start_lr, end_lr, batch_size, epochs)
     39 
     40     def find(self, x_train, y_train, start_lr, end_lr, batch_size=64, epochs=1):
---> 41         num_batches = epochs * x_train.shape[0] / batch_size
     42         self.lr_mult = (end_lr / start_lr) ** (1 / num_batches)
     43 

AttributeError: 'list' object has no attribute 'shape'

Unsafe save_weights/load_weights method

In LRFinder.find() and LRFinder.find_generator(), there is a call to the following functions:

self.model.save_weights('tmp.h5')
self.model.load_weights('tmp.h5')

This is unsafe: In the case where several python processes running LRFinder in parallel, they will all attempt to access the same file, mixing weights between processess...

A random file name should instead be generated every time, and it should be checked whether this random file name already exists or not. It should be noted that, one should prevent different processes from generating the same random file name if they are executed at the same time (this can be a problem if random uses datetime as seed).

AttributeError: 'Adam' object has no attribute 'learning_rate'

Thank you for this repo!

I saw that you rewrite the "lr" to "learning_rate" but now new problems appears..

This is my code

model.compile(loss=scaled_loss,
              optimizer='adam')

lr_finder = LRFinder(model)
lr_finder.find(X_train, Y_train, 1e-6, 1e-2, 128, 5)

And the exception is

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-14-a360a4ec21ae> in <module>
     17 
     18 lr_finder = LRFinder(model)
---> 19 lr_finder.find(X_train, Y_train, 1e-6, 1e-2, 128, 5)

~/anaconda3/envs/myenv/lib/python3.7/site-packages/keras_lr_finder-0.1-py3.7.egg/keras_lr_finder/lr_finder.py in find(self, x_train, y_train, start_lr, end_lr, batch_size, epochs)
     52 
     53         # Remember the original learning rate
---> 54         original_lr = K.get_value(self.model.optimizer.learning_rate)
     55 
     56         # Set the initial learning rate

AttributeError: 'Adam' object has no attribute 'learning_rate'

The attribute of learning rate in my case is 'lr'.
However when I take a look at keras doc, it is written as

class Adam(Optimizer):

    def __init__(self, learning_rate=0.001, beta_1=0.9, beta_2=0.999,
                 amsgrad=False, **kwargs):
        self.initial_decay = kwargs.pop('decay', 0.0)
        self.epsilon = kwargs.pop('epsilon', K.epsilon())
        learning_rate = kwargs.pop('lr', learning_rate)
        super(Adam, self).__init__(**kwargs)
        with K.name_scope(self.__class__.__name__):
            self.iterations = K.variable(0, dtype='int64', name='iterations')
            self.learning_rate = K.variable(learning_rate, name='learning_rate')
            self.beta_1 = K.variable(beta_1, name='beta_1')
            self.beta_2 = K.variable(beta_2, name='beta_2')
            self.decay = K.variable(self.initial_decay, name='decay')
        self.amsgrad = amsgrad

It is self.learning_rate = K.variable(learning_rate, name='learning_rate').

I don't know why this is happening. Even not sure whether should the issue be opened in this repo.

But anyway, I think it maybe better if both 'lr' and 'learning_rate' could be handled.

ValueError: You are trying to load a weight file containing 0 layers into a model with 2 layers.

Code to replicate the error:

# 1. Import callbacks
from keras.models import Sequential
from keras.layers import Flatten, Dense, Conv2D, MaxPooling2D
from keras.callbacks import ModelCheckpoint
from keras.datasets import mnist
!pip install keras_lr_finder
from keras_lr_finder import LRFinder

# 2. Input Data
(X_train, y_train), (X_test, y_test) = mnist.load_data()

mean, std = X_train.mean(), X_train.std()
X_train, X_test = (X_train-mean)/std, (X_test-mean)/std

# 3. Define Model
model = Sequential([Flatten(),
                    Dense(512, activation='relu'),
                    Dense(10, activation='softmax')])


# 5. Train model
model.compile(loss='sparse_categorical_crossentropy', \
              metrics=['accuracy'], optimizer='adam')

lr_finder = LRFinder(model)

lr_finder.find(X_train, y_train, start_lr=0.0001, end_lr=1, batch_size=512, epochs=5)

Colab: https://colab.research.google.com/drive/1YDVTxHutTIeVKz7l7yUuZ7MtiLvTA4AJ

Problem with the model or LRFinder?

Hi!

I'm having troubles figuring out whether my model is not working correctly or I'm using wrong parameters for LRFinder.

Here's my model summary. It's a fully convolutional network for multi class text classification. I use Keras Generator to feed batches during training.

And here are the results of LRFinder:

I'm relatively new to Keras and ML and I would appreciate some clarification :)
Thanks in advance!

Attribute error occur in Lamdacallback

I am using the latest version of tensorflow and keras and while running LRfinder class i got an error as shown below

AttributeError Traceback (most recent call last)
in
1 from keras_lr_finder.lr_finder import*
2 lr_finder = lf.LRFinder(model_seq)
----> 3 lr_finder.find(input_data, labels, start_lr=0.0001, end_lr=1, epochs=5)

D:\Deep_learning_projects\aryan\keras_lr_finder\lr_finder.py in find(self, x_train, y_train, start_lr, end_lr, batch_size, epochs, **kw_fit)
62 batch_size=batch_size, epochs=epochs,
63 callbacks=[callback],
---> 64 **kw_fit)
65
66 # Restore the weights to the state before model fitting

c:\programdata\anaconda3\envs\deeplearning\lib\site-packages\tensorflow\python\keras\engine\training.py in _method_wrapper(self, *args, **kwargs)
106 def _method_wrapper(self, *args, **kwargs):
107 if not self._in_multi_worker_mode(): # pylint: disable=protected-access
--> 108 return method(self, *args, **kwargs)
109
110 # Running inside run_distribute_coordinator already.

c:\programdata\anaconda3\envs\deeplearning\lib\site-packages\tensorflow\python\keras\engine\training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_batch_size, validation_freq, max_queue_size, workers, use_multiprocessing)
1072 verbose=verbose,
1073 epochs=epochs,
-> 1074 steps=data_handler.inferred_steps)
1075
1076 self.stop_training = False

c:\programdata\anaconda3\envs\deeplearning\lib\site-packages\tensorflow\python\keras\callbacks.py in init(self, callbacks, add_history, add_progbar, model, **params)
233 # pylint: disable=protected-access
234 self._should_call_train_batch_hooks = any(
--> 235 cb._implements_train_batch_hooks() for cb in self.callbacks)
236 self._should_call_test_batch_hooks = any(
237 cb._implements_test_batch_hooks() for cb in self.callbacks)

c:\programdata\anaconda3\envs\deeplearning\lib\site-packages\tensorflow\python\keras\callbacks.py in (.0)
233 # pylint: disable=protected-access
234 self._should_call_train_batch_hooks = any(
--> 235 cb._implements_train_batch_hooks() for cb in self.callbacks)
236 self._should_call_test_batch_hooks = any(
237 cb._implements_test_batch_hooks() for cb in self.callbacks)

AttributeError: 'LambdaCallback' object has no attribute '_implements_train_batch_hooks'

Add Exponential Smoothing Plot

After reading this blog post: https://sgugger.github.io/how-do-you-find-a-good-learning-rate.html

It seems that you can get better smoothing by using an exponential weighting. Could this potentially provide better a learning rate?

I'll make a pull request but my code currently looks like this:

    def plot_exp_loss(self, beta=0.98, n_skip_beginning=10, n_skip_end=5):
        exp_loss = self.exp_weighted_losses(beta)[n_skip_beginning:-n_skip_end]
        plt.plot(self.lrs[n_skip_beginning:-n_skip_end], exp_loss, label="Loss")
        plt.ylabel("Exponentially Weighted Loss")
        plt.xlabel("Learning Rate (log scale)")
        plt.xscale('log')

    def plot_exp_loss_change(self, beta=0.98, n_skip_beginning=10, n_skip_end=5):
        exp_der = self.exp_weighted_derivatives(beta)[n_skip_beginning:-n_skip_end]
        plt.plot(self.lrs[n_skip_beginning:-n_skip_end], exp_der, label=r"exp weighted loss change")
        plt.ylabel(r"Exponentially Weighted Loss Change $\frac{dl}{dlr}$")
        plt.xlabel("Learning Rate (log scale)")
        plt.xscale('log')

    def get_best_lr_exp_weighted(self, beta=0.98, n_skip_beginning=10, n_skip_end=5):
        derivatives = self.exp_weighted_derivatives(beta)
        return min(zip(derivatives[n_skip_beginning:-n_skip_end], self.lrs[n_skip_beginning:-n_skip_end]))[1]

    def exp_weighted_losses(self, beta=0.98):
        losses = []
        avg_loss = 0.
        for batch_num, loss in enumerate(self.losses):
            avg_loss = beta * avg_loss + (1 - beta) * loss
            smoothed_loss = avg_loss / (1 - beta ** batch_num)
            losses.append(smoothed_loss)
        return losses

    def exp_weighted_derivatives(self, beta=0.98):
        derivatives = [0]
        losses = self.exp_weighted_losses(beta)
        for i in range(1, len(losses)):
            derivatives.append((losses[i] - losses[i - 1]) / 1)
        return derivatives

Is this only works for Feed Forward?

Hi,

Thank you for providing this awesome library. I gave it a try to my CNN model and it did not work well. Is this only works for Feed Forward model?

Please advise

Fixed Issue when using python 2.7

Hi! Congrats for your job! I've found an Issue when use python 2.7 (I dont know if the python version is the problem), in file lr_finder.py in line 42 I've replaced:

self.lr_mult = (end_lr / start_lr) ** (1 / num_batches)

with

self.lr_mult = (float(end_lr) / float(start_lr)) ** (float(1) / float(num_batches))

because if I use the first line, the result was 1 all the time and the learning rate did not update at all.

What is kw_fit?

What is kw_fit?
I'm trying to make lr_finder work with a keras.Sequence, but I'm unable to do so. In my attempts, I figured out I have no idea of where kw_fit comes from. What is this variable?

Automatic Select Best LR

How do I pick the the best learning rate from the LRFinder object?

If there was a function to do this in the library, then this object could be used with LearningRateScheduler to automatically pick the learning rate again after n epochs using the lr finder method.

Code could look something like:

import numpy as np
def get_derivatives(self, sma):
    assert sma >= 1
    derivatives = [0] * sma
    for i in range(sma, len(self.lrs)):
        derivatives.append((self.losses[i] - self.losses[i - sma]) / sma)
    return derivatives

def get_best_lr(self, sma, n_skip_beginning=10, n_skip_end=5):
    derivatives = self.get_derivatives(sma)
    best_der_idx = np.argmax(derivatives[n_skip_beginning:-n_skip_end])
    return self.lrs[n_skip_beginning:-n_skip_end][best_der_idx]

Alternatively, you can just pick the pick the learning rate associated with the best loss, ie new_lr = self.lrs[np.argmin(self.losses)]

Is it only good with python 3?

When I use this function in python 2 in virtual environment, it couldn’t plot the figures correctly. The figure was just a vertical line at learning rate with 0.0001. Thanks

I tried your example in python 3 without in virtual environment, it worked well

How to choose the parameters in the function of lr_finder.find

How to set the values of start_lr, end_lr, batch_size and epochs? In my model, I couldn't get the figures like your example.

I used
lr_finder.find(X_train, y_train, start_lr=0.0001, end_lr=1, batch_size=512, epochs=5)
lr_finder.plot_loss(n_skip_beginning=20, n_skip_end=5)

Then I got a figure like this.

lr_finder.plot_loss_change(sma=20, n_skip_beginning=20, n_skip_end=5, y_lim=(-0.1, 0.1))

Could you give me some advice to use other parameters to find a good lr? Many thanks

How to use get_best_lr method?

How to use get_best_lr method from the class LRFinder
I tried

lr_finder.get_best_lr(sma=20)

I got the following error.

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-28-407ca9d54a8d> in <module>
----> 1 lr_finder.get_best_lr(sma=20)

/kaggle/usr/lib/keras_lr_finder/keras_lr_finder.py in get_best_lr(self, sma, n_skip_beginning, n_skip_end)
    164     def get_best_lr(self, sma, n_skip_beginning=10, n_skip_end=5):
    165         derivatives = self.get_derivatives(sma)
--> 166         best_der_idx = np.argmax(derivatives[n_skip_beginning:-n_skip_end])[0]
    167         return self.lrs[n_skip_beginning:-n_skip_end][best_der_idx]

IndexError: invalid index to scalar variable.

Fixed Issue when using python

Why does this lr_finder use training loss instead of validation loss?

I have looked into the post "Estimating an Optimal Learning Rate For a Deep Neural Network", it suggested to use training loss to determine the best learning rate to use or a range of learning rate to use. However, in the paper "Cyclical Learning Rates for Training Neural Networks", the author used validation accuracy to find the learning rate range. So, in my humble opinion, lr_finder should evaluate val_loss after each batch and record it, then plot a graph using "validation loss" against "learning rate".

get_best_lr - argmax

Why do you use the argmax method within get_best_lr? Since we are interested in the fastest decrease in the loss shouldn't we use argmin?

Incorrect lr_mult value when using find_generator

When using find_generator:

keras_lr_finder/keras_lr_finder/lr_finder.py

Line 76 in 7870043

    
           self.lr_mult = (float(end_lr) / float(start_lr)) ** (float(1) / float(steps_per_epoch))

lr_mult causes the learning rate to converge at end_lr after only 1 epoch. If you want to use 2 or more epochs, then your learning rate will exceed end_lr and end up training on a learning rate that is far too high than intended.

The fix here is (when end_lr=1,start_lr=0.001,steps_per_epoch=1000,epochs=4):

lr_mult = ((float(end_lr) / float(start_lr)) ** (1. / float(steps_per_epoch*epochs)))

Here's a sample script to show how the two functions diverge while cur_lr <= end_lr

I had an error 'Model' object has no attribute 'optimizer'

In your example, your model is Sequential(). I tried it and it works perfectly. However, my model was formed as model = Model(input, output), then I received above error message. Any advice? Thanks a lot.

May I use this with sequential(keras.utils.Sequence) data?

I'm using 3rd party working code that fits a TCN-based model with

model.fit(train_seq, steps_per_epoch=len(train_seq), epochs=20)

where train_seq is a keras.utils.Sequence implemented by the following code:

def cnn_pad(data, pad_frames):
    """Pad the data by repeating the first and last frame N times."""
    pad_start = np.repeat(data[:1], pad_frames, axis=0)
    pad_stop = np.repeat(data[-1:], pad_frames, axis=0)
    return np.concatenate((pad_start, data, pad_stop))
    
class DataSequence(Sequence):
    def __init__(self, x, y, fps=FPS, pad_frames=None):
        self.x = x
        y_proc = np.zeros(np.shape(x)[1])
        np.add.at(y_proc, y[0]*FPS, 1)
        self.y = [y_proc]
        self.pad_frames = pad_frames

    def __len__(self):
        return len(self.x)

    def __getitem__(self, idx):
        x = np.array(cnn_pad(self.x[idx], self.pad_frames))[np.newaxis, ..., np.newaxis]
        y = self.y[idx][np.newaxis, ..., np.newaxis]
        return x, y

x = [np.random.randn(1949, 81)]  # to emulate audio
y = [np.asarray([1, 2, 3, 4, 5])]  # to emulate annotations
train_seq = DataSequence(x[:1], y[:1], pad_frames=2)

How may I use LRFinder for this type of data? Ie, is it possible to "retrieve" valid x and y from the Sequence?
To reproduce this issue (almost in entirety, with the exception of the keras h5 model), here is some dummy code:

Does this lr finder support generators?

Does this work with learning rate scheduler?

Thanks for the work!

I am wondering if this functionality works with learning rate scheduler (https://keras.io/callbacks/#learningratescheduler), and also custom scheduler like https://gist.github.com/jeremyjordan/5a222e04bb78c242f5763ad40626c452.

If not, I think this will be a desired extension.

Update the pip with the new 'generator' feature

Hi, awesome work!
Seems like the update featuring the 'generator' support is not available yet when installing lr_finder by pip.