Giter Site home page Giter Site logo

keras_lr_finder's People

Contributors

acere avatar amlarraz avatar astlaan avatar astupidbear avatar girishponkiya avatar jonnoftw avatar nsarang avatar qwertpi avatar surmenok avatar tarasivashchuk avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

keras_lr_finder's Issues

LRFinder doesn't work with Multi-Input data

The first problem here is that the num_batches is computed using x_train.shape. For multi-input models, x-train is a list of np.array, hence, it doesn't have a shape.

To solve this you could add an option to pass num_batches and allow the user to calculate this. This would fix this problem.

Error Logs


---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-18-57b811cf4161> in <module>()
     16 
     17 lr_finder = LRFinder(model)
---> 18 lr_finder.find(x_train, y_train, 0.0001, 1, 512, 5)

~/miniconda3/lib/python3.6/site-packages/keras_lr_finder/lr_finder.py in find(self, x_train, y_train, start_lr, end_lr, batch_size, epochs)
     39 
     40     def find(self, x_train, y_train, start_lr, end_lr, batch_size=64, epochs=1):
---> 41         num_batches = epochs * x_train.shape[0] / batch_size
     42         self.lr_mult = (end_lr / start_lr) ** (1 / num_batches)
     43 

AttributeError: 'list' object has no attribute 'shape'

Unsafe save_weights/load_weights method

In LRFinder.find() and LRFinder.find_generator(), there is a call to the following functions:

self.model.save_weights('tmp.h5')
self.model.load_weights('tmp.h5')

This is unsafe: In the case where several python processes running LRFinder in parallel, they will all attempt to access the same file, mixing weights between processess...

A random file name should instead be generated every time, and it should be checked whether this random file name already exists or not. It should be noted that, one should prevent different processes from generating the same random file name if they are executed at the same time (this can be a problem if random uses datetime as seed).

AttributeError: 'Adam' object has no attribute 'learning_rate'

Thank you for this repo!

I saw that you rewrite the "lr" to "learning_rate" but now new problems appears..

This is my code

model.compile(loss=scaled_loss,
              optimizer='adam')

lr_finder = LRFinder(model)
lr_finder.find(X_train, Y_train, 1e-6, 1e-2, 128, 5)

And the exception is

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-14-a360a4ec21ae> in <module>
     17 
     18 lr_finder = LRFinder(model)
---> 19 lr_finder.find(X_train, Y_train, 1e-6, 1e-2, 128, 5)

~/anaconda3/envs/myenv/lib/python3.7/site-packages/keras_lr_finder-0.1-py3.7.egg/keras_lr_finder/lr_finder.py in find(self, x_train, y_train, start_lr, end_lr, batch_size, epochs)
     52 
     53         # Remember the original learning rate
---> 54         original_lr = K.get_value(self.model.optimizer.learning_rate)
     55 
     56         # Set the initial learning rate

AttributeError: 'Adam' object has no attribute 'learning_rate'

The attribute of learning rate in my case is 'lr'.
However when I take a look at keras doc, it is written as

class Adam(Optimizer):

    def __init__(self, learning_rate=0.001, beta_1=0.9, beta_2=0.999,
                 amsgrad=False, **kwargs):
        self.initial_decay = kwargs.pop('decay', 0.0)
        self.epsilon = kwargs.pop('epsilon', K.epsilon())
        learning_rate = kwargs.pop('lr', learning_rate)
        super(Adam, self).__init__(**kwargs)
        with K.name_scope(self.__class__.__name__):
            self.iterations = K.variable(0, dtype='int64', name='iterations')
            self.learning_rate = K.variable(learning_rate, name='learning_rate')
            self.beta_1 = K.variable(beta_1, name='beta_1')
            self.beta_2 = K.variable(beta_2, name='beta_2')
            self.decay = K.variable(self.initial_decay, name='decay')
        self.amsgrad = amsgrad

It is self.learning_rate = K.variable(learning_rate, name='learning_rate').

I don't know why this is happening. Even not sure whether should the issue be opened in this repo.

But anyway, I think it maybe better if both 'lr' and 'learning_rate' could be handled.

ValueError: You are trying to load a weight file containing 0 layers into a model with 2 layers.

Code to replicate the error:

# 1. Import callbacks
from keras.models import Sequential
from keras.layers import Flatten, Dense, Conv2D, MaxPooling2D
from keras.callbacks import ModelCheckpoint
from keras.datasets import mnist
!pip install keras_lr_finder
from keras_lr_finder import LRFinder

# 2. Input Data
(X_train, y_train), (X_test, y_test) = mnist.load_data()

mean, std = X_train.mean(), X_train.std()
X_train, X_test = (X_train-mean)/std, (X_test-mean)/std

# 3. Define Model
model = Sequential([Flatten(),
                    Dense(512, activation='relu'),
                    Dense(10, activation='softmax')])


# 5. Train model
model.compile(loss='sparse_categorical_crossentropy', \
              metrics=['accuracy'], optimizer='adam')

lr_finder = LRFinder(model)

lr_finder.find(X_train, y_train, start_lr=0.0001, end_lr=1, batch_size=512, epochs=5)

Colab: https://colab.research.google.com/drive/1YDVTxHutTIeVKz7l7yUuZ7MtiLvTA4AJ

Problem with the model or LRFinder?

Hi!

I'm having troubles figuring out whether my model is not working correctly or I'm using wrong parameters for LRFinder.

Here's my model summary. It's a fully convolutional network for multi class text classification. I use Keras Generator to feed batches during training.

Capture

And here are the results of LRFinder:

Capture-2

I'm relatively new to Keras and ML and I would appreciate some clarification :)
Thanks in advance!

Attribute error occur in Lamdacallback

I am using the latest version of tensorflow and keras and while running LRfinder class i got an error as shown below


AttributeError Traceback (most recent call last)
in
1 from keras_lr_finder.lr_finder import*
2 lr_finder = lf.LRFinder(model_seq)
----> 3 lr_finder.find(input_data, labels, start_lr=0.0001, end_lr=1, epochs=5)

D:\Deep_learning_projects\aryan\keras_lr_finder\lr_finder.py in find(self, x_train, y_train, start_lr, end_lr, batch_size, epochs, **kw_fit)
62 batch_size=batch_size, epochs=epochs,
63 callbacks=[callback],
---> 64 **kw_fit)
65
66 # Restore the weights to the state before model fitting

c:\programdata\anaconda3\envs\deeplearning\lib\site-packages\tensorflow\python\keras\engine\training.py in _method_wrapper(self, *args, **kwargs)
106 def _method_wrapper(self, *args, **kwargs):
107 if not self._in_multi_worker_mode(): # pylint: disable=protected-access
--> 108 return method(self, *args, **kwargs)
109
110 # Running inside run_distribute_coordinator already.

c:\programdata\anaconda3\envs\deeplearning\lib\site-packages\tensorflow\python\keras\engine\training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_batch_size, validation_freq, max_queue_size, workers, use_multiprocessing)
1072 verbose=verbose,
1073 epochs=epochs,
-> 1074 steps=data_handler.inferred_steps)
1075
1076 self.stop_training = False

c:\programdata\anaconda3\envs\deeplearning\lib\site-packages\tensorflow\python\keras\callbacks.py in init(self, callbacks, add_history, add_progbar, model, **params)
233 # pylint: disable=protected-access
234 self._should_call_train_batch_hooks = any(
--> 235 cb._implements_train_batch_hooks() for cb in self.callbacks)
236 self._should_call_test_batch_hooks = any(
237 cb._implements_test_batch_hooks() for cb in self.callbacks)

c:\programdata\anaconda3\envs\deeplearning\lib\site-packages\tensorflow\python\keras\callbacks.py in (.0)
233 # pylint: disable=protected-access
234 self._should_call_train_batch_hooks = any(
--> 235 cb._implements_train_batch_hooks() for cb in self.callbacks)
236 self._should_call_test_batch_hooks = any(
237 cb._implements_test_batch_hooks() for cb in self.callbacks)

AttributeError: 'LambdaCallback' object has no attribute '_implements_train_batch_hooks'

Add Exponential Smoothing Plot

After reading this blog post: https://sgugger.github.io/how-do-you-find-a-good-learning-rate.html

It seems that you can get better smoothing by using an exponential weighting. Could this potentially provide better a learning rate?

I'll make a pull request but my code currently looks like this:

    def plot_exp_loss(self, beta=0.98, n_skip_beginning=10, n_skip_end=5):
        exp_loss = self.exp_weighted_losses(beta)[n_skip_beginning:-n_skip_end]
        plt.plot(self.lrs[n_skip_beginning:-n_skip_end], exp_loss, label="Loss")
        plt.ylabel("Exponentially Weighted Loss")
        plt.xlabel("Learning Rate (log scale)")
        plt.xscale('log')

    def plot_exp_loss_change(self, beta=0.98, n_skip_beginning=10, n_skip_end=5):
        exp_der = self.exp_weighted_derivatives(beta)[n_skip_beginning:-n_skip_end]
        plt.plot(self.lrs[n_skip_beginning:-n_skip_end], exp_der, label=r"exp weighted loss change")
        plt.ylabel(r"Exponentially Weighted Loss Change $\frac{dl}{dlr}$")
        plt.xlabel("Learning Rate (log scale)")
        plt.xscale('log')

    def get_best_lr_exp_weighted(self, beta=0.98, n_skip_beginning=10, n_skip_end=5):
        derivatives = self.exp_weighted_derivatives(beta)
        return min(zip(derivatives[n_skip_beginning:-n_skip_end], self.lrs[n_skip_beginning:-n_skip_end]))[1]

    def exp_weighted_losses(self, beta=0.98):
        losses = []
        avg_loss = 0.
        for batch_num, loss in enumerate(self.losses):
            avg_loss = beta * avg_loss + (1 - beta) * loss
            smoothed_loss = avg_loss / (1 - beta ** batch_num)
            losses.append(smoothed_loss)
        return losses

    def exp_weighted_derivatives(self, beta=0.98):
        derivatives = [0]
        losses = self.exp_weighted_losses(beta)
        for i in range(1, len(losses)):
            derivatives.append((losses[i] - losses[i - 1]) / 1)
        return derivatives

Is this only works for Feed Forward?

Hi,

Thank you for providing this awesome library. I gave it a try to my CNN model and it did not work well. Is this only works for Feed Forward model?

Please advise

Fixed Issue when using python 2.7

Hi! Congrats for your job! I've found an Issue when use python 2.7 (I dont know if the python version is the problem), in file lr_finder.py in line 42 I've replaced:

self.lr_mult = (end_lr / start_lr) ** (1 / num_batches)

with

self.lr_mult = (float(end_lr) / float(start_lr)) ** (float(1) / float(num_batches))

because if I use the first line, the result was 1 all the time and the learning rate did not update at all.

What is kw_fit?

What is kw_fit?
I'm trying to make lr_finder work with a keras.Sequence, but I'm unable to do so. In my attempts, I figured out I have no idea of where kw_fit comes from. What is this variable?

Automatic Select Best LR

How do I pick the the best learning rate from the LRFinder object?

If there was a function to do this in the library, then this object could be used with LearningRateScheduler to automatically pick the learning rate again after n epochs using the lr finder method.

Code could look something like:

import numpy as np
def get_derivatives(self, sma):
    assert sma >= 1
    derivatives = [0] * sma
    for i in range(sma, len(self.lrs)):
        derivatives.append((self.losses[i] - self.losses[i - sma]) / sma)
    return derivatives

def get_best_lr(self, sma, n_skip_beginning=10, n_skip_end=5):
    derivatives = self.get_derivatives(sma)
    best_der_idx = np.argmax(derivatives[n_skip_beginning:-n_skip_end])
    return self.lrs[n_skip_beginning:-n_skip_end][best_der_idx]      

Alternatively, you can just pick the pick the learning rate associated with the best loss, ie new_lr = self.lrs[np.argmin(self.losses)]

Is it only good with python 3?

When I use this function in python 2 in virtual environment, it couldn’t plot the figures correctly. The figure was just a vertical line at learning rate with 0.0001. Thanks

I tried your example in python 3 without in virtual environment, it worked well

How to choose the parameters in the function of lr_finder.find

How to set the values of start_lr, end_lr, batch_size and epochs? In my model, I couldn't get the figures like your example.

I used
lr_finder.find(X_train, y_train, start_lr=0.0001, end_lr=1, batch_size=512, epochs=5)
lr_finder.plot_loss(n_skip_beginning=20, n_skip_end=5)

Then I got a figure like this.
loss_lr

lr_finder.plot_loss_change(sma=20, n_skip_beginning=20, n_skip_end=5, y_lim=(-0.1, 0.1))
image

Could you give me some advice to use other parameters to find a good lr? Many thanks

How to use get_best_lr method?

How to use get_best_lr method from the class LRFinder
I tried

lr_finder.get_best_lr(sma=20)

I got the following error.

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-28-407ca9d54a8d> in <module>
----> 1 lr_finder.get_best_lr(sma=20)

/kaggle/usr/lib/keras_lr_finder/keras_lr_finder.py in get_best_lr(self, sma, n_skip_beginning, n_skip_end)
    164     def get_best_lr(self, sma, n_skip_beginning=10, n_skip_end=5):
    165         derivatives = self.get_derivatives(sma)
--> 166         best_der_idx = np.argmax(derivatives[n_skip_beginning:-n_skip_end])[0]
    167         return self.lrs[n_skip_beginning:-n_skip_end][best_der_idx]

IndexError: invalid index to scalar variable.

Why does this lr_finder use training loss instead of validation loss?

I have looked into the post "Estimating an Optimal Learning Rate For a Deep Neural Network", it suggested to use training loss to determine the best learning rate to use or a range of learning rate to use. However, in the paper "Cyclical Learning Rates for Training Neural Networks", the author used validation accuracy to find the learning rate range. So, in my humble opinion, lr_finder should evaluate val_loss after each batch and record it, then plot a graph using "validation loss" against "learning rate".

get_best_lr - argmax

Why do you use the argmax method within get_best_lr? Since we are interested in the fastest decrease in the loss shouldn't we use argmin?

Incorrect lr_mult value when using find_generator

When using find_generator:

self.lr_mult = (float(end_lr) / float(start_lr)) ** (float(1) / float(steps_per_epoch))

lr_mult causes the learning rate to converge at end_lr after only 1 epoch. If you want to use 2 or more epochs, then your learning rate will exceed end_lr and end up training on a learning rate that is far too high than intended.

The fix here is (when end_lr=1,start_lr=0.001,steps_per_epoch=1000,epochs=4):

lr_mult = ((float(end_lr) / float(start_lr)) ** (1. / float(steps_per_epoch*epochs)))

Here's a sample script to show how the two functions diverge while cur_lr <= end_lr
Figure_1

May I use this with sequential(keras.utils.Sequence) data?

I'm using 3rd party working code that fits a TCN-based model with

model.fit(train_seq, steps_per_epoch=len(train_seq), epochs=20)

where train_seq is a keras.utils.Sequence implemented by the following code:

def cnn_pad(data, pad_frames):
    """Pad the data by repeating the first and last frame N times."""
    pad_start = np.repeat(data[:1], pad_frames, axis=0)
    pad_stop = np.repeat(data[-1:], pad_frames, axis=0)
    return np.concatenate((pad_start, data, pad_stop))
    
class DataSequence(Sequence):
    def __init__(self, x, y, fps=FPS, pad_frames=None):
        self.x = x
        y_proc = np.zeros(np.shape(x)[1])
        np.add.at(y_proc, y[0]*FPS, 1)
        self.y = [y_proc]
        self.pad_frames = pad_frames

    def __len__(self):
        return len(self.x)

    def __getitem__(self, idx):
        x = np.array(cnn_pad(self.x[idx], self.pad_frames))[np.newaxis, ..., np.newaxis]
        y = self.y[idx][np.newaxis, ..., np.newaxis]
        return x, y

x = [np.random.randn(1949, 81)]  # to emulate audio
y = [np.asarray([1, 2, 3, 4, 5])]  # to emulate annotations
train_seq = DataSequence(x[:1], y[:1], pad_frames=2)

How may I use LRFinder for this type of data? Ie, is it possible to "retrieve" valid x and y from the Sequence?
To reproduce this issue (almost in entirety, with the exception of the keras h5 model), here is some dummy code:

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.