Giter Site home page Giter Site logo

adamw_keras's Introduction

Implementation of the AdamW optimizer(Ilya Loshchilov, Frank Hutter) for Keras.

Tested on this system

  • python 3.6
  • Keras 2.1.6
  • tensorflow(-gpu) 1.8.0

Usage

Additionally to a usual Keras setup for neural nets building (see Keras for details)

from AdamW import AdamW

adamw = AdamW(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=None, decay=0., weight_decay=0.025, batch_size=1, samples_per_epoch=1, epochs=1)

Then nothing change compared to the usual usage of an optimizer in Keras after the definition of a model's architecture

model = Sequential()
<definition of the model_architecture>
model.compile(loss="mse", optimizer=adamw, metrics=[metrics.mse], ...)

Note that the size of a batch (batch_size), number of training samples per epoch (samples_per_epoch) and the number of epochs (epochs) are necessary to the normalization of the weight decay (paper, Section 4)

Done

  • Weight decay added to the parameters optimization
  • Normalized weight decay added

To be done (eventually - help is welcome)

  • Cosine annealing
  • Warm restarts

Source

ADAM: A METHOD FOR STOCHASTIC OPTIMIZATION, D.P. Kingma, J. Lei Ba

Fixing Weight Decay Regularization in Adam, I. Loshchilov, F. Hutter

adamw_keras's People

Contributors

glambard avatar ogrisel avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.