hypOpt

Provides to optimize the hyperparameters using Reinforcement Learning. Term Project for the Optimization course at Izmir University of Economics.

Methodology

Agent

The agent makes decisions by choosing actions that are expected to maximize the cumulative reward over time. The agent is a neural network model that takes the state of the environment as input and outputs the action to be taken.

Optimization Problem

The objective function is to minimize the validation loss (val_loss) which is the Mean Squared Error (MSE). Given a set of n samples, where for each sample i, the predicted value is y^ i and the actual value is yi , the MSE is calculated as:

Rewards and Penalties

In the context of reinforcement learning, Q-values that must be maximized represent the expected future reward for taking a certain action in a certain state. where:

● s is the current state

● a is the action taken,

● r is the immediate reward received after taking action a in state s,

● s′ is the new state after taking action a,

● a′ is the action taken in state s′,

● γ is the discount factor which determines the present value of future rewards.

Algorithm

Random Search

Let f: ℝ n → ℝ be the fitness or cost function which must be minimized. Let x ∈ ℝ n designate a position or candidate solution in the search-space. The basic RandomSearch algorithm can then be described as:

Initialize x with a random position in the search-space.
Until a termination criterion is met (e.g. number of iterations performed, or adequate fitness reached), repeat the following:
1. Sample a new position y from the hypersphere of a given radius surrounding the current position x (see e.g. Marsaglia's technique for sampling a hypersphere.)
2. If f(y) < f(x) then move to the new position by setting x = y

You can access the Tutorial and all inferences below.

References

[1] Neuronlike Adaptive Elements That Can Solve Difficult Learning Control Problems ANDREW G. BARTO, Member, IEEE, Richard S. Sutton, and Charles w. Anderson (0018-9472/83/0900-083401.00 01983 IEEE)

alicanakca / hypopt Goto Github PK

hypopt's Introduction

hypOpt

Methodology

Agent

Optimization Problem

Rewards and Penalties

Algorithm

Random Search

References

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent