Giter Site home page Giter Site logo

gerel's Introduction

GeReL

GeReL is a simple library for genetic algorithms applied to reinforcement learning.

NOTE: GeReL is in development.

Example:

The following uses REINFORCE-ES to solve openai cartpole environment

from gerel.genome.factories import dense
from gerel.algorithms.RES.population import RESPopulation
from gerel.algorithms.RES.mutator import RESMutator
from gerel.model.model import Model
import gym
import numpy as np
from gerel.populations.genome_seeders import curry_genome_seeder
from string import Template


def compute_fitness(genome):
    model = Model(genome)
    env = gym.make("CartPole-v0")
    state = env.reset()
    fitness = 0
    action_map = lambda a: 0 if a[0] <= 0 else 1  # noqa
    for _ in range(1000):
        action = model(state)
        action = action_map(action)
        state, reward, done, _ = env.step(action)
        fitness += reward
        if done:
            break

    return fitness


if __name__ == '__main__':
    genome = dense(
        input_size=4,
        output_size=1,
        layer_dims=[2, 2, 2]
    )

    weights_len = len(genome.edges) + len(genome.nodes)
    init_mu = np.random.uniform(-1, 1, weights_len)

    mutator = RESMutator(
        initial_mu=init_mu,
        std_dev=0.1,
        alpha=0.05
    )

    seeder = curry_genome_seeder(
        mutator=mutator,
        seed_genomes=[genome]
    )

    population = RESPopulation(
        population_size=50,
        genome_seeder=seeder
    )

    report_temp = Template('generation: $generation, mean: $mean, best: $best')
    for generation in range(100):
        for genome in population.genomes:
            genome.fitness = compute_fitness(genome.to_reduced_repr)
        population.speciate()
        data = population.to_dict()
        mutator(population)
        report = report_temp.substitute(
            generation=generation,
            mean=data['mean_fitness'],
            best=data['best_fitness'])
        print(report)

Tests:

To run all unittests:

nosetests

gerel's People

Contributors

dependabot[bot] avatar mauicv avatar

Watchers

 avatar  avatar

gerel's Issues

Add examples

Instead of having integration tests replace with Example folder and include documentation for each case.

Memory issues

Seems like running populations around ~300 with network hidden layer sizes ~ [100, 100] ends up resulting in significant slow downs. I don't feel these sizes should be an issue. Figure out what cuases this? Is it just memory?

get_addmissable_edges returns emptylist?

This may be an error that occurs. Very intermittent so mostly leaving here in case it crops up so as to provide more notes on. Do not persue as may not exist!

Project aims

Currently pyg will successfully run NEAT even with some minor errors #5. The aim is to build pyg so that:

  1. NEAT is easy
  2. Reinforce-ES is implementable
  3. Simple to serve as a backend to a realtime mixed human selection system.
    • Idea is that using a very simple baseline reward function in a given environment we have a human select best stratigies. We both use the human data as a fitness function and train a critic network that adjusts the baseline reward function.
    • As well as this it would be useful to be able to use policy gradient methods to fine tune evolved solutions. To do this we need to be able to convert between out Model some other ML frame work model such as keras...

add genome_seeder_from_ds

Add function that creates a genomes_seeder or population from DataStore and specified generation.

Make SIMPLEMutator

Simple mutator just selects the top n genomes and mutates each some number of times to refil the population.

Add Interface for Mutator Class

Mutator Class should a) be an interface

from src import Population
from src import Mutator
from src import generate_neat_metric
from src import Model

class CustomlMutator(Mutator):
    def mutate(self, genome):
        # do something to genome

    def crossover(self, genome):
        # crossover genome

custom_mutator = CustomMutator()
.
.
.
custom_mutator(population)

Questions:

  1. How do we consolodate the NEAT method of selecting weights and the REINFORCE-ES method?

imporve from_genes implementation

Currently datastore saves the genome structure as all nodes and edges includeing input and output nodes but from_genes hydrator factory expects num_inputs and num_outputs and that the nodes and edge lists passed are only hidden nodes. Hence to use from_nodes with datastore we have to do:

generation = ds.load(generation_in)
nodes, edges = last_gen['best_genome']
input_num = len([n for n in nodes if n[4] == 'input'])
output_num = len([n for n in nodes if n[4] == 'output'])
nodes = [n for n in nodes if n[4] == 'hidden']
genome = from_genes(
    nodes, edges,
    input_size=input_num,
    output_size=output_num,
    weight_low=-2,
    weight_high=2,
    depth=len(LAYER_DIMS))

Similarly computing the number of layers is not trivial and should be just be a value stored in some meta data either on the generation or the DataStore class itself.

Add Adpt REINFORCE-ES

Same as REINFORCE-ES #19 but we select the highest perfoming solution from population that lies along the derivative vector rather than taking a fixed step size each time.

Mutator class should act on Population objects

Mutator Class should act on populations. This would look like:

from src import Population
from src.Neat import NEATMutator
from src import generate_neat_metric
from src import Model

def compute_fitness(genome):
    model = Model(genome)
    # compute fitness of model here
    return fitness

mutator = NEATMutator()
metric = generate_neat_metric()
population = Population(metric=metric)

for i in range(10):
    for genome in population.genomes:
        genome.fitness = compute_fitness(genome.to_reduced_repr)
    population.speciate()
    mutator(population)

Pretrain Network

Add some functionality to pretrain network to fit certain inital conditions. This is in order to solve the problem where all the networks within a certain population end up exhibiting similar behavours.

Add species object

Currently we're using a dict to store the data on each species this should be abstracted into a class becuase at the very least it'll enable code completion.

Integrate Multiprocessing docs and tool

The aspect of evolutionary algorithms applied to RL that benifits from multi processing is really the environment simulation step that has to be performed for each genome in the class. This is outside the scope of what this library tries to do but...

  • Should add examples in the docs.
  • Should have a simple tool that makes it easy out of the box.

Make factory function for different genome classes

Genome.default() builds a very simple Genome with one hidden node. The assumption is that we use it in NEAT algorithm.

Should really be called Genome.default_neat() and we should at some point implement other factories...

Rename from_genes

from_genes should probabaly be renamed to genome_from_genes added to the Genome class as static method. Same is true of other factory methods.

Make Population Class Interface

Population Class should be an interface with the Speciate method to be implemented.

This looks like:

class CustomPopulation(Population):
    def speciate(self):
       for genome in self.genomes:
            # do something here

p = CustomPopulation(metric=NEAT_metric())

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.