Giter Site home page Giter Site logo

vocdevshy / q-learning_frozen_lake Goto Github PK

View Code? Open in Web Editor NEW
2.0 3.0 0.0 113 KB

Made with the gym package from the farama foundation, this project is an hyper detailed version of the Q-Learning reinforcement on the Frozen lake's game.

Python 99.45% Nix 0.55%
agent ai artificial-intelligence detailed frozenlake gym gymnasium machine-learning python q-learning

q-learning_frozen_lake's Introduction

VOC LOGO

This project has been made in a studying context so it could have some errors in the code.
(You have a list in the "Bug List" file in the doc folder if you're interested to help the project!)

This project has been done with Gymnasium from Farama-Foundation that is made for the AI Reinforcement Learining and the Q-Learning domains in python.
(If you want to see what is gymnasium click here to go on the Github page of Gymnasium)

If you want more information about Q-Learning and the Frozen Lake game, you could read the article found on medium, he help me a lot to understand how works the Q-Learning: Q-Learning For Beginners by Maxime Labonne

Welcome on one of the most ultra-detailed version of the
Frozen-Lake Q-Learning project
Ver. 2.1.0

Table of content

About

Like his name is telling, the project is an ultra-detailed version of the Frozen-Lake Q-Learning project.
This program allow to train an agent on the Frozen-Lake game in a range of episodes that the user enter at the start of the program. This program use the Exploration X Exploitation method for the training. That means that the agent explore the environment but also use the updated Q-Table to have a better update of the Q-Table at the end.
The program offers the user the possibility of testing the updated Q-Table obtained by following the training.
During the training like during the test, you have a lot of datas that are detailed in the console during the sessions.

Packages

For this project you need some packages to install to run correctly the project:

  1. gymnasium(ToyText): pip install "gymnasium[toytext]"
  2. matplotlib.pyplot: pip install matplotlib.pyplot
  3. numpy: pip install numpy
  4. pygame: pip install pygame
  5. time: pip install time
  6. warning: pip install warning (optional only hide an error)

Obtainables datas

  • nb_success: Is use in the formula nb_sucess/episodes*100 to calculate the success rate of the training and of the test of the training
  • best_sequence: List of states in the best (shortest) episode that reach the goal
  • longest_best_sequence: List of states in the longest episode that reach the goal
  • longest_sequence: List of states in the longer episode that doesn't reach the goal
  • shortest_sequence: List of states in the shortest episode that doesn't reach the goal
    (All the sequence appeared in the input format (0, 1, 2, 3) and the words format (LEFT, DOWN, RIGHT, UP))
  • reward_counter: number of time that the agent obtain the reward
  • reward_episode: List of the episode that the agent obtain the reward
  • reward_sequence: List of the states in the episodes that the agent obtain the reward
  • recurent_sequence: Number of the episodes that the agent done the same sequence to reach the goal with the best sequence
  • total_actions: Total number of actions in the episodes where the agent reach the goal
  • action_counts[action_words[action]]: Number of Action by types of actions (LEFT, DOWN, RIGHT, UP)

Tools

Maps

  • 2x2 map
  • 4x4 map
  • 8x8 map
  • 16x16 map
    (The list of predefined maps and random generations ones are in the map.txt file in the tools folder.)

Q-Injection

The Q-Injection is a functionality that have for goal to test Q-Tables like:

  • Randomized Q-Table
  • Trained Q-Table (obtained by a training done by our team)
  • A start of trained Q-Table (Three Value)

But also to train them to obtain better results using the Exploration X Exploitation method.
(For more information about the Q-Injection read the injection.md file in the tools folder)

For those who are interested by the calculation of the Q-Table here is an explication:
(Hope it helps you to understand the Q-Learning)

qtable[state, action] = qtable[state, action] + alpha * (reward + gamma * np.max(qtable[next_state, :]) - qtable[state, action])
  • qtable[state, action]: This refers to the current value of action (0, 1, 2, 3 (LEFT, DOWN, RIGHT, UP)) in state (number of the case) of the Q-table. This is the value we will update.
  • alpha: This is the learning rate. It controls the extent to which new information will be integrated into the old values of the Q-table. A high value means that new information will have a greater impact on existing values, while a low value means they will have a lesser impact.
  • reward: This is the immediate reward obtained after taking action in state. This reward is equals to a positive float (1.0).
  • gamma: This is the discount factor. It represents the importance of future rewards compared to immediate rewards. A gamma close to 1 gives great importance to future rewards, while a gamma close to 0 gives similar importance to all rewards, whether immediate or future.
  • np.max(qtable[next_state, :]): This is the maximum value among all possible actions in the next state (next_state). This represents the best estimate of the future value that the agent can obtain from the next state.

q-learning_frozen_lake's People

Contributors

vocdevshy avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar

q-learning_frozen_lake's Issues

Ideas

Hi I'm VOCDevShy, like my name is telling, I'm a dev from VOC.inc (Virtual Online Chatbot).

I'm working on the Q-Learning Frozen Lake project since 3 mounths.

Today we need you, to know what functionnalities you would like to see in the futur of the project.
Our dev team and I have finished working on the 2.0.0 of the project adding a new functionnality (Q-Injection) so now the team is at the research of new ideas to implement in the code. So we need you to know what the community want for the futur of the project.

If you have ideas for the project you need to respect this list:

  • Be clear in what your ideas are.
  • Don't propose datas or functionalities that already exist.
  • Obtainables datas that lookalike datas that already exist are not acepted.
  • Functionalities need to have a meaning with the project (proposition like: "Add a dancing cat" or "add a webview" are not acepted)

Hope you help our dev team to add more functionalties or obtainables datas for the futur of the project!

VOC.inc

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.