Video Demo: CS50P. Windy Grid
Description: The project is a 7x10 grid turn-based game with an element of stochasticity, where one player is an AI agent and another player is human.
- find an optimal path in grid environment with an element of stochasticity
- implementation of Expected SARSA algorithm inspired by Theoretical and Empirical Analysis of Expected Sarsa
- instead of a tabular approach for an Q-value function for educational purposes I decided to choose function approximation by neural network with no hidden layers
- exposed ability to train own agent by adjusting parameters such as learning rate, number of episodes, ratio for exploration and exploitation
- modification of Windy Gridworld environment to allow human to play with trained agent
📦project
┣ 📜README.md
┣ 📜W.pickle
┣ 📜main_menu.png
┣ 📜players.py
┣ 📜project.py
┣ 📜requirements.txt
┣ 📜test_project.py
┗ 📜windy.py
project.py
file contains main menu with different options and you can navigate between them
- Start a new game
- Train an agent. Keep in mind if you want to train a new agent again from scratch it might take some time but you will be able to track this progress.
- Game rules
- Exit
Moreover, the functions to store and load the pre-trained agent from W.pickle
file to be able to play with a human player on a good enough level. Every time you start a game the trained agent will be loaded from this file.
Basic class Player and inherited from it Human and Agent are presented in players.py
and implement logic of picking actions, updating state on the grid and weights based on neural network approximation of Q-value calculated by Expected SARSA algorithm.
The windy.py
introduces a gridworld environment and exposes a render function to display the grid itself and player positions.