A reinforcement learning program that uses policy iteration to help determine the optimal policy for any markov decision problem that can be represented with a 2D matrix.
The user can specificy any value-function (represented as a 2D matix), any reinforcment values (can differ for different states), any terminal state locations, any gamma value (greediness level), and any number of iterations for policy evaluation to be run. The program will use these values to apply policy iteration to the value-function table and prints out the table at each iteration.
Run policyEval.py. You will be prompted to enter information needed to create the initial value-function table. If you entered your values incorrectly, you will have the option to restart.
*initially created for my CSC-261: Artificial Intellegence course