An introduction to Model-Based (RMax) and Model-Free (Q-Learning and SARSA) Reinforcement Learning techniques
Imagine you and your friends are throwing a frisbee on a cold January afternoon, when someone throws it just a little too strong and it lands on the frozen lake nearby! Your parents will be so angry if you lose that frisbee, so you HAVE to get it.
You walk up to the lake and step on it. You realize that it's slippery everywhere. You can choose to go forward, left, or right. But because it's slippery, you only perform the action you meant to do 33% of the time! If you try to go forward you have a 33% chance to go forward, 33% chance to go left, and 33% chance to go right. You want to find the best way to get to the frisbee without falling in any holes in the ice.
We can solve this problem with Reinforcement Learning techniques by modeling it as a Markov Decision Process (MDP).
RMax
-
Q-Learning
-
SARSA