This is the code for this video on Youtube by Siraj Raval on Sensor Networks. We can imagine sensors as a grid, then use the bellman equation to compute the optimal policy to get from Router A to Router B as efficinetly as possible. Implementations of MDP value iteration, MDP policy iteration, and Q-Learning in a toy grid-world setting.
- matplotlib
- OpenCV
- numpy
Install missing dependencies using pip
Run 'python RL.py' in terminal to run the code.
Credits for this code go to kevlar1818. I've merely created a wrapper to get people started.