objective: To solve the problem of finding an optimal path
There is an employee who lives in Azadi (point 1) in the Path and wants to go to the office in Tajrish (point 11) in the path. Based on the traffic and some restrictions the number of routes that he can decide is limited and also has been shown in the link. The amount of fuel consumed and the time spent on a route depend on a number of factors, including the length of the path, the slope of the route, the traffic volume.
The employee wants to know which path is the lowest one in term of time and also the path with least fuel consumed. Furthermore the employee tries to minimize the function which is shown bellow:
C = Fuel + Time^2
With SARSA and Q-learning algorithm an employee can find the best path based on fuel consumption and time. The networkx library is used in this project. Also here is a MapBuilder class to build the map which can use in this project. E-greedy in this work can not be used because it is hard to converge to a way at the end of the episode but boltzmann policy with high temperature will converge to the best path.