This repository contains all my submissions to assignments written during my study of the CS747: Foundations of Intelligent and Learning Agents course in Autumn 2019 at Indian Institute of Technology (IIT) Bombay, India.
- Assignment 1: Implementation of round-robin sampling, epsilon-greedy exploration, UCB, KL-UCB, and Thompson Sampling
- Assignment 2: Implementation of Linear Programming, Howard's Policy Iteration for finding an optimal policy for a given MDP
- Assignment 3: Estimation of the value function of a policy for a given MDP from a trajectory of the form state, action, reward, state, action, reward, ….
- Assignment 4: implementation of the Windy Gridworld task given as Example 6.5 by Sutton and Barto (2018)
- Project: The final project for the course was: Video Captioning using Policy Gradient Optimization. More details can be found in the Project directory.
- Vamsi Krishna Reddy Satti - vamsi3
- Vighnesh Reddy Konda - scopegeneral
- Yaswanth Kumar Orru - yas777
I'm thankful to the course instructor Prof. Shivaram Kalyanakrishnan for the well structured assignments.
This project is licensed under the MIT License - please see the LICENSE file for details.