Giter Site home page Giter Site logo

reinforcement-learning-notes's Introduction

A constantly evolving list of Reinforcement Learning papers, notes, books etc.

Glossary:

  • ๐Ÿš€ - state-of-the-art method in current domain at the moment of paper publication.
  • โญ - valuable paper.

Domain Tags:

  • atari - Atari game (Atari).
  • doom - Doom game (Doom).
  • sc - Starcraft game (Starcraft).
  • nn - Neural Networks & Optimizers (NN).
  • go - Go game (Go).
  • table - Table games (Table).
  • robot - Real-robot applications (Robot).
  • loco - Real/Simulated robotic locomotion (MuJoCo, Roboschool etc).
  • maze - Mazes and Labyrinths (Maze).
  • Multi - Multi-agent learning.
  • Continious - Methods with continious action space support.
  • Planning - Complex planning problems.
  • Transfer - Transfer learning.
  • RTS - Real-Time Strategy video game.
  • FPS - First-Person Shooter video game.

Deep Reinforcement Learning

Year 2018

๐Ÿš€ IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures

  • [arXiv], [pdf]
  • Such et al.; Uber AI Labs
  • atari maze Atari, Maze

โญ One-Shot Imitation from Observing Humans via Domain-Adaptive Meta-Learning

  • [arXiv], [pdf]
  • Finn et al.; UC Berkeley
  • robot Robot, Meta-Learning

๐Ÿš€ Regularized Evolution for Image Classifier Architecture Search

  • [arXiv], [pdf]
  • Real et al.; Google Brain
  • nn NN

Year 2017

๐Ÿš€ Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning

  • [arXiv], [pdf]
  • Such et al.; Uber AI Labs
  • atari loco Atari, Locomotion, Continuous

Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm

  • [arxiv], [pdf]
  • Silver et al.; DeepMind
  • table Table

๐Ÿš€ Rainbow: Combining Improvements in Deep Reinforcement Learning (DQN improvements combined)

  • [arXiv], [pdf]
  • Hessel et al.; Deepmind
  • atari Atari

โญ Meta Learning Shared Hierarchies

One-Shot Visual Imitation Learning via Meta-Learning

  • [arXiv], [pdf]
  • Finn et al.; UC Berkeley, OpenAI
  • robot Robot, Continious, Meta-Learning

โญ Learning with Opponent-Learning Awareness (LOLA)

  • [arXiv], [pdf], [official blog post]
  • Foerster et al.; OpenAI, University of Oxford, UC Berkeley, Carnegie Mellon University
  • Multi

๐Ÿš€ Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR, A2C)

  • [arXiv], [pdf]
  • Wu et al.; University of Toronto, New York University
  • atari loco Atari, Locomotion, Continious

Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning

๐Ÿš€ Proximal Policy Optimization Algorithms (PPO)

๐Ÿš€ Learning Transferable Architectures for Scalable Image Recognition

  • [arXiv], [pdf]
  • Zoph et al.; Google Brain
  • nn NN

โญ Hybrid Reward Architecture for Reinforcement Learning (HRA)

  • [arXiv], [pdf]
  • van Seijen et al.; Microsoft Maluuba, McGill University
  • atari Atari

Parameter Space Noise for Exploration

  • [arXiv], [pdf]
  • Plappert et al.; OpenAI, Karlsruhe Institute of Technology
  • atari loco Atari, Locomotion, Continious

๐Ÿš€ Mastering the Game of Go without Human Knowledge (AlphaGo Zero)

Neural Optimizer Search with Reinforcement Learning

  • [pdf]
  • Bello et al.; Google Brain
  • nn NN

Asymmetric Actor Critic for Image-Based Robot Learning

Sim-to-Real Transfer of Robotic Control with Dynamics Randomization

A Deep Reinforcement Learning Chatbot

Learning model-based planning from scratch

โญ Imagination-Augmented Agents for Deep Reinforcement Learning (I2As)

Distral: Robust Multitask Reinforcement Learning

  • [arXiv], [pdf]
  • Teh et al.; DeepMind
  • maze Maze, Transfer

Emergence of Locomotion Behaviours in Rich Environments

Programmable Agents

  • [arXiv], [pdf]
  • Denil et al.; DeepMind
  • loco Locomotion, Continuous

โญ Evolution Strategies as a Scalable Alternative to Reinforcement Learning

  • [arXiv], [pdf]
  • Salimans et al.; OpenAI
  • atari Atari

Neural Episodic Control

  • [arXiv], [pdf]
  • Pritzel et al.; DeepMind
  • Brief Summary. NEC agent is extremely data efficient. It's performance at 5 millions of frames can be reached by DQN with Prior. Replay only after 40 millions of frames. However, the final performance is still worse than the other state-of-the-art agents can obtain.
  • atari Atari

Year 2016

The Predictron: End-To-End Learning and Planning

  • [arXiv], [pdf]
  • Silver et al.; DeepMind
  • maze Maze, Planning

RL2: Fast Reinforcement Learning via Slow Reinforcement Learning

  • [arXiv], [pdf]
  • Duan et al.; Berkeley, OpenAI
  • maze Maze, Meta-Learning

Neural Architecture Search with Reinforcement Learning

  • [arXiv], [pdf]
  • B. Zoph and Quoc V. Le; Google Brain; ICLR.
  • nn NN

Reinforcement Learning with unsupervised auxiliary tasks (UNREAL)

  • [arXiv], [pdf]
  • Jaderberg et al.; Google DeepMind
  • ๐Ÿ“ Notes
  • atari maze loco Atari, Maze, Locomotion, Continious

๐Ÿš€ Learning to act by predicting the future (VizDoom 2016 Full DM Winner)

  • [arXiv], [pdf]
  • Dosovitskiy, Koltun; Intel Labs
  • doom maze Doom, Maze, FPS

Multiagent Bidirectionally-Coordinated Nets for Learning to Play StarCraft Combat Games

  • [arXiv], [pdf]
  • Peng et al.; Alibaba Group, University College London
  • sc Starcraft, Multi

Playing FPS Games with Deep Reinforcement Learning (VizDoom 2016 Limited DM 2nd place)

  • [arXiv], [pdf]
  • Lample, Chaplot; Carnegie Mellon University
  • doom maze Doom, Maze, FPS

[RTS:SC] Episodic Exploration for Deep Deterministic Policies: An Application to StarCraft Micromanagement Tasks

  • [arXiv], [pdf]
  • Usunier et al.; Facebook AI Research
  • sc Starcraft

๐Ÿš€ Asynchronous Methods for Deep Reinforcement Learning (A3C)

  • [arXiv], [pdf]
  • Mnih et al.; DeepMind
  • ๐Ÿ“ Notes
  • atari maze loco Atari, Maze, Locomotion, Continious

Year 2015

โญ Dueling Network Architectures for Deep Reinforcement Learning (Dueling DQN)

  • [arXiv], [pdf]
  • Wang et al.; DeepMind
  • atari Atari

Prioritized Experience Replay

โญ Deep Reinforcement Learning with Double Q-learning (Double DQN)

  • [arXiv], [pdf]
  • Hasselt et al.; DeepMind
  • atari Atari

High-dimensional continuous control using generalized advantage estimation

  • [arXiv], [pdf]
  • Schulman et al.; Berkeley
  • loco Locomotion, Continuous

โญ Trust Region Policy Optimization (TRPO)

  • [arXiv], [pdf]
  • Schulman et al.; UC Berkeley
  • atari maze loco Atari, Maze, Locomotion, Continious

๐Ÿš€ Human-level control through deep reinforcement learning (DQN)

Mastering the game of Go with deep neural networks and tree search (AlphaGo Master)

  • [Nature], [reddit]
  • Silver et al.; Deepmind, Google
  • go table Go, Table

Year 2013

๐Ÿš€ Playing Atari with Deep Reinforcement Learning (DQN)

  • [arXiv], [pdf]
  • Mnih et al.; DeepMind Technologies
  • atari Atari

Evolving Large-Scale Neural Networks for Vision-Based Reinforcement Learning

  • [pdf]
  • Koutnik et al.; IDSIA, USI-SUPSI

2012 and earlier

Horde: A Scalable Real-time Architecture for Learning Knowledge from Unsupervised Sensorimotor Interaction

  • [pdf]
  • Sutton et al. (2011); University of Alberta, McGill University
  • robot loco Robot, Locomotion

โญ Policy Gradient Reinforcement Learning for Fast Quadrupedal Locomotion

  • [pdf]
  • Kohl and Stone (2004); The University of Texas at Austin
  • robot loco Robot, Locomotion

โญ Autonomous helicopter flight via reinforcement learning

  • [pdf]
  • Ng et al. (2004); Stanford, Berkeley
  • robot Robot

โญ Actor-Critic Algorithms

  • [pdf]
  • Konda and Tsitsiklis (2003)

โญ Temporal Difference Learning and TD-Gammon

  • [pdf]
  • Gerald Tesauro (1995)
  • table Table

โญ Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning (REINFORCE)

  • [pdf]
  • Ronald J. Williams (1992); Northeastern University

Books

โญ Reinforcement Learning: An Introduction (Complete Draft)

  • [pdf]
  • Richard S. Sutton and Andrew G. Barto (2018)

Miscellaneous

How to Read a Paper

  • [pdf]
  • S. Keshav (2007); University of Waterloo

reinforcement-learning-notes's People

Contributors

dbobrenko avatar

Watchers

James Cloos avatar paper2code - bot avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.