Giter Site home page Giter Site logo

learning-in-games's Introduction

Introduction

A game is a collection of n costs (or rewards) that agents try to minimize (or maximize). These costs map from the space of all of the actions of the agents to a scalar value.

We design and analyze learning algorithms that compute the equilibrium of the costs. These algorithms are based on the local first- and second-order gradients of the costs.

There are several equilibrium concepts that are significant in games. One equilibrium is where all agents are individually at a local minimum (or maximum). Another is when an agents is at a local optimum given that the other agent will perform optimally. These equilibria can be verified with the second order derivatives of the costs and the Schur complement of the game Jacobian.

Installation

Make sure to run the package in its own environment for testing. Install using poetry:

cd learning-in-games
poetry install

Run python notebooks:

poetry run jupyter notebooks

Run tests:

poetry run python tests.py

Formulation

Concretely, an n-player game denoted by G = (f1, f2, ..., fn) where

  1. costs are fi: X -> R, i = 1, 2, ..., n
  2. actions are xi in Xi which is a euclidean action space
  3. action profile is x = (x1, x2, ..., fn) in X = (X1, X2, ..., Xn)

The learning algorithms are primarily gradient-based, so they can take advantage of auto diff software.

The following derivatives are useful:

  1. game form g = (D1f1, D2f2, ..., Dnfn)
  2. game jacobian J = [[D11f1, D12f1], [D21f2, D22f2]]
  3. interaction terms (D2f1, D1f2)

These operators are applied at action profile x(t) at time t, typically following x(t+1) = x(t) - lr * g(x(t)) or some other history-dependent first order methods like acceleartion.

The joint gradients g = (g1, g2, ..., gn) is defined by some combination of the gradients of the costs (f1, f2, ..., fn).

For example, gradient descent-ascent is g = (D1f, -D2f) whereas potential games are g = (D1V, D2p) for potential function p:X->R.

Implementations

Two-player games

We start with n = 2.

  • Quadratic costs
  • Linear dynamics, quadratic costs
  • ames
  • Zero-sum games
  • Potential games
  • General-sum games

n-player games

  • time-varying bimatrix games
  • grid world games

Single player games

As a benchmark, we try to always include a single player scenario (optimization problems, LQR, classification, regression, etc).

learning-in-games's People

Contributors

bchasnov avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.