Arena: A Scalable and Configurable Benchmark for Policy Learning

Arena is a scalable and configurable benchmark for policy learning. It is an object-based game-like environment. The game logic is reminiscent of many classic games such as Pac-Man and Bomberman. An instance of the Arena benchmark starts with an arbitrarily sized region (i.e., the arena) containing a controllable agent as well as an arbitrary number of destructable obstacles, enemies, and collectable coins. The agent can move in four directions, fire projectiles, as well as place bombs. The goal is to control the agent to collect as many coins as possible in the shortest amount of time, potentially kill enemies and destroy obstacles using the projectiles and bombs along the way.

Installation

The main part of PGLE only requires the following dependencies:

numpy
pygame

Clone the repo and install with pip.

git clone https://github.com/Sirui-Xu/Arena.git
cd Arena/
pip install -e .

How to play these games yourself

cd examples/
python play.py

Use w, s, a, d to move, space to place bombs, and j to fire projectiles.

Getting started

Here's an example of importing Arena from the games library within Wrapper:

from arena import Arena

game = Arena(width=1280,
             height=720,
             object_size=32,
             obstacle_size=40,
             num_coins=50,
             num_enemies=50,
             num_bombs=3,
             explosion_max_step=100,
             explosion_radius=128,
             num_projectiles=3,
             num_obstacles=200,
             agent_speed=8,
             enemy_speed=8,
             p_change_direction=0.01,
             projectile_speed=32,
             visualize=True,
             reward_decay=0.99)

It's important to change the map size and the number of objects as a test for scalability.

Next we configure and initialize Wrapper:

from arena import Wrapper

p = Wrapper(game)
p.init()

You are free to use any agent with the Wrapper. Below we create a fictional agent and grab the valid actions:

myAgent = MyAgent(p.getActionSet())

We can now have our agent, with the help of Wrapper, interact with the game over a certain number of frames:

nb_frames = 1000
reward = 0.0

for f in range(nb_frames):
    action = myAgent.pickAction(reward, state)
    state, reward, game_over, info = p.step(action)
    if game_over: #check if the game is over
        state = p.reset()

Just like that we have our agent interacting with our game environment. A specific example can be referred to example/test.py

Test heuristic policy

cd example
python algorithm.py --algorithm ${algorithm_name} --store_data

${algorithm_name} should be something like random.

Train GNN policy

cd example
python train.py --dataset ${data_path} --checkpoints_path ${checkpoints_path} --model ${model_name}

Test GNN policy

cd example
python test.py --checkpoints_path ${checkpoints_path}

Train DQN agent (AX0)

cd examples/rl_dqgnn
python train_dqgnn.py --train --model_path ${path to save model} --num_episode 5000 --num_rewards 5

Visualizing DQN policy

cd examples/rl_dqgnn
python eval_dqgnn.py --train --model_path ${path to load model}

Acknowledgements

We referred to the PyGame Learning Environment for some of the implementations.

sirui-xu / arena Goto Github PK

arena's Introduction

Arena: A Scalable and Configurable Benchmark for Policy Learning

Installation

How to play these games yourself

Getting started

Test heuristic policy

Train GNN policy

Test GNN policy

Train DQN agent (AX0)

Visualizing DQN policy

Acknowledgements

arena's People

Contributors

Stargazers

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent