This project contains the code for Cribbage as a reinforcement learning problem.
The agents are: DeepPeg.py, LinearB.py, Myrmidon.py, Monty.py, Monty2.py, NonLinearB.py, and PlayerRandom.py
Each agent can be run directly as a python script
>> python3 <agentname>.py
This will run the agent in the Arena against Myrmidon and produces learning curve performance graphs.
Supporting files are: TrainHand.py, TrainPegging.py, and TrainingScript.py
These files are configured to be run directly as scripts; TrainHand.py and TrainPegging.py are both interactive scripts.
There are four kinds of files in this folder.
- Cribbage.py: Main file for playing a game of cribbage. Plays through hands one at a time, scoring for nobs, pegging and hands as it goes. Winner is declared as soon as one player reaches 121 points.
- Deck.py: Classes representing Suits, Ranks, Cards and Decks.
- Scoring.py: Scores cards according to the rules of cribbage.
- Player.py: An abstract class defining what methods a player class must have in order to play well with Cribbage.py.
- PlayerRandom.py: A simple instantiation of a Player. Makes decisions randomly.
- Myrmidon.py: A Player that makes use of one-step rollouts and heuristics.
- LinearB.py: A Player that represents hands using a linear combination of features. These features are then used for episodic semi-gradient one-step Sarsa during the throwing cards phase and for true online Sarsa during the pegging phase.
- NonLinearB: A Player that represents hands using a non-linear combination of features. These features are then used for episodic semi-gradient one-step Sarsa during the throwing cards phase and for true online Sarsa during the pegging phase.
- DeepPeg: A Player that uses two multilayer perceptron regressors to encode Q values: one for pegging and one for throwing cards.
- Monty.py: A player that uses first visit Monte Carlo to learn the Q values for different states. A minor modification of QLearner.
- Monty2.py: A second player that uses first visit Monte Carlo to learn the Q values for different states. A minor modification of QLearner.
- Arena.py: Records performance data for a player over a number of hands. Can be used to produce training curve data or to measure final performance levels.
- CriticSessions.py: Allows you to use one agent to critique the decisions of another agent during play.
- TrainPegging.py: Trains players on pegging phase of Cribbage.
- TrainHand.py: Trains players on a single hand of Cribbage.
- TrainingScript.py: Automates the processes of training agents against each other, providing critiques by other agents, and of running round robin tournaments.
- Utilities.py: Useful functions that are used throughout the project.
A number of learning agents store parameters in files. These are: throwWeights.npy and pegWeights.npy: LinearB NLBthrowWeights.npy and NLBpegWeights.npy: NonLinearB Brain files in the directory 'BrainsInJars': QLearner, Monty, and Monty2
These files have a number of dependencies on standard python libraries:
- abc
- enum
- math
- numpy
- sklearn
- os
- joblib
- warnings
- itertools
- matplotlib