carykh / prisonersdilemmatournament Goto Github PK

Watch This Place's awesome video about iterated Prisoner's Dilemma for context! https://www.youtube.com/watch?v=BOvAbjfJ0x0

License: MIT License

Python 100.00%

prisonersdilemmatournament's Introduction

PrisonersDilemmaTournament

This is my CS 269i class project.

Watch This Place's awesome videos about iterated Prisoner's Dilemma for context!

https://www.youtube.com/watch?v=t9Lo2fgxWHw

https://www.youtube.com/watch?v=BOvAbjfJ0x0

Nicky Case's "The Evolution of Trust" is also super fascinating, but it's not necessary to understand this project: https://ncase.me/trust/

How this works: When you run code/prisonersDilemma.py, it will search through all the Python strategy files in code/exampleStrats. Then, it will simulate "Iterated Prisoner's Dilemma" with every possible pairing. (There are (n choose 2) pairings.) After all simulations are done, it will average each strategies' overall score. It will then produce a leaderboard of strategies based on their average performance, and save it to results.txt.

If you'd like to add your own strategy, all you have to do is create a new .py file in the code/exampleStrats folder that follows the same format as the others. Then, when you run code/prisonersDilemma.py, it should automatically include your strategy into the tournament!

Details

Payout Chart	Player A cooperates	Player A defects
Player B cooperates	A: +3, B: +3	A: +5, B: +0
Player B defects	A: +0, B: +5	A: +1, B: +1

In this code, 0 = 'D' = defecting, and 1 = 'C' = cooperating.

Strategy functions take in two parameters, 'history' and 'memory', and output two values: 'moveChoice' and 'memory'. 'history' is a 2*n numpy array, where n is the number of turns so far. Axis one corresponds to "this player" vs "opponent player", and axis two corresponds to what turn number we're on. For example, it might have the values

 [[0 0 1]       a.k.a.    D D C
  [1 1 1]]      a.k.a.    C C C

In this case, there have been 3 turns, we have defected twice then cooperated once, and our opponent has cooperated all three times.

'memory' is a very open-ended parameter that can takes on any information that should be retained, turn-to-turn. Strategies that don't need memory, like Tit-for-tat, can simply return None for this variable. If you want to keep track of being 'wronged', like grimTrigger.py, you can set memory to True or False. If you have an extremely complicated strategy, you have make 'memory' store a list of arbitrarily many varibles!

For the outputs: the first value is just the move your strategy chooses to make, with 0 being defect and 1 being cooperate. The second value is any memory you'd like to retain into the next iteration. This can be 'None'.

Each pairing simulation runs for this many turns:

200-40*np.log(random.random())

This means each game is guaranteed to be at least 200 turns long. But then, for every turn after the 200th, there is an equal probability that the game ends. The probability is very low, so there should be no strategizing to defect on the very last turn consequence-free.

prisonersdilemmatournament's People

Contributors

Stargazers

Watchers

Forkers

santimc henryruss2 abhijeetkrishnan patrickkan soapshirt321 womogenes lunarmagpie cobbcoding1 alecgmain dunkyl nobody5050 going-digital valadaptive mocha2007 zasharan2 jaiwargacki jichanbachan cavandervoort matteog195 l4vr0v fawaz1643 bled-7767 ferre0x1a davi1140 dremz seanlossef bs2kbs2k prisoners-dilemma-enjoyers tantusar errorcrafter visual-mov realalpe sangohan gwvdh yasirroni emacholdt furthandfurther rne1223 pattux paul0314 gabrielreynes dreamersoul kalyen phoenix616 robo-monk cameronchandler haydn-jones janjagusch ajwyman1 daniel-ler joshcarl4 missingassignments samfromcadott kanrieb tristanjrmills bertik23 roiexlab oliverwales tfburns niki9796dk valscsgo weewolfg panadero1 yoyoyonono aachen1306 benpm gono3 bluesheep7 techoholic tammon23 kishan93 redrem95 rokonshimo nightmarius timstanglow jostsalathe shayshoon perdignus adamoudad bakaq gemgames competitivemazesolutions zebarnabe spaghetsie neckstime bluseking the-soph-ia kaan-pide nickybu davawen reantric annoyed-droid ndelanou zacharybarbanell volotomite decaquestionmarks landerslevi gezakerecsenyi cai-lw tempernal

prisonersdilemmatournament's Issues

Risk of kingmaker strategies poisoning the pool

So first of all to define a kingmaker strategy: If a group of users collaborates and implements agents that are able to recognize each other this could be very detrimental as it could sabotage the tournament.

A very simple strategy for this would be to submit multiple agents, most of which are kingmakers, aswell as one king.

The kingmakers are able to recognize the king and the king is able to recognize the kingmakers, for example by using a sequence of arbitrary seeming cooperations/ defections

If the kingmaker doesnt recognize the king, it will refuse to cooperate, sabotaging the other strategies.

If the kingmaker does recognize the king he will always cooperate and let the king take advantage to maximize points for the king.

The individual kingmakers wouldnt gain anything, but the king would wildly outperform the other strategies, given that enough kingmakers are part of the pool, which would defeat the point of the experiment in my opinion.

stop pychache

i'm sorry, i was just being stupid.
infinite while loop :(

Just some thoughts

What if the real tournament starts after the dead line?

So what do I mean with that? Let's imagine he writes to each participant: You are on the 3rd place. So you won $100
But I will not pay you all at once. I will put your strategy to the test once more.
Send me $10 and I will send you $30. We will continue, until you got all your $100.

Would you send the first $10 ? ;)

Suggestion: add chance of miscommunication

Preface

Right now a more lenient version of grimTrigger with titForTat beats all of the example strategies, even when adding several other strategies(tested with both 44 and 24 different strategies). Due to the existence and success of grim strategies, Defecting at all can cause your strategy to be unsuccessful. As such most rounds will just be cooperation which reduces the power of strategy as always cooperate will have nearly the same effect and you can't defect to see if they are always cooperate without the risk of triggering them if they are grim. This further increases the effectiveness of grim strategies, thereby increasing their prevalence.

Why grim is so good

Grim is good because unlike tit for tat it is able to punish random strategies as well as being as effective at dealing with always defect, always cooperate, itself, and tit for tat as tit for tat is.

Solution

To address this problem I suggest adding a small chance for your action to be the opposite of what you wanted so you can pass defections off as miscommunications to increase strategy.

Are there any limitations? Just checking.

Hi, I am really excited about this. So to make sure I don't get disqualified and ruin the fun for myself, I want to ask a few questions that I couldn't find the answers to:
1) Are there any length limitations? For example, is there a limit on how many lines of code you can submit(Mine came out to be 290 0_0)?
2) Or any time limits(For example, your code can only run for up to 0.05 seconds)?
3) Should the submission only contain the strategy function? My code has more functions before it to make the coding easier(No external libraries, of course.) and it ran just fine, but I want to make sure.
4) Are there any more limitations that I couldn't think of?
Thank you in advance, and I wish you the best of luck.(I'm NOT going to win, unfortunately.)

Is there a way to edit submission?

My submission has a bug in it, it doesn't look like I can edit the attachment in the google form :/

[SPOILER] I FIGURED OUT (at least part of) THE EXPERIMENT GIMMICK [Formerly: What's up with the other branch?]

The "policestation" branch is linked on the webpage. Is this on purpose? Which answer format should we use?

EDIT: when opening in an incognito window, it links to the other other branch instead! Is this part of the experiment?

Agents can have priori knowledge of expected game length

Emphasis on expected.

Since the game length is defined to be int(200-40*np.log(random.random())), which has an expected value of 240, agents can make use of this in undesirable (?) ways. I.e. use a probabilistic approach to try to deceive the opponent at the end of the game with no consequences.

I think this is an unintended issue since

The whole point of the random game length was to prevent this behavior
It is not in the "spirit" of the iterated prisoner's dilemma

Traceback when running prisonersDilemma.py

Hello! I’m nearly positive this is something I’m doing wrong but I’m getting a traceback when attempting to run prisonersDilemma.py using the code in your gemstone branch.

See below:

time limit per turn?

title

Can't Edit File When Editing Form Response

I haven't seen anyone here posting about this though some of the youtube comments also pointed it out. When trying to edit your google form response, you can't change what you already uploaded. I had to go around using a different account to reupload my entry then editing my previous entries to say 'please disregard'. May be we can send files to an email instead with 'tracking codes'? Like have the form have a field of what we will put into the email subject or something? Or use even a different uploading service?

I quit. I have no chance.

I just saw @l4vr0v's fork of the project with a bunch of other strategies and I was excited because I could test better. but it was sad to see my consistently winning strat fall down to 37'th place. So I don't really think I have much chance. I wish the best of luck to all of you guys.

How many guarantee turns will it have?

On /code/prisonersDilemma.py, line 44 the comment says it will guarantee 50 turns:

The games are a minimum of 50 turns long. The np.log here guarantees that every turn after the 50th has an equal (low) chance of being the final turn.

But on readme.md it says it is guaranteed to be at least 200 turns:

This means each game is guaranteed to be at least 200 turns long.

Which one is correct? I assume the 200, but both information are inconsistent.

Number of strategies per user

Are we allowed to submit multiple strategies?

Using time.time() to limit approximate duration of strategy call

There are several ways agents might want to use as much time as possible to improve their performance (training neural networks/other machine learning, Monte Carlo tree search, ...). I read that a reasonable goal is to finish your turn in 10 ms. Can we import the standard time library so we can just approximately limit the time taken? E.g. as follows:

def strategy(history, memory):
    start = time()
    
    # Do some stuff ...
    
    while time() - start < 0.01:
        # The more stuff we do here, the better ...
    
    # Do some stuff ...
    
    return choice, memory

An important typo on the tournament's website

This includes a spoiler for the whole experiment

Spoiler warning

In the BROCCOLI version of the description it says

5 free-years for ratting her out is much better than 3 free-years after interrogation.

This is supposed to be a male version of the experiment so it might be quite important for the outcome.

Do we have to compress the .py file with WinRAR or we can just upload .py file directly in Google Forms?

Can we compress our python file and upload the compressed files (.zip, .rar etc) or is it compulsory to upload uncompressed python file?

stealing tactics from forks

guys, if you fork this repo, dont add your tactic! otherwise everyone could inspect your tactics which could then be easily exploited (by switching countermeasures for each tactic (either manually or training AI))

[NOT A COPY OF #34] We need an official statement about "colluding" or "handshake" strategies.

The focus of the prisoner's dilemma is that both sides should be attempting to get the highest score, with no prior communication. Colluding strategies completely break this premise, allowing people to get ~4.5 scores with essentially no effort. Ban collusion.

Minimum length of game is 50 or 200?

The code contradicts with the comment

PrisonersDilemmaTournament/code/prisonersDilemma.py

Line 44 in 2a4b657

    
           LENGTH_OF_GAME = int(200-40*np.log(random.random())) # The games are a minimum of 50 turns long. The np.log here guarantees that every turn after the 50th has an equal (low) chance of being the final turn.

BTW, a nit: random.random() can return 0 (although very unlikely) and crash the program. Consider using 1 - random.random() as that will never be zero (See https://docs.python.org/3/library/random.html#random.random)

In [9]: int(200-40*np.log(0))
C:\Users\cai_l\Anaconda3\Scripts\ipython3:1: RuntimeWarning: divide by zero encountered in log
---------------------------------------------------------------------------
OverflowError                             Traceback (most recent call last)
<ipython-input-9-abafa11aae74> in <module>
----> 1 int(200-40*np.log(0))

OverflowError: cannot convert float infinity to integer

May we import typing for static type checking?

I realise that as it currently stands no imports other than numpy and random are allowed. I would to like to ask @carykh to consider permitting us to use static type-checking through the typing standard library module, which provides for instance the helpful typing.Optional[sometype].

To the best of my knowledge, nothing in typing does anything to circumvent the rules of the competition.

If this can't be granted, then I suppose I will remove type signatures dependent on typing and delete the import before submission.

Limitations on loading files?

Hi! I'm currently thinking of a strategy that requires loading in a saved numpy array from a file. Are we allowed to retrieve data from a file? If so what's the maximum file size? And, how do you go about implementing this so that it works everywhere?

For me, my file should be small nothing more than a hundred kilobytes. But while implementing this I found an issue. I'm saving and loading the file within the working directory. However, I found that the working directory changes when running my strategy script and running the "prisonersDilemma.py" script. This would be solved using absolute paths but I'm not sure if that's allowed.

Has anyone actually found a counter to GrimTripper?

I realize that when all the strategies are put together in the end, grim tripper may not be the best, but with the example strats has anyone found a strategy which can consistently counter grim tripper. If so what was your strategy?

One-line strategy for amusement (possible bug?)

Not submitting this as it obviously cheats the system, but I discovered that this simple code handily beats all the sample strategies:

def strategy(history, memory): return -2, None

What version of python is expected to be used?

There are some useful tools available in the latest version of python, up to with version can we use them?

What happens if we think we know the "behind-the-scenes" of this tournament?

Apologies beforehand if this is not where we should be asking questions regarding the tournament, I don't think I could find any place specified in the video or site.

Being the braindead person I am, I immediately started snooping around after seeing anything interesting which has led me to have some potential insight on the so-called "behind-the-scenes". Would this disqualify me for not being a fully blind participant?

A strategy for amusement

SECRET = ("01001110 01100101 01110110 01100101 01110010 00100000 01100111 01101111 01101110 01101110 01100001 00100000 01100111 01101001 01110110 01100101 00100000 01111001 01101111 01110101 00100000 01110101 01110000 00001010 01001110 01100101 01110110 01100101 01110010 00100000 01100111 01101111 01101110 01101110 01100001 00100000 01101100 01100101 01110100 00100000 01111001 01101111 01110101 00100000 01100100 01101111 01110111 01101110 00001010 01001110 01100101 01110110 01100101 01110010 00100000 01100111 01101111 01101110 01101110 01100001 00100000 01110010 01110101 01101110 00100000 01100001 01110010 01101111 01110101 01101110 01100100 00100000 01100001 01101110 01100100 00100000 01100100 01100101 01110011 01100101 01110010 01110100 00100000 01111001 01101111 01110101 00001010 01001110 01100101 01110110 01100101 01110010 00100000 01100111 01101111 01101110 01101110 01100001 00100000 01101101 01100001 01101011 01100101 00100000 01111001 01101111 01110101 00100000 01100011 01110010 01111001 00001010 01001110 01100101 01110110 01100101 01110010 00100000 01100111 01101111 01101110 01101110 01100001 00100000 01110011 01100001 01111001 00100000 01100111 01101111 01101111 01100100 01100010 01111001 01100101 00001010 01001110 01100101 01110110 01100101 01110010 00100000 01100111 01101111 01101110 01101110 01100001 00100000 01110100 01100101 01101100 01101100 00100000 01100001 00100000 01101100 01101001 01100101 00100000 01100001 01101110 01100100 00100000 01101000 01110101 01110010 01110100 00100000 01111001 01101111 01110101 "*2).replace(" ","")

def strategy(history, memory):
    return int(SECRET[history.shape[1]]), None

How can I include my machine-learning model in the python file?

I want to incorporate some machine-learning (no libraries of course), but since everything has to be in one file I am having trouble saving my model. I tried pickle but that does not work when the file is called from the tournament code. (also I don't think pickle is allowed)

Has anyone already figured this out? I would imagine there being multiple machine-learning approaches.

Saving values between games

Because the memory variable gets wiped between games and only retained turn to turn, I imagine that it is not wanted that information about previous games can be saved.

I used a class with simple booleans and the value I set them to got retained from game to game.
Example:

def strategy(history, memory):
    round = history.shape[1] + 1
    if round == 1:
        # This should never be 1
        print(MyMemory.test)
        return 1, MyMemory
    mem = memory
    mem.test = 1
    return 1, mem


class MyMemory:
    test = 0

Discussion: What’s the lowest scoring alg?

I got to thinking, and it’s really hard to get a low scoring alg. Even playing always cooperate will still net you a pretty high score in the grand scheme of things because playing against TFT and always cooperate will raise your average.

about the "history" parameter for "strategy"

Is it supposed to be a list? a numpy array?

Also, sorry for the stupid question, just never worked with NP, and it's not clear to me.

Is it possible to counter random?

Assuming random cooperates half the time. The best strategy is to defect because you get an expected result of 3 overall. I think some people will just submit a random one as a joke and my strategies are getting destroyed against random. But is there a way you can detect a random strategy apart from just a complicated one.

Discussion: What do you all predict will be the most common tactics?

I personally predict lots of submissions will be variants on tit-for-tat.
Commenters in youtube speculate lots of people will submit the random one-liner.

Should my solution return 'tell truth'/'stay silent' or 0/1?

The contest page links to the "policestation" branch where strategies returns strings ('tell truth'/'stay silent') but I just randomly saw that the "main" branch strategies return 0/1. Which one is right? Note that I already submitted my string-based version.

Is Math library allowed?

I know I'm only supposed to use python libraries of numpy and random, but is it okay to use math.floor() from the math library?

Submitting more than one strategy

Hi! I've been working in some strategy algorithms and ended up with 3 good (and somewhat unique) versions. Is it allowed to submit all of them? I'm worried because, if this were to be vaguely allowed, people would be able to upload a lot of bots/minions that could benefit their own main strategy/master (I think there's a post in here related to this specifc topic).

Hi

I'm a bit confused on what exactly we are meant to do, are we meant to pit all the strats against each other, make a new strat or something different all together?

How can I check what the opponent has done the last two rounds?

I know I have to use ‘history’ but I’m unsure how to go about converting the array into something I can humanly understand.

Error when running the code

Hey, I'm relativly new to python and I didn't find anything on this when googling. So when I tried running the code I got this errormessage:

Starting tournament, reading files from exampleStrats
Traceback (most recent call last):
  File "C:/Users/bblom/PycharmProjects/PrisonersDilemmaTournament/code/prisonersDilemma.py", line 120, in <module>
    runFullPairingTournament(STRATEGY_FOLDER, RESULTS_FILE)
  File "C:/Users/bblom/PycharmProjects/PrisonersDilemmaTournament/code/prisonersDilemma.py", line 97, in runFullPairingTournament
    roundHistory = runRound(pair)
  File "C:/Users/bblom/PycharmProjects/PrisonersDilemmaTournament/code/prisonersDilemma.py", line 39, in runRound
    moduleA = importlib.import_module(STRATEGY_FOLDER+"."+pair[0])
  File "C:\Python27\lib\importlib\__init__.py", line 37, in import_module
    __import__(name)
ImportError: No module named exampleStrats.alwaysCooperate

It doesn't seem to be related to any of my code so I was wondering what caused it and how I would be able to fix it.
I'm using Python 3.9 on Windows with PyCharm.

Update submission file

Is it possible to edit the file of a submission ?

Strategy discussion

Hello.
I wanted to create this issue for people to shear their ideas with eachother.
For example in one of my earlier attempts i wanted to awoid some of the flaws of ftft by doing 2 in total and not 2 in a row and it worked. But i am going with a different strategy now so i don't need this.
I know it's not the best strategy to help your enemies but i did it anyways.
If you have any strategy ideas that you don't need anymore but someone could use it, post it here.

File name conflict in example strats

Name of the "random" strat should be changed because you might need to import random, which would import this strategy instead

Can you output memory as a dictionary?

I'm trying to output memory as a dictionary but memory is staying as a None value, does this mean that memory can't be a dictionary and how can I fix it?

Will multiple rounds be run to get the average?

Will multiple rounds be run to get the average? I just want to know to see how fair it is and how much luck is needed. if there are multiple rounds how many(I'm not talking about the 200+ rounds of the game)

First time GitHub user

Hi, I'm new to GitHub and while I don't have high hopes for the tournament I at least want to join along in the fun and see what I can code.

However, I'm really just not familiar with GitHub and how all the repositories work and Google can only get you so far. I've been able to download the entire repository into GitHub desktop, but I'm unable to clone or edit any of the pre-built strategies to create my own.

If anyone could help me out it'd be much appreciated!
(Also I use Mac if that matters in the way I need to set it up)

How the heck is this gonna be actually run

I assume at least a few of us here are going the route of NN where some computationally expensive matrix multiplication might need to be done to evaluate the results. I think carykh said it would repeat the tournament like 1000 times to help eliminate randomness, and that's ontop of every one of our agents playing every other agent. The sum of all of this is that this is gonna take a lot of computational power, maybe too much to be even practical?

ML & AI should not be allowed!!!

ML and AI should not be allowed! The thing is ML and AI are not strategies and as far I know (think) we are supposed to submit strategies that anyone (or anything) can adapt and change and update, But ML (and AI) are like creatures (in this case the network) that have mastered the thing (the game), which is not same... But anyways tell me if I am wrong!!!

Searching and then loading competitors to simulate them would be considered cheating?

I could run a simulation of competitors during execution. This would allow me to predict the outcome of a game against all competitors and would let me choose a strategy that gives me the highest chances of winning. This would probably take really long, as mentioned by Quadrapod in #13.

This is probably cheating. But it would be fun to implement.

Only running each pairing once could result in too much randomness affecting results

The current prisonersDilemma.py script only pits each algorithm against each other once, meaning that randomness (both in the algorithms themselves and in the round length) could significantly affect the results. A fix for this would be to run each round many times over.

Are handshake strategies allowed?

Handshake strategies have one central entry and multiple non-central ones. When a central and non-central meets each other, they do a “handshake” (do a certain pattern) and check if the opponent’s moves match to their handshake as well. If so, the non-central entry starts to give up for the benefit of the central entry. When it doesn’t match it just starts doing some good strategy like TFT or anything that won’t hurt much. With enough non-central entries, the central entry can easily win.

This was asked by Ottomated and a lot of other people in our server. Do you have any official words on this?

Personally I do not think it fair, as you can easily submit with many gmail alts and win even against the best strategies.