Comments (15)
That's right, I modified the runFullPairingTournament function for those who are interested in this kind of technique.
def runFullPairingTournamentN(inFolder=STRATEGY_FOLDER, n=1000):
finalScoreKeeper = {}
STRATEGY_LIST = []
for file in os.listdir(inFolder):
if file.endswith(".py"):
STRATEGY_LIST.append(file[:-3])
for strategy in STRATEGY_LIST:
finalScoreKeeper[strategy] = 0
for _ in tqdm(range(n)):
scoreKeeper = {}
for strategy in STRATEGY_LIST:
scoreKeeper[strategy] = 0
for pair in itertools.combinations(STRATEGY_LIST, r=2):
roundHistory = runRound(pair)
scoresA, scoresB = tallyRoundScores(roundHistory)
scoreKeeper[pair[0]] += scoresA
scoreKeeper[pair[1]] += scoresB
for strategy in STRATEGY_LIST:
finalScoreKeeper[strategy] += scoreKeeper[strategy]
scoresNumpy = np.zeros(len(finalScoreKeeper))
for i in range(len(STRATEGY_LIST)):
scoresNumpy[i] = finalScoreKeeper[STRATEGY_LIST[i]]
rankings = np.argsort(scoresNumpy)
for rank in range(len(STRATEGY_LIST)):
i = rankings[-1-rank]
score = scoresNumpy[i]
scoreTrour = score/n
scorePer = scoreTrour/(len(STRATEGY_LIST)-1)
print(f'#{pad(str(rank+1)+":", 3)} {pad(STRATEGY_LIST[i]+":", 18)} {score:f} ({scoreTrour:f} avg per tournament) ({scorePer:f} avg per match)')
from prisonersdilemmatournament.
I’ve been using a modified version of that function myself for testing, but this one looks much cleaner. Thanks!
from prisonersdilemmatournament.
Agreed; My submission consistently ranks 1st when running a few times and averaging the results, but sometimes ranks as low as 3rd when running only once.
from prisonersdilemmatournament.
Thanks for pointing this out! Yeah, quite a few people noticed how much randomness had an effort on the leaderboard's ordering. I will probably do something like @peopleOfPalay's edit where I run the tournament over and over and take the average of all of them. I'll try to do 1,000 iterations, but I hope the code doesn't take too long to run! (Since it is O(n^2) for the number of submissions)
from prisonersdilemmatournament.
You could take advantage of multiprocessing. I can create a PR for that if you want!
from prisonersdilemmatournament.
Well, as the number of matches each algorith plays increases the actual effect of randomness goes down. So the more participants there are the less iterations are needed.
But since we're already talking performance, you could accept my Pull Request that improves it significatly.
from prisonersdilemmatournament.
For matches between two strategies that don't use random, repeated matches will always have the same outcome, so you could try, for example, running matches 10 times, and if the first 200 moves are the same for both players all 10 times, you can take the average from just those 10, otherwise you know RNG played a part and you can then repeat 990 more times.
This would significantly reduce the time it takes to run without significantly changing the results compared to running it 1000 times.
There are a few exceptions where strategies using RNG could falsely be flagged as not using RNG using the above, for example, strategies that only use RNG after the 200th move or strategies that have a small chance of RNG of changing things; maybe 0.05% of the time, they defect on the 1st move or something.
Another method you could use is checking of the source code includes the word "random" since they're probably not using RNG if they don't include random. This would instead have false positives instead of false negatives, which might be preferred. It's still probably a good idea to run matches not using RNG a few times since the match length is random, and that can effect the score.
from prisonersdilemmatournament.
For matches between two strategies that don't use random, repeated matches will always have the same outcome, so you could try, for example, running matches 10 times, and if the first 200 moves are the same for both players all 10 times, you can take the average from just those 10, otherwise you know RNG played a part and you can then repeat 990 more times.
This would significantly reduce the time it takes to run without significantly changing the results compared to running it 1000 times.
There are a few exceptions where strategies using RNG could falsely be flagged as not using RNG using the above, for example, strategies that only use RNG after the 200th move or strategies that have a small chance of RNG of changing things; maybe 0.05% of the time, they defect on the 1st move or something.
Another method you could use is checking of the source code includes the word "random" since they're probably not using RNG if they don't include random. This would instead have false positives instead of false negatives, which might be preferred. It's still probably a good idea to run matches not using RNG a few times since the match length is random, and that can effect the score.
I think it would be best to check for whether the random module was imported in the strategy. While this could lead to false negatives if somebody uses randomness not from the random module (such as numpy.random), you could potentially fix this by adding the import random to any strategies that used randomness without a random import, since all the files have to be checked anyway. You could also remove random imports if a strategy doesn't use it to decrease false positive rates.
After removing unnecessary imports, I found that a basic cache with a dictionary gave a 3-4x speedup for multiple iterations on my test set with a few dozen strategies.
from prisonersdilemmatournament.
For matches between two strategies that don't use random, repeated matches will always have the same outcome, so you could try, for example, running matches 10 times, and if the first 200 moves are the same for both players all 10 times, you can take the average from just those 10, otherwise you know RNG played a part and you can then repeat 990 more times.
This would significantly reduce the time it takes to run without significantly changing the results compared to running it 1000 times.
There are a few exceptions where strategies using RNG could falsely be flagged as not using RNG using the above, for example, strategies that only use RNG after the 200th move or strategies that have a small chance of RNG of changing things; maybe 0.05% of the time, they defect on the 1st move or something.
Another method you could use is checking of the source code includes the word "random" since they're probably not using RNG if they don't include random. This would instead have false positives instead of false negatives, which might be preferred. It's still probably a good idea to run matches not using RNG a few times since the match length is random, and that can effect the score.I think it would be best to check for whether the random module was imported in the strategy. While this could lead to false negatives if somebody uses randomness not from the random module (such as numpy.random), you could potentially fix this by adding the import random to any strategies that used randomness without a random import, since all the files have to be checked anyway. You could also remove random imports if a strategy doesn't use it to decrease false positive rates.
After removing unnecessary imports, I found that a basic cache with a dictionary gave a 3-4x speedup for multiple iterations on my test set with a few dozen strategies.
Some strategies might import random but not actually use it though
from prisonersdilemmatournament.
Some strategies might import random but not actually use it though
You could also remove random imports if a strategy doesn't use it to decrease false positive rates.
After removing unnecessary imports, I found that a basic cache with a dictionary gave a 3-4x speedup
from prisonersdilemmatournament.
Guys, don't forget that the game length is random. That can have an effect even on a match between two non-random strategies. You can't do that optimization.
from prisonersdilemmatournament.
Guys, don't forget that the game length is random. That can have an effect even on a match between two non-random strategies. You can't do that optimization.
I didn't forget, I mentioned it at the end;
It's still probably a good idea to run matches not using RNG a few times since the match length is random, and that can effect the score.
The difference that makes compared to other forms of RNG is probably going to be far less noticeable. Historically, a lot of strategies will cooperate until defected against, and in that situation, the score is 3 for both sides no matter how long the match is. If you want to be as cautious about that type of RNG as possible, you could check if across a few rounds the score for both stays within a small range (+/- 3% or so) and if it does, it's pretty safe to just average from there and move on.
from prisonersdilemmatournament.
If you want to be completely sure that game length randomness is the same, what you could do cache the entire game history and memories, then generate more of the round when the game length is longer than the length of the history stored in cache, and just return the first amount of rounds from it when it isn't. This should produce exactly the same results as outright running the strategies the full amount of times, and would be quicker than rerunning even just a few times.
This could also start running into issues with memory on a larger set, but if this becomes a problem you could try to store the data in a way where each value only uses 1 bit instead of 32.
from prisonersdilemmatournament.
If Cary plans to run the code multiple times, yes, building a cache is ideal, but otherwise, it's a waste of time and resources.
In the fork I'm working on, we're implementing a cache, but I see no need for Cary to use one if he only plans to run things once.
from prisonersdilemmatournament.
If Cary plans to run the code multiple times, yes, building a cache is ideal, but otherwise, it's a waste of time and resources.
In the fork I'm working on, we're implementing a cache, but I see no need for Cary to use one if he only plans to run things once.
They won't just be run once though.
I will probably do something like peopleOfPalay's edit where I run the tournament over and over and take the average of all of them.
Also, as per the idea I said before, continuing a game from cached history could be problematic for consistency, so it would probably be better to precompute the cache to 500 turns or so once, and then run it from the start again in the unlikely case that the game's longer than that.
from prisonersdilemmatournament.
Related Issues (20)
- Analysis and performance
- Submission without a Google account? HOT 1
- After the deadline in a few hours, please feel free to USE THIS THREAD TO SHARE YOUR STRATEGIES, since the fear of them being stolen will be gone HOT 42
- I left a print() HOT 5
- Name associated with Google account being appended to filename when uploading to the google form HOT 2
- I made a POV version of the game HOT 1
- Has anyone successfully done a resubmission? HOT 20
- Will ALL duplicates be removed, including random? HOT 17
- Submission period is over for everyone now! HOT 2
- The "research", as seen from evidence left in the submission form and web page HOT 3
- Potential Exploit I haven't seen discussed yet.
- For everyone who's not subscribed to carykh on YouTube HOT 14
- Predictions :D HOT 10
- Update from Cary! (I implemented subprocesses) HOT 7
- I got stupid... HOT 4
- Update from Cary (2021-06-15) (Long and not urgent, so you don't have to read it) HOT 21
- A cheater got to first place (and got caught). HOT 11
- Detecting Kingmaker and Minion strats HOT 5
- Releasing results.txt in order to simulate different metas HOT 4
- Update from Cary on 2021-06-29 HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from prisonersdilemmatournament.