Giter Site home page Giter Site logo

alphagozero's People

Contributors

dependabot[bot] avatar narsil avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

alphagozero's Issues

Generate more training data

I'm interested in how to generate more training data from existing .sgf file, especially for the policy_target and value_target attribute. Is there some code already implemented this feature?

Huge number of files created

It ran for a couple of days and found several new best models. However, it also creates numerous files (502,586 items, totalling 5.6 GB). The models directory is large and the games directory has most of the files. Perhaps zipping would be worthwhile. In any case, I'm happy to restart it again after you have had a chance to make more improvements. Thanks again for sharing.

No Model Progress

Did some more runs with SHOW_END_GAME True.
No model progress at all does not seem right for 63 models.
Before OOM fix (#3), was finding better models (until hitting OOM).

Have a log which I could email you, unless you would rather I post here.
I think my email is public.
Thanks

Sampled evaluation games

In the original paper, only positions from self-play games are sampled. These have temperature=1 for part of the games, meaning more exploration. Won't adding all evaluation games to the games to sample from heavily decrease exploration? Of course we could remove the recording of evaluation games if we parallelize everything, but since it saves time doing this, I was wondering if you know if it has any noticeable negative impact.

Set size changed during iteration -- is this a problem

Ubuntu 16.04 LTS
Thanks,
BrianR (author of Tinker chess engine)

brianr@Tinker-Ubuntu:~/alphagozero$ python3 main.py
Using TensorFlow backend.
2017-11-06 16:53:14.674061: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:892] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2017-11-06 16:53:14.674512: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Found device 0 with properties:
name: GeForce GTX 770 major: 3 minor: 0 memoryClockRate(GHz): 1.163
pciBusID: 0000:01:00.0
totalMemory: 1.94GiB freeMemory: 1.67GiB
2017-11-06 16:53:14.674554: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 770, pci bus id: 0000:01:00.0, compute capability: 3.0)
Exception in thread Thread-2: | 1/20 [00:51<16:09, 51.02s/it]
Traceback (most recent call last):██▎ | 34/162 [00:49<03:06, 1.46s/it]
File "/usr/lib/python3.5/threading.py", line 914, in _bootstrap_inner | 6/162 [00:08<03:36, 1.39s/it]
self.run()
File "/home/brianr/.local/lib/python3.5/site-packages/tqdm/_tqdm.py", line 144, in run
for instance in self.tqdm_cls._instances:
File "/usr/lib/python3.5/_weakrefset.py", line 60, in iter
for itemref in self.data:
RuntimeError: Set changed size during iteration

BTW it is running and says:
Evaluation model_2 vs model_1 (winrate:100%): 100%|██████████████████████████████████████████████████████| 10/10 [19:11<00:00, 115.13s/it]
We found a new best model : model_2!███████████████████████▉ | 78/162 [02:07<02:16, 1.63s/it]

Potential problems

Edit: turned it into a general thread instead

  1. The AGZ spreadsheet mentions only one filter for the value head. In this implementation, two filters are used. Any reason to it? I don't think it's going to have a big impact, but I'm just putting it out there.

  2. The target policies that are created during simulated games are taken from the prior probabilities p. These are calculated by the neural net. From the AGZ cheatsheet I believe that the target policies should instead be the search probabilities, which are given by the number of visits of a move and the temperature parameter.

Some notes:

  1. During MCTS search, there are lots of zero Q-values and often patches of Q-values that are almost 1 appear. (This might just be due to a bad network)

  2. The MCTS batched search yields more Q-values, but the search depth will be considerably lowered. Chosen moves are only at max depth 4 from the current position and usually 2 or 3. Running 64 simulations with batch size 1 can give chosen moves with up to depth 66 from the current position, but of course, it will be slower. Unsure on what is a good balance. Hard to tune.

Easy way to import existing sgf files?

Is there an easy way to train the network on a large set of professional-level sgf files? I have a database of several tens of thousands of games I'd like to use to set initial weights. But I'm not sure how to put this into the format AGZ needs.

Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.