Comments (6)
It sounds interesting, but there is a high cost. I will use Suggestion 1 as an example. You talked about only doing it for random positions, but in the introduction you mentioned the main point is to combine everything into a single NN with multiple outputs. I think that means you need to calculate what the net should output for those cases for all positions, not just some. So you need self-play games to do a full 800 node search for both sides for every position. And you need to double the number of outputs.
Combining value and policy heads is essentially free because we were already computing them. So the question is will it be worth it to generate games at half the rate, and have your NN evals be slower? I can't even guess. :)
from lczero.org.
I agree killerducky, getting policy for both sides should involve twice as much calculation, or half as many games, not to mention the added net structure cost of a bigger input and output.
With regard to games, I think the eventual ceiling of the net is the important thing, as things seem to be heading now.
With regard to the cost/benefit of having better but slower NN evals, I believe that aiming for better is always good, it is what drives the improvement and makes Leela viable at all. The hope is that adding more information/structure will lead to a quantum leap in strength. (As opposed to now, where we simply make the net bigger until it's not worth it anymore because it becomes too slow.)
from lczero.org.
There are two types of suggestions here:
-
Adding policy for the other side.
-
Trying to get the net to think more in terms of moves, the logic of the position, deeper comprehension.
After thinking more about it, the second part seems to be more involved and not readily facilitated by the current leela structure. It is not clear that Leela is thinking much at all in terms of tactical logic or move logic (if I move this piece, what will then be possible in the position), as evidenced by the problem with discovered checks. Instead, maybe it is working more on very finetuned/balanced but shallow pattern recognition, to try and put some simple words on the difference.
from lczero.org.
The idea of making more information flow into the net by predicting the next k moves has been explored in go game by Tian et al. (FB, Darkforest bot) in supervised learning mode, achieving better prediction accuracy:
https://arxiv.org/abs/1511.06410. Could work for RL as well. And readily applicable to chess, a priori.
Another paper Multi-Labelled Value Networks for Computer Go (Wu et al., https://arxiv.org/abs/1705.10701) also report training a value head to output position value for different komi (compensation points for second side to move), which de facto amounts to injecting more information into the network. An additional board value (BV) head, sharing the same network front-end with the value head, is trained to output the status of stones/intersections at the end of the game. Yet another implementation of the same idea. Although these two last (multi-komi value head and BV head) are specific to the game of go.
from lczero.org.
While I don't have any better suggestions of a better place where to post suggestions like this (dev forum?), it would be nice github issues to be more actionable and task oriented, so that we can mark them "done" sometimes.
Keeping this open for now, but we need a better place for non-actionable ideas.. Probably.
from lczero.org.
I think good place for writeups like those would be a section in our lczero.org website (even if it's already implemented or not relevant anymore).
So I'm moving issue there in order not to forget to migrate it, afterwards it can be closed.
from lczero.org.
Related Issues (20)
- Main menu links are not bold when non-top-level document is choosen
- Add redirects from lczero.org to training.lczero.org
- Have a script to export a github issue with comments into .md document HOT 1
- Add per-subheader anchors/link support
- Make leftside menu collapsible
- Add RSS to blog
- Generate a website preview as a PR check
- Detailing requirements for new and improved MCTS HOT 1
- Add "last updated" to every page
- Adjust header/subheader sizes to be mobile-friendly
- Lagging/Freezing on Linux Mint 20
- Website improvements
- Add next/prev article navigation at bottom of the article HOT 2
- Leela eats up all my GPU. HOT 1
- Get the output from value and policy network HOT 1
- Leela V 0.29 suggests a losing move HOT 2
- Enable support for more languages
- Use GitHub Actions & Pages for deployment HOT 1
- Neural network architecture documentation is out of date
- Unhandled exception: Invalid weight file: lc0 version >= 0.30.0 required.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lczero.org.