Giter Site home page Giter Site logo

Comments (6)

killerducky avatar killerducky commented on July 30, 2024

It sounds interesting, but there is a high cost. I will use Suggestion 1 as an example. You talked about only doing it for random positions, but in the introduction you mentioned the main point is to combine everything into a single NN with multiple outputs. I think that means you need to calculate what the net should output for those cases for all positions, not just some. So you need self-play games to do a full 800 node search for both sides for every position. And you need to double the number of outputs.

Combining value and policy heads is essentially free because we were already computing them. So the question is will it be worth it to generate games at half the rate, and have your NN evals be slower? I can't even guess. :)

from lczero.org.

DaghN avatar DaghN commented on July 30, 2024

I agree killerducky, getting policy for both sides should involve twice as much calculation, or half as many games, not to mention the added net structure cost of a bigger input and output.

With regard to games, I think the eventual ceiling of the net is the important thing, as things seem to be heading now.

With regard to the cost/benefit of having better but slower NN evals, I believe that aiming for better is always good, it is what drives the improvement and makes Leela viable at all. The hope is that adding more information/structure will lead to a quantum leap in strength. (As opposed to now, where we simply make the net bigger until it's not worth it anymore because it becomes too slow.)

from lczero.org.

DaghN avatar DaghN commented on July 30, 2024

There are two types of suggestions here:

  1. Adding policy for the other side.

  2. Trying to get the net to think more in terms of moves, the logic of the position, deeper comprehension.

After thinking more about it, the second part seems to be more involved and not readily facilitated by the current leela structure. It is not clear that Leela is thinking much at all in terms of tactical logic or move logic (if I move this piece, what will then be possible in the position), as evidenced by the problem with discovered checks. Instead, maybe it is working more on very finetuned/balanced but shallow pattern recognition, to try and put some simple words on the difference.

from lczero.org.

Ishinoshita avatar Ishinoshita commented on July 30, 2024

The idea of making more information flow into the net by predicting the next k moves has been explored in go game by Tian et al. (FB, Darkforest bot) in supervised learning mode, achieving better prediction accuracy:
https://arxiv.org/abs/1511.06410. Could work for RL as well. And readily applicable to chess, a priori.
Another paper Multi-Labelled Value Networks for Computer Go (Wu et al., https://arxiv.org/abs/1705.10701) also report training a value head to output position value for different komi (compensation points for second side to move), which de facto amounts to injecting more information into the network. An additional board value (BV) head, sharing the same network front-end with the value head, is trained to output the status of stones/intersections at the end of the game. Yet another implementation of the same idea. Although these two last (multi-komi value head and BV head) are specific to the game of go.

from lczero.org.

mooskagh avatar mooskagh commented on July 30, 2024

While I don't have any better suggestions of a better place where to post suggestions like this (dev forum?), it would be nice github issues to be more actionable and task oriented, so that we can mark them "done" sometimes.
Keeping this open for now, but we need a better place for non-actionable ideas.. Probably.

from lczero.org.

mooskagh avatar mooskagh commented on July 30, 2024

I think good place for writeups like those would be a section in our lczero.org website (even if it's already implemented or not relevant anymore).
So I'm moving issue there in order not to forget to migrate it, afterwards it can be closed.

from lczero.org.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.