zxqfl / mcts Goto Github PK

Generic, parallel Monte Carlo tree search library

License: MIT License

Rust 100.00%

mcts's Introduction

mcts

This is a library for Monte Carlo tree search (MCTS) in Rust.

The implementation is parallel and lock-free. The generic design allows it to be used in a wide variety of domains. It can accomodate different search methodologies (for example, rollout-based or neural-net-based leaf evaluation).

Documentation

mcts's People

Contributors

Stargazers

Watchers

Forkers

jakur fpli-mbr trungfinity sagebati ocornoc billlin0904 gispda blockcat eddiechristian codingskynet ten0 integritynoble bermell gwstudy czinn yehuda-blip mrodz

mcts's Issues

Multi-Agent (Players)

Hi,

Thank you so much for publishing this implementation of MCTS. Using it has helped me greatly in learning how MCTS works and experimenting 👍

Do you have any examples or suggestions for how to handle a multi-player environment? The only example in the repo is a simple counting game, which is a single player. I see that there are player methods, but the example does not show how to use it.

Also, are there any considerations for whether the player moves are simultaneous vs serial per turn? In the example I'm working on, there may be 2 to 8 players, one of whom is the player I'm trying to maximize, and the others are opponents I'm trying to minimize, and all players always make one of their available moves simultaneously once per turn. In other words, my player does not know which of the moves each other player will make, until the next child node has been simulated (assuming I choose randomly what the other player is likely to do).

Thanks in advance!

Purpose of the TranspositionHash trait

I have noticed that a custom trait, TranspositionHash, is used for the transposition table. Is there a reason why the standard Hash trait cannot be used here?

Is there a way to diagnose number of playouts?

Hi,

When using playout_parallel_for, is there a way to diagnose the final number of playouts that were done between when it was started and when the timeout was reached?

My use case is to set a constant timeout, and use the final number of playouts as a performance measurement. There is a way to diagnose the number of nodes (and this is included in .diagnose()) but I didn't see anything for playouts.

Thanks!

Committing to moves

First of all, thanks a lot for this library. It's super well thought and designed and works great.

I'm using this in a real-time context in a deterministic full information game.

For performance purposes, I would like to be able to register that a player "commits" to one of the available moves at the root, just dropping the other possible moves out of the tree but maintaining the computation of everything that was below that move (since it's typically one of the main variations, this is where a very significant portion of the nodes should be).

Would that be something that can be achieved? (I would probably work on it but the little amount of documentation makes it somewhat hard to be sure of the all the unsafe invariants that have to be upheld.)

Thank you for your help!

Thoughts:

That seems pretty straightforward to not reclaim memory (we just change the root node and put all the other nodes in orphaned). However to reclaim memory that's a whole other story, mostly due to potential cycles.

Thoughts on reclaiming memory:

As nodes may reference each other a lot, in order to reclaim space in the main structure we'd have to figure out a way to determine a new owner for a node when we are discarding an owning node, while properly handling cyclic references to avoid memory leaks.
I use transposition_table::ApproxTable - if I understand correctly space can never be reclaimed in this particular struct because otherwise we'd have to remove early-stopping upon finding an empty entry. Overall if we intend to free we need an extra CleanableTranspositionTable trait (or like) that enables removing entries, and condition commit_move on that the transposition table supports it.

Need clarifications

Hi,
I don't know if you still want to worry about this project but, if you are ever available, I would like to have some clarifications to use your library whose API is nice.

I'm not a MCTS pro, but the function:
evaluate_new_state
is to evaluate all the possible new states of a node?

If you ever have time maybe you could upgrade the dependencies to update the codebase.

Thanks for your library.

How are move evaluations used? (Empty list in example)

Hi,

I was just curious how move evaluations are meant to be used? In the counting game example, the evaluation returns a vector of () for each move, and just returns a state evaluation instead.

I can see in the code that there is an AlphaGo policy that uses the move evaluations (that makes sense). So I just wanted to check to see if I got this right: If I'm using the UCTPolicy, I should return only a state evaluation, and not move evaluations, and if I'm using the AlphaGoPolicy, I should return move evaluations and a state evaluation? Or does the AlphaGoPolicy want only move evaluations and not a state evaluation?

Thanks!

zxqfl / mcts Goto Github PK

mcts's Introduction

mcts

mcts's People

Contributors

Stargazers

Watchers

Forkers

mcts's Issues

Multi-Agent (Players)

Purpose of the TranspositionHash trait

Is there a way to diagnose number of playouts?

Committing to moves

Need clarifications

How are move evaluations used? (Empty list in example)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent