Giter Site home page Giter Site logo

alphazero_connect4's People

Contributors

plkmo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

alphazero_connect4's Issues

Get RuntimeError: CUDA error: unspecified launch failure

I run got runtime error, what's happend? my gpu is gtx1080:

RuntimeError: CUDA error: unspecified launch failure
 17%|███████████████                                                                           | 20/120 [1:34:56<7:54:44, 284.85s/it]
Process Process-1:
Traceback (most recent call last):
  File "/usr/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/usr/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "/home/netcan/Workspace/Python/AlphaZero_Connect4/src/MCTS_c4.py", line 177, in MCTS_self_play
    root = UCT_search(current_board,777,connectnet,t)
  File "/home/netcan/Workspace/Python/AlphaZero_Connect4/src/MCTS_c4.py", line 137, in UCT_search
    child_priors = child_priors.detach().cpu().numpy().reshape(-1); value_estimate = value_estimate.item()
RuntimeError: CUDA error: unspecified launch failure
terminate called after throwing an instance of 'c10::Error'
  what():  CUDA error: unspecified launch failure (operator() at /pytorch/c10/cuda/CUDACachingAllocator.cpp:943)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x46 (0x7fcad1403536 in /home/netcan/.local/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x1635f (0x7fcad164635f in /home/netcan/.local/lib/python3.7/site-packages/torch/lib/libc10_cuda.so)
frame #2: <unknown function> + 0x7e047a (0x7fcab421147a in /home/netcan/.local/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #3: THCIpcDeleter::~THCIpcDeleter() + 0x51 (0x7fca6987ba61 in /home/netcan/.local/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #4: <unknown function> + 0x1070a8e (0x7fca6987ba8e in /home/netcan/.local/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #5: c10::TensorImpl::release_resources() + 0x4d (0x7fcad13f3abd in /home/netcan/.local/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #6: <unknown function> + 0x5236b2 (0x7fcab3f546b2 in /home/netcan/.local/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #7: <unknown function> + 0x523756 (0x7fcab3f54756 in /home/netcan/.local/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #8: /usr/bin/python3() [0x5968b1]
frame #9: /usr/bin/python3() [0x5bc890]
frame #10: /usr/bin/python3() [0x4d2ece]
frame #11: /usr/bin/python3() [0x5bc890]
frame #12: /usr/bin/python3() [0x59688d]
frame #13: /usr/bin/python3() [0x5bc890]
frame #14: /usr/bin/python3() [0x4d2ece]
frame #15: /usr/bin/python3() [0x5bc890]
frame #16: /usr/bin/python3() [0x59688d]
frame #17: /usr/bin/python3() [0x5bc890]
frame #18: /usr/bin/python3() [0x4d2ece]
frame #19: /usr/bin/python3() [0x5bc890]
frame #20: /usr/bin/python3() [0x59688d]
frame #21: /usr/bin/python3() [0x59661b]
frame #22: /usr/bin/python3() [0x5bc890]
frame #23: /usr/bin/python3() [0x59688d]
frame #24: PyDict_SetItem + 0x337 (0x5b8ca7 in /usr/bin/python3)
frame #25: _PyModule_ClearDict + 0x107 (0x5aaae7 in /usr/bin/python3)
frame #26: PyImport_Cleanup + 0x354 (0x5386b4 in /usr/bin/python3)
frame #27: Py_FinalizeEx + 0x6e (0x633f9e in /usr/bin/python3)
frame #28: Py_Exit + 0x8 (0x6340b8 in /usr/bin/python3)
frame #29: /usr/bin/python3() [0x6309d0]
frame #30: PyErr_PrintEx + 0x1c (0x6309fc in /usr/bin/python3)
frame #31: PyRun_SimpleStringFlags + 0x4f (0x630bff in /usr/bin/python3)
frame #32: /usr/bin/python3() [0x6540d8]
frame #33: _Py_UnixMain + 0x2e (0x65420e in /usr/bin/python3)
frame #34: __libc_start_main + 0xeb (0x7fcad568909b in /lib/x86_64-linux-gnu/libc.so.6)
frame #35: _start + 0x2a (0x5df66a in /usr/bin/python3)

terminate called after throwing an instance of 'c10::Error'
  what():  CUDA error: unspecified launch failure (operator() at /pytorch/c10/cuda/CUDACachingAllocator.cpp:943)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x46 (0x7f9cda0d1536 in /home/netcan/.local/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x1635f (0x7f9cda31435f in /home/netcan/.local/lib/python3.7/site-packages/torch/lib/libc10_cuda.so)
frame #2: <unknown function> + 0x7e047a (0x7f9cbcedf47a in /home/netcan/.local/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #3: THCIpcDeleter::~THCIpcDeleter() + 0x51 (0x7f9c72549a61 in /home/netcan/.local/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #4: <unknown function> + 0x1070a8e (0x7f9c72549a8e in /home/netcan/.local/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #5: c10::TensorImpl::release_resources() + 0x4d (0x7f9cda0c1abd in /home/netcan/.local/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #6: <unknown function> + 0x5236b2 (0x7f9cbcc226b2 in /home/netcan/.local/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #7: <unknown function> + 0x523756 (0x7f9cbcc22756 in /home/netcan/.local/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #8: /usr/bin/python3() [0x5968b1]
frame #9: /usr/bin/python3() [0x5bc890]
frame #10: /usr/bin/python3() [0x4d2ece]
frame #11: /usr/bin/python3() [0x5bc890]
frame #12: /usr/bin/python3() [0x59688d]
frame #13: /usr/bin/python3() [0x5bc890]
frame #14: /usr/bin/python3() [0x4d2ece]
frame #15: /usr/bin/python3() [0x5bc890]
frame #16: /usr/bin/python3() [0x59688d]
frame #17: /usr/bin/python3() [0x5bc890]
frame #18: /usr/bin/python3() [0x4d2ece]
frame #19: /usr/bin/python3() [0x5bc890]
frame #20: /usr/bin/python3() [0x59688d]
frame #21: /usr/bin/python3() [0x59661b]
frame #22: /usr/bin/python3() [0x5bc890]
frame #23: /usr/bin/python3() [0x59688d]
frame #24: PyDict_SetItem + 0x337 (0x5b8ca7 in /usr/bin/python3)
frame #25: _PyModule_ClearDict + 0x107 (0x5aaae7 in /usr/bin/python3)
frame #26: PyImport_Cleanup + 0x354 (0x5386b4 in /usr/bin/python3)
frame #27: Py_FinalizeEx + 0x6e (0x633f9e in /usr/bin/python3)
frame #28: Py_Exit + 0x8 (0x6340b8 in /usr/bin/python3)
frame #29: /usr/bin/python3() [0x6309d0]
frame #30: PyErr_PrintEx + 0x1c (0x6309fc in /usr/bin/python3)
frame #31: PyRun_SimpleStringFlags + 0x4f (0x630bff in /usr/bin/python3)
frame #32: /usr/bin/python3() [0x6540d8]
frame #33: _Py_UnixMain + 0x2e (0x65420e in /usr/bin/python3)
frame #34: __libc_start_main + 0xeb (0x7f9cde35709b in /lib/x86_64-linux-gnu/libc.so.6)
frame #35: _start + 0x2a (0x5df66a in /usr/bin/python3)

terminate called after throwing an instance of 'c10::Error'
  what():  CUDA error: unspecified launch failure (operator() at /pytorch/c10/cuda/CUDACachingAllocator.cpp:943)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x46 (0x7f8358859536 in /home/netcan/.local/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x1635f (0x7f8358a9c35f in /home/netcan/.local/lib/python3.7/site-packages/torch/lib/libc10_cuda.so)
frame #2: <unknown function> + 0x7e047a (0x7f835f70c47a in /home/netcan/.local/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #3: THCIpcDeleter::~THCIpcDeleter() + 0x51 (0x7f82f8b02a61 in /home/netcan/.local/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #4: <unknown function> + 0x1070a8e (0x7f82f8b02a8e in /home/netcan/.local/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #5: c10::TensorImpl::release_resources() + 0x4d (0x7f8358849abd in /home/netcan/.local/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #6: <unknown function> + 0x5236b2 (0x7f835f44f6b2 in /home/netcan/.local/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #7: <unknown function> + 0x523756 (0x7f835f44f756 in /home/netcan/.local/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #8: /usr/bin/python3() [0x5968b1]
frame #9: /usr/bin/python3() [0x5bc890]
frame #10: /usr/bin/python3() [0x4d2ece]
frame #11: /usr/bin/python3() [0x5bc890]
frame #12: /usr/bin/python3() [0x59688d]
frame #13: /usr/bin/python3() [0x5bc890]
frame #14: /usr/bin/python3() [0x4d2ece]
frame #15: /usr/bin/python3() [0x5bc890]
frame #16: /usr/bin/python3() [0x59688d]
frame #17: /usr/bin/python3() [0x5bc890]
frame #18: /usr/bin/python3() [0x4d2ece]
frame #19: /usr/bin/python3() [0x5bc890]
frame #20: /usr/bin/python3() [0x59688d]
frame #21: /usr/bin/python3() [0x59661b]
frame #22: /usr/bin/python3() [0x5bc890]
frame #23: /usr/bin/python3() [0x59688d]
frame #24: PyDict_SetItem + 0x337 (0x5b8ca7 in /usr/bin/python3)
frame #25: _PyModule_ClearDict + 0x107 (0x5aaae7 in /usr/bin/python3)
frame #26: PyImport_Cleanup + 0x354 (0x5386b4 in /usr/bin/python3)
frame #27: Py_FinalizeEx + 0x6e (0x633f9e in /usr/bin/python3)
frame #28: Py_Exit + 0x8 (0x6340b8 in /usr/bin/python3)
frame #29: /usr/bin/python3() [0x6309d0]
frame #30: PyErr_PrintEx + 0x1c (0x6309fc in /usr/bin/python3)
frame #31: PyRun_SimpleStringFlags + 0x4f (0x630bff in /usr/bin/python3)
frame #32: /usr/bin/python3() [0x6540d8]
frame #33: _Py_UnixMain + 0x2e (0x65420e in /usr/bin/python3)
frame #34: __libc_start_main + 0xeb (0x7f8363a2909b in /lib/x86_64-linux-gnu/libc.so.6)
frame #35: _start + 0x2a (0x5df66a in /usr/bin/python3)

terminate called after throwing an instance of 'c10::Error'
  what():  CUDA error: unspecified launch failure (operator() at /pytorch/c10/cuda/CUDACachingAllocator.cpp:943)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x46 (0x7f7350cc7536 in /home/netcan/.local/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x1635f (0x7f7350f0a35f in /home/netcan/.local/lib/python3.7/site-packages/torch/lib/libc10_cuda.so)
frame #2: <unknown function> + 0x7e047a (0x7f7333ad547a in /home/netcan/.local/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #3: THCIpcDeleter::~THCIpcDeleter() + 0x51 (0x7f72e913fa61 in /home/netcan/.local/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #4: <unknown function> + 0x1070a8e (0x7f72e913fa8e in /home/netcan/.local/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #5: c10::TensorImpl::release_resources() + 0x4d (0x7f7350cb7abd in /home/netcan/.local/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #6: <unknown function> + 0x5236b2 (0x7f73338186b2 in /home/netcan/.local/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #7: <unknown function> + 0x523756 (0x7f7333818756 in /home/netcan/.local/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #8: /usr/bin/python3() [0x5968b1]
frame #9: /usr/bin/python3() [0x5bc890]
frame #10: /usr/bin/python3() [0x4d2ece]
frame #11: /usr/bin/python3() [0x5bc890]
frame #12: /usr/bin/python3() [0x59688d]
frame #13: /usr/bin/python3() [0x5bc890]
frame #14: /usr/bin/python3() [0x4d2ece]
frame #15: /usr/bin/python3() [0x5bc890]
frame #16: /usr/bin/python3() [0x59688d]
frame #17: /usr/bin/python3() [0x5bc890]
frame #18: /usr/bin/python3() [0x4d2ece]
frame #19: /usr/bin/python3() [0x5bc890]
frame #20: /usr/bin/python3() [0x59688d]
frame #21: /usr/bin/python3() [0x59661b]
frame #22: /usr/bin/python3() [0x5bc890]
frame #23: /usr/bin/python3() [0x59688d]
frame #24: PyDict_SetItem + 0x337 (0x5b8ca7 in /usr/bin/python3)
frame #25: _PyModule_ClearDict + 0x107 (0x5aaae7 in /usr/bin/python3)
frame #26: PyImport_Cleanup + 0x354 (0x5386b4 in /usr/bin/python3)
frame #27: Py_FinalizeEx + 0x6e (0x633f9e in /usr/bin/python3)
frame #28: Py_Exit + 0x8 (0x6340b8 in /usr/bin/python3)
frame #29: /usr/bin/python3() [0x6309d0]
frame #30: PyErr_PrintEx + 0x1c (0x6309fc in /usr/bin/python3)
frame #31: PyRun_SimpleStringFlags + 0x4f (0x630bff in /usr/bin/python3)
frame #32: /usr/bin/python3() [0x6540d8]
frame #33: _Py_UnixMain + 0x2e (0x65420e in /usr/bin/python3)
frame #34: __libc_start_main + 0xeb (0x7f7354f4d09b in /lib/x86_64-linux-gnu/libc.so.6)
frame #35: _start + 0x2a (0x5df66a in /usr/bin/python3)

^[2^[1^T^CTraceback (most recent call last):
  File "main_pipeline.py", line 35, in <module>
    run_MCTS(args, start_idx=0, iteration=i)
  File "/home/netcan/Workspace/Python/AlphaZero_Connect4/src/MCTS_c4.py", line 240, in run_MCTS
    p.join()
  File "/usr/lib/python3.7/multiprocessing/process.py", line 140, in join
    res = self._popen.wait(timeout)
  File "/usr/lib/python3.7/multiprocessing/popen_fork.py", line 48, in wait
    return self.poll(os.WNOHANG if timeout == 0.0 else 0)
  File "/usr/lib/python3.7/multiprocessing/popen_fork.py", line 28, in poll
    pid, sts = os.waitpid(self.pid, flag)
KeyboardInterrupt
^CError in atexit._run_exitfuncs:
Traceback (most recent call last):
  File "/usr/lib/python3.7/multiprocessing/popen_fork.py", line 28, in poll
    pid, sts = os.waitpid(self.pid, flag)
KeyboardInterrupt
Segmentation fault

implementation of drop_piece makes no sense

The bound function drop_piece below is basically dropping a piece at the (5, column) coordinate of the board, if (0, column) hasn't been occupied. What's the point of doing that? Isn't it supposed to drop a piece at a general coordinate?

class board():
    def __init__(self):
        self.init_board = np.zeros([6,7]).astype(str)
        self.init_board[self.init_board == "0.0"] = " "
        self.player = 0
        self.current_board = self.init_board
        
    def drop_piece(self, column):
        if self.current_board[0, column] != " ":
            return "Invalid move"
        else:
            row = 0; pos = " "
            while (pos == " "):
                if row == 6:
                    row += 1
                    break
                pos = self.current_board[row, column]
                row += 1
            if self.player == 0:
                self.current_board[row-2, column] = "O"
                self.player = 1
            elif self.player == 1:
                self.current_board[row-2, column] = "X"
                self.player = 0

The issue of despair in solvable games

This looks like a wonderful project, but I see from your medium article that you may be running into a problem I ran into myself when I wrote my first connect4 playing program many years ago. Since the game is a first player win, and you can feasibly find out that even with traditional search methods on the first move, the second player doesn't know what to do. It will rightly conclude that it loses against good play no matter what it does, thus it doesn't matter what it does, so it plays random moves!! It's almost as if the algorithm was overcome with despair!

It looks like you may be getting this problem even with AlphaZero's much more sophisticated search.

The solution I implemented back then was to add a slight reward for losing late over losing early, to encourage the second player to at least postpone the inevitable as long as possible.

Incorrect scalar weight type in conv2d_forward call

First, thank you for writing your article and sharing the code on GitHub. It provides a great starting point for someone interested in reinforcement learning.

I have encountered the problem mentioned in the issue title, but let me first describe my development environment: a basic Windows 10 (i7-4790, 16GB) without CUDA support, JetBrains PyCharm IDE, and Python 3.7. In order to compile the code under PyCharm, I needed to prefix some of the imports for your files with "src." since they are loaded as part of the same project. Also used PyCharm's built-in import optimizer to re-order the imports but don't think that broke anything.

Removed the .cuda() call in MCTS_c4.DummyNode.UCT_search() to fix an immediate execution error.

After 19 hours (and 120 games for each of 5 processors) the first iteration finished and the Neural Network began to train Here is a trace log excerpt:

12/28/2019 06:49:38 AM [INFO]: Finished multi-process MCTS!
12/28/2019 06:49:38 AM [INFO]: Loading training data...
12/28/2019 06:49:39 AM [INFO]: Loaded data from ./datasets/iter_0/.
12/28/2019 06:49:39 AM [INFO]: Loaded checkpoint model ./model_data/cc4_current_net__iter0.pth.tar.
12/28/2019 06:49:39 AM [INFO]: Starting training process...
Update step size: 32
Traceback (most recent call last):
File "C:/InstallationKits/AlphaZero_Connect4-master/src/main_pipeline.py", line 37, in
train_connectnet(args, iteration=i, new_optim_state=True)
File "C:\InstallationKits\AlphaZero_Connect4-master\src\train_c4.py", line 155, in train_connectnet
train(net, datasets, optimizer, scheduler, start_epoch, 0, args, iteration)
File "C:\InstallationKits\AlphaZero_Connect4-master\src\train_c4.py", line 84, in train
policy_pred, value_pred = net(state) # policy_pred = torch.Size([batch, 4672]) value_pred = torch.Size([batch, 1])
File "C:\Program Files\Python37\lib\site-packages\torch\nn\modules\module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "C:\InstallationKits\AlphaZero_Connect4-master\src\alpha_net_c4.py", line 95, in forward
s = self.conv(s)
File "C:\Program Files\Python37\lib\site-packages\torch\nn\modules\module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "C:\InstallationKits\AlphaZero_Connect4-master\src\alpha_net_c4.py", line 35, in forward
s = F.relu(self.bn1(self.conv1(s)))
File "C:\Program Files\Python37\lib\site-packages\torch\nn\modules\module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "C:\Program Files\Python37\lib\site-packages\torch\nn\modules\conv.py", line 345, in forward
return self.conv2d_forward(input, self.weight)
File "C:\Program Files\Python37\lib\site-packages\torch\nn\modules\conv.py", line 342, in conv2d_forward
self.padding, self.dilation, self.groups)
RuntimeError: Expected object of scalar type Long but got scalar type Float for argument #2 'weight' in call to _thnn_conv2d_forward

Process finished with exit code 1


Thank you in advance for any help you can give us.

Missing model weights

Thank you for the wonderful work. I am trying to run the code but the initial weights of the network are missing:

FileNotFoundError: [Errno 2] No such file or directory: './model_data/c4_current_net_trained2_iter0.pth.tar'

Could you share the weights file?

MuZero!!

Hey, since MuZero is very similar but more general, could you PLEASE do a similar article and repo for that? many applications will do better with a more simple version that doesn't have to scale across a datacenter, so your style will be perfect, and with an explanation like the medium article, will be golden!

Also, could you post your article in markdown in this repo? I could not even read the article because Medium says I have to 'upgrade and pay' them to be able to read your article. (I'd rather pay you).

Error during the first run

Hello, I have a trouble during the first run:

python3 main_pipeline.py

I get :

"alpha_net_c4.py", line 14, in __init__ self.X = dataset[:,0] IndexError: too many indices for array

Someone can help me ?
Thanks for your time

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.