Giter Site home page Giter Site logo

madmario's People

Contributors

subramen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

madmario's Issues

error when running the tutorial v2.ipnb on colab

ModuleNotFoundError Traceback (most recent call last)
in ()
15
16 #NES Emulator for OpenAI Gym
---> 17 from nes_py.wrappers import JoypadSpace
18
19 # Super Mario environment for OpenAI Gym

ModuleNotFoundError: No module named 'nes_py'

Pretrained weight is really bad?

Hi, thanks for the amazing repo!

I download the trained weight here
https://drive.google.com/file/d/1RRwhSMUrpBBRyAsfHLPGt1rlYFoiuus2/view?usp=sharing
mentioned in README.

And then load statedict into Mario network successfully.

file_id = '1RRwhSMUrpBBRyAsfHLPGt1rlYFoiuus2'
url = f'https://drive.google.com/uc?id={file_id}'
!gdown {url} # I run in Colab

ckp = torch.load('./trained_mario.chkpt', map_location=('cuda' if use_cuda else 'cpu'))
mario.exploration_rate = ckp.get('exploration_rate')
mario.net.load_state_dict(ckp.get('model'))
<All keys matched successfully>

However, when trying to play using this trained model, the mario always dies very fast at the beginning (e.g. 40 frames)
Is the above path still a correct pretrained path?

Running out of GPU memory after several minutes training

Hi,

I got a CUDA out of memory issue after several minutes training. Is there a way to fix it?

(py38) C:\Src\GitHub\MadMario>python main.py
Loading model at checkpoints\2021-02-20T16-13-06\trained_mario.chkpt with exploration rate 0.1
Episode 0 - Step 660 - Epsilon 0.1 - Mean Reward 2990.0 - Mean Length 660.0 - Mean Loss 0.0 - Mean Q Value 0.0 - Time Delta 10.198 - Time 2021-02-20T16:29:03
Episode 20 - Step 5262 - Epsilon 0.1 - Mean Reward 1311.095 - Mean Length 250.571 - Mean Loss 0.0 - Mean Q Value 0.0 - Time Delta 61.936 - Time 2021-02-20T16:30:05
Episode 40 - Step 9888 - Epsilon 0.1 - Mean Reward 1149.829 - Mean Length 241.171 - Mean Loss 0.0 - Mean Q Value 0.0 - Time Delta 62.843 - Time 2021-02-20T16:31:08
Episode 60 - Step 13407 - Epsilon 0.1 - Mean Reward 1072.361 - Mean Length 219.787 - Mean Loss 0.0 - Mean Q Value 0.0 - Time Delta 47.898 - Time 2021-02-20T16:31:56
Episode 80 - Step 19197 - Epsilon 0.1 - Mean Reward 1144.407 - Mean Length 237.0 - Mean Loss 0.0 - Mean Q Value 0.0 - Time Delta 77.715 - Time 2021-02-20T16:33:14
Episode 100 - Step 22474 - Epsilon 0.1 - Mean Reward 1060.12 - Mean Length 218.14 - Mean Loss 0.0 - Mean Q Value 0.0 - Time Delta 44.237 - Time 2021-02-20T16:33:58
Episode 120 - Step 26864 - Epsilon 0.1 - Mean Reward 1015.29 - Mean Length 216.02 - Mean Loss 0.0 - Mean Q Value 0.0 - Time Delta 58.86 - Time 2021-02-20T16:34:57
Episode 140 - Step 32109 - Epsilon 0.1 - Mean Reward 1094.56 - Mean Length 222.21 - Mean Loss 0.0 - Mean Q Value 0.0 - Time Delta 71.322 - Time 2021-02-20T16:36:08
Traceback (most recent call last):
File "main.py", line 59, in
action = mario.act(state)
File "C:\Src\GitHub\MadMario\agent.py", line 57, in act
state = torch.FloatTensor(state).cuda() if self.use_cuda else torch.FloatTensor(state)
RuntimeError: CUDA out of memory. Tried to allocate 2.00 MiB (GPU 0; 10.00 GiB total capacity; 7.56 GiB already allocated; 0 bytes free; 7.74 GiB reserved in total by PyTorch)

GPU Speed Up Benchmark

Here we compare training time on a Macbook Pro (CPU) vs. Google Colab (GPU). In the below terminal outputs, pay attention to the Step Time. It is the average iteration time including act(), step(), learn() and remember().

Macbook Pro CPU

Episode 20 - Step 3603 - Step Time 0.065 - Epsilon 0.999 - Mean Reward 578.905 - Mean Length 171.571 - Mean Loss 2.127 - Mean Q Value 4.123 - Time 2020-06-05T20:23:33
Episode 21 - Step 3643 - Step Time 0.066 - Epsilon 0.999 - Mean Reward 563.091 - Mean Length 165.591 - Mean Loss 2.056 - Mean Q Value 4.087 - Time 2020-06-05T20:23:36
Episode 22 - Step 4097 - Step Time 0.068 - Epsilon 0.999 - Mean Reward 581.696 - Mean Length 178.13 - Mean Loss 1.994 - Mean Q Value 4.063 - Time 2020-06-05T20:24:06
Episode 23 - Step 4195 - Step Time 0.07 - Epsilon 0.999 - Mean Reward 583.542 - Mean Length 174.792 - Mean Loss 1.934 - Mean Q Value 4.041 - Time 2020-06-05T20:24:13
Episode 24 - Step 4235 - Step Time 0.071 - Epsilon 0.999 - Mean Reward 569.44 - Mean Length 169.4 - Mean Loss 1.877 - Mean Q Value 4.019 - Time 2020-06-05T20:24:16
Episode 25 - Step 4493 - Step Time 0.068 - Epsilon 0.999 - Mean Reward 576.231 - Mean Length 172.808 - Mean Loss 1.824 - Mean Q Value 4.001 - Time 2020-06-05T20:24:34

Google Colab GPU

Episode 41 - Step 9149 - Step Time 0.018 - Epsilon 0.998 - Mean Reward 733.976 - Mean Length 217.833 - Mean Loss 0.859 - Mean Q Value 2.953 - Time 2020-06-06T03:22:58
Episode 42 - Step 9568 - Step Time 0.019 - Epsilon 0.998 - Mean Reward 742.163 - Mean Length 222.512 - Mean Loss 0.848 - Mean Q Value 2.963 - Time 2020-06-06T03:23:06
Episode 43 - Step 10622 - Step Time 0.019 - Epsilon 0.997 - Mean Reward 745.114 - Mean Length 241.409 - Mean Loss 0.842 - Mean Q Value 3.004 - Time 2020-06-06T03:23:26
Episode 44 - Step 10662 - Step Time 0.019 - Epsilon 0.997 - Mean Reward 733.689 - Mean Length 236.933 - Mean Loss 0.837 - Mean Q Value 3.054 - Time 2020-06-06T03:23:26
Episode 45 - Step 10702 - Step Time 0.018 - Epsilon 0.997 - Mean Reward 722.761 - Mean Length 232.652 - Mean Loss 0.832 - Mean Q Value 3.104 - Time 2020-06-06T03:23:27
Episode 46 - Step 10828 - Step Time 0.018 - Epsilon 0.997 - Mean Reward 719.851 - Mean Length 230.383 - Mean Loss 0.827 - Mean Q Value 3.16 - Time 2020-06-06T03:23:30

We see a speed up of ~4x by training on Colab GPU.

Keep Colab session active

Training on Colab takes around 8 hours. We need to keep the session active. Use this snippet to keep the session alive.

function ClickConnect(){
    console.log("Clicked on connect button"); 
    document.querySelector("colab-connect-button").click()
}
setInterval(ClickConnect,60000)

colab-connect-button is the button on the upper right that shows RAM and Disk usage.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.