Giter Site home page Giter Site logo

kautenja / gym-super-mario-bros Goto Github PK

View Code? Open in Web Editor NEW
661.0 661.0 122.0 1.03 MB

An OpenAI Gym interface to Super Mario Bros. & Super Mario Bros. 2 (Lost Levels) on The NES

License: Other

Python 99.14% Makefile 0.86%
nes-py openai-gym super-mario-bros super-mario-bros-2-lost-levels

gym-super-mario-bros's People

Contributors

eliashasle avatar jjshoots avatar kautenja avatar lukewood avatar roclark avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

gym-super-mario-bros's Issues

Lost Levels Support

Super Mario Bros Lost Levels is the more difficult sequel to Super Mario Bros that was never released in the U.S.. It's a much more difficult version, but nonetheless likely has similar (if not identical) memory mapping. As such, an integration should be as simple as the acquisition of a ROM for the game. This would be a cool additional feature

Parameterized Frame Skip

Frame skipping is hard coded into the Super Mario Bros lua file. Is there a way to parameterize this value in the NESEnv class?

Boosted Frame Rate

Is there any way to boost the frame rate more, possibly by editing the FCEUX source and maintaining a local copy?

Individual Level Environments

Is there a way to access individual levels to provide each of the 32 as environments? The memory map shows spots for world, level, part, etc. and a spot to force reload so it should be doable. (gym-super-mario-bros) implements this functionality.

Down-sampled Environment

The down-sampled environment has remaining artifacts that could be further simplified like bushes, mountains, seaweed etc. all the background static images that don't contribute to the game dynamics. It's challenging to locate the correct ROM addresses for these textures to black them out.

Get Mario Information

Is your feature request related to a problem? Please describe.

I want to know information that is mario status.
e.g. small mario, big mario, fire mario

Describe the solution you'd like

I think that status add to info.

Describe alternatives you've considered

Nope

Additional context

Nope

(gym.Monitor) OSError: [Errno 12] Cannot allocate memory

Running with a gym.wrapper.Monitor on an environment produces the spurious error

Traceback (most recent call last):
  File "dddqn_train.py", line 62, in <module>
    agent.train(callback=callback)
  File "/home/bitcommander/Documents/Projects/deep-learning-project/src/agents/deep_q_agent.py", line 252, in train
    state = self._initial_state()
  File "/home/bitcommander/Documents/Projects/deep-learning-project/src/agents/agent.py", line 39, in _initial_state
    state = self.env.reset()
  File "/usr/local/lib/python3.5/dist-packages/gym/wrappers/monitor.py", line 39, in reset
    self._after_reset(observation)
  File "/usr/local/lib/python3.5/dist-packages/gym/wrappers/monitor.py", line 193, in _after_reset
    self.reset_video_recorder()
  File "/usr/local/lib/python3.5/dist-packages/gym/wrappers/monitor.py", line 214, in reset_video_recorder
    self.video_recorder.capture_frame()
  File "/usr/local/lib/python3.5/dist-packages/gym/wrappers/monitoring/video_recorder.py", line 116, in capture_frame
    self._encode_image_frame(frame)
  File "/usr/local/lib/python3.5/dist-packages/gym/wrappers/monitoring/video_recorder.py", line 162, in _encode_image_frame
    self.encoder = ImageEncoder(self.path, frame.shape, self.frames_per_sec)
  File "/usr/local/lib/python3.5/dist-packages/gym/wrappers/monitoring/video_recorder.py", line 256, in __init__
    self.start()
  File "/usr/local/lib/python3.5/dist-packages/gym/wrappers/monitoring/video_recorder.py", line 288, in start
    self.proc = subprocess.Popen(self.cmdline, stdin=subprocess.PIPE, preexec_fn=os.setsid)
  File "/usr/lib/python3.5/subprocess.py", line 947, in __init__
    restore_signals, start_new_session)
  File "/usr/lib/python3.5/subprocess.py", line 1490, in _execute_child
    restore_signals, start_new_session, preexec_fn)
OSError: [Errno 12] Cannot allocate memory

it seems the monitor is using subprocess and fails to allocate memory. Could this be from the process running FCEUX hogging memory or is something else wrong here? Currently running a long training session without the monitor enabled to see if the monitor is truly the problem

Headless FCEUX

headless FCEUX would be nice for serverside running.

import os
os.environ['SDL_VIDEODRIVER'] = 'dummy'

the only problem is that for playing the game as human, this also disables the pygame window.

Negative Reward on Level Completion

The standard meta environment (i.e. SuperMarioBros-v0) seems to be providing a large negative reward for completion of a level as if Mario has died. This is not desired functionality as it will discourage agents from getting to the flag pole (completing levels). Some logic needs injected into the Lua script to prevent this from happening and potentially provide a large positive reward for flag get events.

Create multiple environment Error

Hi.
I have used the multiprocessing module to create multiple environments. However, when one environment becomes Done, all other environments are also done. How do you handle this issue?
thank you. :)

Love for Luigi

Luigi gets no love, let's find a way to allow a Luigi option to play as Luigi instead of Mario.

Individual Level Environments for Lost Levels

worlds 1-4 of Lost Levels work correctly with the level select mechanism; however, worlds 5, 6, 7, 8, 9, A, B, C, and D don't seem to work for inexplicable reasons. How can we get this functionality implemented for Lost Levels?

Graphics Glitch

reset appears to cause unusual behavior that is hard to replicate. Although the background fails to render, the foreground sprites are unaffected. Perhaps recording actions from human input trying to replicate the error by dying is in order to produce a state where the error can be reproduced to debug. It's hard to say if this bug is the result of RAM hacking, or something more serious in the underlying nes-py emulator's PPU.

screenshot from 2018-07-22 13-29-21

Possible Solutions

  • redesign nes-py as object oriented and completely reset the memory on resets
  • add a save and restore state feature to nes-py to reduce the overhead of resets and reduce the possibility of this bug

Can't make environment

Describe the bug

Can't make environment

To Reproduce

Steps to reproduce the behavior:

  1. update package 4.0.1
  2. and make env

Expected behavior

Screenshots

image

Environment

  • Operating System: centos7
  • Python version: 3.6
  • gym-super-mario-bros version: 4.0.1
  • nes-gym version: 2.0.0

Additional context

Level envs do not fire the done flag when a flagpole is reached

Describe the bug

the individual level environments don't fire the done flag (to terminate an episode) when the end of a level is reached (flagpole, bowser, etc.).

To Reproduce
Steps to reproduce the behavior:

  1. step any level env (e.g. SuperMarioBros-1-1-v0) close to the end and backup a save state
  2. finish the level and observe the loading of the next level
  3. restore the save state and observe that the bug is invariably reproducible

Expected behavior

When Mario reaches the end of a level in a level env, the done flag returned by the step method of an instance of SMBLevelEnv should return True indicating that the episode is over.

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: all

Additional context

NA

Human Control in Distributed Code

The human control module is limited to the Github repo and doesn't make it to PyPi distribution. This code should be cleaned up into a class of some sort and packaged for use on deployed instances. It's mostly useful for testing the emulator (Lua code), but there are likely use cases for deployed instances to have it as well.

info dictionary item for Flagpole Get

Is your feature request related to a problem? Please describe.

There is no indication of when Mario gets a flagpole at the end of a level.

Describe the solution you'd like

  • a flag that indicates that a flagpole has reached in the info dict returned by (SMBEnv).step(...)
  • an environment wrapper to add a reward based on this event

Describe alternatives you've considered

NA

Additional context

NA

Full Game Environment Lag During Stage Transition

Describe the bug

when mario clears 1-1 stage, env stop few second

To Reproduce

Steps to reproduce the behavior:

  1. mario clear 1-1(SuperMarioBros-v2)
  2. environment stop few second

Expected behavior

environment will be not stop

Screenshots

Environment

  • Operating System: ubunto 16.04
  • Python version: 3.6
  • gym-super-mario-bros version: SuperMarioBros-v2
  • nes-gym version: i don't know

Additional context

Customizable Reward Space

The reward space is statically defined in the super-mario-bros lua file. Is there a way to parameterize the different elements of the reward space to access through the Python API? For instance

  • optional move right reward
  • optional time penalty
  • optional death penalty
  • optional points reward
  • optional coins reward

env.step(action) not executing correct action

Hi I use your environment for a university project and with your new version 3.0 env.step(action) results either in not doing anything or jumping.

By downgrading to version 2.3.1 the Issue disappeared.

Hope that helps, please tell me if you need further input ;)

Flag Grab Reward / Ending Level Sequence

There is currently no logic to reward the agent for grabbing the flag (without rewarding for points or checking the memory locations for some completion flag or measuring x distance or something) nor any logic to ensure that the agent doesn't receive cut-scene data for the replay memory. Perhaps manually getting close to the flag pole by hand and creating a save state for the agent to start from is a good way to get this functionality designed and implemented.

Sprite Flicker

It is well documented that the NES has a sprite limit based on the limitations of the original hardware. This results in spurious flickering of sprites (particularly mario when he is loosing power ups or enemies moving along). FCEUX has an option to disable the sprite limit, though it can cause problems with certain ROMs. Perhaps this avenue should be explored to guarantee the agent isn't starved of relevant sprites as a result of the frame skip. It could potentially improve the performance of the emulator too from what I have read.

A question to ask

Hello,
This project is amazing ,then i have a question to ask you .I try to install this project on Win,can i got this "nes-by" on Win? and this wheel?

Windows 10 + FCEUX + Python >=3.5.x

I would be eternally grateful if there is a step by step guide on how to get FCEUX and gym super mario brothers working on windows 10.
Thank you

Unique Pipe Names

the pipes currently use a static name such that only one instance of the environment can run on a machine at a given time. Using timestamp, random numbers, or some other better mechanism, we can define the pipes with unique names then pass the name to the Lua script through an environment key enabling multiple instances of the emulator on a single machine.

2 Player Mode

Super Mario Bros supports a 2 player mode. Although potentially challenging and annoying to implements, a two player mode provides a very unique opportunity for collaborative RL. Essentially, observations would stay the same, but the action space would expect a tuple of 2 actions: one for Mario and one for Luigi. Naturally, the reward streams would also need to uniquely identify rewards for each player. Terminal flag would remain unchanged. This is by no means a pressing feature, but it's worth noting the possibility on the roadmap for this project.

Check for FCEUX before runtime execution

There are no checks for FCEUXs availability, this results in crashes pretty late in the execution cycle that should be caught much sooner in either setup.py or the initializer for NESEnv

Recording Human Control

It could be cool to record human control for filling replay memory instead of starting from randomness. This builds on #17 by introducing a new feature to the proposed class (recording)

Resizeable window

Would it be possible to make it so that the emulator gives us a resizeable window?

Installation Issues

pip install gym-super-mario-bros
Collecting gym-super-mario-bros
Using cached https://files.pythonhosted.org/packages/a9/f9/ff8254f8115a46c1cad551ec98e56da1d0a95396f25e130bb98a62ff87e0/gym_super_mario_bros-1.1.0.tar.gz
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "", line 1, in
File "/tmp/pip-install-TDJ6Mj/gym-super-mario-bros/setup.py", line 4
def README() -> str:
^
SyntaxError: invalid syntax

----------------------------------------

Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-install-TDJ6Mj/gym-super-mario-bros/

Tried installing but faced these issues. Any help is appreciated. Cheers

Action Space

In hindsight, the action space seems to be missing some potentially necessary actions. Should this be revisited? One alternative is to avoid the discrete action space entirely and define the action space as a one-hot vector over UDLRAB. This allows all potential button combinations; however, increases the search space complexity.

self.actions = [
    'U',   # Up
    'D',   # Down
    'L',   # Left
    'R',   # Right
    'UR',  # Up + Right
    'DR',  # Down + Right
    'URA', # Up + Right + A
    'DRB', # Down + Right + B
    'A',   # A
    'B',   # B
    'RB',  # Right + B
    'RA'   # Right + A
]

Unexpected Screen in End of World cutscene

Describe the bug

At the end of the world on a standard meta env (e.g. SuperMarioBros-v0), when finishing a castle (4th level of any world), the game shows an additional frame from the Toad cutscene.

To Reproduce

Steps to reproduce the behavior:

  1. step a standard env (e.g. SuperMarioBros-v0) to the end of level 4
  2. make a backup state
  3. finish the world
  4. observe a brief showing of a screen with toad
  5. restore the save state and observe reproducibility of the issue

Expected behavior

no screens are showed between any levels.

Question about env.step(action) return value.

This environment is really nice environment to learn reinforcement learning. Thank you so much. But When I using this environment, I encountered little problem.
When I did env.step(action), I can receive four kinds of different return values such as next_state, reward, done and info. So I want to see that kinds of values. When I display a info value, info value was empty. I use Windows 10.
Why does this happen?

Pixel and Rectangle ROMs for Lost Levels

It would be nice to have similar pixel and rectangle ROMs for Lost Levels like those for the original Super Mario Bros.

  • Rectangle
  • Pixel

The rectangle environment should be much easier as all sprites are just converted to a single color. The pixel environment is more complex. It might be better to contact to original creator for some help or guidance relating to this ROM.

human_play.py crashes on invalid input

If the player presses invalid inputs using the play_human.py script, the interface crashes. This related to a bug in the OpenAI Gym play script. I've opened a PR over on their repository, but it has received no attention from the maintainers. Might be better to just copy their script and fix their bugs here, decoupling the dependency altogether.

Faulty Synchronization

After 1.7M steps, the emulator got stuck in an infinite loop during either a reset or a standard death.

Exception ignored in: <module 'threading' from '/usr/lib/python3.5/threading.py'>
Traceback (most recent call last):
  File "/usr/lib/python3.5/threading.py", line 1288, in _shutdown
    t.join()
  File "/usr/lib/python3.5/threading.py", line 1054, in join
    self._wait_for_tstate_lock()
  File "/usr/lib/python3.5/threading.py", line 1070, in _wait_for_tstate_lock
    elif lock.acquire(block, timeout):
KeyboardInterrupt
pipe closed

Emulation speed 100.0%
Script died of natural causes.

Headless FCEUX

Is there a way to use FCEUX headlessly to run this environment on servers and such?

Human Control Wrapper

a wrapper for playing the game with human control would be nice for testing and data collection purposes

Include play*.py files in distribution

providing access to play_random.py and play_human.py through the pip distribution would be useful. Otherwise, users have to download the repository to use these scripts.

Final downsampled environment artifacts in SMB

The downsampled environment has a final artifacts that still need removed:

  • Blue background for water?
    • Mario changes his animations and sprites in water. the blue is the final source of massive background noise in the environment
  • Bushes, if possible.
    • this may not be possible, the remaining artifact is just a block of color. Removing the
      color causes collateral damage to pipes. The only way would be to find where bushes
      are placed in levels and somehow black those values out

Sprite Repository

A repository of game sprites for analysis, etc. would be nice. A collection of sprite gifs has been added to a new branch sprites. Some interfacing and pre-processing will be necessary to make this available at the module level.

Sprites for SMB2

The sprites repository needs updated with sprites unique to SMB2: Lost Levels

Skip pipe and vine animations

There are two untested animation sequences that may need tuned up:

  • pipe entry
  • vine climbing / entry

hopefully these are connected to easy to find timers to RAM hack to 0 and skip.

Environment Wrappers

Packaging commonly used wrappers (RGB->Y, 84x84, 4-frame stack, reward clip, etc.) with this code would be convenient.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.