kautenja / gym-super-mario-bros Goto Github PK

View Code? Open in Web Editor NEW

661.0 661.0 122.0 1.03 MB

An OpenAI Gym interface to Super Mario Bros. & Super Mario Bros. 2 (Lost Levels) on The NES

License: Other

Python 99.14% Makefile 0.86%

nes-py openai-gym super-mario-bros super-mario-bros-2-lost-levels

gym-super-mario-bros's People

Contributors

Stargazers

Watchers

Forkers

rovertroy thaije wonseokjung namlehai zxshinxz fuxianh landoufulxf modulabs-ctrl bic4907 arjunmurthys erik-sn tarsbase munkarkin96 cryp70n1c4 flexpad noeleel ramonpereira sweetice atomikcircus3 ciaranwelsh jeongsooha wwxfromtju universityai robot0102 ryanweh jnc96 zoq gccxeon ksalamandra kotwanikunal sparky2708 nanduzz alexzhou1995 meisi simula huanpass agnes5 mbodenham maridia kammer0820 favre49 exp-optimization-tools aditya13354 sjjz logmosier mcupery anthonysong98 vitozhang404 cacoleman16 contrivancecompanychicago pranjaldhole imperialite 00mjk kbbbhhy cnheider dhruvashp eliasel crowbardispensingcore meiiow chillzoe xxzbg deeptrial ei5uke asahichine retamal96 k4ntz mohramadan86 play3577 lukewood roclark jinxuekun mirage-c fenghui2013 boranaykn ydeh22 spyroot cool9203 soumil32 atsu3 iszxy artificial-life-lab jesslyn1999 aditya-29 olliejonas threxx89 strevia philo-themoonlover jingxuanyang davidhefan twin1shun ozomata guio13233 arpitjain799 gchhablani techthiyanes iq-scm dadams2 nadunvindy huolan842569313 jhpark9090

gym-super-mario-bros's Issues

Lost Levels Support

Super Mario Bros Lost Levels is the more difficult sequel to Super Mario Bros that was never released in the U.S.. It's a much more difficult version, but nonetheless likely has similar (if not identical) memory mapping. As such, an integration should be as simple as the acquisition of a ROM for the game. This would be a cool additional feature

Parameterized Frame Skip

Frame skipping is hard coded into the Super Mario Bros lua file. Is there a way to parameterize this value in the NESEnv class?

Descriptive Image in README

Boosted Frame Rate

Is there any way to boost the frame rate more, possibly by editing the FCEUX source and maintaining a local copy?

Individual Level Environments

Is there a way to access individual levels to provide each of the 32 as environments? The memory map shows spots for world, level, part, etc. and a spot to force reload so it should be doable. (gym-super-mario-bros) implements this functionality.

Down-sampled Environment

The down-sampled environment has remaining artifacts that could be further simplified like bushes, mountains, seaweed etc. all the background static images that don't contribute to the game dynamics. It's challenging to locate the correct ROM addresses for these textures to black them out.

Get Mario Information

Is your feature request related to a problem? Please describe.

I want to know information that is mario status.
e.g. small mario, big mario, fire mario

Describe the solution you'd like

I think that status add to info.

Describe alternatives you've considered

Nope

Additional context

Nope

(gym.Monitor) OSError: [Errno 12] Cannot allocate memory

Running with a gym.wrapper.Monitor on an environment produces the spurious error

Traceback (most recent call last):
  File "dddqn_train.py", line 62, in <module>
    agent.train(callback=callback)
  File "/home/bitcommander/Documents/Projects/deep-learning-project/src/agents/deep_q_agent.py", line 252, in train
    state = self._initial_state()
  File "/home/bitcommander/Documents/Projects/deep-learning-project/src/agents/agent.py", line 39, in _initial_state
    state = self.env.reset()
  File "/usr/local/lib/python3.5/dist-packages/gym/wrappers/monitor.py", line 39, in reset
    self._after_reset(observation)
  File "/usr/local/lib/python3.5/dist-packages/gym/wrappers/monitor.py", line 193, in _after_reset
    self.reset_video_recorder()
  File "/usr/local/lib/python3.5/dist-packages/gym/wrappers/monitor.py", line 214, in reset_video_recorder
    self.video_recorder.capture_frame()
  File "/usr/local/lib/python3.5/dist-packages/gym/wrappers/monitoring/video_recorder.py", line 116, in capture_frame
    self._encode_image_frame(frame)
  File "/usr/local/lib/python3.5/dist-packages/gym/wrappers/monitoring/video_recorder.py", line 162, in _encode_image_frame
    self.encoder = ImageEncoder(self.path, frame.shape, self.frames_per_sec)
  File "/usr/local/lib/python3.5/dist-packages/gym/wrappers/monitoring/video_recorder.py", line 256, in __init__
    self.start()
  File "/usr/local/lib/python3.5/dist-packages/gym/wrappers/monitoring/video_recorder.py", line 288, in start
    self.proc = subprocess.Popen(self.cmdline, stdin=subprocess.PIPE, preexec_fn=os.setsid)
  File "/usr/lib/python3.5/subprocess.py", line 947, in __init__
    restore_signals, start_new_session)
  File "/usr/lib/python3.5/subprocess.py", line 1490, in _execute_child
    restore_signals, start_new_session, preexec_fn)
OSError: [Errno 12] Cannot allocate memory

it seems the monitor is using subprocess and fails to allocate memory. Could this be from the process running FCEUX hogging memory or is something else wrong here? Currently running a long training session without the monitor enabled to see if the monitor is truly the problem

Headless FCEUX

headless FCEUX would be nice for serverside running.

import os
os.environ['SDL_VIDEODRIVER'] = 'dummy'

the only problem is that for playing the game as human, this also disables the pygame window.

Negative Reward on Level Completion

The standard meta environment (i.e. SuperMarioBros-v0) seems to be providing a large negative reward for completion of a level as if Mario has died. This is not desired functionality as it will discourage agents from getting to the flag pole (completing levels). Some logic needs injected into the Lua script to prevent this from happening and potentially provide a large positive reward for flag get events.

Create multiple environment Error

Hi.
I have used the multiprocessing module to create multiple environments. However, when one environment becomes Done, all other environments are also done. How do you handle this issue?
thank you. :)

Love for Luigi

Luigi gets no love, let's find a way to allow a Luigi option to play as Luigi instead of Mario.

Individual Level Environments for Lost Levels

worlds 1-4 of Lost Levels work correctly with the level select mechanism; however, worlds 5, 6, 7, 8, 9, A, B, C, and D don't seem to work for inexplicable reasons. How can we get this functionality implemented for Lost Levels?

Graphics Glitch

reset appears to cause unusual behavior that is hard to replicate. Although the background fails to render, the foreground sprites are unaffected. Perhaps recording actions from human input trying to replicate the error by dying is in order to produce a state where the error can be reproduced to debug. It's hard to say if this bug is the result of RAM hacking, or something more serious in the underlying nes-py emulator's PPU.

Possible Solutions

redesign nes-py as object oriented and completely reset the memory on resets
add a save and restore state feature to nes-py to reduce the overhead of resets and reduce the possibility of this bug

Downsampled artifacts in SMB2

The SMB2 downsampled ROM needs to have the standard static textures removed.

Can't make environment

Describe the bug

Can't make environment

To Reproduce

Steps to reproduce the behavior:

update package 4.0.1
and make env

Expected behavior

Screenshots

Environment

Operating System: centos7
Python version: 3.6
gym-super-mario-bros version: 4.0.1
nes-gym version: 2.0.0

Additional context

Level envs do not fire the done flag when a flagpole is reached

Describe the bug

the individual level environments don't fire the done flag (to terminate an episode) when the end of a level is reached (flagpole, bowser, etc.).

To Reproduce
Steps to reproduce the behavior:

step any level env (e.g. SuperMarioBros-1-1-v0) close to the end and backup a save state
finish the level and observe the loading of the next level
restore the save state and observe that the bug is invariably reproducible

Expected behavior

When Mario reaches the end of a level in a level env, the done flag returned by the step method of an instance of SMBLevelEnv should return True indicating that the episode is over.

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

OS: all

Additional context

State size/Observation space

Is there any command that will allow me to grab the state size or the observation space of the environment?

Human Control in Distributed Code

The human control module is limited to the Github repo and doesn't make it to PyPi distribution. This code should be cleaned up into a class of some sort and packaged for use on deployed instances. It's mostly useful for testing the emulator (Lua code), but there are likely use cases for deployed instances to have it as well.

info dictionary item for Flagpole Get

Is your feature request related to a problem? Please describe.

There is no indication of when Mario gets a flagpole at the end of a level.

Describe the solution you'd like

a flag that indicates that a flagpole has reached in the info dict returned by (SMBEnv).step(...)
an environment wrapper to add a reward based on this event

Describe alternatives you've considered

Additional context

Full Game Environment Lag During Stage Transition

Describe the bug

when mario clears 1-1 stage, env stop few second

To Reproduce

Steps to reproduce the behavior:

mario clear 1-1(SuperMarioBros-v2)
environment stop few second

Expected behavior

environment will be not stop

Screenshots

Environment

Operating System: ubunto 16.04
Python version: 3.6
gym-super-mario-bros version: SuperMarioBros-v2
nes-gym version: i don't know

Additional context

Customizable Reward Space

The reward space is statically defined in the super-mario-bros lua file. Is there a way to parameterize the different elements of the reward space to access through the Python API? For instance

env.step(action) not executing correct action

Hi I use your environment for a university project and with your new version 3.0 env.step(action) results either in not doing anything or jumping.

By downgrading to version 2.3.1 the Issue disappeared.

Hope that helps, please tell me if you need further input ;)

Flag Grab Reward / Ending Level Sequence

There is currently no logic to reward the agent for grabbing the flag (without rewarding for points or checking the memory locations for some completion flag or measuring x distance or something) nor any logic to ensure that the agent doesn't receive cut-scene data for the replay memory. Perhaps manually getting close to the flag pole by hand and creating a save state for the agent to start from is a good way to get this functionality designed and implemented.

Sprite Flicker

It is well documented that the NES has a sprite limit based on the limitations of the original hardware. This results in spurious flickering of sprites (particularly mario when he is loosing power ups or enemies moving along). FCEUX has an option to disable the sprite limit, though it can cause problems with certain ROMs. Perhaps this avenue should be explored to guarantee the agent isn't starved of relevant sprites as a result of the frame skip. It could potentially improve the performance of the emulator too from what I have read.

A question to ask

Hello,
This project is amazing ,then i have a question to ask you .I try to install this project on Win，can i got this "nes-by" on Win? and this wheel?

Windows 10 + FCEUX + Python >=3.5.x

I would be eternally grateful if there is a step by step guide on how to get FCEUX and gym super mario brothers working on windows 10.
Thank you

Unique Pipe Names

the pipes currently use a static name such that only one instance of the environment can run on a machine at a given time. Using timestamp, random numbers, or some other better mechanism, we can define the pipes with unique names then pass the name to the Lua script through an environment key enabling multiple instances of the emulator on a single machine.

2 Player Mode

Super Mario Bros supports a 2 player mode. Although potentially challenging and annoying to implements, a two player mode provides a very unique opportunity for collaborative RL. Essentially, observations would stay the same, but the action space would expect a tuple of 2 actions: one for Mario and one for Luigi. Naturally, the reward streams would also need to uniquely identify rewards for each player. Terminal flag would remain unchanged. This is by no means a pressing feature, but it's worth noting the possibility on the roadmap for this project.

Check for FCEUX before runtime execution

There are no checks for FCEUXs availability, this results in crashes pretty late in the execution cycle that should be caught much sooner in either setup.py or the initializer for NESEnv

Accelerate Emulation Speed

How can you accelerate your emulation/training?

Example Screenshots for each environment

some example images for each environment would be helpful given that many are hacks that people wont know about off the bat

Recording Human Control

It could be cool to record human control for filling replay memory instead of starting from randomness. This builds on #17 by introducing a new feature to the proposed class (recording)

Resizeable window

Would it be possible to make it so that the emulator gives us a resizeable window?

Installation Issues

pip install gym-super-mario-bros
Collecting gym-super-mario-bros
Using cached https://files.pythonhosted.org/packages/a9/f9/ff8254f8115a46c1cad551ec98e56da1d0a95396f25e130bb98a62ff87e0/gym_super_mario_bros-1.1.0.tar.gz
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "", line 1, in
File "/tmp/pip-install-TDJ6Mj/gym-super-mario-bros/setup.py", line 4
def README() -> str:
^
SyntaxError: invalid syntax

----------------------------------------

Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-install-TDJ6Mj/gym-super-mario-bros/

Tried installing but faced these issues. Any help is appreciated. Cheers

Action Space

In hindsight, the action space seems to be missing some potentially necessary actions. Should this be revisited? One alternative is to avoid the discrete action space entirely and define the action space as a one-hot vector over UDLRAB. This allows all potential button combinations; however, increases the search space complexity.

self.actions = [
    'U',   # Up
    'D',   # Down
    'L',   # Left
    'R',   # Right
    'UR',  # Up + Right
    'DR',  # Down + Right
    'URA', # Up + Right + A
    'DRB', # Down + Right + B
    'A',   # A
    'B',   # B
    'RB',  # Right + B
    'RA'   # Right + A
]

Unexpected Screen in End of World cutscene

Describe the bug

At the end of the world on a standard meta env (e.g. SuperMarioBros-v0), when finishing a castle (4th level of any world), the game shows an additional frame from the Toad cutscene.

To Reproduce

Steps to reproduce the behavior:

step a standard env (e.g. SuperMarioBros-v0) to the end of level 4
make a backup state
finish the world
observe a brief showing of a screen with toad
restore the save state and observe reproducibility of the issue

Expected behavior

no screens are showed between any levels.

Question about env.step(action) return value.

This environment is really nice environment to learn reinforcement learning. Thank you so much. But When I using this environment, I encountered little problem.
When I did env.step(action), I can receive four kinds of different return values such as next_state, reward, done and info. So I want to see that kinds of values. When I display a info value, info value was empty. I use Windows 10.
Why does this happen?

Pixel and Rectangle ROMs for Lost Levels

It would be nice to have similar pixel and rectangle ROMs for Lost Levels like those for the original Super Mario Bros.

Rectangle
Pixel

The rectangle environment should be much easier as all sprites are just converted to a single color. The pixel environment is more complex. It might be better to contact to original creator for some help or guidance relating to this ROM.

human_play.py crashes on invalid input

If the player presses invalid inputs using the play_human.py script, the interface crashes. This related to a bug in the OpenAI Gym play script. I've opened a PR over on their repository, but it has received no attention from the maintainers. Might be better to just copy their script and fix their bugs here, decoupling the dependency altogether.

Faulty Synchronization

After 1.7M steps, the emulator got stuck in an infinite loop during either a reset or a standard death.

Exception ignored in: <module 'threading' from '/usr/lib/python3.5/threading.py'>
Traceback (most recent call last):
  File "/usr/lib/python3.5/threading.py", line 1288, in _shutdown
    t.join()
  File "/usr/lib/python3.5/threading.py", line 1054, in join
    self._wait_for_tstate_lock()
  File "/usr/lib/python3.5/threading.py", line 1070, in _wait_for_tstate_lock
    elif lock.acquire(block, timeout):
KeyboardInterrupt
pipe closed

Emulation speed 100.0%
Script died of natural causes.

Headless FCEUX

Is there a way to use FCEUX headlessly to run this environment on servers and such?

Human Control Wrapper

a wrapper for playing the game with human control would be nice for testing and data collection purposes

Include play*.py files in distribution

providing access to play_random.py and play_human.py through the pip distribution would be useful. Otherwise, users have to download the repository to use these scripts.

4 Frames per action?

is 4 frames per action the best default value for frame-skipping?

Final downsampled environment artifacts in SMB

The downsampled environment has a final artifacts that still need removed:

Blue background for water?
- Mario changes his animations and sprites in water. the blue is the final source of massive background noise in the environment
Bushes, if possible.
- this may not be possible, the remaining artifact is just a block of color. Removing the
  color causes collateral damage to pipes. The only way would be to find where bushes
  are placed in levels and somehow black those values out

pipe entry
vine climbing / entry

hopefully these are connected to easy to find timers to RAM hack to 0 and skip.

Environment Wrappers

Packaging commonly used wrappers (RGB->Y, 84x84, 4-frame stack, reward clip, etc.) with this code would be convenient.

kautenja / gym-super-mario-bros Goto Github PK

gym-super-mario-bros's People

Contributors

Stargazers

Watchers

Forkers

gym-super-mario-bros's Issues

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Possible Solutions

Describe the bug

To Reproduce

Expected behavior

Screenshots

Environment

Additional context

Describe the bug

To Reproduce

Expected behavior

Screenshots

Environment

Additional context

Describe the bug

To Reproduce

Expected behavior

Recommend Projects

Recommend Topics

Recommend Org