kautenja / gym-super-mario-bros Goto Github PK
View Code? Open in Web Editor NEWAn OpenAI Gym interface to Super Mario Bros. & Super Mario Bros. 2 (Lost Levels) on The NES
License: Other
An OpenAI Gym interface to Super Mario Bros. & Super Mario Bros. 2 (Lost Levels) on The NES
License: Other
Super Mario Bros Lost Levels is the more difficult sequel to Super Mario Bros that was never released in the U.S.. It's a much more difficult version, but nonetheless likely has similar (if not identical) memory mapping. As such, an integration should be as simple as the acquisition of a ROM for the game. This would be a cool additional feature
Frame skipping is hard coded into the Super Mario Bros lua file. Is there a way to parameterize this value in the NESEnv class?
Is there any way to boost the frame rate more, possibly by editing the FCEUX source and maintaining a local copy?
Is there a way to access individual levels to provide each of the 32 as environments? The memory map shows spots for world, level, part, etc. and a spot to force reload so it should be doable. (gym-super-mario-bros) implements this functionality.
The down-sampled environment has remaining artifacts that could be further simplified like bushes, mountains, seaweed etc. all the background static images that don't contribute to the game dynamics. It's challenging to locate the correct ROM addresses for these textures to black them out.
I want to know information that is mario status.
e.g. small mario, big mario, fire mario
I think that status add to info.
Nope
Nope
Running with a gym.wrapper.Monitor on an environment produces the spurious error
Traceback (most recent call last):
File "dddqn_train.py", line 62, in <module>
agent.train(callback=callback)
File "/home/bitcommander/Documents/Projects/deep-learning-project/src/agents/deep_q_agent.py", line 252, in train
state = self._initial_state()
File "/home/bitcommander/Documents/Projects/deep-learning-project/src/agents/agent.py", line 39, in _initial_state
state = self.env.reset()
File "/usr/local/lib/python3.5/dist-packages/gym/wrappers/monitor.py", line 39, in reset
self._after_reset(observation)
File "/usr/local/lib/python3.5/dist-packages/gym/wrappers/monitor.py", line 193, in _after_reset
self.reset_video_recorder()
File "/usr/local/lib/python3.5/dist-packages/gym/wrappers/monitor.py", line 214, in reset_video_recorder
self.video_recorder.capture_frame()
File "/usr/local/lib/python3.5/dist-packages/gym/wrappers/monitoring/video_recorder.py", line 116, in capture_frame
self._encode_image_frame(frame)
File "/usr/local/lib/python3.5/dist-packages/gym/wrappers/monitoring/video_recorder.py", line 162, in _encode_image_frame
self.encoder = ImageEncoder(self.path, frame.shape, self.frames_per_sec)
File "/usr/local/lib/python3.5/dist-packages/gym/wrappers/monitoring/video_recorder.py", line 256, in __init__
self.start()
File "/usr/local/lib/python3.5/dist-packages/gym/wrappers/monitoring/video_recorder.py", line 288, in start
self.proc = subprocess.Popen(self.cmdline, stdin=subprocess.PIPE, preexec_fn=os.setsid)
File "/usr/lib/python3.5/subprocess.py", line 947, in __init__
restore_signals, start_new_session)
File "/usr/lib/python3.5/subprocess.py", line 1490, in _execute_child
restore_signals, start_new_session, preexec_fn)
OSError: [Errno 12] Cannot allocate memory
it seems the monitor is using subprocess and fails to allocate memory. Could this be from the process running FCEUX hogging memory or is something else wrong here? Currently running a long training session without the monitor enabled to see if the monitor is truly the problem
headless FCEUX would be nice for serverside running.
import os
os.environ['SDL_VIDEODRIVER'] = 'dummy'
the only problem is that for playing the game as human, this also disables the pygame window.
The standard meta environment (i.e. SuperMarioBros-v0
) seems to be providing a large negative reward for completion of a level as if Mario has died. This is not desired functionality as it will discourage agents from getting to the flag pole (completing levels). Some logic needs injected into the Lua script to prevent this from happening and potentially provide a large positive reward for flag get events.
Hi.
I have used the multiprocessing module to create multiple environments. However, when one environment becomes Done, all other environments are also done. How do you handle this issue?
thank you. :)
Luigi gets no love, let's find a way to allow a Luigi option to play as Luigi instead of Mario.
worlds 1-4 of Lost Levels work correctly with the level select mechanism; however, worlds 5, 6, 7, 8, 9, A, B, C, and D don't seem to work for inexplicable reasons. How can we get this functionality implemented for Lost Levels?
reset
appears to cause unusual behavior that is hard to replicate. Although the background fails to render, the foreground sprites are unaffected. Perhaps recording actions from human input trying to replicate the error by dying is in order to produce a state where the error can be reproduced to debug. It's hard to say if this bug is the result of RAM hacking, or something more serious in the underlying nes-py emulator's PPU.
The SMB2 downsampled ROM needs to have the standard static textures removed.
Describe the bug
the individual level environments don't fire the done flag (to terminate an episode) when the end of a level is reached (flagpole, bowser, etc.).
To Reproduce
Steps to reproduce the behavior:
SuperMarioBros-1-1-v0
) close to the end and backup a save stateExpected behavior
When Mario reaches the end of a level in a level env, the done
flag returned by the step
method of an instance of SMBLevelEnv
should return True
indicating that the episode is over.
Screenshots
If applicable, add screenshots to help explain your problem.
Desktop (please complete the following information):
Additional context
NA
Is there any command that will allow me to grab the state size or the observation space of the environment?
The human control module is limited to the Github repo and doesn't make it to PyPi distribution. This code should be cleaned up into a class of some sort and packaged for use on deployed instances. It's mostly useful for testing the emulator (Lua code), but there are likely use cases for deployed instances to have it as well.
Is your feature request related to a problem? Please describe.
There is no indication of when Mario gets a flagpole at the end of a level.
Describe the solution you'd like
info
dict returned by (SMBEnv).step(...)
Describe alternatives you've considered
NA
Additional context
NA
when mario clears 1-1 stage, env stop few second
Steps to reproduce the behavior:
environment will be not stop
gym-super-mario-bros
version: SuperMarioBros-v2nes-gym
version: i don't knowThe reward space is statically defined in the super-mario-bros lua file. Is there a way to parameterize the different elements of the reward space to access through the Python API? For instance
Hi I use your environment for a university project and with your new version 3.0 env.step(action) results either in not doing anything or jumping.
By downgrading to version 2.3.1 the Issue disappeared.
Hope that helps, please tell me if you need further input ;)
There is currently no logic to reward the agent for grabbing the flag (without rewarding for points or checking the memory locations for some completion flag or measuring x distance or something) nor any logic to ensure that the agent doesn't receive cut-scene data for the replay memory. Perhaps manually getting close to the flag pole by hand and creating a save state for the agent to start from is a good way to get this functionality designed and implemented.
It is well documented that the NES has a sprite limit based on the limitations of the original hardware. This results in spurious flickering of sprites (particularly mario when he is loosing power ups or enemies moving along). FCEUX has an option to disable the sprite limit, though it can cause problems with certain ROMs. Perhaps this avenue should be explored to guarantee the agent isn't starved of relevant sprites as a result of the frame skip. It could potentially improve the performance of the emulator too from what I have read.
Hello,
This project is amazing ,then i have a question to ask you .I try to install this project on Win,can i got this "nes-by" on Win? and this wheel?
I would be eternally grateful if there is a step by step guide on how to get FCEUX and gym super mario brothers working on windows 10.
Thank you
the pipes currently use a static name such that only one instance of the environment can run on a machine at a given time. Using timestamp, random numbers, or some other better mechanism, we can define the pipes with unique names then pass the name to the Lua script through an environment key enabling multiple instances of the emulator on a single machine.
Super Mario Bros supports a 2 player mode. Although potentially challenging and annoying to implements, a two player mode provides a very unique opportunity for collaborative RL. Essentially, observations would stay the same, but the action space would expect a tuple of 2 actions: one for Mario and one for Luigi. Naturally, the reward streams would also need to uniquely identify rewards for each player. Terminal flag would remain unchanged. This is by no means a pressing feature, but it's worth noting the possibility on the roadmap for this project.
There are no checks for FCEUXs availability, this results in crashes pretty late in the execution cycle that should be caught much sooner in either setup.py or the initializer for NESEnv
How can you accelerate your emulation/training?
some example images for each environment would be helpful given that many are hacks that people wont know about off the bat
It could be cool to record human control for filling replay memory instead of starting from randomness. This builds on #17 by introducing a new feature to the proposed class (recording)
Would it be possible to make it so that the emulator gives us a resizeable window?
pip install gym-super-mario-bros
Collecting gym-super-mario-bros
Using cached https://files.pythonhosted.org/packages/a9/f9/ff8254f8115a46c1cad551ec98e56da1d0a95396f25e130bb98a62ff87e0/gym_super_mario_bros-1.1.0.tar.gz
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "", line 1, in
File "/tmp/pip-install-TDJ6Mj/gym-super-mario-bros/setup.py", line 4
def README() -> str:
^
SyntaxError: invalid syntax
----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-install-TDJ6Mj/gym-super-mario-bros/
Tried installing but faced these issues. Any help is appreciated. Cheers
In hindsight, the action space seems to be missing some potentially necessary actions. Should this be revisited? One alternative is to avoid the discrete action space entirely and define the action space as a one-hot vector over UDLRAB. This allows all potential button combinations; however, increases the search space complexity.
self.actions = [
'U', # Up
'D', # Down
'L', # Left
'R', # Right
'UR', # Up + Right
'DR', # Down + Right
'URA', # Up + Right + A
'DRB', # Down + Right + B
'A', # A
'B', # B
'RB', # Right + B
'RA' # Right + A
]
At the end of the world on a standard meta env (e.g. SuperMarioBros-v0
), when finishing a castle (4th level of any world), the game shows an additional frame from the Toad cutscene.
Steps to reproduce the behavior:
SuperMarioBros-v0
) to the end of level 4no screens are showed between any levels.
This environment is really nice environment to learn reinforcement learning. Thank you so much. But When I using this environment, I encountered little problem.
When I did env.step(action), I can receive four kinds of different return values such as next_state, reward, done and info. So I want to see that kinds of values. When I display a info value, info value was empty. I use Windows 10.
Why does this happen?
It would be nice to have similar pixel and rectangle ROMs for Lost Levels like those for the original Super Mario Bros.
The rectangle environment should be much easier as all sprites are just converted to a single color. The pixel environment is more complex. It might be better to contact to original creator for some help or guidance relating to this ROM.
If the player presses invalid inputs using the play_human.py script, the interface crashes. This related to a bug in the OpenAI Gym play script. I've opened a PR over on their repository, but it has received no attention from the maintainers. Might be better to just copy their script and fix their bugs here, decoupling the dependency altogether.
After 1.7M steps, the emulator got stuck in an infinite loop during either a reset or a standard death.
Exception ignored in: <module 'threading' from '/usr/lib/python3.5/threading.py'>
Traceback (most recent call last):
File "/usr/lib/python3.5/threading.py", line 1288, in _shutdown
t.join()
File "/usr/lib/python3.5/threading.py", line 1054, in join
self._wait_for_tstate_lock()
File "/usr/lib/python3.5/threading.py", line 1070, in _wait_for_tstate_lock
elif lock.acquire(block, timeout):
KeyboardInterrupt
pipe closed
Emulation speed 100.0%
Script died of natural causes.
Is there a way to use FCEUX headlessly to run this environment on servers and such?
a wrapper for playing the game with human control would be nice for testing and data collection purposes
providing access to play_random.py
and play_human.py
through the pip distribution would be useful. Otherwise, users have to download the repository to use these scripts.
is 4 frames per action the best default value for frame-skipping?
The downsampled environment has a final artifacts that still need removed:
A repository of game sprites for analysis, etc. would be nice. A collection of sprite gifs has been added to a new branch sprites. Some interfacing and pre-processing will be necessary to make this available at the module level.
The sprites repository needs updated with sprites unique to SMB2: Lost Levels
There are two untested animation sequences that may need tuned up:
hopefully these are connected to easy to find timers to RAM hack to 0 and skip.
Packaging commonly used wrappers (RGB->Y, 84x84, 4-frame stack, reward clip, etc.) with this code would be convenient.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.