Giter Site home page Giter Site logo

pku-alignment / safety-gymnasium Goto Github PK

View Code? Open in Web Editor NEW
317.0 8.0 47.0 501.66 MB

NeurIPS 2023: Safety-Gymnasium: A Unified Safe Reinforcement Learning Benchmark

Home Page: https://safety-gymnasium.readthedocs.io/en/latest/

License: Apache License 2.0

Python 97.95% Makefile 0.70% HTML 1.30% Batchfile 0.05%
constraint-satisfaction-problem reinforcement-learning safe-reinforcement-learning safe-reinforcement-learning-environments safety-critical safety-critical-systems constraint-rl safe-policy-optimization

safety-gymnasium's People

Contributors

dependabot[bot] avatar gaiejj avatar hdadong avatar muchvo avatar pre-commit-ci[bot] avatar rockmagma02 avatar xuehaipan avatar zmsn-2077 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

safety-gymnasium's Issues

[Question] Duplicating an environment including its current state

Required prerequisites

Questions

Hi,

thank you very much for providing safety-gymnasium and supporting us in using it.

I tried to copy an environment with deepcopy:

import safety_gymnasium
import gymnasium as gym

env = gym.make('SafetyPointGoal1Gymnasium-v0')
env.step(...)
env_duplicate = deepcopy(env)

However, I receive the error trace (originating from the deepcopy-line above):

File "/usr/lib/python3.8/copy.py", line 172, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/usr/lib/python3.8/copy.py", line 270, in _reconstruct
    state = deepcopy(state, memo)
  File "/usr/lib/python3.8/copy.py", line 146, in deepcopy
    y = copier(x, memo)
  File "/usr/lib/python3.8/copy.py", line 230, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/usr/lib/python3.8/copy.py", line 172, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/usr/lib/python3.8/copy.py", line 270, in _reconstruct
    state = deepcopy(state, memo)
  File "/usr/lib/python3.8/copy.py", line 146, in deepcopy
    y = copier(x, memo)
  File "/usr/lib/python3.8/copy.py", line 230, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/usr/lib/python3.8/copy.py", line 172, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/usr/lib/python3.8/copy.py", line 270, in _reconstruct
    state = deepcopy(state, memo)
  File "/usr/lib/python3.8/copy.py", line 146, in deepcopy
    y = copier(x, memo)
  File "/usr/lib/python3.8/copy.py", line 230, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/usr/lib/python3.8/copy.py", line 172, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/usr/lib/python3.8/copy.py", line 270, in _reconstruct
    state = deepcopy(state, memo)
  File "/usr/lib/python3.8/copy.py", line 146, in deepcopy
    y = copier(x, memo)
  File "/usr/lib/python3.8/copy.py", line 230, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/usr/lib/python3.8/copy.py", line 172, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/usr/lib/python3.8/copy.py", line 270, in _reconstruct
    state = deepcopy(state, memo)
  File "/usr/lib/python3.8/copy.py", line 146, in deepcopy
    y = copier(x, memo)
  File "/usr/lib/python3.8/copy.py", line 230, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/usr/lib/python3.8/copy.py", line 172, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/usr/lib/python3.8/copy.py", line 270, in _reconstruct
    state = deepcopy(state, memo)
  File "/usr/lib/python3.8/copy.py", line 146, in deepcopy
    y = copier(x, memo)
  File "/usr/lib/python3.8/copy.py", line 230, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/usr/lib/python3.8/copy.py", line 172, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/usr/lib/python3.8/copy.py", line 270, in _reconstruct
    state = deepcopy(state, memo)
  File "/usr/lib/python3.8/copy.py", line 146, in deepcopy
    y = copier(x, memo)
  File "/usr/lib/python3.8/copy.py", line 230, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/usr/lib/python3.8/copy.py", line 172, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/usr/lib/python3.8/copy.py", line 272, in _reconstruct
    y.__setstate__(state)
  File "<MY_PROJECT_DIR>/.venv/lib/python3.8/site-packages/gymnasium/utils/ezpickle.py", line 35, in __setstate__
    out = type(self)(*d["_ezpickle_args"], **d["_ezpickle_kwargs"])
TypeError: __init__() missing 1 required positional argument: 'task_id'

Is it not possible to duplicate safety-gymnasium environments? Is there any other way to save and restore the exact state after calling step?
My use-case requires me to implement a sort of "undo"-functionality and I was trying to simply save snapshots.

Thank you very much for your time!

[Question] Any benchmark results available for the set of tasks?

Required prerequisites

Questions

Hi authors, thanks for the great work. I am planning to use this repo as one of my testing playgrounds. I am wondering if there are training results for the Goal, Button, Push tasks available, since I found that using my PPO-Lag algorithm with a cost-limit 20 on these tasks will result in very low rewards (~6 for PointGoal1). Therefore, some benchmark results can help me to understand whether my obtained reward/cost ranges are normal for these tasks. Thanks!

[BUG] Problem encountered in Windows

Required prerequisites

What version of Safety Gymnasium are you using?

0.1.1

System information

Windows
0.1.1

Problem description

What a nice repo !
When I use safety_gymnasium in windows platform, I encountered problem as:
image
How can I solve it? Thanks !

Reproducible example code

The Python snippets:

Traceback

No response

Expected behavior

No response

Additional context

No response

[Feature Request] upgrade to gymnasium 28.1

Required prerequisites

Motivation

Hi, I just wanted to make a quick issue to suggest upgrading to gymnasium 28.1. It shouldn’t require much if any code changes from 26.3 but there have been a number of bugs fixed since then. We have this repository listed under our third party environments list (https://gymnasium.farama.org/environments/third_party_environments/) and want to make sure they are as up to date as possible.

Solution

No response

Alternatives

No response

Additional context

No response

Can Safety-Gymnasium run on the SSH Server[Question]

Required prerequisites

Questions

Hi, a quick question does safety-gymnasium have to run in the local environment? Can we run it on the SSH Server?

I'm having the following bug report from glfw package running the toy example provided in the example folder on the SSH server:
.local/lib/python3.8/site-packages/glfw/__init__.py:912: GLFWError: (65544) b'X11: Failed to open display :0.0'
.local/lib/python3.8/site-packages/glfw/__init__.py:912: GLFWError: (65537) b'The GLFW library is not initialized'
python: /builds/florianrhiem/pyGLFW/glfw-3.3.8/src/monitor.c:445: glfwGetVideoMode: Assertion 'monitor != ((void *)0)' failed. Aborted (core dumped)

Or, is there any way we can bypass the GUI initiation and run the safety-gymnasium GUI-free?

[Feature Request] Adding support for standard gymnasium API by removing the `cost` from the return variables

Required prerequisites

Motivation

Hi all, I have been using this repo for a while and really appreciate your efforts in organizing all the tasks, which will for sure benefit the entire safe RL community. However, I am a bit confused regarding the design of explicitly returning the cost value when executing the step() function, i.e.:

obs, reward, cost, terminated, truncated, info = env.step(act)

I am curious that what's the particular motivation for modifying the original gymnasium API, since I personally think sticking to the standard one, i.e.,

obs, reward, terminated, truncated, info = env.step(act)

where "cost" is in the info dict`, would be better and more convenient for usage. My thoughts are as follows:

  1. Using the standard API can help us to better integrate with popular RL libraries, such as tianshou and stable-baselines3, without modifying too much about their data collection part. I noticed that the level-0 tasks are mostly constraint-free, so using the standard API can help us to test the basic environment more conveniently with other non-safe RL libraries. I personally have had this issue since modifying the collector in tianshou is not a trivial thing and might not be elegant.
  2. I think the current API only supports a single constraint setting, which might not be easily extended to multiple constraints. For example, if there are two constraints and we need to consider their costs separately, then the current cost returning variable seems to be in an awkward position. In contrast, adding more entries/keys in the info dict will be much more extendable as well as making the API clean and consistent. I do believe that extending the set of environments to multiple-constraint settings in the future would be interesting and promising.
  3. We may not be able to apply standard gymnasium wrappers easily due to the API changes.
  4. Converting the current API to the standard gymnasium with a wrapper may not be very elegant in terms of implementation.

I tried to use this repo as a standard gymnasium way by using the following code snippet, but it failed:

import gymnasium as gym
import safety_gymnasium

env = gym.make("SafetyCarCircle1-v0")
env.reset()
env.step(env.action_space.sample())

If I want to convert it to standard gymnasium API, I have to do the following:

import gymnasium as gym
import safety_gymnasium as sgym

class SafetyGymnasiumWrapper(gym.Wrapper):
    def __init__(self, env: gym.Env):
        super().__init__(env)
    def step(self, action):
        obs, reward, cost, terminated, truncated, info = super().step(action)
        return obs, reward, terminated, truncated, info

env = SafetyGymnasiumWrapper(sgym.make("SafetyCarCircle1-v0"))
env.reset()
print(env.step(env.action_space.sample()))

Solution

I am wondering if the authors could provide an argument when creating the environment, such that we can easily switch between the proposed step API and the standard gymnasium API. For example, the ideal usage could be something like:

import gymnasium as gym
import safety_gymnasium as sgym

# use standard gymnasium API
env = gym.make("SafetyCarCircle1-v0", standard_api=True)
env.reset()
obs, reward, terminated, truncated, info = env.step(env.action_space.sample())

# use the current safety-gymnasium API
env = sgym.make("SafetyCarCircle1-v0", standard_api=False)
env.reset()
obs, reward, cost, terminated, truncated, info = env.step(env.action_space.sample())

Ideally, the default value for the standard_api argument should be True to stick with the standard gymnasium style usage.

I am looking forward to hearing from the authors regarding your thoughts on this problem. Thanks.

Alternatives

No response

Additional context

By the way, it seems that the discussion tab is not open. It is always 404 on my end.

[Question] How to save images as videos in depth_array render mode?

Required prerequisites

Questions

Hi, I am using depth_array render mode. I will get a np.array array when I call env.render(), and I noticed that in safety-gymnasium/examples/vision_env.py, a example shows how to save the list of array as a video, but It's an rgb_array example, depth_array which has different dimensions can't work if I just follow that.
Please give me some help.
Thanks a lot.

[Question] How to get the reward and cost design for each environment?

Required prerequisites

Questions

I am curious about the rewards and costs in tasks. I want to know when the adding rewards will be triggered and when the costs will be increased. Which file should I check? Like "SafetyPointGoal1-v0"

[Question] How to use Safe Isaac Gym?

Required prerequisites

Questions

Hi, I would like to use the environments provided in the safe Isaac gym section. However, there is no documentation on how to import and make them. I used gym.make("ShadowHandCatchOver2UnderarmSafeJoint") to create the environment but I get the error "Environment ShadowHandCatchOver2UnderarmSafeJoint is not registered in safety-gymnasium".
How can I make the environment?

Can not render mujoco env

Required prerequisites

What version of Safety-Gymnasium are you using?

1.0.0

System information

sys.version: 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0]
sys.platfrom: linux
I am runing on colab

Problem description

I am trying to render SafetyRacecarButton2-v0 environment but I face some errors.

Reproducible example code

import safety_gymnasium
from gymnasium.wrappers import RecordVideo
from safety_gymnasium.wrappers import SafeAutoResetWrapper, SafeRescaleAction, SafeUnsqueeze
from safepo.common.wrappers import SafeNormalizeObservation
env = safety_gymnasium.make("SafetyRacecarButton2-v0", max_episode_steps=1000, render_mode="rgb_array", camera_name="fixednear")
env.reset(seed=456)
env = SafeAutoResetWrapper(env)
env = SafeRescaleAction(env, -1.0, 1.0)
env = SafeNormalizeObservation(env)
env = SafeUnsqueeze(env)

trigger = lambda t: t % 300 == 0
env = RecordVideo(env, video_folder=f"./ppo_lag_video", episode_trigger=trigger)

eval_env.reset()
eval_env.render()

Traceback

/usr/local/lib/python3.10/dist-packages/glfw/__init__.py:916: GLFWError: (65544) b'X11: The DISPLAY environment variable is missing'
  warnings.warn(message, GLFWError)
/usr/local/lib/python3.10/dist-packages/glfw/__init__.py:916: GLFWError: (65537) b'The GLFW library is not initialized'
  warnings.warn(message, GLFWError)
FatalError                                Traceback (most recent call last)
<ipython-input-6-b4c042a36a8b> in <cell line: 2>()
      1 eval_env.reset()
----> 2 eval_env.render()

12 frames
/usr/local/lib/python3.10/dist-packages/gymnasium/core.py in render(self)
    416     def render(self) -> RenderFrame | list[RenderFrame] | None:
    417         """Uses the :meth:`render` of the :attr:`env` that can be overwritten to change the returned data."""
--> 418         return self.env.render()
    419 
    420     def close(self):

/usr/local/lib/python3.10/dist-packages/gymnasium/core.py in render(self)
    416     def render(self) -> RenderFrame | list[RenderFrame] | None:
    417         """Uses the :meth:`render` of the :attr:`env` that can be overwritten to change the returned data."""
--> 418         return self.env.render()
    419 
    420     def close(self):

/usr/local/lib/python3.10/dist-packages/gymnasium/core.py in render(self)
    416     def render(self) -> RenderFrame | list[RenderFrame] | None:
    417         """Uses the :meth:`render` of the :attr:`env` that can be overwritten to change the returned data."""
--> 418         return self.env.render()
    419 
    420     def close(self):

/usr/local/lib/python3.10/dist-packages/gymnasium/core.py in render(self)
    416     def render(self) -> RenderFrame | list[RenderFrame] | None:
    417         """Uses the :meth:`render` of the :attr:`env` that can be overwritten to change the returned data."""
--> 418         return self.env.render()
    419 
    420     def close(self):

/usr/local/lib/python3.10/dist-packages/gymnasium/core.py in render(self)
    416     def render(self) -> RenderFrame | list[RenderFrame] | None:
    417         """Uses the :meth:`render` of the :attr:`env` that can be overwritten to change the returned data."""
--> 418         return self.env.render()
    419 
    420     def close(self):

/usr/local/lib/python3.10/dist-packages/gymnasium/wrappers/order_enforcing.py in render(self, *args, **kwargs)
     68                 "set `disable_render_order_enforcing=True` on the OrderEnforcer wrapper."
     69             )
---> 70         return self.env.render(*args, **kwargs)
     71 
     72     @property

/usr/local/lib/python3.10/dist-packages/gymnasium/wrappers/env_checker.py in render(self, *args, **kwargs)
     61         if self.checked_render is False:
     62             self.checked_render = True
---> 63             return env_render_passive_checker(self.env, *args, **kwargs)
     64         else:
     65             return self.env.render(*args, **kwargs)

/usr/local/lib/python3.10/dist-packages/gymnasium/utils/passive_env_checker.py in env_render_passive_checker(env)
    389             )
    390 
--> 391     result = env.render()
    392     if env.render_mode is not None:
    393         _check_render_return(env.render_mode, result)

/usr/local/lib/python3.10/dist-packages/safety_gymnasium/builder.py in render(self)
    315             not self.task.observe_vision
    316         ), 'When you use vision envs, you should not call this function explicitly.'
--> 317         return self.task.render(cost=self.cost, **asdict(self.render_parameters))
    318 
    319     @property

/usr/local/lib/python3.10/dist-packages/safety_gymnasium/bases/underlying.py in render(self, width, height, mode, camera_id, camera_name, cost)
    513                 )
    514 
--> 515         self._get_viewer(mode)
    516 
    517         # Turn all the geom groups on

/usr/local/lib/python3.10/dist-packages/safety_gymnasium/bases/underlying.py in _get_viewer(self, mode)
    566                 )
    567             elif mode in {'rgb_array', 'depth_array'}:
--> 568                 self.viewer = OffScreenViewer(self.model, self.data)
    569             else:
    570                 raise AttributeError(f'Unexpected mode: {mode}')

/usr/local/lib/python3.10/dist-packages/gymnasium/envs/mujoco/mujoco_rendering.py in __init__(self, model, data)
    142         self._get_opengl_backend(width, height)
    143 
--> 144         super().__init__(model, data, width, height)
    145 
    146         self._init_camera()

/usr/local/lib/python3.10/dist-packages/gymnasium/envs/mujoco/mujoco_rendering.py in __init__(self, model, data, width, height)
     59 
     60         # Keep in Mujoco Context
---> 61         self.con = mujoco.MjrContext(self.model, mujoco.mjtFontScale.mjFONTSCALE_150)
     62 
     63         self._set_mujoco_buffer()

FatalError: gladLoadGL error


### Expected behavior

_No response_

### Additional context

_No response_

[BUG] Aborted (core dumped)

Required prerequisites

What version of Safety Gymnasium are you using?

0.1.2.dev2+ge3e3fa1

System information

3.8.16 (default, Jan 17 2023, 23:13:24)
[GCC 11.2.0] linux
0.1.2.dev2+ge3e3fa1

Problem description

When I execute the example/env.py, I got the bug:

root@autodl-container-b37a11a83c-666dc781:~/safety-gymnasium/examples# python env.py 
/root/miniconda3/lib/python3.8/site-packages/glfw/__init__.py:912: GLFWError: (65544) b'X11: The DISPLAY environment variable is missing'
  warnings.warn(message, GLFWError)
/root/miniconda3/lib/python3.8/site-packages/glfw/__init__.py:912: GLFWError: (65537) b'The GLFW library is not initialized'
  warnings.warn(message, GLFWError)
python: /builds/florianrhiem/pyGLFW/glfw-3.3.8/src/monitor.c:445: glfwGetVideoMode: Assertion `monitor != ((void *)0)' failed.
Aborted (core dumped)

Reproducible example code

The Python snippets:

# Copyright 2022 Safety Gymnasium Team. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Examples for environments."""

import argparse

import safety_gymnasium


def run_random(env_name):
    """Random run."""
    env = safety_gymnasium.make(env_name, render_mode='human')
    obs, info = env.reset()  # pylint: disable=unused-variable
    # Use below to specify seed.
    # obs, _ = env.reset(seed=0)
    terminated, truncated = False, False
    ep_ret, ep_cost = 0, 0
    while True:
        if terminated or truncated:
            print(f'Episode Return: {ep_ret} \t Episode Cost: {ep_cost}')
            ep_ret, ep_cost = 0, 0
            obs, info = env.reset()  # pylint: disable=unused-variable
        assert env.observation_space.contains(obs)
        act = env.action_space.sample()
        assert env.action_space.contains(act)
        # Use the environment's built_in max_episode_steps
        if hasattr(env, '_max_episode_steps'):  # pylint: disable=unused-variable
            max_ep_len = env._max_episode_steps  # pylint: disable=unused-variable,protected-access
        # pylint: disable-next=unused-variable
        obs, reward, cost, terminated, truncated, info = env.step(act)
        #print("obs",env.observation_space.spaces)

        ep_ret += reward
        ep_cost += cost


if __name__ == '__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument('--env', default='SafetyCarGoal2-v0')
    args = parser.parse_args()
    run_random(args.env)

Command lines:

python env.py

Traceback

No response

Expected behavior

No response

Additional context

No response

[Question] gladLoadGL error

Required prerequisites

Questions

I encounterred this issue when I use env.render() with render_mode="rgb_array".
1546e0bc3290bd0e102dde3c7424452

In my bashrc, there is a line related to OpenGL library, export LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libGLEW.so:/usr/lib/nvidia-415/libGL.so, but it still dosen't work.

[Feature Request] Enable type annotation checker

Required prerequisites

Motivation

I found many wrong type annotations in the codebase after a quick glance.

https://github.com/OmniSafeAI/safety-gymnasium/blob/c1c598ea45f23535405f9941695800bae41d3a4a/safety_gymnasium/utils/random_generator.py#L170-L173

-def constrain_placement(self, placement: dict, keepout: float) -> tuple[float]:
+def constrain_placement(self, placement: list[float], keepout: float) -> tuple[float, float, float, float]:

Also, mypy reports:

-Found 129 errors in 13 files (checked 82 source files)

We should enable a CI workflow to check type annotations are correct.

Solution

Update type annotations and enable mypy in the CI workflow. And remove:

https://github.com/OmniSafeAI/safety-gymnasium/blob/c1c598ea45f23535405f9941695800bae41d3a4a/pyproject.toml#L150-L156

Alternatives

No response

Additional context

No response

[BUG] randomize_layout fixes environment layout regardless of initial random seed

Required prerequisites

What version of Safety-Gymnasium are you using?

1.2.1

System information

Installed from source. On python 3.10.13.

Problem description

I'm using the safe_navigation tasks within Safety-Gymnasium, e.g. button_level2. My intention is to set the layout of the environment to be a new random layout at each new run of training. However, when self.mechanism.conf.randomize_layout is set to False in the __init__ method of the environment, the layout does not change, even when the random seeds of numpy are set to different values. Even when setting self.random_generator.set_random_seed(seed) in the __init__ method of the environment, the layout does not change. Setting a random seed in calls to env.reset(seed=seed) is not allowed when self.mechanism.conf.randomize_layout = False. I previously asked in #87 how to fix the environment, expecting for fixed environments to be different across different executions of the code, but am noticing this does not appear to be the case.

What changes need to be made, if any, to ensure that the initial layouts are random when randomize_layout is set to False in independent executions of the code?

Reproducible example code

Set self.mechanism.conf.randomize_layout to False in any safe_navigation task. Change initial random seed and observe layouts.

Traceback

No response

Expected behavior

No response

Additional context

No response

Rendering when using vision environments [Question]

Required prerequisites

Questions

I am using the code below to play around with vision environments, I copied it from the documentation. When I run this I get the below error.
I want to have this env where the vision camera can be used as the state input. Thank you for your help

import safety_gymnasium
import matplotlib.pyplot as plt

env = safety_gymnasium.make('SafetyCarGoal1Vision-v0', render_mode='human')

obs, info = env.reset()
terminated, truncated = False, False
ep_ret, ep_cost = 0, 0
for _ in range(1000):
    assert env.observation_space.contains(obs)
    act = env.action_space.sample()
    assert env.action_space.contains(act)
    # modified for Safe RL, added cost
    obs, reward, cost, terminated, truncated, info = env.step(act)
    ep_ret += reward
    ep_cost += cost
    if terminated or truncated:
        observation, info = env.reset()

    env.close()
  File ~/Documents/safety-gymnasium/safety_gymnasium/utils/passive_env_checker.py:25 in env_step_passive_checker
    result = env.step(action)

  File ~/Documents/safety-gymnasium/safety_gymnasium/builder.py:248 in step
    self.render()

  File ~/Documents/safety-gymnasium/safety_gymnasium/builder.py:314 in render
    assert (

AssertionError: When you use vision envs, you should not call this function explicitly.

[Question] How to efficiently design a custom navigation environment?

Required prerequisites

Questions

Hello,

I am trying to create a simple custom goal environment. I manage to come up with the following code to make an environment

from __future__ import annotations

from typing import Dict
import safety_gymnasium
from safety_gymnasium.assets.geoms import Goal
from safety_gymnasium.assets.geoms import Hazards
from safety_gymnasium.agents.point import Point
from safety_gymnasium.agents.car import Car
from safety_gymnasium.bases.base_task import BaseTask
from safety_gymnasium.builder import Builder
from safety_gymnasium.utils.task_utils import get_task_class_name

agents = {
    "point": Point,
    "car": Car,
}
        
class SimpleGoalLevel1(BaseTask):
    """
    Custom safety gym environment
    """
    def __init__(
        self, 
        config:Dict=dict(),
    ):
        super(SimpleGoalLevel1, self).__init__(config=config)
        # Increased difficulty and randomization
        self.placements_conf.extents = [-1.5, -1.5, 1.5, 1.5]

        # Instantiate and register hazards
        self._add_geoms(Hazards(
            num=1,
            size=0.7,
            locations=[(0,0)],
            is_lidar_observed=True,
            is_constrained=True,
            keepout=0.705))

        # Instantiate and register the goal
        self._add_geoms(Goal(
            keepout=0.305, 
            size=0.3, 
            locations=[(1.1, 1.1)],
            is_lidar_observed=True))
        
        self.lidar_conf.max_dist = 3
        self.lidar_conf.num_bins = 16
        
        self.last_dist_goal = None
    
    def _build_agent(self, agent_name:str) -> None:
        if agent_name in agents:
            self.agent = agents[agent_name](random_generator=self.random_generator)
        else:
            super()._build_agent(agent_name=agent_name)
         
    def calculate_reward(self):
        """Determine reward depending on the agent and tasks."""
        # pylint: disable=no-member
        reward = 0.0
        dist_goal = self.dist_goal()
        reward += (self.last_dist_goal - dist_goal) * self.goal.reward_distance
        self.last_dist_goal = dist_goal

        if self.goal_achieved:
            reward += self.goal.reward_goal

        return reward
            
    def specific_reset(self):
        pass

    def specific_step(self):
        pass

    def update_world(self):
        """Build a new goal position, maybe with resampling due to hazards."""
        self.build_goal_position()
        self.last_dist_goal = self.dist_goal()

    @property
    def goal_achieved(self):
        """Whether the goal of task is achieved."""
        # pylint: disable-next=no-member
        return self.dist_goal() <= self.goal.size

tasks = {
    "SimpleGoalLevel1": SimpleGoalLevel1
}

class CustomBuilder(Builder):
   def _get_task(self):
        class_name = get_task_class_name(self.task_id)
        if class_name in tasks:
            task_class = tasks[class_name]
            task = task_class(config=self.config)
            task.build_observation_space()
        else:
            task = super()._get_task()    
        return task

if __name__ == "__main__":
    env = CustomBuilder(task_id="SafetyPointSimpleGoal1-v0",
                        config={"agent_name":"point"})
    s, i = env.reset()
    env = safety_gymnasium.wrappers.SafetyGymnasium2Gymnasium(env)
    for k in range(1000):
        s, a, d, t, i = env.step(env.action_space.sample())

But it feels as a very complicated way to make an environment and I had to a few hacks to make it work. Some parts of the code I'd like to avoid. For example,

  1. Overloading the method _build_agent. If this is not done then the following error is thrown.
  File "/miniconda3/envs/ray/lib/python3.8/site-packages/safety_gymnasium/bases/underlying.py", line 227, in __init__
    self._build_agent(self.agent_name)
  File "/miniconda3/envs/ray/lib/python3.8/site-packages/safety_gymnasium/bases/underlying.py", line 251, in _build_agent
    self.agent = agent_cls(random_generator=self.random_generator)
TypeError: 'module' object is not callable
  1. To pass the task to the environment, I am creating a subclass of the Builder class with the overloaded _get_task method
  2. When I create the environment I have to send the robot name in the config dictionary and in the task_id, that is: env = CustomBuilder(task_id="SafetyPointSimpleGoal1-v0", config={"agent_name":"point"})

I have the following questions:

  1. Can this solution be simplified?
  2. Why the error above is thrown if the method _build_agent is not overloaded?
  3. Can the overloaded task be passed to the builder without inheritance?
  4. Why the agent needs to be specified twice: in the task_id and in the config

Thanks in advance!

[Feature Request] Creating a custom navigation task from the config dictionary similarly to the safety gym

Required prerequisites

Motivation

Motivated by the issue #96, I propose a feature similar to the original safety gym of making a custom navigation environment purely from the config file. In this case, the hazards, the vases, the goals etc can be specified in the config file. For example, to specify a simple goal environment with one large hazard the config should look like:

    config = {
        'task_name': 'Goal',
        'agent_name': "Point",
        "lidar_conf.max_dist": 3,
        "lidar_conf.num_bins": 16,
        "placements_conf.extents": [-1.5, -1.5, 1.5, 1.5],
        "Hazards": dict(
            num=1,
            size=0.7,
            locations=[(0,0)],
            is_lidar_observed=True,
            is_constrained=True,
            keepout=0.705),
        "Goal": dict(
            keepout=0.305,
            size=0.3,
            locations=[(1.1, 1.1)],
            is_lidar_observed=True)
        }

While the environment is created through the make file while passing the config

env_id = "SafetyCustomNavigation-v0"
env = safety_gymnasium.make(env_id, config=config)

I think this can be done for vision tasks as well.

Solution

  1. Introduce a custom level base class for all tasks. For example, for the task goal we can define:
class GoalLevelC(BaseTask):
    """An agent must navigate to a goal."""

    def calculate_reward(self):
        """Determine reward depending on the agent and tasks."""
        # pylint: disable=no-member
        reward = 0.0
        dist_goal = self.dist_goal()
        reward += (self.last_dist_goal - dist_goal) * self.goal.reward_distance
        self.last_dist_goal = dist_goal

        if self.goal_achieved:
            reward += self.goal.reward_goal

        return reward

    def specific_reset(self):
        pass

    def specific_step(self):
        pass

    def update_world(self):
        """Build a new goal position, maybe with resampling due to hazards."""
        self.build_goal_position()
        self.last_dist_goal = self.dist_goal()

    @property
    def goal_achieved(self):
        """Whether the goal of task is achieved."""
        # pylint: disable-next=no-member
        return self.dist_goal() <= self.goal.size

Note the __init__ method is not overloaded in this class.

  1. The method _parse in the class Underlying can handle adding hazards, vases, gremlins and goals. For example, to handle hazards and goals we can modify the method to:
    import safety_gymnasium.assets.geoms as geoms
    for key, value in config.items():
        if '.' in key:
            obj, key = key.split('.')
            assert hasattr(self, obj) and hasattr(getattr(self, obj), key), f'Bad key {key}'
            setattr(getattr(self, obj), key, value)
        elif hasattr(geoms, key):
            self._add_geoms(getattr(geoms, key)(**value))
        else:
            assert hasattr(self, key), f'Bad key {key}'
            setattr(self, key, value)

Alternatives

No response

Additional context

No response

[BUG] python3.8/multiprocessing/connection.py EOFError

Required prerequisites

What version of Safety Gymnasium are you using?

0.2.0

System information

print(sys.version, sys.platform)
3.8.16 (default, Mar 2 2023, 03:21:46)
[GCC 11.2.0] linux
print(safety_gymnasium.version)
0.2.0

Problem description

Runnning the first code in the document, tested in my ubuntu 16.04, 18.04, windows, met same problem in multiprocessing files.
File "/home/tian/anaconda3/envs/safe/lib/python3.8/multiprocessing/connection.py", line 414, in _recv_bytes
File "/home/tian/anaconda3/envs/safe/lib/python3.8/multiprocessing/connection.py", line 383, in _recv
EOFError:

Reproducible example code

The Python snippets:

Command lines:

Extra dependencies:


Steps to reproduce:

Runing the following codes, which is in the document.

import safety_gymnasium
env = safety_gymnasium.vector.make("SafetyCarGoal1-v0", render_mode="human", num_envs=8)
observation, info = env.reset(seed=0)
for _ in range(1000):
action = env.action_space.sample() # this is where you would insert your policy
observation, reward, terminated, truncated, info = env.step(action)

if terminated or truncated:
observation, info = env.reset()
env.close()

Traceback

(safe) tian@ROG:~/safety-gymnasium$ python testrepo.py 
/home/tian/anaconda3/envs/safe/lib/python3.8/site-packages/glfw/__init__.py:916: GLFWError: (65543) b'GLX: Failed to create context: BadValue (integer parameter out of range for operation)'
  warnings.warn(message, GLFWError)
/home/tian/anaconda3/envs/safe/lib/python3.8/site-packages/glfw/__init__.py:916: GLFWError: (65543) b'GLX: Failed to create context: BadValue (integer parameter out of range for operation)'
  warnings.warn(message, GLFWError)
/home/tian/anaconda3/envs/safe/lib/python3.8/site-packages/glfw/__init__.py:916: GLFWError: (65543) b'GLX: Failed to create context: BadValue (integer parameter out of range for operation)'
  warnings.warn(message, GLFWError)
/home/tian/anaconda3/envs/safe/lib/python3.8/site-packages/glfw/__init__.py:916: GLFWError: (65543) b'GLX: Failed to create context: BadValue (integer parameter out of range for operation)'
  warnings.warn(message, GLFWError)
/home/tian/anaconda3/envs/safe/lib/python3.8/site-packages/glfw/__init__.py:916: GLFWError: (65538) b'Cannot set swap interval without a current OpenGL or OpenGL ES context'
  warnings.warn(message, GLFWError)
/home/tian/anaconda3/envs/safe/lib/python3.8/site-packages/glfw/__init__.py:916: GLFWError: (65538) b'Cannot set swap interval without a current OpenGL or OpenGL ES context'
  warnings.warn(message, GLFWError)
/home/tian/anaconda3/envs/safe/lib/python3.8/site-packages/glfw/__init__.py:916: GLFWError: (65538) b'Cannot set swap interval without a current OpenGL or OpenGL ES context'
  warnings.warn(message, GLFWError)
/home/tian/anaconda3/envs/safe/lib/python3.8/site-packages/glfw/__init__.py:916: GLFWError: (65538) b'Cannot set swap interval without a current OpenGL or OpenGL ES context'
  warnings.warn(message, GLFWError)
python: /builds/florianrhiem/pyGLFW/glfw-3.3.8/src/window.c:646: glfwGetFramebufferSize: Assertion `window != ((void *)0)' failed.
python: /builds/florianrhiem/pyGLFW/glfw-3.3.8/src/window.c:646: glfwGetFramebufferSize: Assertion `window != ((void *)0)' failed.
python: /builds/florianrhiem/pyGLFW/glfw-3.3.8/src/window.c:646: glfwGetFramebufferSize: Assertion `window != ((void *)0)' failed.
python: /builds/florianrhiem/pyGLFW/glfw-3.3.8/src/window.c:646: glfwGetFramebufferSize: Assertion `window != ((void *)0)' failed.
/home/tian/anaconda3/envs/safe/lib/python3.8/site-packages/glfw/__init__.py:916: GLFWError: (65543) b'GLX: Failed to create context: BadValue (integer parameter out of range for operation)'
  warnings.warn(message, GLFWError)
/home/tian/anaconda3/envs/safe/lib/python3.8/site-packages/glfw/__init__.py:916: GLFWError: (65543) b'GLX: Failed to create context: BadValue (integer parameter out of range for operation)'
  warnings.warn(message, GLFWError)
/home/tian/anaconda3/envs/safe/lib/python3.8/site-packages/glfw/__init__.py:916: GLFWError: (65538) b'Cannot set swap interval without a current OpenGL or OpenGL ES context'
  warnings.warn(message, GLFWError)
python: /builds/florianrhiem/pyGLFW/glfw-3.3.8/src/window.c:646: glfwGetFramebufferSize: Assertion `window != ((void *)0)' failed.
/home/tian/anaconda3/envs/safe/lib/python3.8/site-packages/glfw/__init__.py:916: GLFWError: (65538) b'Cannot set swap interval without a current OpenGL or OpenGL ES context'
  warnings.warn(message, GLFWError)
python: /builds/florianrhiem/pyGLFW/glfw-3.3.8/src/window.c:646: glfwGetFramebufferSize: Assertion `window != ((void *)0)' failed.
/home/tian/anaconda3/envs/safe/lib/python3.8/site-packages/glfw/__init__.py:916: GLFWError: (65543) b'GLX: Failed to create context: BadValue (integer parameter out of range for operation)'
  warnings.warn(message, GLFWError)
/home/tian/anaconda3/envs/safe/lib/python3.8/site-packages/glfw/__init__.py:916: GLFWError: (65543) b'GLX: Failed to create context: BadValue (integer parameter out of range for operation)'
  warnings.warn(message, GLFWError)
/home/tian/anaconda3/envs/safe/lib/python3.8/site-packages/glfw/__init__.py:916: GLFWError: (65538) b'Cannot set swap interval without a current OpenGL or OpenGL ES context'
  warnings.warn(message, GLFWError)
python: /builds/florianrhiem/pyGLFW/glfw-3.3.8/src/window.c:646: glfwGetFramebufferSize: Assertion `window != ((void *)0)' failed.
/home/tian/anaconda3/envs/safe/lib/python3.8/site-packages/glfw/__init__.py:916: GLFWError: (65538) b'Cannot set swap interval without a current OpenGL or OpenGL ES context'
  warnings.warn(message, GLFWError)
python: /builds/florianrhiem/pyGLFW/glfw-3.3.8/src/window.c:646: glfwGetFramebufferSize: Assertion `window != ((void *)0)' failed.
Traceback (most recent call last):
  File "testrepo.py", line 6, in <module>
    observation, reward, terminated, truncated, info = env.step(action)
  File "/home/tian/anaconda3/envs/safe/lib/python3.8/site-packages/gymnasium/vector/vector_env.py", line 197, in step
    return self.step_wait()
  File "/home/tian/safety-gymnasium/safety_gymnasium/vector/async_vector_env.py", line 122, in step_wait
    result, success = pipe.recv()
  File "/home/tian/anaconda3/envs/safe/lib/python3.8/multiprocessing/connection.py", line 250, in recv
    buf = self._recv_bytes()
  File "/home/tian/anaconda3/envs/safe/lib/python3.8/multiprocessing/connection.py", line 414, in _recv_bytes
    buf = self._recv(4)
  File "/home/tian/anaconda3/envs/safe/lib/python3.8/multiprocessing/connection.py", line 383, in _recv
    raise EOFError
EOFError
/home/tian/anaconda3/envs/safe/lib/python3.8/site-packages/gymnasium/vector/async_vector_env.py:457: UserWarning: WARN: Calling `close` while waiting for a pending call to `step` to complete.
Exception ignored in: <function AsyncVectorEnv.__del__ at 0x7f11c2ebd430>
Traceback (most recent call last):
  File "/home/tian/anaconda3/envs/safe/lib/python3.8/site-packages/gymnasium/vector/async_vector_env.py", line 546, in __del__
  File "/home/tian/anaconda3/envs/safe/lib/python3.8/site-packages/gymnasium/vector/vector_env.py", line 265, in close
  File "/home/tian/anaconda3/envs/safe/lib/python3.8/site-packages/gymnasium/vector/async_vector_env.py", line 461, in close_extras
  File "/home/tian/safety-gymnasium/safety_gymnasium/vector/async_vector_env.py", line 122, in step_wait
  File "/home/tian/anaconda3/envs/safe/lib/python3.8/multiprocessing/connection.py", line 250, in recv
  File "/home/tian/anaconda3/envs/safe/lib/python3.8/multiprocessing/connection.py", line 414, in _recv_bytes
  File "/home/tian/anaconda3/envs/safe/lib/python3.8/multiprocessing/connection.py", line 383, in _recv
EOFError:

Expected behavior

No response

Additional context

No response

[BUG] can't use autoreset

Required prerequisites

What version of Safety Gymnasium are you using?

0.1.0

System information

In [0]: print(sys.version, sys.platform)
3.8.16 (default, Jan 17 2023, 23:13:24) 
[GCC 11.2.0] linux

In [1]: safety_gymnasium.__version__
Out[2]: '0.1.0'

Problem description

I can't use auto reset in single env (autoreset work well in vector env)

I gauss it is because you need to redefine a AutoresetWrapper in safety_gymnaisum, when I use autorest by passing argument autoreset = True in the make function, safety_gymnaisum use AutoresetWrapper in gymnasium which don't have cost.

Reproducible example code

The Python snippets:

import safety_gymnasium
import numpy as np

env = safety_gymnasium.make('SafetyPointGoal1-v0', autoreset=True)
env.reset()
env.step(np.array([0.0, 0.0]))

Traceback

----> 1 env.step(np.array([0.0, 0.0]))

File ~/anaconda3/envs/typing/lib/python3.8/site-packages/gymnasium/wrappers/autoreset.py:45, in AutoResetWrapper.step(self, action)
     36 def step(self, action):
     37     """Steps through the environment with action and resets the environment if a terminated or truncated signal is encountered.
     38 
     39     Args:
   (...)
     43         The autoreset environment :meth:`step`
     44     """
---> 45     obs, reward, terminated, truncated, info = self.env.step(action)
     46     if terminated or truncated:
     48         new_obs, new_info = self.env.reset()

ValueError: too many values to unpack (expected 5)

Expected behavior

No response

Additional context

If you agree, I can pull request a modify, which add a AutoresetWrapper which follow the CONTRIBUTING.Md

[Feature Request] Safety gym bug report

Required prerequisites

Motivation

I am considering switching from safety gym to safety gymnasium. I want to figure out the differences between safety gym and safety gymnasium. I heard that safety gymnasium fixed several bugs from safety gym. But when I checked the bug report, it is an empty document. Could you please help update the document? Thanks!

Solution

No response

Alternatives

No response

Additional context

No response

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.