instadeepai / jumanji Goto Github PK

View Code? Open in Web Editor NEW

559.0 10.0 68.0 61.75 MB

🕹️ A diverse suite of scalable reinforcement learning environments in JAX

Home Page: https://instadeepai.github.io/jumanji

License: Apache License 2.0

JavaScript 0.01% Python 100.00%

jax python reinforcement-learning research

jumanji's Introduction

Environments | Installation | Quickstart | Training | Citation | Docs

Jumanji @ ICLR 2024

Jumanji has been accepted at ICLR 2024, check out our research paper.

Welcome to the Jungle! 🌴

Jumanji is a diverse suite of scalable reinforcement learning environments written in JAX. It now features 22 environments!

Jumanji is helping pioneer a new wave of hardware-accelerated research and development in the field of RL. Jumanji's high-speed environments enable faster iteration and large-scale experimentation while simultaneously reducing complexity. Originating in the research team at InstaDeep, Jumanji is now developed jointly with the open-source community. To join us in these efforts, reach out, raise issues and read our contribution guidelines or just star 🌟 to stay up to date with the latest developments!

Goals 🚀

Provide a simple, well-tested API for JAX-based environments.
Make research in RL more accessible.
Facilitate the research on RL for problems in the industry and help close the gap between research and industrial applications.
Provide environments whose difficulty can be scaled to be arbitrarily hard.

Overview 🦜

🥑 Environment API: core abstractions for JAX-based environments.
🕹️ Environment Suite: a collection of RL environments ranging from simple games to NP-hard combinatorial problems.
🍬 Wrappers: easily connect to your favourite RL frameworks and libraries such as Acme, Stable Baselines3, RLlib, OpenAI Gym and DeepMind-Env through our dm_env and gym wrappers.
🎓 Examples: guides to facilitate Jumanji's adoption and highlight the added value of JAX-based environments.
🏎️ Training: example agents that can be used as inspiration for the agents one may implement in their research.

Environments 🌍

Jumanji provides a diverse range of environments ranging from simple games to NP-hard combinatorial problems.

Environment	Category	Registered Version(s)	Source	Description
🔢 Game2048	Logic	`Game2048-v1`	code	doc
🎨 GraphColoring	Logic	`GraphColoring-v0`	code	doc
💣 Minesweeper	Logic	`Minesweeper-v0`	code	doc
🎲 RubiksCube	Logic	`RubiksCube-v0` `RubiksCube-partly-scrambled-v0`	code	doc
🔀 SlidingTilePuzzle	Logic	`SlidingTilePuzzle-v0`	code	doc
✏️ Sudoku	Logic	`Sudoku-v0` `Sudoku-very-easy-v0`	code	doc
📦 BinPack (3D BinPacking Problem)	Packing	`BinPack-v1`	code	doc
🧩 FlatPack (2D Grid Filling Problem)	Packing	`FlatPack-v0`	code	doc
🏭 JobShop (Job Shop Scheduling Problem)	Packing	`JobShop-v0`	code	doc
🎒 Knapsack	Packing	`Knapsack-v1`	code	doc
▒ Tetris	Packing	`Tetris-v0`	code	doc
🧹 Cleaner	Routing	`Cleaner-v0`	code	doc
🔗 Connector	Routing	`Connector-v2`	code	doc
🚚 CVRP (Capacitated Vehicle Routing Problem)	Routing	`CVRP-v1`	code	doc
🚚 MultiCVRP (Multi-Agent Capacitated Vehicle Routing Problem)	Routing	`MultiCVRP-v0`	code	doc
🔍 Maze	Routing	`Maze-v0`	code	doc
🤖 RobotWarehouse	Routing	`RobotWarehouse-v0`	code	doc
🐍 Snake	Routing	`Snake-v1`	code	doc
📬 TSP (Travelling Salesman Problem)	Routing	`TSP-v1`	code	doc
Multi Minimum Spanning Tree Problem	Routing	`MMST-v0`	code	doc
ᗧ•••ᗣ•• PacMan	Routing	`PacMan-v1`	code	doc
👾 Sokoban	Routing	`Sokoban-v0`	code	doc

Installation 🎬

You can install the latest release of Jumanji from PyPI:

pip install -U jumanji

Alternatively, you can install the latest development version directly from GitHub:

pip install git+https://github.com/instadeepai/jumanji.git

Jumanji has been tested on Python 3.8 and 3.9. Note that because the installation of JAX differs depending on your hardware accelerator, we advise users to explicitly install the correct JAX version (see the official installation guide).

Rendering: Matplotlib is used for rendering all the environments. To visualize the environments you will need a GUI backend. For example, on Linux, you can install Tk via: apt-get install python3-tk, or using conda: conda install tk. Check out Matplotlib backends for a list of backends you can use.

Quickstart ⚡

RL practitioners will find Jumanji's interface familiar as it combines the widely adopted OpenAI Gym and DeepMind Environment interfaces. From OpenAI Gym, we adopted the idea of a registry and the render method, while our TimeStep structure is inspired by DeepMind Environment.

Basic Usage 🧑‍💻

import jax
import jumanji

# Instantiate a Jumanji environment using the registry
env = jumanji.make('Snake-v1')

# Reset your (jit-able) environment
key = jax.random.PRNGKey(0)
state, timestep = jax.jit(env.reset)(key)

# (Optional) Render the env state
env.render(state)

# Interact with the (jit-able) environment
action = env.action_spec.generate_value()          # Action selection (dummy value here)
state, timestep = jax.jit(env.step)(state, action)   # Take a step and observe the next state and time step

state represents the internal state of the environment: it contains all the information required to take a step when executing an action. This should not be confused with the observation contained in the timestep, which is the information perceived by the agent.
timestep is a dataclass containing step_type, reward, discount, observation and extras. This structure is similar to dm_env.TimeStep except for the extras field that was added to allow users to log environments metrics that are neither part of the agent's observation nor part of the environment's internal state.

Advanced Usage 🧑‍🔬

Being written in JAX, Jumanji's environments benefit from many of its features including automatic vectorization/parallelization (jax.vmap, jax.pmap) and JIT-compilation (jax.jit), which can be composed arbitrarily. We provide an example of a more advanced usage in the advanced usage guide.

Registry and Versioning 📖

Like OpenAI Gym, Jumanji keeps a strict versioning of its environments for reproducibility reasons. We maintain a registry of standard environments with their configuration. For each environment, a version suffix is appended, e.g. Snake-v1. When changes are made to environments that might impact learning results, the version number is incremented by one to prevent potential confusion. For a full list of registered versions of each environment, check out the documentation.

Training 🏎️

To showcase how to train RL agents on Jumanji environments, we provide a random agent and a vanilla actor-critic (A2C) agent. These agents can be found in jumanji/training/.

Because the environment framework in Jumanji is so flexible, it allows pretty much any problem to be implemented as a Jumanji environment, giving rise to very diverse observations. For this reason, environment-specific networks are required to capture the symmetries of each environment. Alongside the A2C agent implementation, we provide examples of such environment-specific actor-critic networks in jumanji/training/networks.

⚠️ The example agents in jumanji/training are only meant to serve as inspiration for how one can implement an agent. Jumanji is first and foremost a library of environments - as such, the agents and networks will not be maintained to a production standard.

For more information on how to use the example agents, see the training guide.

Contributing 🤝

Contributions are welcome! See our issue tracker for good first issues. Please read our contributing guidelines for details on how to submit pull requests, our Contributor License Agreement, and community guidelines.

Citing Jumanji ✏️

If you use Jumanji in your work, please cite the library using:

@misc{bonnet2024jumanji,
    title={Jumanji: a Diverse Suite of Scalable Reinforcement Learning Environments in JAX},
    author={Clément Bonnet and Daniel Luo and Donal Byrne and Shikha Surana and Sasha Abramowitz and Paul Duckworth and Vincent Coyette and Laurence I. Midgley and Elshadai Tegegn and Tristan Kalloniatis and Omayma Mahjoub and Matthew Macfarlane and Andries P. Smit and Nathan Grinsztajn and Raphael Boige and Cemlyn N. Waters and Mohamed A. Mimouni and Ulrich A. Mbou Sob and Ruan de Kock and Siddarth Singh and Daniel Furelos-Blanco and Victor Le and Arnu Pretorius and Alexandre Laterre},
    year={2024},
    eprint={2306.09884},
    url={https://arxiv.org/abs/2306.09884},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

Acknowledgements 🙏

The development of this library was supported with Cloud TPUs from Google's TPU Research Cloud (TRC) 🌤.

jumanji's People

Contributors

Stargazers

Watchers

Forkers

cemlyn007 dluo96 iamunr4v31 alaterre jaedukseo micseb louis-csm biogeek kaiamj coyettev clement-bonnet ralami1859 siddarthsingh1 mbrukman cyprienc mwolinska devesh251298 wang-r-j tristankalloniatis sjdex aar65537 rk1a pduckworth rodsiry egiob wang5768 elshadaik medalimimouni surana01 iq-scm noahdenicola felixchalumeau urela djbyrne jorgesb10 roman-212 eltociear rsjeffers george-ogden ashiqullah mvmacfarlane johannestreutlein sash-a kjman678 danielpalen schultzjack tarekyoung ephrem-getachew lollcat timityjoe callumtilbury refiloe-shabe jameshennessytempus navh dankoan dantp-ai raphaelavalos robintida acetreinerleo ilyaorson npretor arnolfokam taodav chaofantu

jumanji's Issues

Make `wrappers.jumanji_specs_to_dm_env_specs` compatible with PyTrees

Is your feature request related to a problem? Please describe

In jumanji.wrappers the conversion of jumanji.specs to dm_env.specs does not accept general PyTree Nodes. As mentioned in: google-deepmind/dm_env#10, this should simply be compatible.

Describe the solution you'd like

Replace the else statement within jumanji.wrappers.jumanji_specs_to_dm_env_specs on line:465 from this:

def jumanji_specs_to_dm_env_specs(
    spec: Spec,
) -> Union[dm_env.specs.DiscreteArray, dm_env.specs.BoundedArray, dm_env.specs.Array]:
    if isinstance(spec, DiscreteArray):
        ...
    elif: 
        ...
    else:
        raise ValueError(
            f"spec {spec} of type {type(spec)} is not available in a deepmind environment. "
            "Please override the observation_spec or action_spec method to output spec of type "
            "`dm_env.specs.Array`."
        )

to something like this:

def jumanji_specs_to_dm_env_specs(
    spec: Spec,
) -> Union[dm_env.specs.DiscreteArray, dm_env.specs.BoundedArray, dm_env.specs.Array]:
    if isinstance(spec, DiscreteArray):
        ...
    elif: 
        ...
    else:
        try:
            # Recursively call this function for nested Specs
            return jax.tree_map(jumanji_specs_to_dm_env_specs, spec)
        except ...:
            raise ValueError(...)

Additional context

Add any other context or screenshots about the feature request here.

docs: update copyright year to 2023 instead of 2022

Change copyright year to 2023 instead of 2022.

bug: import jumanji crash from a console that is not a notebook

Description

If I run import jumanji from a console (a PyCharm console in my case), I get the following error from this line:

IPython.core.error.UsageError: Invalid GUI request 'notebook', valid ones are:dict_keys(['none', 'osx', 'tk', 'gtk', 'wx', 'qt', 'qt4', 'qt5', 'glut', 'pyglet', 'gtk3'])

I think it comes from the fact that we are checking the backend to set up the proper matplotlib backend accordingly. However, the current version seems to assume a jupyter notebook when it is a python console, thus breaking at import time.

The solution would be to set up the backend optionally, i.e. if an error is encountered, then a default backend is set. Or to figure out how to properly differentiate jupyter notebooks from something else. In any case, we should not break at import time because of rendering!

What Jumanji version are you using?

v0.1.1

Which accelerator(s) are you using?

No response

Additional System Info

Linux

Additional Context

No response

(Optional) Suggestion

No response

refactor(cleaner): return state from the generator

Is your feature request related to a problem? Please describe

The instance generator for the cleaner environment currently returns an array containing the initial grid. It should instead returns the full environment state to be consistent with other environments.

Describe the solution you'd like

Return an instance of cleaner.State in the instance generator.

docs: add example notebook

Is your feature request related to a problem? Please describe

People are more likely to use Jumanji if they can quickly try it out without a lot of admin. We should have an example notebook (we previously had the anakin snake notebook but it was removed because it was outdated).

docs: make contributing link and menu bar shortcut work on the doc

Is your feature request related to a problem? Please describe

On the documentation website, there are hyperlinks/shortcuts that do not work properly, although they are working on the readme on GitHub.

Describe the solution you'd like

Clicking the shortcuts on the menu bar (Installation | Quickstart | ... | Reference Docs) should redirect the user to the corresponding locations on the page. There is a problem with the emojis that are sometimes part of the link (on the GitHub's readme I think) and sometimes not (on the website I believe). Also, the "contributing guidelines" link does not work on the website because there is no page hosted for it.

Additional context

It seems that one has to reconcile the way GitHub renders the readme and the way mkdocs builds the doc when it comes to hyperlinks.

docs: state that we use the google style guide

Is your feature request related to a problem? Please describe

Make it clear to the community what style guide they should be abiding by which could reduce the number of review iterations required to get a pull request in.

Describe the solution you'd like

Add a reference to the google style guide in the developer documentation or readme.

ci: add continuous integration pipeline for Python 3.7

Jumanji supports Python 3.7 so we should add a ci pipeline in GitHub Actions for running linters, tests, etc. for Python 3.7.

feat(registration): default to latest version

Is your feature request related to a problem? Please describe

jumanji.make("TSP") should default to latest version of TSP, e.g. "TSP-v1". This way, a user who sees the codebase and agrees with the current version of TSP should be able to use TSP and not have to search for what version of TSP we are at.

docs: show test coverage in badge

Is your feature request related to a problem? Please describe

We should display the test coverage in a badge on the README. This is important to show since well-tested API is a goal.

feat: deprecate Connect4

Is your feature request related to a problem? Please describe

Connect4 will be removed in a future release (v0.2).

Describe the solution you'd like

Raise a deprecated warning when using Connect4 as it will be removed soon.

Remark

Bug in the reset function: action_mask = jnp.ones((BOARD_WIDTH,), dtype=jnp.int8) should be changed to action_mask = jnp.ones((BOARD_WIDTH,), dtype=bool).

refactor: remove environment factory

Is your feature request related to a problem? Please describe

The environment factory ENV_FACTORY in setup_train.py is no longer used and can be removed.

refactor: use named tuples instead of dataclasses for timestep and state

Is your feature request related to a problem? Please describe

Since the dataclasses we use are mutable, some side effects may occur when working with environment State and TimeStep.

Describe the solution you'd like

Use NamedTuple instead.

Describe alternatives you've considered

We could also freeze the dataclasses to be immutable but the NamedTuple option is preferred.

docs: add environment speeds to readme

Is your feature request related to a problem? Please describe

A clear and concise description of what the problem is.

Describe the solution you'd like

Add to the README environment table the environment speed (in steps/s).

feat(jobshop): create instance generator with known optimal solution

Is your feature request related to a problem? Please describe

Create a JobShop instance generator whose optimal makespan is known. This will be useful to benchmark agents better since the best solution will be known in advance.

Describe the solution you'd like

The generator will generate a random schedule based on a specified makespan (length of the schedule), number of machines, number of jobs, and max number of operations per job.

bug: protocol not supported in Python 3.7

Description

Jumanji supports Python 3.7 however the Protocol imported from typing isn't supported in Python 3.7.

What Jumanji version are you using?

v0.1.3

Which accelerator(s) are you using?

No response

Additional System Info

No response

Additional Context

No response

(Optional) Suggestion

Import Protocol from typing_extension instead of typing.

feat(rubiks): configurable generator and viewer

feat: expose state and key as attributes in DMEnvWrapper

Is your feature request related to a problem? Please describe

When working with a dm_env.Environment version of a Jumanji environment using the JumanjiToDMEnvWrapper, one may need to get/set the state and key of the environment e.g. to allow planning and "restart" the environment to its previous state.

Describe the solution you'd like

Right now, this is possible by calling wrapped_env._state and wrapped_env._state which should not be allowed since key and state are private attributes.
A solution would be to properly expose them as properties and to implement setters for these properties (in the common style).

Checklist:

Expose state and key of JumanjiToDMEnvWrapper as properties
Implement setters for these properties
Test that one can instantiate a Jumanji environment (e.g. jumanji.make("Snake-6x6-v0")), wrap it with JumanjiToDMEnvWrapper and then get and set the corresponding state and key attributes without leading underscores

ci: support python 3.10

Is your feature request related to a problem? Please describe

We would like Jumanji to support Python 3.10.

Checklist

Add 3.10 checks in the GitHub pipeline.
Update the documentation to reflect this extended support.

doc: fix mypy badge in online documentation

Is your feature request related to a problem? Please describe

The mypy badge appear broken on the online documentation (build hosted by GitHub pages)

feat: add a py.typed file so that mypy will know to use the type annotations in the published package

We should add a py.typed file so that the mypy type checker knows to use the type hints provided by the published jumanji package. See this for more information.

docs: hyperlink to registered environments in main README

Add a hyperlink to the autogenerated docs in the section describing the environment registry and versioning.

feat: use future annotations for same class type

Is your feature request related to a problem? Please describe

This issue is about cleaning the type hinting in the repository. Currently, when a method (inside a given class) returns an object whose type is the class itself (e.g. in Environment), the return type is given with quotes:

class Environment:

    ...

    def unwrapped(self) -> "Environment":
        ...

According to this thread, this is needed for Python<3.7, but can be removed with future annotations from Python 3.7+.

Describe the solution you'd like

Since Jumanji supports Python>=3.8, we could get rid of these quotes throughout the repository and use future annotations instead.

Fix Typo in environments/games/snake/env.py

Fix typo in line 50, that says snale instead of snake.

refactor: create env viewer interface

Is your feature request related to a problem? Please describe

Create an abstract base class Viewer which all environment viewers inherit from. This will help standardise how rendering is done across all environments.

fix: type checking of dataclasses

Is your feature request related to a problem? Please describe

At multiple places in the code, a workaround is used to check typing of chex dataclasses. This involves doing a conditional import using if TYPE_CHECKING - we would like to avoid this if possible. It is related to this issue.

Describe the solution you'd like

We would a solution which avoids doing the conditional import.

Alternatives considered

Use a built-in dataclass instead of a chex dataclass. Example implementation where mypy doesn't complain:

@dataclasses.dataclass(init=False)
class TimeStep(Generic[Observation]):
    step_type: StepType
    reward: Array
    discount: Array
    observation: Observation
    extras: Optional[Dict]

    def __init__(self, step_type: StepType, reward: Array, discount: Array, observation: Observation,
                 extras: Optional[Dict] = None):
        self.step_type = step_type
        self.reward = reward
        self.discount = discount
        self.observation = observation
        self.extras = extras

    def first(self) -> Array:
        return self.step_type == StepType.FIRST

    def mid(self) -> Array:
        return self.step_type == StepType.MID

    def last(self) -> Array:
        return self.step_type == StepType.LAST

fix: step type in autoreset wrapper

Is your feature request related to a problem? Please describe

When using AutoResetWrapper, the sequence of timestep.step_type returned during a rollout shows 0 values where the env has been auto reset instead of 2 values. Recall that

0 corresponds to the first step
1 corresponds to a mid step
2 corresponds to the last step

Describe the solution you'd like

Move the auto reset bug: [1, 1, 1, 2, 0, 1, 1, 1] has to be converted to [1, 1, 1, X, 1, 1, 1] with X being 0 or 2. Right now it is 0 but it should be 2 to warn the user that the episode got terminated. This would be important e.g. if using while not timestep.last().

timestep = timestep.replace(  # type: ignore
    observation=reset_timestep.observation
)

feat: improve environment reset speed by changing jax.random.choice

Is your feature request related to a problem? Please describe

jax.random.choice seems to be slow especially when sampling without replacement. Sampling with replacement seems to be much faster, but even without replacement is probably slower than jax.random.randint (to be verified).

Describe the solution you'd like

Find a way to use jax.random.choice(replace=False) as little as possible to improve environment speed.

Alternatives considered

As a first study towards this, it turns out that jax.random.choice(..., replace=True) is faster than jax.random.categorical for sampling with replacement. jax.random.choice(..., replace=False) appears much slower than the other two. When sampling without replacement is needed, we still have to study what the best approach is.
Source: notebook

Alternatives to jax.random.choice(..., replace=False) that could be considered and assessed include:

Sampling once from the joint distribution where the joint gathers all the valid pairs
Creating two random partitions p1 and p2, sampling one index i only, and get the two samples by p1[i] and p2[i]
Sequentially sampling one index and then a second one from the conditional distribution given the first one is not available

It may be that jax.random.choice(..., replace=False) ends up being the most optimised version. In any case, the solution may depend on how many samples we need to sample without replacement (e.g. 2 in the case of Snake).

Remarks

We need to take this into account for random policies. It is likely that the random action selection influences the environment speed by a lot, hence biasing speed benchmarks.

docs: state that we follow the conventional commits specification

We should state explicitly that we follow the conventional commits specification in the CONTRIBUTING.md.

bug(ci): ci is broken because of default python version in pre-commit

Description

The CI is broken due to the default python version

What Jumanji version are you using?

No response

Which accelerator(s) are you using?

No response

Additional System Info

No response

Additional Context

No response

(Optional) Suggestion

No response

fix: vmap-ed brax environment with BraxToJumanjiWrapper

Is your feature request related to a problem? Please describe

Currently, the BraxToJumanjiWrapper cannot accept a Brax environment that has been wrapped by Brax's VmapWrapper due to jax.lax.cond not broadcasting in the case of a vmap-ed environment. A current workaround is to not use Brax's VmapWrapper', convert it to Jumanji with BraxToJumanjiWrapperand then use Jumanji'sVmapWrapper`, this will produce the desired outcome.
To summarize,

from jumanji import wrappers as jumanji_wrappers
from brax.envs import create, wrappers as brax_wrappers

brax_env = create("ant")

# This does not work
jumanji_env = jumanji_wrappers.BraxToJumanjiWrapper(brax_wrappers.VmapWrapper(brax_env))

# This works
jumanji_env = jumanji_wrappers.VmapWrapper(jumanji_wrappers.BraxToJumanjiWrapper(brax_env))

The goal of this issue is to make the first solution work as well for consistency.

Describe the solution you'd like

The jax.lax.cond line in the step function of BraxToJumanjiWrapper should be changed to handle the case where
state.done is a vector and not a scalar (i.e., when the Brax environment originates from a VmapWrapper).

Issue reproduction

from brax.envs import wrappers as brax_wrappers
from brax.envs import create
from jumanji.wrappers import BraxToJumanjiWrapper

brax_env = create("ant")
jumanji_env = BraxToJumanjiWrapper(brax_wrappers.VmapWrapper(brax_env))
state, timestep = jax.jit(jumanji_env.reset)(jax.random.split(jax.random.PRNGKey(0), num = 1))
action = jumanji_env.action_spec().generate_value()[None, ...]
state, timestep = jax.jit(jumanji_env.step)(state, action)

Checklist

jumanji_env = jumanji_wrappers.BraxToJumanjiWrapper(brax_wrappers.VmapWrapper(brax_env)). Calling jumanji_env.reset and jumanji_env.step should work without raising exceptions.
Writes a unit test to check for this feature

refactor: improve consistency of extras

Is your feature request related to a problem? Please describe

The extras field of TimeStep can contain environment information useful for decision-making (e.g. Connect4's current player ID) or environment metrics (e.g. BinPack's volume utilisation). There is an inconsistency in what the extras field is used for as it is sometimes meant to be used by the algorithm and sometimes just logged as a metric.

Describe the solution you'd like

We should move any algorithm-related information from extras to the environment observation (e.g. Connect4's observation could have another field called current_player or something). We should update the documentation/docstrings accordingly to explicitly mention that TimeStep.extras does not contain stuff that is meant to be observed by the agent as those should be in the observation.

TODOs

adapt docstrings, doc, codes, etc to make explicit the fact that TimeStep.extras does not contain any info meant to be observed
move agent-specific extras (e.g. Connect4's current player ID) to environment observations

refactor(training): remove param_size from parametric distributions

Is your feature request related to a problem? Please describe

param_size is not used so it should be removed. This might make the network builder functions lighter as some may not need the env specs anymore (e.g. for TSP).

bug: flake8 fails in ci pipeline

Description

The CI pipeline fails at the linting step, specifically flake8 fails with the following error:

An unexpected error has occurred: CalledProcessError: command: ('/usr/bin/git', 'fetch', 'origin', '--tags')
return code: 128
expected return code: 0
stdout: (none)
stderr:
    fatal: could not read Username for 'https://gitlab.com/': No such device or address
    
Check the log at /home/runner/.cache/pre-commit/pre-commit.log

What Jumanji version are you using?

v0.1.1

Which accelerator(s) are you using?

N/A

Additional System Info

N/A

Additional Context

No response

(Optional) Suggestion

This error occurs because flake8 took down their GitLab repository in favour of their GitHub repository.
However, by default, pre-commit uses the GitLab link. Thus, we need to replace https://gitlab.com/PyCQA/flake8 with https://github.com/PyCQA/flake8 in the .pre-commit-config.yaml. For more information, see this

feat(registry): retrieve the latest version of an environment

Is your feature request related to a problem? Please describe

As of now, when calling make with an environment name omitting the version, the current behaviour is to fetch version v0. This will become a problem when we start having more than one version per environment. Getting v0 doesn't make sense.

Describe the solution you'd like

I suggest we throw an error if the version number is missing. This would simplify the code and force users to be explicit about the version they want. It is also better for reproducibility. It is slightly less user-friendly because they would need to look up the listing of the registered environments before calling make.

Describe alternatives you've considered

An alternative solution is to change the described behaviour to fetch the latest version of an environment if the version number is omitted.

Misc

Check for duplicate requests.

fix: link to contributing markdown is broken in the documentation

Description

If you go to: https://instadeepai.github.io/jumanji/#contributing

And click on contributing guidelines the server will return a 404.

What Jumanji version are you using?

No response

Which accelerator(s) are you using?

No response

Additional System Info

No response

Additional Context

No response

(Optional) Suggestion

No response

fix: colab link

Point colab link to notebook in main branch

feat(specs): make environment specs properties instead of methods

Is your feature request related to a problem? Please describe

env.observation_spec is a method and creates a new spec each time it is being called. If we it a property, we can JIT a function that gets these specs.

Describe the solution you'd like

def __init__(self):
    self.observation_spec = self._make_observation_spec()

def _make_observation_spec():
    return spec.Spec(...)

style: move root files to their own folder

Is your feature request related to a problem? Please describe

The root is quite messy with lots of files (e.g. commitlint.config.js, mkdocs.yml, license_header.txt).

Describe the solution you'd like

We could move a lot of these files to separate folders and keep only necessary files in the root directory (e.g. README, and setup.py).

docs: mention request for Pgx

Sorry for the comment from out of the blue. Jumanji's sophisticated API is great, and its application to problems like TSP is really interesting.

Today, we released Pgx, a collection of JAX-based RL environments dedicated to classic board games like Go. We have implemented over 15 environments, including Backgammon, Shogi, and Go, and confirmed that they are considerably faster than existing C++/Python implementations. We also plan to implement Chess and Contract Bridge in the coming weeks.

We believe Jumanji and Pgx can complement each other as both implement JAX-based RL environments but focus on different domains. We would be happy if you could kindly mention Pgx in the README like other JAX-based RL environments if you like it. For example,

🎲 Pgx provides classic board game environments like Backgammon, Shogi, and Go.

🎲 [Pgx](https://github.com/sotetsuk/pgx) provides classic board game environments like Backgammon, Shogi, and Go.

Thanks!

feat(specs): implement sample method

Is your feature request related to a problem? Please describe

Jumanji specs have a generate_value method which essentially returns zeros of the correct pytree/shape. It would be nice to be able to sample values (with an optional mask) from the action spec. This would give us random policies for free for all environments.

Describe the solution you'd like

Add a sample method to the specs similar to Gym.

Remarks

We need to decide whether it is redundant to have both sample and generate_value.

Describe alternatives you've considered

A clear and concise description of any alternative solutions or features you've considered.

Additional context

Add any other context or screenshots about the feature request here.

Misc

Check for duplicate requests.

feat(binpack): improve speed of the environment

Is your feature request related to a problem? Please describe

There is a potential for speedup in the BinPack environment's step method. If the computation is quite sparse, using jax.lax.map instead of jax.vmap may speed the environment up when lots of EMSs are not alive.

Describe the solution you'd like

POC to be done with the timer.
Faster version of BinPack's step method.

bug: make pull requests trigger workflow

Description

Previously the workflow was not triggered.

What Jumanji version are you using?

No response

Which accelerator(s) are you using?

No response

Additional System Info

No response

Additional Context

No response

(Optional) Suggestion

No response

docs: update the contributing document with jumanji api 0.1.x

Is your feature request related to a problem? Please describe

The current CONTRIBUTING.md is outdated and makes references to the old jumanji API (from before v0.1)

Describe the solution you'd like

Update the document accordingly.

bug: protocol not supported in Python 3.7

Description

Jumanji supports Python 3.7 however the Protocol imported from typing isn't supported in Python 3.7.

What Jumanji version are you using?

v0.1.3

Which accelerator(s) are you using?

No response

Additional System Info

No response

Additional Context

No response

(Optional) Suggestion

Import Protocol from typing_extension instead of typing.

bug: images are not displayed on PyPI

Description

The Jumanji logo, badges and environment GIF do not appear on the PyPI description page. This is probably due to not exporting the images when uploading to PyPI.

What Jumanji version are you using?

v0.1.1

Which accelerator(s) are you using?

No response

Additional System Info

No response

Additional Context

No response

(Optional) Suggestion

No response

[Doc] Table in Examples section renders incorrectly on https://instadeepai.github.io/jumanji/#examples

Description

The table in the Examples section of the README renders correctly on Github:

but incorrectly on the documentation site:

What Jumanji version are you using?

N/A

Which accelerator(s) are you using?

N/A

Additional System Info

N/A

Additional Context

No response

(Optional) Suggestion

Fix: Add a space between the table and the text.

bug: import of jumanji make pytest tests seg faults.

Description

Good morning,

Importing jumanji is making my other tests seg faults.

Without any test calling/importing code relying on JJ - if I just add a simple import jumanji - tests seg fault, if I remove the import, tests pass.

if I comment in jumanji/__init__.py the import of the binpack sub-module (from jumanji.environments.combinatorial import binpack as _binpack) and the associated env registrations - tests pass.

I suspect it has to do with something in jumanji\environments\__init__.py which is loaded when importing binpack. Since there are a lot of import in the init - it's hard to find the culprit.

I can't disclose the dependencies I am using.

Thanks,

Cyprien

What Jumanji version are you using?

0.1.3

Which accelerator(s) are you using?

CPU/GPU

Additional System Info

Python 3.10.8 - WSL - Ubuntu 20.04 LTS

Additional Context

No response

(Optional) Suggestion

I have forked the repo and am working on making the registration of the binpack env not require any import from the env.

Instead of:

register(
    id="BinPack-rand20-v0",
    entry_point="jumanji.environments:BinPack",
    kwargs={
        "instance_generator": _binpack.instance_generator.RandomInstanceGenerator(
            max_num_items=20,
            max_num_ems=80,
        ),
        "obs_num_ems": 40,
    },
)

We could have:

register(
    id="BinPack-rand20-v0",
    entry_point="jumanji.environments:BinPack",
    kwargs={
        "instance_generator": {
            "type": "random",
            "max_num_items": 20,
            "max_num_ems": 80,
        },
        "obs_num_ems": 40,
    },
)

or,

register(
    id="BinPack-rand20-v0",
    entry_point="jumanji.environments:BinPack",
    kwargs={
        "instance_generator": "random",
        "max_num_items": 20,
        "max_num_ems": 80,
        "obs_num_ems": 40,
    },
)

feat: constrain all environment states to have a key attribute

Is your feature request related to a problem? Please describe

One assumption about environment states is that they have a key (jax random key) attribute to manage stochasticity in the environment step function. This key is then used in wrappers such as JumanjiToDMEnvWrapper.

Describe the solution you'd like

I am not sure of the solution to go for. I have identified two possible ways: using protocols to make it explicit that states have a key attribute, or using some abstract State class that will have the key attribute mandatory.

What is the best way of forcing the environment's State to have a key attribute?

instadeepai / jumanji Goto Github PK

jumanji's Introduction

Environments | Installation | Quickstart | Training | Citation | Docs

Jumanji @ ICLR 2024

Welcome to the Jungle! 🌴

Goals 🚀

Overview 🦜

Environments 🌍

Installation 🎬

Quickstart ⚡

Basic Usage 🧑‍💻

Advanced Usage 🧑‍🔬

Registry and Versioning 📖

Training 🏎️

Contributing 🤝

Citing Jumanji ✏️

See Also 🔎

Acknowledgements 🙏

jumanji's People

Contributors

Stargazers

Watchers

Forkers

jumanji's Issues

Is your feature request related to a problem? Please describe

Describe the solution you'd like

Additional context

Description

What Jumanji version are you using?

Which accelerator(s) are you using?

Additional System Info

Additional Context

(Optional) Suggestion

Is your feature request related to a problem? Please describe

Describe the solution you'd like

Is your feature request related to a problem? Please describe

Is your feature request related to a problem? Please describe

Describe the solution you'd like

Additional context

Is your feature request related to a problem? Please describe

Describe the solution you'd like

Is your feature request related to a problem? Please describe

Is your feature request related to a problem? Please describe

Is your feature request related to a problem? Please describe

Describe the solution you'd like

Remark

Is your feature request related to a problem? Please describe

Is your feature request related to a problem? Please describe

Describe the solution you'd like

Describe alternatives you've considered

Is your feature request related to a problem? Please describe

Describe the solution you'd like

Is your feature request related to a problem? Please describe

Describe the solution you'd like

Description

What Jumanji version are you using?

Which accelerator(s) are you using?

Additional System Info

Additional Context

(Optional) Suggestion

Is your feature request related to a problem? Please describe

Describe the solution you'd like

Checklist:

Is your feature request related to a problem? Please describe

Checklist

Is your feature request related to a problem? Please describe

Is your feature request related to a problem? Please describe

Describe the solution you'd like

Is your feature request related to a problem? Please describe

Is your feature request related to a problem? Please describe

Describe the solution you'd like

Alternatives considered

Is your feature request related to a problem? Please describe

Describe the solution you'd like

Is your feature request related to a problem? Please describe

Describe the solution you'd like

Alternatives considered

Remarks

Description

What Jumanji version are you using?