Giter Site home page Giter Site logo

dunno-lab / xland-minigrid-datasets Goto Github PK

View Code? Open in Web Editor NEW
33.0 3.0 4.0 637 KB

XLand-100B: A Large-Scale Multi-Task Dataset for In-Context Reinforcement Learning

License: Apache License 2.0

Python 99.30% Dockerfile 0.70%
datasets in-context-learning in-context-reinforcement-learning meta-reinforcement-learning reinforcement-learning

xland-minigrid-datasets's Introduction

XLand-100B: A Large-Scale Dataset for In-Context RL

Official code for the XLand-100B: A Large-Scale Multi-Task Dataset for In-Context Reinforcement Learning, which presents two large datasets for in-context RL based on XLand-MiniGrid environment: XLand-100B and a smaller version XLand-Trivial-20B. Together, they contain about 3.5B episodes, 130B transitions and 40,000 unique tasks, which is more than in any other dataset currently available in RL. Furthermore, our datasets are unique in that they contain the complete training histories of the base algorithms, rather than just expert transitions or partial replay buffers. With this datasets we aim to democratize research in the rapidly growing field of in-context RL and provide a solid foundation for further scaling.

As part of the code release, we provide the utilities used to collect the datasets as well as the code used for the experiments with AD and DPT methods. As these parts are not semantically related to each other, they are split into separate directories for ease of use. See the README in each directory for instructions.

Downloading the datasets

Both XLand-100B and XLand-Trivial-20B datasets hosted on public S3 bucket and freely available for everyone under CC BY-SA 4.0 Licence.

We advise starting with Trivial dataset for debugging due to smaller size and faster downloading time. Both datasets have an identical structure. For additional details we refer to the paper.

Datasets can be downloaded with the curl utility (or any other like wget) as follows:

# XLand-Trivial-20B, approx 60GB size
curl -L -o xland-trivial-20b.hdf5 https://sc.link/A4rEW

# XLand-100B, approx 325GB size
curl -L -o xland-100b.hdf5 https://sc.link/MoCvZ

What's inside

The datasets are stored in hdf5 format. For each task, we provide 32 complete learning histories and all the metadata necessary for evaluation, such as environment, benchmark and task IDs from XLand-MiniGrid (see .attrs property for each history). Each learning history stores states, actions, rewards, dones and expert_actions sequentially, without splitting into individual episodes (for convenient cross-episode sequences sampling for training). expert_actions are relabeled with the final policy and needed for DPT-like methods (see paper for the details).

Name Dtype Shape (XLand-Trivial-20B) Shape (XLand-100B)
states np.uint8 (32, 60928, 5, 5) (32, 121856, 5, 5)
actions np.uint8 (32, 60928) (32, 121856)
rewards np.float16 (32, 60928) (32, 121856)
dones np.bool (32, 60928) (32, 121856)
expert_actions np.uint8 (32, 60928) (32, 121856)

NB! We have also compressed the observations to reduce the size of the dataset. The original observations from the XLand-MiniGrid have the shape (5, 5, 2), not (5, 5)! After sampling, decompress them like that (see also dataloaders in the baselines):

# see collection/training/utils.py
import numpy as np
from xminigrid.core.constants import NUM_COLORS

np.stack(np.divmod(obs, NUM_COLORS), axis=-1)

Dependencies

We provide specific dependencies for experiments and data collection in the appropriate directories. However, we have also prepared a Dockerfile for easier setup of the working environment. It should containt all the necessary dependencies.

docker build . -t xland-dataset
docker run --itd --rm --gpus all -v $(pwd):/workspace --name xland-dataset xland-dataset

Appreciation

This work was supported by Artificial Intelligence Research Institute (AIRI).

Citing

If you use this code or datasets for your research, please consider the following BibTeX:

@article{nikulin2024xland,
  title={XLand-100B: A Large-Scale Multi-Task Dataset for In-Context Reinforcement Learning},
  author={Nikulin, Alexander and Zisman, Ilya and Zemtsov, Alexey and Sinii, Viacheslav and Kurenkov, Vladislav and Kolesnikov, Sergey},
  journal={arXiv preprint arXiv:2406.08973},
  year={2024}
}

Dataset Metadata

The following table is necessary for this dataset to be indexed by search engines such as Google Dataset Search.

property value
name XLand-100B
url
description A large-scale dataset for in-context reinforcement learning based on the XLand-MiniGrid environment. It contains complete learning histories for nearly 30,000 different tasks, covering 100B transitions and 2.5B episodes.
provider
property value
name AIRI
license
property value
name CC BY-SA 4.0

xland-minigrid-datasets's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

gary109 rumi381 nz42

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.