Giter Site home page Giter Site logo

salesforce / warp-drive Goto Github PK

View Code? Open in Web Editor NEW
450.0 14.0 79.0 13.25 MB

Extremely Fast End-to-End Deep Multi-Agent Reinforcement Learning Framework on a GPU (JMLR 2022)

License: BSD 3-Clause "New" or "Revised" License

Dockerfile 0.07% Python 77.55% Cuda 3.77% Jupyter Notebook 17.95% Makefile 0.06% C 0.59% Shell 0.02%
reinforcement-learning gpu cuda multiagent-reinforcement-learning deep-learning high-throughput pytorch numba

warp-drive's Introduction

WarpDrive: Extremely Fast End-to-End Single or Multi-Agent Deep Reinforcement Learning on a GPU

WarpDrive is a flexible, lightweight, and easy-to-use open-source reinforcement learning (RL) framework that implements end-to-end multi-agent RL on a single or multiple GPUs (Graphics Processing Unit).

Using the extreme parallelization capability of GPUs, WarpDrive enables orders-of-magnitude faster RL compared to CPU simulation + GPU model implementations. It is extremely efficient as it avoids back-and-forth data copying between the CPU and the GPU, and runs simulations across multiple agents and multiple environment replicas in parallel. Together, these allow the user to run thousands or even millions of concurrent simulations and train on extremely large batches of experience, achieving at least 100x throughput over CPU-based counterparts.

The table below provides a visual overview of Warpdrive's key features and scalability over various dimensions.

Support Concurrency Version
Environments Single ✅ Multi ✅ 1 to 1000 per GPU 1.0+
Agents Single ✅ Multi ✅ 1 to 1024 per environment 1.0+
Agents Multi across blocks ✅ 1024 per block 1.6+
Discrete Actions Single ✅ Multi ✅ - 1.0+
Continuous Action Single ✅ Multi ✅ - 2.7+
On-Policy Policy Gradient A2C ✅ PPO ✅ - 1.0+
Off-Policy Policy Gradient DDPG ✅ - 2.7+
Auto-Scaling - 1.3+
Distributed Simulation 1 GPU ✅ 2-16 GPU node ✅ - 1.4+
Environment Backend CUDA C ✅ - 1.0+
Environment Backend CUDA C ✅ Numba ✅ - 2.0+
Training Backend Pytorch ✅ - 1.0+

Environments

  1. Game of "Tag": In the "Tag" games, taggers are trying to run after and tag the runners. They are fairly complicated games for benchmarking and testing, where thread synchronization, shared memory, high-dimensional indexing for thousands of interacting agents are involved. Below, we show multi-agent RL policies trained for different tagger:runner speed ratios using WarpDrive. These environments can run at millions of steps per second, and train in just a few hours, all on a single GPU!

  1. Complex 2-level multi-agent environments such as Covid-19 environment and climate change environment have been developed based on WarpDrive, you may see examples in Real-World Problems and Collaborations.

  1. Classic control: We include environments at gym.classic_control. Single-agent is a special case of multi-agent environment in WarpDrive. Since each environment only has one agent, the scalability is even higher.
Screenshot 2023-12-19 at 10 02 51 PM

  1. Catalytic reaction pathways: We include environments that convert quantum density functional theory to a reinforcement learning representation and enables an automatic search for the optimal chemical reaction pathway from the noisy chemical system. You may see examples in Real-World Problems and Collaborations.
Screenshot 2023-09-19 at 10 23 56 AM

Throughput, Scalability and Convergence

Multi Agent

Below, we compare the training speed on an N1 16-CPU node versus a single A100 GPU (using WarpDrive), for the Tag environment with 100 runners and 5 taggers. With the same environment configuration and training parameters, WarpDrive on a GPU is about 10× faster. Both scenarios are with 60 environment replicas running in parallel. Using more environments on the CPU node is infeasible as data copying gets too expensive. With WarpDrive, it is possible to scale up the number of environment replicas at least 10-fold, for even faster training.

Single Agent

Below, we compare the training speed on a single A100 GPU (using WarpDrive), for the (top) Cartpole-v1 and (bottom) Acrobot-v1 with 10, 100, 1K, and 10K environment replicas running in parallel for 3000 epochs (hyperperams are the same). You can see an amazing convergence and speed with the huge number of environments scaled by WarpDrive.

Code Structure

WarpDrive provides a CUDA (or Numba) + Python framework and quality-of-life tools, so you can quickly build fast, flexible and massively distributed multi-agent RL systems. The following figure illustrates a bottoms-up overview of the design and components of WarpDrive. The user only needs to write a CUDA or Numba step function at the CUDA environment layer, while the rest is a pure Python interface. We have step-by-step tutorials for you to master the workflow.

Python Interface

WarpDrive provides tools to build and train multi-agent RL systems quickly with just a few lines of code. Here is a short example to train tagger and runner agents:

# Create a wrapped environment object via the EnvWrapper
# Ensure that env_backend is set to 'pycuda' or 'numba' (in order to run on the GPU)
env_wrapper = EnvWrapper(
    TagContinuous(**run_config["env"]),
    num_envs=run_config["trainer"]["num_envs"], 
    env_backend="pycuda"
)

# Agents can share policy models: this dictionary maps policy model names to agent ids.
policy_tag_to_agent_id_map = {
    "tagger": list(env_wrapper.env.taggers),
    "runner": list(env_wrapper.env.runners),
}

# Create the trainer object
trainer = Trainer(
    env_wrapper=env_wrapper,
    config=run_config,
    policy_tag_to_agent_id_map=policy_tag_to_agent_id_map,
)

# Perform training!
trainer.train()

Papers and Citing WarpDrive

Our paper published at Journal of Machine Learning Research (JMLR) https://jmlr.org/papers/v23/22-0185.html. You can also find more details in our white paper: https://arxiv.org/abs/2108.13976.

If you're using WarpDrive in your research or applications, please cite using this BibTeX:

@article{JMLR:v23:22-0185,
  author  = {Tian Lan and Sunil Srinivasa and Huan Wang and Stephan Zheng},
  title   = {WarpDrive: Fast End-to-End Deep Multi-Agent Reinforcement Learning on a GPU},
  journal = {Journal of Machine Learning Research},
  year    = {2022},
  volume  = {23},
  number  = {316},
  pages   = {1--6},
  url     = {http://jmlr.org/papers/v23/22-0185.html}
}

@misc{lan2021warpdrive,
  title={WarpDrive: Extremely Fast End-to-End Deep Multi-Agent Reinforcement Learning on a GPU}, 
  author={Tian Lan and Sunil Srinivasa and Huan Wang and Caiming Xiong and Silvio Savarese and Stephan Zheng},
  year={2021},
  eprint={2108.13976},
  archivePrefix={arXiv},
  primaryClass={cs.LG}
}

Tutorials and Quick Start

Tutorials

Familiarize yourself with WarpDrive by running these tutorials on Colab or NGC container!

You may also run these tutorials locally, but you will need a GPU machine with nvcc compiler installed and a compatible Nvidia GPU driver. You will also need Jupyter. See https://jupyter.readthedocs.io/en/latest/install.html for installation instructions

Example Training Script

We provide some example scripts for you to quickly start the end-to-end training. For example, if you want to train tag_continuous environment (10 taggers and 100 runners) with 2 GPUs and CUDA C backend

python example_training_script_pycuda.py -e tag_continuous -n 2

or switch to JIT compiled Numba backend with 1 GPU

python example_training_script_numba.py -e tag_continuous

You can find full reference documentation here.

Real World Problems and Collaborations

Installation Instructions

To get started, you'll need to have Python 3.7+ and the nvcc compiler installed with a compatible Nvidia GPU CUDA driver.

CUDA (which includes nvcc) can be installed by following Nvidia's instructions here: https://developer.nvidia.com/cuda-downloads.

Docker Image

V100 GPU: You can refer to the example Dockerfile to configure your system.

A100 GPU: Our latest image is published and maintained by NVIDIA NGC. We recommend you download the latest image from NGC catalog.

If you want to build your customized environment, we suggest you visit Nvidia Docker Hub to download the CUDA and cuDNN images compatible with your system. You should be able to use the command line utility to monitor the NVIDIA GPU devices in your system:

nvidia-smi

and see something like this

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.51.06    Driver Version: 450.51.06    CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla V100-SXM2...  Off  | 00000000:00:04.0 Off |                    0 |
| N/A   37C    P0    32W / 300W |      0MiB / 16160MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

In this snapshot, you can see we are using a Tesla V100 GPU and CUDA version 11.0.

Installing using Pip

You can install WarpDrive using the Python package manager:

pip install rl_warp_drive

Installing from Source

  1. Clone this repository to your machine:

    git clone https://www.github.com/salesforce/warp-drive
    
  2. Optional, but recommended for first tries: Create a new conda environment (named "warp_drive" below) and activate it:

    conda create --name warp_drive python=3.7 --yes
    conda activate warp_drive
    
  3. Install as an editable Python package:

    cd warp_drive
    pip install -e .
    

Testing your Installation

You can call directly from Python command to test all modules and the end-to-end training workflow.

python warp_drive/utils/unittests/run_unittests_pycuda.py
python warp_drive/utils/unittests/run_unittests_numba.py
python warp_drive/utils/unittests/run_trainer_tests.py

Learn More

For more information, please check out our blog, white paper, and code documentation.

If you're interested in extending this framework, or have questions, join the AI Economist Slack channel using this invite link.

warp-drive's People

Contributors

blchu avatar emerald01 avatar mustious avatar sunil-s avatar svc-scm avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

warp-drive's Issues

Confusion of step function

Hi, is the step() funtion in warp_drive individual for each thread agent? In Figure 1 displayed in the paper, it seemes like that each thread agent in the block maintains a individual step function leading the transition function of the system can be paralized. However, in many marl scenarios, the transition of the system must need all agents' actions and then forward the step function. Is this a limitation for the repo?

Addition of Other Reinforcement Learning Algorithms (i.e., Q-Learning)

Dear WarpDrive Team,

May I find out if it is possible to implement other reinforcement learning algorithms into WarpDrive (i.e., Q-Learning)?

If not, may I ask whether PPO and A2C are considered one of the better algorithms out there in the field? I am not that well informed of the algorithms and their individual advantages, but from what I have garnered from online searches:

It can be observed that PPO provides a better convergence and performance rate than other techniques but is sensitive to changes. DQN alone is unstable and gives poor convergence, hence requires several add-ons.

Reference:
https://medium.datadriveninvestor.com/which-reinforcement-learning-rl-algorithm-to-use-where-when-and-in-what-scenario-e3e7617fb0b1

found invalid values error

When running the example here:
https://github.com/salesforce/warp-drive/blob/master/tutorials/simple-end-to-end-example.ipynb

Note:

repo commit: b5d46d4
These tests passed successfully:
python warp_drive/utils/unittests/run_unittests_pycuda.py
python warp_drive/utils/unittests/run_trainer_tests.py
NVIDIA p104-100 8GB

I get this output:

Device: 0
Iterations Completed : 1 / 50

Speed performance stats

Mean policy eval time per iter (ms) : 196.94
Mean action sample time per iter (ms) : 37.12
Mean env. step time per iter (ms) : 85.96
Mean training time per iter (ms) : 123.15
Mean total time per iter (ms) : 453.86
Mean steps per sec (policy eval) : 50775.87
Mean steps per sec (action sample) : 269373.56
Mean steps per sec (env. step) : 116335.91
Mean steps per sec (training time) : 81202.92
Mean steps per sec (total) : 22033.34

Metrics for policy 'runner'

VF loss coefficient : 0.01000
Entropy coefficient : 0.05000
Total loss : 0.09430
Policy loss : 0.33186
Value function loss : 0.20734
Mean rewards : 0.00085
Max. rewards : 1.00000
Min. rewards : -1.00000
Mean value function : 0.04290
Mean advantages : 0.06929
Mean (norm.) advantages : 0.06929
Mean (discounted) returns : 0.11219
Mean normalized returns : 0.11219
Mean entropy : 4.79267
Variance explained by the value function: 0.01151
Std. of action_0 over agents : 3.13083
Std. of action_0 over envs : 3.14615
Std. of action_0 over time : 3.14577
Std. of action_1 over agents : 3.17047
Std. of action_1 over envs : 3.18386
Std. of action_1 over time : 3.18446
Current timestep : 10000.00000
Gradient norm : 0.00000
Learning rate : 0.00500
Mean episodic reward : 1.71000
Mean episodic steps : 100.00000

Metrics for policy 'tagger'

VF loss coefficient : 0.01000
Entropy coefficient : 0.05000
Total loss : 1.78037
Policy loss : 2.01399
Value function loss : 0.59261
Mean rewards : 0.01810
Max. rewards : 1.00000
Min. rewards : 0.00000
Mean value function : 0.06817
Mean advantages : 0.42039
Mean (norm.) advantages : 0.42039
Mean (discounted) returns : 0.48856
Mean normalized returns : 0.48856
Mean entropy : 4.79084
Variance explained by the value function: -0.00882
Std. of action_0 over agents : 3.06860
Std. of action_0 over envs : 3.17762
Std. of action_0 over time : 3.17566
Std. of action_1 over agents : 3.05678
Std. of action_1 over envs : 3.16503
Std. of action_1 over time : 3.16620
Current timestep : 10000.00000
Gradient norm : 0.00000
Learning rate : 0.00200
Mean episodic reward : 9.05000
Mean episodic steps : 100.00000

[Device 0]: Saving the results to the file '/tmp/continuous_tag/example/1679065351/results.json'
[Device 0]: Saving the 'runner' torch model to the file: '/tmp/continuous_tag/example/1679065351/runner_10000.state_dict'.
[Device 0]: Saving the 'tagger' torch model to the file: '/tmp/continuous_tag/example/1679065351/tagger_10000.state_dict'.
Traceback (most recent call last):
File "wd_test.py", line 84, in
trainer.train()
File "/home/warp/github/warp-drive/warp_drive/training/trainer.py", line 415, in train
metrics = self._update_model_params(iteration)
File "/home/warp/github/warp-drive/warp_drive/training/trainer.py", line 710, in _update_model_params
perform_logging=logging_flag,
File "/home/warp/github/warp-drive/warp_drive/training/algorithms/policygradient/a2c.py", line 102, in compute_loss_and_metrics
m = Categorical(action_probabilities_batch[idx])
File "/home/warp/anaconda3/envs/warp_drive/lib/python3.7/site-packages/torch/distributions/categorical.py", line 64, in init
super(Categorical, self).init(batch_shape, validate_args=validate_args)
File "/home/warp/anaconda3/envs/warp_drive/lib/python3.7/site-packages/torch/distributions/distribution.py", line 56, in init
f"Expected parameter {param} "
ValueError: Expected parameter probs (Tensor of shape (100, 100, 5, 11)) of distribution Categorical(probs: torch.Size([100, 100, 5, 11])) to satisfy the constraint Simplex(), but found invalid values:
tensor([[[[ 1.2426e+00, -1.2945e+00, 4.1014e-01, ..., 5.5622e-01,
-6.7214e-01, -1.2349e+00],
[-1.7248e-01, 6.4287e-02, -7.4881e-01, ..., 4.6214e-01,
7.5912e-01, 1.8682e-01],
[ 6.3147e-01, 4.5790e-01, -3.2810e-01, ..., 3.1173e-01,
2.7938e-01, 3.7275e-01],
[ 1.9841e+00, 7.4553e-01, -6.1727e-01, ..., -8.2579e-01,
-1.8078e+00, -5.4283e-01],
[ 4.3695e-01, 1.6643e-02, -1.7423e-01, ..., 6.6712e-01,
-5.9217e-01, -7.6138e-01]],

     [[ 1.5075e+00, -3.4445e+00, -2.6291e+00,  ...,  8.3555e-01,
        2.2945e+00, -5.3965e-02],
      [-2.0584e-01,  2.0469e+00,  3.3165e-01,  ..., -8.6847e-01,
        1.9418e-01,  9.3736e-01],
      [ 1.3200e+01, -7.2059e+00, -7.2506e-01,  ...,  3.0440e+00,
        2.1811e+00, -1.9477e+00],
      [ 8.5257e-01, -5.0674e-01,  2.2239e+00,  ...,  3.3165e-01,
        5.6384e-01,  3.9233e-01],
      [ 4.9769e-01, -3.7434e-01,  9.3445e-02,  ..., -4.1689e+00,
        1.1729e+00, -1.9694e+00]],

     [[ 2.8376e-01,  4.2903e+00, -1.6735e+00,  ..., -4.2179e-01,
        2.0839e+00,  2.7028e-01],
      [ 1.2377e+00,  3.3001e+00,  1.7086e+00,  ..., -1.5953e+00,
        4.2362e-01, -1.8910e+00],
      [ 1.2417e-01,  1.8337e+00,  2.2134e+00,  ..., -4.1177e-01,
       -2.0779e+00, -6.5099e-01],
      [ 2.7558e-01,  6.8455e-01,  2.6409e-01,  ...,  8.5417e-01,
       -3.8728e-01, -2.9246e-01],
      [ 1.8247e+00,  5.1626e-01, -1.3113e+00,  ...,  1.2756e+00,
       -9.6114e-01,  4.7476e-01]],

     ...,

     [[ 4.9671e-01,  1.1875e+00, -3.7837e-01,  ...,  1.8017e+00,
        9.7922e-02,  9.0826e-01],
      [ 4.0105e-01, -6.0412e-01, -2.3511e+00,  ..., -1.4305e-01,
        3.4514e+00, -3.4735e-01],
      [-1.6592e+00, -3.1455e+00,  3.4837e+00,  ...,  3.0970e+00,
        1.4824e+00, -1.6712e+00],
      [ 2.2480e+00,  2.1861e+00,  2.0778e+00,  ..., -4.3291e+00,
        4.2960e+00,  8.5239e-02],
      [ 4.7889e-01,  4.5080e-01,  5.7547e-01,  ..., -6.0323e-01,
       -3.5546e-01, -3.4922e-01]],

     [[ 6.6736e-01, -1.9598e-01,  4.9250e-01,  ...,  1.0089e+00,
       -6.2808e-01,  3.1778e-01],
      [-6.8272e-01,  1.0324e+00, -2.4395e+00,  ..., -8.7298e-01,
       -8.6526e-01,  6.1645e-01],
      [ 3.4440e-01, -7.2880e-01,  2.2529e-01,  ..., -5.1057e-02,
       -2.4368e-01,  2.5231e-01],
      [ 6.9983e-01, -4.9519e-01,  4.1547e-01,  ...,  4.5329e-01,
       -4.5527e-01,  7.5715e-02],
      [ 6.6658e-01,  9.1576e-02,  5.3606e-01,  ..., -5.4163e-01,
        1.7202e+00,  8.6003e-02]],

     [[-3.9657e+00, -1.4984e+00,  2.5009e+01,  ..., -1.9405e+01,
        2.4046e+01, -1.0344e+01],
      [ 7.6318e-01,  6.2132e-01, -1.6545e-01,  ...,  4.0838e-01,
       -1.0056e+00,  4.0224e+00],
      [-1.6237e-01,  7.6003e-01, -1.5722e+00,  ...,  9.8846e-01,
       -7.1669e-01, -5.8512e-01],
      [ 2.2718e+00,  1.0969e+00,  5.8637e-01,  ..., -1.7726e+00,
       -8.3147e+00,  2.6415e+00],
      [ 4.5803e-01,  1.9375e-01,  6.8278e-01,  ..., -6.5108e-01,
       -7.5676e-02,  1.1231e+00]]],


    [[[-6.6555e-01,  5.1884e-01,  3.0740e+00,  ..., -2.3698e+00,
        2.3828e+00,  1.4393e+00],
      [ 6.6651e-01, -2.8901e-01,  1.0364e-01,  ...,  7.0320e-02,
        7.2539e-01,  2.4268e-01],
      [ 4.0248e-01,  4.1511e-01,  2.9718e-01,  ...,  7.2417e-01,
        6.9254e-02, -7.5353e-01],
      [-1.6744e+01, -2.1953e+01,  1.3020e+01,  ...,  2.0297e+01,
       -1.2317e+01,  1.7973e+01],
      [ 6.4031e-01, -2.7768e-01, -1.1653e-01,  ...,  2.4048e-01,
        2.4544e-01,  2.7243e-01]],

     [[ 1.6415e+00,  8.1022e-01,  1.0484e+00,  ...,  2.1372e+00,
       -4.6998e-01, -6.2497e-01],
      [ 2.9397e-01, -2.7979e-02,  1.0541e+00,  ..., -7.6941e-01,
       -1.6443e-01,  2.7037e-01],
      [ 9.9372e-01, -3.7063e-01,  4.6666e-01,  ...,  1.0245e+00,
        2.1319e-01, -3.5092e-01],
      [ 2.4373e-02, -8.9471e+00,  6.7321e+00,  ..., -4.2705e+00,
        1.5750e+00,  2.9782e+00],
      [ 2.0529e-01,  1.6676e+00,  6.5215e-01,  ..., -6.6583e-01,
        1.9530e-01, -1.3662e-01]],

     [[ 3.3094e-01, -4.1690e-01,  1.4120e-01,  ...,  9.2762e-01,
        3.4857e-01, -4.6941e-01],
      [ 7.5478e+00,  6.1822e+00, -3.2198e+00,  ..., -8.0159e-01,
       -6.3349e-02, -6.2434e+00],
      [ 1.3450e+00, -1.4053e+00, -1.1709e+00,  ..., -5.2519e-01,
       -9.9739e-01, -1.1001e+00],
      [-2.4680e-01,  6.5710e-01, -1.4691e+00,  ...,  1.4205e+00,
        1.8427e+00,  1.1796e-01],
      [ 4.1336e-01,  8.5022e-02,  9.2853e-02,  ...,  3.2010e-01,
        2.9028e-01,  1.8401e-01]],

     ...,

     [[-1.1544e+00,  1.5420e+00,  1.7679e+00,  ..., -6.7182e-01,
       -3.6764e-02, -2.2441e+00],
      [ 1.7255e+00, -9.9439e-01, -1.2645e+00,  ...,  1.1284e+00,
        9.0534e-02,  1.7442e+00],
      [-2.0325e+00,  1.5082e+00,  3.8115e+00,  ..., -4.3420e+00,
        9.2502e-01, -6.1594e+00],
      [ 4.1424e-01, -4.8287e-01,  1.5418e-01,  ..., -2.5157e-01,
        4.0651e-01,  9.1724e-01],
      [ 1.0154e+00, -5.3099e-01,  1.0433e+00,  ...,  1.1691e+00,
       -6.0312e-01, -3.2987e-02]],

     [[-2.6123e-02,  7.3750e-01, -1.7682e-01,  ...,  9.0202e-01,
        1.9137e-01,  1.4555e+00],
      [ 6.1389e-01,  2.6924e-01,  1.4631e+00,  ...,  7.3595e-01,
        8.0151e-01, -6.3749e-01],
      [ 5.2367e-01, -1.0993e+00,  4.3189e-01,  ..., -1.8716e+00,
        5.6906e-01, -4.0940e-01],
      [ 4.9998e-01, -1.3751e+00, -8.6454e-01,  ..., -1.3774e+00,
       -2.5627e-01, -3.8065e-02],
      [-7.3349e-01,  4.3705e-01,  3.5770e+00,  ..., -1.2184e+00,
       -9.5621e-01, -2.6853e+00]],

     [[ 1.7939e+00, -1.2353e+00,  1.3519e+00,  ...,  6.9516e-01,
        1.2708e+00,  7.4168e-01],
      [-6.7631e-01, -1.2218e+00, -2.7282e-01,  ..., -7.6156e-01,
        9.4981e-01, -3.5000e-01],
      [ 5.5513e-01,  7.5819e-01,  1.6906e-01,  ...,  5.1887e-01,
       -1.3181e-02,  8.2934e-02],
      [ 7.8362e-01, -2.5555e-01, -2.0774e+00,  ..., -4.8269e-01,
        1.4186e+00, -3.9613e-01],
      [ 1.4990e+00,  1.3719e+00, -3.4172e+00,  ..., -2.6224e+00,
       -6.0584e+00,  9.6879e+00]]],


    [[[ 5.7833e-01,  4.5936e+00,  1.2302e+01,  ..., -1.4435e+01,
        9.4017e+00,  9.5336e-01],
      [-5.4774e-02,  1.1083e+00,  3.0625e-01,  ..., -1.9470e-01,
       -3.0002e-01, -4.2560e-01],
      [ 8.9458e-01, -3.2593e-01,  7.6646e-02,  ..., -8.6837e-02,
       -6.0646e-01,  6.9351e-01],
      [-3.1247e-01,  5.2223e-01, -3.3424e+00,  ..., -1.9283e+00,
       -1.3650e+00,  1.7507e+00],
      [ 3.2635e-01, -8.6947e-01,  8.3852e-02,  ..., -3.4544e-01,
       -8.9523e-02,  2.5219e-01]],

     [[ 7.7817e-01, -8.5250e-01,  2.8413e-01,  ...,  8.7812e-01,
       -1.0428e+00, -1.3697e-01],
      [-3.6143e+00,  4.5287e+00, -1.2355e+01,  ...,  1.1151e+01,
       -1.1201e+01, -1.5772e+01],
      [ 4.4204e-01,  4.5941e-01,  1.4200e+00,  ..., -1.4958e+00,
        9.9831e-01, -3.0248e-01],
      [ 8.4145e-01, -5.4955e-02, -8.5717e-01,  ...,  1.9797e+00,
       -4.0240e-01,  1.7958e+00],
      [-9.7922e-02,  1.1579e+00, -1.1985e+00,  ...,  9.4564e-01,
        6.3346e-02, -4.4363e-01]],

     [[ 1.4148e+00,  1.0272e+00, -1.5679e-01,  ..., -9.9173e-01,
       -4.1617e+00,  9.9793e-01],
      [ 2.2756e-01, -5.4218e-02,  5.5711e-01,  ..., -5.2833e-01,
       -4.2351e-02,  8.4122e-01],
      [-2.4643e+00, -8.9988e-01,  3.7735e+00,  ..., -2.7195e+00,
        4.3069e+00,  1.3947e+00],
      [ 6.3190e-01, -1.7120e-01,  2.0941e-01,  ..., -1.2354e-01,
        5.8095e-01,  1.0262e-01],
      [ 5.8950e-01,  3.5717e-01,  4.0016e-01,  ...,  2.1093e-01,
        1.1742e-01, -7.6782e-01]],

     ...,

     [[-1.9296e+00,  2.0762e-01,  8.3194e-01,  ...,  5.6747e-02,
        2.8068e+00,  5.9143e-01],
      [ 1.0056e+00,  9.8933e-02,  1.0071e+00,  ...,  1.0832e+00,
        5.7345e-02, -1.9594e+00],
      [-6.6266e-03, -3.1431e-01, -2.9666e-01,  ...,  4.4572e-01,
        9.9407e-01, -6.2249e-02],
      [-2.6428e+00, -7.0332e-01,  9.8558e-01,  ...,  1.9927e+00,
        2.0943e+00,  2.2718e+00],
      [ 3.6267e+00, -4.7667e-01,  1.4884e+00,  ..., -1.1679e+00,
        7.7638e-02,  1.2573e+00]],

     [[ 1.1881e+00,  1.8042e-01,  4.6982e-01,  ...,  7.4913e-01,
       -7.1513e-01,  2.7843e-02],
      [ 1.1181e-01,  6.3832e-01, -2.0892e-01,  ...,  9.0355e-02,
       -8.3373e-02, -3.4605e-01],
      [ 9.5217e-01, -4.3010e-01, -3.9673e-01,  ...,  1.0727e+00,
       -2.3093e-01,  5.4155e-01],
      [ 5.2008e-02,  3.7345e-01,  1.9147e-01,  ...,  7.9313e-01,
       -1.0457e+00, -2.7206e-01],
      [ 1.0753e+00,  5.5819e-01,  7.0341e-01,  ..., -5.4942e-01,
       -5.0333e-01,  6.3525e-01]],

     [[ 5.7962e-02,  4.2277e-01,  3.8236e-01,  ..., -5.1404e-01,
        8.8852e-01,  1.8998e-01],
      [ 5.3058e-01,  2.2198e-01, -4.9676e-02,  ..., -1.4092e-01,
        2.2237e-01, -2.0980e-01],
      [-4.8962e+01,  1.1259e+01,  5.6544e+01,  ..., -4.0250e+01,
       -2.7558e+01, -3.0411e+01],
      [-1.0543e+01, -4.6906e+00,  1.3036e+01,  ..., -2.7036e-02,
       -2.4293e+00, -1.0603e+01],
      [ 1.4160e+00,  5.3356e-01, -1.9352e+00,  ...,  1.2995e+00,
        3.9623e-01,  8.5555e-01]]],


    ...,


    [[[ 9.4254e-01,  5.1876e-01,  3.7956e-01,  ..., -4.1182e-01,
        1.4662e+00,  1.4871e+00],
      [ 3.7801e-02,  1.1907e+00,  7.0360e-01,  ...,  2.1115e-01,
        2.9789e-01,  5.0459e-01],
      [-6.5351e+00, -3.1801e+00,  2.4453e+00,  ..., -1.1528e+00,
        7.1053e+00,  3.1471e+00],
      [ 9.9387e-01,  4.7417e-01,  4.0640e-01,  ...,  8.3128e-01,
       -1.5710e-01,  3.1348e-01],
      [ 3.0644e-02,  2.1654e+00, -3.2200e-01,  ...,  2.9643e+00,
       -5.0624e-01,  6.8427e+00]],

     [[-4.5878e+00,  3.3556e+00, -7.1328e+00,  ..., -7.6168e+00,
        1.5838e+01, -1.4438e+01],
      [ 1.3205e+01,  6.6298e+00,  4.7031e+01,  ..., -2.0646e+01,
       -1.8688e+01,  6.8704e+01],
      [ 1.1237e+00,  2.3903e+00, -1.2417e+00,  ...,  1.7067e+00,
       -3.1532e+00, -1.9685e+00],
      [-4.7536e-02, -7.7557e-01, -5.1901e-01,  ...,  7.7719e-01,
        1.7095e+00,  2.2881e+00],
      [ 1.5270e+00,  2.2445e+00,  1.3476e+00,  ..., -3.8061e-01,
        5.2737e-01, -3.4140e-01]],

     [[-4.1516e-01, -6.1130e-01,  1.7781e+00,  ..., -4.2010e-01,
        2.4555e+00,  1.3414e+00],
      [ 4.4015e-01,  3.5935e-01,  6.7883e-01,  ..., -4.6652e-01,
       -1.2027e+00,  5.6941e-02],
      [ 7.6944e-01, -6.9579e-01,  5.1049e-01,  ...,  6.6508e-01,
        1.4991e-01, -3.0746e-01],
      [ 9.5676e-01,  1.0666e+00,  2.7229e+00,  ..., -1.3453e+00,
       -2.7133e+00,  9.0167e-04],
      [ 1.9745e-02, -1.6052e+00,  2.6452e+00,  ..., -1.0596e+00,
        8.1113e-01,  7.7127e-01]],

     ...,

     [[-4.9620e+00,  2.8767e+00, -1.5517e+00,  ..., -6.7731e+00,
       -3.9575e+00,  3.0952e-01],
      [ 5.0870e-01, -3.2030e-01,  4.9990e-01,  ...,  3.4475e-01,
        2.8955e-02,  2.0354e-01],
      [ 6.9547e-01,  5.1287e-01, -1.4381e+00,  ..., -5.0634e-01,
        4.5152e-01,  1.8553e-01],
      [ 4.1317e-01,  7.2432e-01,  1.3053e+00,  ..., -6.4745e-01,
        4.6445e-01,  5.1297e-01],
      [ 6.1733e-01, -1.9964e+00,  1.5222e+00,  ..., -2.9912e-01,
        4.2244e-01,  1.2825e+00]],

     [[ 4.7489e-01,  4.7182e-01,  6.6222e-01,  ...,  1.1634e+00,
        4.2618e-01, -2.2452e+00],
      [ 1.8447e+00, -3.4247e-01,  4.3219e+00,  ..., -2.6348e+00,
       -7.0686e-02, -5.1091e+00],
      [ 1.6583e-01,  9.2552e-01,  1.2414e+00,  ..., -6.8017e-01,
        3.9580e-01,  2.7543e-01],
      [ 4.1585e+00,  2.1752e+00,  2.9503e+00,  ...,  5.6982e-01,
        6.1948e-01,  9.9855e-01],
      [ 5.5671e-01,  3.4761e-01, -1.2956e+00,  ..., -3.1286e-01,
        1.2305e+00,  1.0559e+00]],

     [[ 2.3741e+01, -5.9836e+01,  5.4721e+01,  ..., -3.8232e+00,
        4.5851e+00,  3.2750e+01],
      [ 1.0989e+00, -2.8977e+00,  3.5956e+00,  ...,  2.5251e-01,
       -2.2446e+00, -1.3616e+00],
      [-1.3798e-01,  2.1394e+00,  2.0645e-01,  ...,  4.6853e-01,
        3.7650e+00, -5.4336e-01],
      [ 5.8165e-01, -1.2998e+00,  2.1851e-01,  ..., -1.6847e+00,
       -6.0834e-01,  7.0311e-01],
      [ 3.9262e-01,  1.8806e+00,  4.1490e+00,  ...,  1.0796e+00,
       -8.9924e-01, -4.6079e-01]]],


    [[[ 2.2265e-01, -2.0570e-01,  3.1797e-02,  ...,  3.7660e-02,
       -4.1366e-01,  2.0542e-01],
      [ 5.8108e-01,  7.6933e-01, -5.1310e-01,  ..., -1.6667e-01,
        2.4480e-01,  4.2667e-01],
      [-3.0719e+00,  2.4038e+00,  3.3761e+00,  ...,  8.7145e-01,
        1.2536e+00, -5.6292e+00],
      [ 8.9647e-01,  9.3441e-01,  3.8667e-01,  ..., -1.1763e-01,
        1.0418e-01, -2.4992e-01],
      [ 1.0936e+01, -6.2251e+01,  4.3816e+01,  ..., -4.3446e+01,
       -2.9938e+01,  6.8241e+01]],

     [[-1.8638e-01, -2.4923e-01, -1.1012e+00,  ...,  1.8603e-01,
       -3.2592e-01,  2.3948e-01],
      [ 3.9760e+00, -4.1706e+00,  5.0572e-01,  ...,  6.0632e+00,
       -3.4053e+00,  1.1460e+01],
      [ 9.2532e-01,  1.3921e+00, -1.7843e-01,  ..., -1.2403e+00,
       -9.8856e-01,  1.0621e+00],
      [ 4.3452e-02, -1.6717e-01, -5.7880e-02,  ...,  1.4633e+00,
        1.0175e-01, -9.2251e-01],
      [ 1.2661e+00,  3.1550e-02,  4.6298e-02,  ..., -7.8469e-02,
       -1.8855e+00, -4.5504e-01]],

     [[ 4.6267e-01, -9.4152e-01,  6.4885e-01,  ..., -3.4818e-01,
       -7.6115e-01,  2.2274e+00],
      [ 7.2543e-01,  2.6525e+00,  4.4636e-01,  ..., -9.3224e-01,
        4.5972e+00, -3.1881e+00],
      [-2.1488e+00, -5.1438e-01, -8.9369e+00,  ..., -1.2142e+00,
        2.1087e+00, -2.0650e+01],
      [-1.3089e-01,  8.0855e-01,  3.3616e-01,  ...,  2.5241e-01,
        8.3287e-02,  3.3158e-01],
      [ 2.5745e+00, -1.1222e+00, -3.4501e+00,  ...,  1.9027e+00,
       -2.6546e+00, -4.7814e-01]],

     ...,

     [[-8.4899e-01,  4.5336e-01, -5.7752e-01,  ...,  6.3741e-01,
       -2.6161e-01,  1.2684e+00],
      [ 6.0437e-01,  6.3276e-02, -1.3153e-01,  ..., -3.1852e-03,
        1.8475e-01,  1.0924e-01],
      [-3.0929e+00, -3.1014e+00,  5.9858e-01,  ..., -4.4175e-01,
        1.2333e+00,  7.4185e+00],
      [ 1.6183e+00,  1.0251e+00, -1.3564e-01,  ..., -1.1713e+00,
       -6.7476e-01,  9.0628e-01],
      [-1.6322e+00,  2.2092e+00, -1.2780e+00,  ...,  3.3654e+00,
        3.6797e+00, -5.7386e+00]],

     [[ 7.6706e-01, -2.6295e-01,  8.4968e-01,  ...,  6.1631e-01,
       -3.7476e-01, -1.2049e+00],
      [-2.4539e-01,  1.2085e-01,  3.7377e-01,  ...,  1.1437e+00,
        1.1793e+00, -1.1422e-01],
      [ 6.3429e-01,  8.0557e-01,  5.9677e-01,  ..., -2.5139e-01,
       -3.2989e-01, -1.5825e-01],
      [ 1.0353e+00, -5.4504e-01,  1.1037e+00,  ..., -5.5962e-01,
        1.4610e-01, -1.9254e+00],
      [ 8.9514e-02,  1.0989e+00,  1.5606e+00,  ..., -7.3987e-01,
        1.9311e+00, -2.2786e+00]],

     [[ 1.2851e+00, -1.9851e+00, -4.5494e+00,  ...,  3.7657e+00,
        4.5665e+00, -9.2168e-01],
      [-1.7239e-01,  4.6246e-01,  3.7013e-01,  ..., -6.1598e-01,
        6.5582e-01, -1.4461e+00],
      [-1.1701e+00,  4.2045e-01,  3.5514e-01,  ..., -2.9292e+00,
       -2.5521e+00,  2.6618e-01],
      [ 1.1595e+00, -8.0284e-01,  1.5254e+00,  ...,  9.7177e-01,
        9.2073e-01,  4.3168e-01],
      [-5.2398e+00,  1.4340e+01, -6.2417e+01,  ..., -9.0160e+00,
        1.3309e+01,  2.2005e+01]]],


    [[[-4.4581e+00, -6.7095e+01, -3.2797e+01,  ...,  1.1565e+01,
       -1.3261e+01, -1.4874e+01],
      [ 4.1974e-01,  3.6623e+00, -3.5843e+00,  ...,  1.0202e+00,
       -1.4485e+00, -1.9347e+00],
      [-4.6281e+00,  1.2632e+01, -8.6199e+00,  ...,  1.7171e+01,
       -1.8479e+01, -1.8846e+01],
      [ 3.5012e-01, -8.2658e-01,  8.9016e-01,  ..., -8.7550e-01,
        1.2175e-01, -1.5617e+00],
      [-3.3812e+00,  6.0930e+00,  5.5390e+00,  ..., -3.2472e+00,
       -2.8270e+00,  4.2411e-01]],

     [[-3.7637e+00, -1.7642e+00, -2.3450e+00,  ..., -3.4761e+00,
        4.4729e+00, -1.9583e+00],
      [ 1.1669e+00,  9.0970e-01, -2.6114e+00,  ..., -1.8751e-01,
        1.1157e+00,  1.8243e+00],
      [ 1.0297e-01,  5.3322e-01, -3.3761e-01,  ...,  5.7520e-02,
       -5.7774e-02, -3.6703e-01],
      [ 1.5873e+00, -2.0804e+00,  1.0169e+00,  ..., -6.6074e-01,
       -1.9426e+00, -1.2487e-01],
      [-6.2900e-01,  1.2562e+00, -3.7006e-01,  ...,  2.1023e-01,
        3.7004e-02,  1.8923e+00]],

     [[ 8.4332e-01, -4.3401e-02, -3.0546e-01,  ..., -8.0304e-01,
       -6.1729e-01,  5.2937e-01],
      [-2.1099e+01, -2.7131e+01, -2.8109e+02,  ..., -1.0522e+02,
        2.8456e+01,  2.6699e+01],
      [-2.3551e-01, -8.7806e-02,  9.8973e-01,  ..., -4.2571e-01,
       -9.5759e-01,  1.3006e-01],
      [ 6.8565e-01,  3.3882e-01, -6.1284e-01,  ..., -2.1239e-01,
       -1.9207e-02,  1.1479e+00],
      [-9.2754e-01,  8.7619e-01,  6.3741e-01,  ...,  7.7548e-01,
        2.8069e+00, -2.3306e+00]],

     ...,

     [[ 2.2937e+00,  8.0051e+00,  3.9795e+00,  ..., -2.5096e+00,
        5.5701e+00, -2.5075e+00],
      [-3.2691e-01, -1.3848e+00, -1.8447e+00,  ...,  5.6993e-01,
       -1.7544e+00,  1.1677e+00],
      [ 1.3561e+00,  3.8136e-01,  1.4107e+00,  ...,  1.9896e+00,
       -6.5320e-01, -4.6294e+00],
      [-2.3471e-01, -5.2743e-01, -5.6733e-01,  ...,  4.3915e-01,
        5.0340e-01,  1.1045e+00],
      [ 4.1318e+00,  5.7137e-01, -1.5194e+00,  ...,  2.8802e+00,
       -7.9121e+00,  4.0104e+00]],

     [[ 4.8319e-01,  2.7313e-01,  3.9362e-01,  ...,  3.3351e-02,
       -2.2658e-01, -2.1984e-01],
      [-6.3220e+00, -9.0551e+00,  6.8837e+00,  ...,  5.3650e+00,
       -5.2190e+00,  1.1687e+00],
      [ 2.1542e+00,  1.1602e+00, -8.8548e-02,  ...,  9.0859e-01,
       -2.5196e+00, -1.3161e-01],
      [ 5.4348e-01, -7.5089e-01,  2.0165e+00,  ...,  4.7323e-01,
        4.6918e-01,  8.5174e-01],
      [ 1.2587e+00, -5.3498e+00,  2.2615e+00,  ...,  4.3795e-02,
       -1.9938e+00,  3.1597e+00]],

     [[-2.8105e-01, -3.4248e+00,  1.7655e+00,  ..., -8.7769e-01,
        3.2186e+00, -6.3708e+00],
      [ 3.4623e-01,  3.0659e-01, -1.3735e-01,  ...,  5.6169e-02,
       -1.6627e+00,  1.1547e+00],
      [-3.5326e+00,  6.8798e-01, -9.7799e-01,  ...,  4.0571e+00,
       -2.0529e-01,  5.0493e-01],
      [ 4.6537e-01,  3.3993e-01,  3.9738e-01,  ..., -4.7107e-02,
        1.0712e-01, -3.3944e-01],
      [ 1.1307e+00,  2.1601e+00,  1.0504e-01,  ...,  6.5564e-01,
       -1.1295e+00, -2.6074e+00]]]], device='cuda:0',
   grad_fn=<DivBackward0>)

Exception ignored in: <function PyCUDASampler.del at 0x7f66e6914830>
Traceback (most recent call last):
File "/home/warp/github/warp-drive/warp_drive/managers/pycuda_managers/pycuda_function_manager.py", line 510, in del
File "/home/warp/anaconda3/envs/warp_drive/lib/python3.7/site-packages/pycuda/driver.py", line 480, in function_call
pycuda._driver.LogicError: cuFuncSetBlockShape failed: invalid resource handle

Creating a 4D custom environment from Gridworld 2D env

Dear all, I am new to reinforcement learning, but I am fascinated with the Warp Drive. I was wondering if you could help me to build up my custom env for my little study project. The story of my env is like:
I wanna create a gym 4D environment, where it is a 468x225x182x54 plane (which means 1,034,888,400 unique cells). And every cell in this space has a unique value. And my agent (e.g. rabbit) can jump anywhere in this space and makes cells get zero value (or burned after the point of the cell gets collected by the rabbit). Also the agent will be rewarded based on reduction of the environment overall points (e.g. 2000) from the change of cell values to zero. Which cells have more points or reward is unknown to agent but fixed, and it is the task of the agent to find out by making jump in order to burn more higher value cells before game episode length finish. I thought my action space could be defined as

class CustomEnv(gym.Env):
    def __init__(self):
           self.action_space = gym.spaces.MultiDiscrete([468, 225, 182, 54])
 
 

For example

 print(CustomEnv.action_space.sample())
[172 54 101 37]

where my agent collects the reward of the location [172 54 101 37]. And all values at this cell is zero now.
When the game starts the agent would jump to this 4D space (I assume it is better to make my first episode start at a fixed position but buffer action(no values are zeroed at this first episode) and during policy training agent learns to begin with an action that makes a globally better reward). Furthermore, I want the step function for episodes of the game be like a rabbit make a jump, then the reward is returned. Also, the returned state of the episode is the 4D space with same shape but the value of it will change from zeroing of previous action.

However, I don't know how should I define my observation space and I really appreciate your help.

So far, for example if I modify your gridword example env:

import numpy as np
from gym import spaces
from gym.utils import seeding

# seeding code from https://github.com/openai/gym/blob/master/gym/utils/seeding.py
from warp_drive.utils.constants import Constants
from warp_drive.utils.data_feed import DataFeed
from warp_drive.utils.gpu_environment_context import CUDAEnvironmentContext

_OBSERVATIONS = Constants.OBSERVATIONS
_ACTIONS = Constants.ACTIONS
_REWARDS = Constants.REWARDS

# Our Custom field, where it is 4D space of size 468x225x182x54, and each cell has a random value
RabbitField_World = np.array([np.random.randint(0,5,468), np.random.randint(0,5,225), np.random.randint(0,5,182), np.random.randint(0,5,54)])
RabbitField_World_Fixed_Points = (sum(RabbitField_World[0])+sum(RabbitField_World[1])+sum(RabbitField_World[2])+sum(RabbitField_World[3]))
_LOC_X = "cells_dim_x"
_LOC_Y = "cells_dim_y"
_LOC_Z = "cells_dim_z"
_LOC_K = "cells_dim_k"


def burning(dim_world, jump_pos):
    dim_world[jump_pos] = 0 
    return dim_world

class RabbitField:
    """
    The game of tag on a 4D 468x225x182x54 plane.
    There are a number of agents (Rabbits) trying to minimize the plane overall points.
    A cell might have a value from range of 0 to 5. An agent jumps on a cell and collects 
        the point of it, and the value of the cell becomes zero.
    The reward will be the remaining points in the 4D plane.
    """

    def __init__(
        self,
        num_agents=1,
        grid_dim_one=468,
        grid_dim_two=225,
        grid_dim_three=182,
        grid_dim_four=54,
        episode_length=100,
        starting_cells_x=RabbitField_World[0],
        starting_cells_y=RabbitField_World[1],
        starting_cells_z=RabbitField_World[2],
        starting_cells_k=RabbitField_World[3],
        finish_point = 1000,
        seed=None,
        step_cost_for_agent=0.01,
        use_full_observation=True,
        env_backend="cpu"
    ):
        """
        :param num_agents (int): the total number of rabbits. In this env,
            num_agent = 1 or each env can have only one rabbit or multi.
        :param grid_dim_# (int): the world is a 4D space,
        :param episode_length (int): episode length
        :param starting_location_x ([ndarray], optional): starting x axis cells values
            of the 4D plane.
        :param starting_location_y ([ndarray], optional): starting y axis cells values
            of the 4D agents.
        :param starting_location_z ([ndarray], optional): starting z axis cells values
            of the 4D agents.
        :param starting_location_k ([ndarray], optional): starting k axis cells values
            of the 4D agents.
        :param finish_point = 1000: The sufficient reward to finish the game.
        :param seed: seeding parameter.
        :param step_cost_for_agent (float): penalty for each jump that rabbit makes
        :param use_full_observation (bool): boolean indicating whether to
            include all the agents' data in the use_full_observation or
            just the nearest neighbor. Defaults to True.
        """
        assert num_agents > 0
        self.num_agents = num_agents

        assert episode_length > 0
        self.episode_length = episode_length

        self.grid_dim_one = grid_dim_one
        self.grid_dim_two = grid_dim_two
        self.grid_dim_three = grid_dim_three
        self.grid_dim_four = grid_dim_four

        # Seeding
        self.np_random = np.random
        if seed is not None:
            self.seed(seed)


        self.starting_cells_x = starting_cells_x
        self.starting_cells_y = starting_cells_y
        self.starting_cells_z = starting_cells_z
        self.starting_cells_k = starting_cells_k

        # Each possible action is a cell position in the self.RabbitField_World 
        self.step_actions = [468, 225, 182, 54]

        # Defining observation and action spaces
        self.observation_space = None  # Note: this will be set via the env_wrapper

        self.action_space = {
            agent_id: spaces.MultiDiscrete(self.step_actions)
            for agent_id in range(self.num_agents)
        }

        # These will be set during reset (see below)
        self.timestep = None
        self.global_state = None

        # For reward computation
        self.step_cost_for_agent = step_cost_for_agent
        self.finish_point = finish_point  #this is a fixed reward defined by us to end the game
        self.reward_penalty = np.zeros(self.num_agents)
        self.use_full_observation = use_full_observation

        self.env_backend = env_backend

    name = "RabbitField"

    def seed(self, seed=None):
        self.np_random, seed = seeding.np_random(seed)
        return [seed]

    def set_global_state(self, key=None, value=None, t=None, dtype=None):
        assert key is not None
        if dtype is None:
            dtype = np.int32

        # If no values are passed, set everything to zeros.
        if key not in self.global_state:
            self.global_state[key] = np.zeros(
                (self.episode_length + 1, self.num_agents), dtype=dtype
            )

        if t is not None and value is not None:
            assert isinstance(value, np.ndarray)
            assert value.shape[0] == self.global_state[key].shape[1]

            self.global_state[key][t] = value



    def update_state(self, actions_x, actions_y, actions_z, actions_k):
        loc_x_prev_t = self.global_state[_LOC_X][self.timestep - 1]
        loc_y_prev_t = self.global_state[_LOC_Y][self.timestep - 1]
        loc_z_prev_t = self.global_state[_LOC_Z][self.timestep - 1]
        loc_k_prev_t = self.global_state[_LOC_K][self.timestep - 1]


        loc_x_curr_t = burning(loc_x_prev_t, actions_x)
        loc_y_curr_t = burning(loc_y_prev_t, actions_y)
        loc_z_curr_t = burning(loc_z_prev_t, actions_z)
        loc_k_curr_t = burning(loc_k_prev_t, actions_k)


        self.set_global_state(key=_LOC_X, value=loc_x_curr_t, t=self.timestep)
        self.set_global_state(key=_LOC_Y, value=loc_y_curr_t, t=self.timestep)
        self.set_global_state(key=_LOC_Z, value=loc_z_curr_t, t=self.timestep)
        self.set_global_state(key=_LOC_K, value=loc_k_curr_t, t=self.timestep)

        #Our Rabbit Field Custom Reward from collecting points, the more the current 4D plane lose overall values, the more reward be increased.
        self.reward_collection = RabbitField_World_Fixed_Points - (sum(loc_x_curr_t)+sum(loc_y_curr_t)+sum(loc_z_curr_t)+sum(loc_k_curr_t))
        if self.reward_collection >= self.finish_point:
            tag = True

        reward = self.reward_collection
        rew = {}
        for agent_id, r in enumerate(reward):
            rew[agent_id] = r

        return rew, tag

    def generate_observation(self):
        obs = {}
        if self.use_full_observation:
            common_obs = None
            for feature in [
                _LOC_X,
                _LOC_Y,
                _LOC_Z,
                _LOC_K,
            ]:
                if common_obs is None:
                    common_obs = self.global_state[feature][self.timestep]
                else:
                    common_obs = np.vstack(
                        (common_obs, self.global_state[feature][self.timestep])
                    )
            normalized_common_obs = common_obs 

            agent_types = np.array(
                [self.agent_type[agent_id] for agent_id in range(self.num_agents)]
            )

            for agent_id in range(self.num_agents):
                agent_indicators = np.zeros(self.num_agents)
                agent_indicators[agent_id] = 1
                obs[agent_id] = np.concatenate(
                    [
                        np.vstack(
                            (normalized_common_obs, agent_types, agent_indicators)
                        ).reshape(-1),
                        np.array([float(self.timestep) / self.episode_length]),
                    ]
                )
        else:
            for agent_id in range(self.num_agents):
                feature_list = []
                for feature in [
                    _LOC_X,
                    _LOC_Y,
                    _LOC_Z,
                    _LOC_K,
                ]:
                    feature_list.append(
                        self.global_state[feature][self.timestep][agent_id]
                    )
                if agent_id < self.num_agents - 1:
                    for feature in [
                        _LOC_X,
                        _LOC_Y,
                        _LOC_Z,
                        _LOC_K,
                    ]:
                        feature_list.append(
                            self.global_state[feature][self.timestep][-1]
                        )
                else:
                    dist_array = None
                    for feature in [
                        _LOC_X,
                        _LOC_Y,
                        _LOC_Z,
                        _LOC_K,
                    ]:
                        if dist_array is None:
                            dist_array = np.square(
                                self.global_state[feature][self.timestep][:-1]
                                - self.global_state[feature][self.timestep][-1]
                            )
                        else:
                            dist_array += np.square(
                                self.global_state[feature][self.timestep][:-1]
                                - self.global_state[feature][self.timestep][-1]
                            )
                    min_agent_id = np.argmin(dist_array)
                    for feature in [
                        _LOC_X,
                        _LOC_Y,
                        _LOC_Z,
                        _LOC_K,
                    ]:
                        feature_list.append(
                            self.global_state[feature][self.timestep][min_agent_id]
                        )
                feature_list += [
                    self.agent_type[agent_id],
                    float(self.timestep) / self.episode_length,
                ]
                obs[agent_id] = np.array(feature_list)
        return obs

    def reset(self):
        # Reset time to the beginning
        self.timestep = 0

        # Re-initialize the global state
        self.global_state = {}
        self.set_global_state(
            key=_LOC_X, value=self.starting_cells_x, t=self.timestep, dtype=np.int32
        )
        self.set_global_state(
            key=_LOC_Y, value=self.starting_cells_y, t=self.timestep, dtype=np.int32
        )
        self.set_global_state(
            key=_LOC_Z, value=self.starting_cells_z, t=self.timestep, dtype=np.int32
        )
        self.set_global_state(
            key=_LOC_K, value=self.starting_cells_k, t=self.timestep, dtype=np.int32
        )
        return self.generate_observation()

    def step(
        self,
        actions=None,
    ):
        self.timestep += 1
        assert isinstance(actions, dict)
        assert len(actions) == self.num_agents

        actions_x = np.array(
            [
                actions[agent_id][0]
                for agent_id in range(self.num_agents)
            ]
        )
        actions_y = np.array(
            [
                actions[agent_id][1]
                for agent_id in range(self.num_agents)
            ]
        )
        actions_z = np.array(
            [
                actions[agent_id][2]
                for agent_id in range(self.num_agents)
            ]
        )
        actions_k = np.array(
            [
                actions[agent_id][3]
                for agent_id in range(self.num_agents)
            ]
        )

        rew, tag = self.update_state(actions_x, actions_y, actions_z, actions_k)
        obs = self.generate_observation()
        done = {"__all__": self.timestep >= self.episode_length or tag}
        info = {}

        return obs, rew, done, info


class CUDARabbitField(RabbitField, CUDAEnvironmentContext):
    """
    CUDA version of the RabbitField environment.
    Note: this class subclasses the Python environment class RabbitField,
    and also the  CUDAEnvironmentContext
    """

    def get_data_dictionary(self):
        data_dict = DataFeed()
        for feature in [
            _LOC_X,
            _LOC_Y,
            _LOC_Z,
            _LOC_K,
        ]:
            data_dict.add_data(
                name=feature,
                data=self.global_state[feature][0],
                save_copy_and_apply_at_reset=True,
                log_data_across_episode=True,
            )
        data_dict.add_data_list(
            [
                ("finish_point", self.finish_point),
                ("step_cost_for_agent", self.step_cost_for_agent),
                ("use_full_observation", self.use_full_observation),
            ]
        )
        return data_dict

    def get_tensor_dictionary(self):
        tensor_dict = DataFeed()
        return tensor_dict

    def step(self, actions=None):
        self.timestep += 1
        args = [
            _LOC_X,
            _LOC_Y,
            _LOC_Z,
            _LOC_K,
            _ACTIONS,
            "_done_",
            _REWARDS,
            _OBSERVATIONS,
            "finish_point",
            "step_cost_for_agent",
            "use_full_observation",
            "_timestep_",
            ("episode_length", "meta"),
        ]
        if self.env_backend == "pycuda":
            self.cuda_step(
                *self.cuda_step_function_feed(args),
                block=self.cuda_function_manager.block,
                grid=self.cuda_function_manager.grid,
            )
        elif self.env_backend == "numba":
            self.cuda_step[
                self.cuda_function_manager.grid, self.cuda_function_manager.block
            ](*self.cuda_step_function_feed(args))
        else:
            raise Exception("CUDARabbitField expects env_backend = 'pycuda' or 'numba' ")

a question for cuda env

There is reset in the python environment, but in cuda they do not, suppose I want each environment agent to be able to set a different initial position, how can I write it in cuda?

Update README.md for limitations

In the readme it states that this library is useful / fast for simple RL problems, and that the environment that this library contains was created to be simple for understanding purposes, however, this leads me to my question [apologies as I dont know of another way of sending this message to you w/o a github issue, as its just my lack of understanding, but perhaps it will help others]

What are the limitations of this library?

Could I create an environment suchas a humanoid and create multiple instances of humanoids in one environment and have them learn [" " cheat] from each other to learn the fastest way to get across the environment [100m dash for example]

Could this library be used to train agents within a unity environment [probably wouldnt be actually training in the unity environment itself, but rather visualized in the unity environment after training] ?

problem with running environment

Hi,

Need some help to make this run. The issue is happening in Jupyter and in PyCharm, 'make bin file failed..' exception is raised over and over again. The exception is raised in simple-end-to-end example and also while running both tests build in the library.
Fresh Conda environment.

Thanks for any suggestions

appdirs 1.4.4 pypi_0 pypi
atomicwrites 1.4.0 py_0
attrs 21.4.0 pyhd3eb1b0_0
backcall 0.2.0 pypi_0 pypi
blas 1.0 mkl
brotli 1.0.9 h2bbff1b_7
brotli-bin 1.0.9 h2bbff1b_7
ca-certificates 2022.07.19 haa95532_0
certifi 2022.6.15 py37haa95532_0
charset-normalizer 2.1.1 pypi_0 pypi
cloudpickle 2.1.0 pypi_0 pypi
colorama 0.4.5 py37haa95532_0
cycler 0.11.0 pyhd3eb1b0_0
debugpy 1.6.3 pypi_0 pypi
decorator 5.1.1 pypi_0 pypi
entrypoints 0.4 pypi_0 pypi
fonttools 4.25.0 pyhd3eb1b0_0
freetype 2.10.4 hd328e21_0
glib 2.69.1 h5dc1a3c_1
gst-plugins-base 1.18.5 h9e645db_0
gstreamer 1.18.5 hd78058f_0
gym 0.25.2 pypi_0 pypi
gym-notices 0.0.8 pypi_0 pypi
icu 58.2 ha925a31_3
idna 3.3 pypi_0 pypi
importlib-metadata 4.11.3 py37haa95532_0
importlib_metadata 4.11.3 hd3eb1b0_0
iniconfig 1.1.1 pyhd3eb1b0_0
intel-openmp 2021.4.0 haa95532_3556
ipykernel 6.15.1 pypi_0 pypi
ipython 7.34.0 pypi_0 pypi
jedi 0.18.1 pypi_0 pypi
jpeg 9e h2bbff1b_0
jupyter-client 7.3.4 pypi_0 pypi
jupyter-core 4.11.1 pypi_0 pypi
kiwisolver 1.4.2 py37hd77b12b_0
lerc 3.0 hd77b12b_0
libbrotlicommon 1.0.9 h2bbff1b_7
libbrotlidec 1.0.9 h2bbff1b_7
libbrotlienc 1.0.9 h2bbff1b_7
libclang 12.0.0 default_h627e005_2
libdeflate 1.8 h2bbff1b_5
libffi 3.4.2 hd77b12b_4
libiconv 1.16 h2bbff1b_2
libogg 1.3.5 h2bbff1b_1
libpng 1.6.37 h2a8f88b_0
libtiff 4.4.0 h8a3f274_0
libvorbis 1.3.7 he774522_0
libwebp 1.2.2 h2bbff1b_0
libxml2 2.9.14 h0ad7f3c_0
libxslt 1.1.35 h2bbff1b_0
lz4-c 1.9.3 h2bbff1b_1
mako 1.2.1 pypi_0 pypi
markupsafe 2.1.1 pypi_0 pypi
matplotlib 3.5.1 py37haa95532_1
matplotlib-base 3.5.1 py37hd77b12b_1
matplotlib-inline 0.1.6 pypi_0 pypi
mkl 2021.4.0 haa95532_640
mkl-service 2.4.0 py37h2bbff1b_0
mkl_fft 1.3.1 py37h277e83a_0
mkl_random 1.2.2 py37hf11a4ad_0
munkres 1.1.4 py_0
nest-asyncio 1.5.5 pypi_0 pypi
numpy 1.21.5 py37h7a0a035_3
numpy-base 1.21.5 py37hca35cd5_3
openssl 1.1.1q h2bbff1b_0
packaging 21.3 pyhd3eb1b0_0
parso 0.8.3 pypi_0 pypi
pcre 8.45 hd77b12b_0
pickleshare 0.7.5 pypi_0 pypi
pillow 9.2.0 py37hdc2b20a_1
pip 22.1.2 py37haa95532_0
platformdirs 2.5.2 pypi_0 pypi
pluggy 1.0.0 py37haa95532_1
ply 3.11 py37_0
prompt-toolkit 3.0.30 pypi_0 pypi
psutil 5.9.1 pypi_0 pypi
py 1.11.0 pyhd3eb1b0_0
pycuda 2021.1 pypi_0 pypi
pygments 2.13.0 pypi_0 pypi
pyparsing 3.0.4 pyhd3eb1b0_0
pyqt 5.15.7 py37hd77b12b_0
pyqt5-sip 12.11.0 py37hd77b12b_0
pytest 7.1.2 py37haa95532_0
python 3.7.13 h6244533_0
python-dateutil 2.8.2 pyhd3eb1b0_0
pytools 2022.1.12 pypi_0 pypi
pywin32 304 pypi_0 pypi
pyyaml 6.0 py37h2bbff1b_1
pyzmq 23.2.1 pypi_0 pypi
qt-main 5.15.2 he8e5bd7_7
qt-webengine 5.15.9 hb9a9bb5_4
qtwebkit 5.212 h3ad3cdb_4
requests 2.28.1 pypi_0 pypi
rl-warp-drive 1.6.7 pypi_0 pypi
setuptools 61.2.0 py37haa95532_0
sip 6.6.2 py37hd77b12b_0
six 1.16.0 pyhd3eb1b0_1
sqlite 3.39.2 h2bbff1b_0
tk 8.6.12 h2bbff1b_0
toml 0.10.2 pyhd3eb1b0_0
tomli 2.0.1 py37haa95532_0
torch 1.10.2 pypi_0 pypi
torchaudio 0.12.1+cu116 pypi_0 pypi
torchtext 0.11.2 pypi_0 pypi
torchvision 0.11.3 pypi_0 pypi
tornado 6.1 py37h2bbff1b_0
tqdm 4.64.0 pypi_0 pypi
traitlets 5.3.0 pypi_0 pypi
typing_extensions 4.3.0 py37haa95532_0
urllib3 1.26.11 pypi_0 pypi
vc 14.2 h21ff451_1
vs2015_runtime 14.27.29016 h5e58377_2
wcwidth 0.2.5 pypi_0 pypi
wheel 0.37.1 pyhd3eb1b0_0
wincertstore 0.2 py37haa95532_2
xz 5.2.5 h8cc25b3_1
yaml 0.2.5 he774522_0
zipp 3.8.0 py37haa95532_0
zlib 1.2.12 h8cc25b3_2
zstd 1.5.2 h19a0ad4_0

Numba #Enhancement

Perhaps, I misunderstand [very possible] what numba is for, but maybe it can used to replace learning how to write cuda c code, and instead just write python code that numba can translate [which runs directly on the gpu] into cuda c code?

An issue with installation of warp-drive: Failed building wheel for pycuda

Hello all!

I follow the instruction on this repository to install warp-drive on my laptop:

Processor Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz 1.99 GHz
Installed RAM 16.0 GB (15.9 GB usable)
System type 64-bit operating system, x64-based processor
Edition Windows 10 Pro

However, it gives the following error for "pycuda":

 C:\Users\Aslan\AppData\Local\Temp\pip-install-6zudjbjm\pycuda_26af0be0537a4731b787cf7208c68c7e\src\cpp\cuda.hpp(14): fatal error C1083: Cannot open include file: 'cuda.h': No such file or directory
  C:\Users\Aslan\AppData\Local\Temp\pip-build-env-22hj7b1u\overlay\Lib\site-packages\setuptools\command\build_py.py:153: SetuptoolsDeprecationWarning:     Installing 'pycuda.cuda' as data is deprecated, please list it in `packages`.
      !!


      ############################
      # Package would be ignored #
      ############################
      Python recognizes 'pycuda.cuda' as an importable package,
      but it is not listed in the `packages` configuration of setuptools.

      'pycuda.cuda' has been automatically added to the distribution only
      because it may contain data files, but this behavior is likely to change
      in future versions of setuptools (and therefore is considered deprecated).

      Please make sure that 'pycuda.cuda' is included as a package by using
      the `packages` configuration field or the proper discovery methods
      (for example by using `find_namespace_packages(...)`/`find_namespace:`
      instead of `find_packages(...)`/`find:`).

      You can read more about "package discovery" and "data files" on setuptools
      documentation page.


  !!

    check.warn(importable)
  error: command 'C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\BuildTools\\VC\\Tools\\MSVC\\14.26.28801\\bin\\HostX86\\x64\\cl.exe' failed with exit code 2
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for pycuda
Failed to build pycuda
ERROR: Could not build wheels for pycuda, which is required to install pyproject.toml-based projects
WARNING: Ignoring invalid distribution -ffi (c:\users\aslan\anaconda3\envs\ai-economist\lib\site-packages)
WARNING: Ignoring invalid distribution -ffi (c:\users\aslan\anaconda3\envs\ai-economist\lib\site-packages)
WARNING: Ignoring invalid distribution -ffi (c:\users\aslan\anaconda3\envs\ai-economist\lib\site-packages)

As you see the problem is with pycuda.

Here is my installed packages:

Name Version Build Channel

absl-py 1.1.0 pypi_0 pypi
aiosignal 1.2.0 pypi_0 pypi
alabaster 0.7.12 py37_0 anaconda
astroid 2.9.0 py37haa95532_0 anaconda
astunparse 1.6.3 pypi_0 pypi
attrs 21.2.0 pypi_0 pypi
babel 2.9.1 pyhd3eb1b0_0 anaconda
backcall 0.2.0 pyhd3eb1b0_0 anaconda
beautifulsoup4 4.11.1 py37haa95532_0 anaconda
blas 1.0 mkl
bleach 4.1.0 pyhd3eb1b0_0 anaconda
brotlipy 0.7.0 py37h2bbff1b_1003 anaconda
ca-certificates 2022.4.26 haa95532_0 anaconda
cachetools 5.2.0 pypi_0 pypi
certifi 2022.6.15 py37haa95532_0 anaconda
cffi 1.15.0 pypi_0 pypi
chardet 4.0.0 py37haa95532_1003 anaconda
charset-normalizer 2.0.4 pyhd3eb1b0_0 anaconda
cloudpickle 2.0.0 pyhd3eb1b0_0 anaconda
colorama 0.4.4 pyhd3eb1b0_0 anaconda
cryptography 36.0.0 py37h21b164f_0 anaconda
cudatoolkit 11.6.0 hc0ea762_10 conda-forge
debugpy 1.5.1 py37hd77b12b_0 anaconda
decorator 5.0.9 pypi_0 pypi
defusedxml 0.7.1 pyhd3eb1b0_0 anaconda
distlib 0.3.4 pypi_0 pypi
docutils 0.17.1 py37haa95532_1 anaconda
entrypoints 0.4 py37haa95532_0 anaconda
filelock 3.7.1 pypi_0 pypi
flatbuffers 1.12 pypi_0 pypi
freetype 2.10.4 hd328e21_0
frozenlist 1.3.0 pypi_0 pypi
gast 0.4.0 pypi_0 pypi
google-auth 2.8.0 pypi_0 pypi
google-auth-oauthlib 0.4.6 pypi_0 pypi
google-pasta 0.2.0 pypi_0 pypi
grpcio 1.43.0 pypi_0 pypi
gym 0.21.0 pypi_0 pypi
icu 58.2 vc14hc45fdbb_0 [vc14] anaconda
idna 3.3 pyhd3eb1b0_0 anaconda
imageio 2.19.3 pypi_0 pypi
imagesize 1.3.0 pyhd3eb1b0_0 anaconda
importlib-metadata 4.11.3 py37haa95532_0 anaconda
importlib_metadata 4.11.3 hd3eb1b0_0 anaconda
importlib_resources 5.2.0 pyhd3eb1b0_1 anaconda
intel-openmp 2021.4.0 haa95532_3556
ipykernel 6.15.0 pypi_0 pypi
ipython 7.34.0 pypi_0 pypi
ipython_genutils 0.2.0 pyhd3eb1b0_1 anaconda
isort 5.9.3 pyhd3eb1b0_0 anaconda
jedi 0.18.0 pypi_0 pypi
jinja2 3.0.3 pyhd3eb1b0_0 anaconda
jpeg 9b vc14h4d7706e_1 [vc14] anaconda
jsonschema 4.4.0 py37haa95532_0 anaconda
jupyter-client 7.3.4 pypi_0 pypi
jupyter_client 7.2.2 py37haa95532_0
jupyter_core 4.10.0 py37haa95532_0
jupyterlab-pygments 0.2.2 pypi_0 pypi
jupyterlab_pygments 0.1.2 py_0 anaconda
keras 2.9.0 pypi_0 pypi
keras-preprocessing 1.1.2 pypi_0 pypi
keyring 23.4.0 py37haa95532_0 anaconda
lazy-object-proxy 1.6.0 py37h2bbff1b_0 anaconda
libclang 14.0.1 pypi_0 pypi
libpng 1.6.37 h2a8f88b_0 anaconda
libtiff 4.2.0 hd0e1b90_0
libuv 1.40.0 he774522_0
libwebp 1.2.2 h2bbff1b_0
lz4-c 1.9.3 h2bbff1b_1
markdown 3.3.7 pypi_0 pypi
markupsafe 2.0.1 py37h2bbff1b_0 anaconda
matplotlib-inline 0.1.2 pyhd3eb1b0_2 anaconda
mccabe 0.7.0 pyhd3eb1b0_0 anaconda
mistune 0.8.4 py37hfa6e2cd_1001 anaconda
mkl 2021.4.0 haa95532_640
mkl-service 2.4.0 py37h2bbff1b_0
mkl_fft 1.3.1 py37h277e83a_0
mkl_random 1.2.2 py37hf11a4ad_0
msgpack 1.0.4 pypi_0 pypi
nbclient 0.5.13 py37haa95532_0 anaconda
nbconvert 6.4.4 py37haa95532_0 anaconda
nbformat 5.3.0 py37haa95532_0 anaconda
nest-asyncio 1.5.5 py37haa95532_0 anaconda
networkx 2.6.3 pypi_0 pypi
numpy 1.21.5 py37h7a0a035_3
numpy-base 1.21.5 py37hca35cd5_3
numpydoc 1.2 pyhd3eb1b0_0 anaconda
oauthlib 3.2.0 pypi_0 pypi
openssl 1.1.1o h2bbff1b_0 anaconda
opt-einsum 3.3.0 pypi_0 pypi
packaging 20.9 pypi_0 pypi
pandocfilters 1.5.0 pyhd3eb1b0_0 anaconda
parso 0.8.2 pypi_0 pypi
pickleshare 0.7.5 pyhd3eb1b0_1003 anaconda
pillow 9.0.1 py37hdc2b20a_0
pip 22.1.2 pypi_0 pypi
platformdirs 2.4.0 pyhd3eb1b0_0 anaconda
prompt-toolkit 3.0.18 pypi_0 pypi
psutil 5.8.0 py37h2bbff1b_1 anaconda
pyasn1 0.4.8 pypi_0 pypi
pyasn1-modules 0.2.8 pypi_0 pypi
pycodestyle 2.8.0 pyhd3eb1b0_0 anaconda
pycparser 2.20 pypi_0 pypi
pyflakes 2.4.0 pyhd3eb1b0_0 anaconda
pygments 2.9.0 pypi_0 pypi
pylint 2.12.2 py37haa95532_1 anaconda
pyopenssl 22.0.0 pyhd3eb1b0_0 anaconda
pyparsing 3.0.4 pyhd3eb1b0_0 anaconda
pyqt 5.9.2 py37ha878b3d_0 anaconda
pyrsistent 0.17.3 pypi_0 pypi
pysocks 1.7.1 py37_1 anaconda
python 3.7.13 h6244533_0
python-dateutil 2.8.2 pyhd3eb1b0_0
python-fastjsonschema 2.15.1 pyhd3eb1b0_0 anaconda
python_abi 3.7 2_cp37m conda-forge
pytorch 1.12.0 py3.7_cuda11.6_cudnn8_0 pytorch
pytorch-mutex 1.0 cuda pytorch
pytz 2021.3 pyhd3eb1b0_0 anaconda
pywavelets 1.3.0 pypi_0 pypi
pywin32 302 py37h2bbff1b_2 anaconda
pywin32-ctypes 0.2.0 py37_1001 anaconda
pywinpty 2.0.5 pypi_0 pypi
pyzmq 23.2.0 pypi_0 pypi
qt 5.9.7 vc14h73c81de_0 [vc14] anaconda
qtawesome 1.0.3 pyhd3eb1b0_0 anaconda
qtconsole 5.3.0 pyhd3eb1b0_0 anaconda
qtpy 2.0.1 pyhd3eb1b0_0 anaconda
ray 1.13.0 pypi_0 pypi
requests 2.27.1 pyhd3eb1b0_0 anaconda
requests-oauthlib 1.3.1 pypi_0 pypi
rope 0.22.0 pyhd3eb1b0_0 anaconda
rsa 4.8 pypi_0 pypi
scikit-image 0.19.3 pypi_0 pypi
setuptools 62.6.0 pypi_0 pypi
sip 6.5.1 py37hd77b12b_0 anaconda
six 1.16.0 pyhd3eb1b0_1 anaconda
snowballstemmer 2.2.0 pyhd3eb1b0_0 anaconda
soupsieve 2.3.1 pyhd3eb1b0_0 anaconda
sphinx 4.4.0 pyhd3eb1b0_0 anaconda
sphinxcontrib-applehelp 1.0.2 pyhd3eb1b0_0 anaconda
sphinxcontrib-devhelp 1.0.2 pyhd3eb1b0_0 anaconda
sphinxcontrib-htmlhelp 2.0.0 pyhd3eb1b0_0 anaconda
sphinxcontrib-jsmath 1.0.1 pyhd3eb1b0_0 anaconda
sphinxcontrib-qthelp 1.0.3 pyhd3eb1b0_0 anaconda
sphinxcontrib-serializinghtml 1.1.5 pyhd3eb1b0_0 anaconda
spyder 3.3.6 py37_0 anaconda
spyder-kernels 0.5.2 py37_0 anaconda
sqlite 3.38.3 h2bbff1b_0
tabulate 0.8.9 pypi_0 pypi
tensorboard 2.9.1 pypi_0 pypi
tensorboard-data-server 0.6.1 pypi_0 pypi
tensorboard-plugin-wit 1.8.1 pypi_0 pypi
tensorboardx 2.5.1 pypi_0 pypi
tensorflow 2.9.1 pypi_0 pypi
tensorflow-estimator 2.9.0 pypi_0 pypi
tensorflow-io-gcs-filesystem 0.26.0 pypi_0 pypi
termcolor 1.1.0 pypi_0 pypi
testpath 0.5.0 pyhd3eb1b0_0 anaconda
tifffile 2021.11.2 pypi_0 pypi
tk 8.6.12 h2bbff1b_0
toml 0.10.2 pyhd3eb1b0_0 anaconda
torchaudio 0.12.0 py37_cu116 pytorch
torchvision 0.13.0 py37_cu116 pytorch
tornado 6.1 py37h2bbff1b_0 anaconda
traitlets 5.3.0 pypi_0 pypi
typed-ast 1.4.3 py37h2bbff1b_1 anaconda
typing-extensions 3.10.0.0 pypi_0 pypi
typing_extensions 4.1.1 pyh06a4308_0
urllib3 1.26.9 py37haa95532_0 anaconda
vc 14.2 h21ff451_1
virtualenv 20.14.1 pypi_0 pypi
vs2015_runtime 14.27.29016 h5e58377_2
wcwidth 0.2.5 pyhd3eb1b0_0 anaconda
webencodings 0.5.1 py37_1 anaconda
werkzeug 2.1.2 pypi_0 pypi
wheel 0.37.1 pyhd3eb1b0_0
win_inet_pton 1.1.0 py37haa95532_0 anaconda
wincertstore 0.2 py37haa95532_2
wrapt 1.13.3 py37h2bbff1b_2 anaconda
xz 5.2.5 h8cc25b3_1
zipp 3.7.0 pyhd3eb1b0_0 anaconda
zlib 1.2.11 vc14h1cdd9ab_1 [vc14] anaconda
zstd 1.4.9 h19a0ad4_0

I was wondering if someone could be helpful in this regard. I would be happy to share more information if you need.

The other question is that is there any plan in near future to make a version of warp-drive for MacBook Pro Apple M1.

Many thanks in advance!

Error trying to use Pytorch with tow GPUs

Hi,
Not sure what is not working here. I followed the implementation of the pytorch lightning tutorial. I am trying to use this code to run my training on 2 GPU (NVIDIA GeForce GTX 1080 Ti).
My configuration is unchange to the single GPU except for what I found in the pytorch lightning tutorial and I get the following warnings:

/project/MARL_env/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:224: 
PossibleUserWarning: The dataloader, train_dataloader, does not have many workers which may be a bottleneck. 
Consider increasing the value of the `num_workers` argument` (try 32 which is the number of cpus on this machine) in the `DataLoader` init to improve performance.
  rank_zero_warn(
/project/MARL_env/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py:1609: PossibleUserWarning: 
The number of training batches (1) is smaller than the logging interval Trainer(log_every_n_steps=10). 
Set a lower value for log_every_n_steps if you want to see logs for the training epoch.
  rank_zero_warn(

and

/project_ghent/MARL_env/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/logger_connector/result.py:232: 
UserWarning: You called `self.log('VF loss coefficient_prey', ...)` in your `training_step` but the value needs to be floating point. Converting it to torch.float32.

Not sure what parameters I can change to correct those.
Those are warning and the computation happens, but I am not sure if it's done correctly (results wise and speed wise). It seems to me that the speed doesn't increase that much so I am afraid that the second GPU is not used (which seems to be confirmed by my GPU utilization metrics).

Error: Invalid Resource handle.

Hello WarpDrive Team,

A good MARL library indeed. I have tried this library on an old machine and it works fine.

However, when I moved to a new machine, I met the following error.

(warp_drive) ***@***-lab-gpu:~/warp-drive-master/warp_drive$ python training/example_training_script.py --env tag_continuous --num_gpus 1 --results_dir ..
We have successfully found 1 GPUs!
Training with 1 GPU(s).
Traceback (most recent call last):
  File "training/example_training_script.py", line 224, in <module>
    setup_trainer_and_train(run_config, results_directory=results_dir)
  File "training/example_training_script.py", line 126, in setup_trainer_and_train
    trainer.train()
  File "/home/mwj/warp-drive-master/warp_drive/training/trainer.py", line 402, in train
    metrics = self._update_model_params(iteration)
  File "/home/mwj/warp-drive-master/warp_drive/training/trainer.py", line 741, in _update_model_params
    loss.backward()
  File "/home/mwj/anaconda3/envs/warp_drive/lib/python3.7/site-packages/torch/_tensor.py", line 363, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
  File "/home/mwj/anaconda3/envs/warp_drive/lib/python3.7/site-packages/torch/autograd/__init__.py", line 175, in backward
    allow_unreachable=True, accumulate_grad=True)  # Calls into the C++ engine to run the backward pass
RuntimeError: Event device type CUDA does not match blocking stream's device type CPU.
Exception ignored in: <function CUDASampler.__del__ at 0x7f86b065e9e0>
Traceback (most recent call last):
  File "/home/mwj/warp-drive-master/warp_drive/managers/function_manager.py", line 637, in __del__
    free(block=self._block, grid=self._grid)
  File "/home/mwj/anaconda3/envs/warp_drive/lib/python3.7/site-packages/pycuda/driver.py", line 480, in function_call
    func._set_block_shape(*block)
pycuda._driver.LogicError: cuFuncSetBlockShape failed: invalid resource handle

And my nvidia-smi command looks like this.

Tue Apr  5 23:10:52 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.39.01    Driver Version: 510.39.01    CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  On   | 00000000:01:00.0  On |                  N/A |
| 30%   24C    P8    34W / 350W |    326MiB / 24576MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1268      G   /usr/lib/xorg/Xorg                 35MiB |
|    0   N/A  N/A      1771      G   /usr/lib/xorg/Xorg                144MiB |
|    0   N/A  N/A      1884      G   /usr/bin/gnome-shell               55MiB |
|    0   N/A  N/A      3043      G   gnome-control-center               12MiB |
|    0   N/A  N/A      6784      G   ...792671094337050779,131072       46MiB |
|    0   N/A  N/A     12488      G   ...RendererForSitePerProcess       15MiB |
+-----------------------------------------------------------------------------+

The result of running run_unittest.py looks like this.

(warp_drive) mwj@mwj-lab-gpu:~/warp-drive-master/warp_drive$ python utils/run_unittests.py
Running Unit tests ... 
/home/mwj/warp-drive-master/warp_drive/cuda_includes/../../example_envs/tag_gridworld/tag_gridworld_step.cu(151): warning #2361-D: invalid narrowing conversion from "unsigned int" to "int"

====================================================================================== test session starts =======================================================================================
platform linux -- Python 3.7.12, pytest-6.2.5, py-1.11.0, pluggy-1.0.0
rootdir: /home/mwj/warp-drive-master
collected 13 items                                                                                                                                                                               

../tests/example_envs/test_tag_continuous.py .                                                                                                                                             [  7%]
../tests/example_envs/test_tag_gridworld.py .                                                                                                                                              [ 15%]
../tests/example_envs/test_tag_gridworld_step_cuda.py .                                                                                                                                    [ 23%]
../tests/example_envs/test_tag_gridworld_step_python.py ..                                                                                                                                 [ 38%]
../tests/warp_drive/test_action_sampler.py ...                                                                                                                                             [ 61%]
../tests/warp_drive/test_data_manager.py ...                                                                                                                                               [ 84%]
../tests/warp_drive/test_env_reset.py .                                                                                                                                                    [ 92%]
../tests/warp_drive/test_function_manager.py .                                                                                                                                             [100%]

======================================================================================== warnings summary ========================================================================================
../../anaconda3/envs/warp_drive/lib/python3.7/site-packages/gym/envs/registration.py:250
  /home/mwj/anaconda3/envs/warp_drive/lib/python3.7/site-packages/gym/envs/registration.py:250: DeprecationWarning: SelectableGroups dict interface is deprecated. Use select.
    for plugin in metadata.entry_points().get(entry_point, []):

../../anaconda3/envs/warp_drive/lib/python3.7/site-packages/pycuda/compyte/dtypes.py:120
  /home/mwj/anaconda3/envs/warp_drive/lib/python3.7/site-packages/pycuda/compyte/dtypes.py:120: DeprecationWarning: `np.bool` is a deprecated alias for the builtin `bool`. To silence this warning, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here.
  Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
    reg.get_or_register_dtype("bool", np.bool)

-- Docs: https://docs.pytest.org/en/stable/warnings.html
================================================================================= 13 passed, 2 warnings in 5.38s =================================================================================

As the unit tests have passed, I think the cuda version mismatch may not be an issue.

Also, as there are many other environments on this machine, I wonder if there exists a solution to change my environment as little as possible.

So what can I do to fix this issue? Any idea helps.

Many thanks!

GPU requirements

I am testing warpdrive on a p104-100 8g
I can successfully run tag_gridworld env with the test script provided in this repo:
example_training_script_pycuda.py

but tag_continuous:

python example_training_script_pycuda.py -e tag_continuous

it gives me out of memory:

RuntimeError: CUDA out of memory. Tried to allocate 2.38 GiB (GPU 0; 7.93 GiB total capacity; 5.56 GiB already allocated; 1.50 GiB free; 5.58 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

So what are the minimum VRAM requirements for most envs?

support gymnasium >= 0.26?

Thanks for sharing this impressive resource! I see the config currently restricts installation to before breaking changes were introduced into the gymnasium package in its version 0.26 release. Are there plans to migrate to supporting the latest version of the gym/gymnasium package? You are probably already familiar with the offical migration guide

ERROR:root:module 'warp_drive.numba_includes.env_runner' has no attribute 'NumbaCustomEnvStep'

Hi thank your help setting up this,

Following the same tutorial and this examples I have set up the cpu env in custom_env.py where CustomEnv class is and the numba env in custom_env_step_numba.py where NumbaCustomEnvStep function is.

The environment I register is the first custom_en.py this way :

env_registrar.add_cuda_env_src_path(CustomEnv.name, "custom_env", env_backend="numba")

So I have been able to charge my cpu environment without any problem.

env_wrapper = EnvWrapper(
    env_obj=CustomEnv(**run_config["env"]),
    env_name=CustomEnv.name,
    num_envs=run_config["trainer"]["num_envs"],
    env_backend="cpu",
    env_registrar=env_registrar
)

However when I try to set env_backend="numba" I get the following error ERROR:root:module 'warp_drive.numba_includes.env_runner' has no attribute 'NumbaCustomEnvStep'

Not sure where is this error coming from. Warp_drive should have find the NumbaCustomEnvStep class in custom_env_step_numba.py but it obviously did not.

What am I missing again ?

Support for WSL 2

Has anyone been able to install warp-drive on WSL 2? I am not able to install pycuda

[Tutorial 5] 'Trainer' object has no attribute 'cuda_sample_controller'

Dear WarpDrive team,

In tutorial 5, for this particular line of code:

anim = generate_tag_env_rollout_animation(trainer)

I have encountered an error:

AttributeError: 'Trainer' object has no attribute 'cuda_sample_controller'

I have also tried to access this particular attribute in a separate cell but to no avail.

Like the previous issue regarding Tutorial 2, is there a solution for this?

Thank you and I look forward to your help!

Ryan :)

[Tutorial 2 + 3] Error when loading test_build.fatbin file in tutorials (No kernel image is available for execution on the device)

Dear WarpDrive team,

I have came across an error that is consistent in tutorials 2 and 3.

The error occurs when this particular line of code is run-ed:

cuda_function_manager.load_cuda_from_binary_file(f"{_CUBIN_FILEPATH}/test_build.fatbin")

and the error that pops-up is:

RuntimeError: cuModuleLoad failed: no kernel image is available for execution on the device

Is there a solution or a work around for this?

I am trying to learning WarpDrive to implement it for a MARL path finding scenario for possible use in application scenarios (i.e., warehouses).

Thank you and I appreciate your prompt reply in this :)

Correct way to wrap a gymnasium environment.

I couldn't find a tutorial about how to wrap an gymnasium environment.
I want to do something like this:

# Import the PPO algorithm from Stable Baselines 3
from stable_baselines3 import PPO

# Import the gymnasium module
import gymnasium

# Import the EnvWrapper class from WarpDrive
from warp_drive.utils.env_wrapper import EnvWrapper

# Define the number of parallel environments
n_envs = 256

# Choose an environment from Gymnasium
env_name = 'LunarLander-v2'

# Create a list of environment constructors with custom arguments
envs = [lambda: gymnasium.make(env_name, env_kwargs={
    "continuous": False,
    "gravity": -10.0,
    "enable_wind": False,
    "wind_power": 15.0,
    "turbulence_power": 1.5,
}) for _ in range(n_envs)]

# Create a wrapped environment object via the EnvWrapper
env_wrapper = EnvWrapper(envs, num_envs=n_envs, env_backend='pycuda')

model = PPO('MlpPolicy', env_wrapper)
model.learn(total_timesteps=1000000)

Unique env with mixed # of threads/block and chained CUDA kernels. Is Warp-Drive appropriate?

Hello! I have a weird environment which I am having difficulty implementing in warp-drive. Essentially, the environment has N agents place M units on their own boards. After all agents are done placing units, then the boards are matched vs each other and intensive computations are performed to determine per-agent rewards.

I was thinking I could have a CUDA Step function with N agents (threads) per environment (1 block per env) which would handle overall state/action. When the agents are done performing actions, then a CUDA BoardStep function with M units (threads) per board (1 block per board) would run being fed the mapped state -> board_state input (mapping would be done by a separate CUDA function).
I essentially am attempting the below:

step():  # 4 agents per env
    CudaEnvStep(_state_, _action_, _done_)  # 4 agents per block
    if (_done_ && !board_done):
        CudaMapEnvToBoard(_state_, board_state)
    while (_done_ && !board_done):
        CudaBoardStep(board_state, board_done, board_reward)   # 24 units per block
    if (_done_ && board_done):
        CudaCombineRewards(board_reward, _reward_)   # 4 agents per block again

I have implemented the CudaBoardStep(). I am not sure if Warp-Drive's Trainer can handle multiple CUDAFunctionManagers with different threads/block and if this impacts Warp-Drive's performance. Looking at the example environments, I do not see a mixed-thread or chained CUDA kernels environment.

Questions:

  1. Does warp-drive support chained CUDA kernels? Can I make every operation in my step a separate CUDA kernel if necessary and warp-drive will chain them together similar to CUDA Graphs?
  2. Can I have CUDA functions with a different # of threads per block (aka different # of "agents" per environment) mixed within a step() without expecting a significant performance loss?
  3. Would branch/loop operations like if/while run on GPU? I am not sure if the if/while operations are running within PyTorch GPU context or not.

How to implement CTDE-based MARL algorithms on the platform?

How to implement joint-learning-based MARL algorithms(e.g. MAPPO, QMIX, etc.) but not independent-learning-based algorithms(such as ppo implemented in the paper) on warp-drive? Dow you have the plan to give some tutorials about this?
Thanks alot~

fail install package with GPU

hello, not sure what I am doing wrong but I was not able to install this package with Linux environment... https://dev.azure.com/Lightning-AI/Tutorials/_build/results?buildId=209848&view=logs&j=39b3c594-f55d-5bf7-f035-2947d37d58a7&t=76638fff-2d3c-56ee-85c7-984ea34044a4

pip install torchvision rl-warp-drive==2.5 matplotlib 'pytorch-lightning>=2.0, <2.1.0' torchtext 'torchmetrics>=1.0, <1.3' ffmpeg-python 'numpy <2.0' 'torch>=1.8.1, <2.1.0' --quiet --extra-index-url=https://download.pytorch.org/whl/cu118
  error: subprocess-exited-with-error
  
  × Building wheel for pycuda (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [12328 lines of output]
      ***************************************************************
      *** WARNING: nvcc not in path.
      *** May need to set CUDA_INC_DIR for installation to succeed.
      ***************************************************************
      *************************************************************
      *** I have detected that you have not run configure.py.
      *************************************************************
      *** Additionally, no global config files were found.
      *** I will go ahead with the default configuration.
      *** In all likelihood, this will not work out.
      ***
      *** See README_SETUP.txt for more information.
      ***
      *** If the build does fail, just re-run configure.py with the
      *** correct arguments, and then retry. Good luck!
      *************************************************************
      *** HIT Ctrl-C NOW IF THIS IS NOT WHAT YOU WANT
      *************************************************************
      Continuing in 10 seconds...
      Continuing in 9 seconds...
...

and many logs ending with:

      In file included from src/cpp/cuda.cpp:4:
      src/cpp/cuda.hpp:14:10: fatal error: cuda.h: No such file or directory
         14 | #include <cuda.h>
            |          ^~~~~~~~
      compilation terminated.
      error: command '/usr/bin/x86_64-linux-gnu-gcc' failed with exit code 1
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for pycuda
ERROR: ERROR: Failed to build installable wheels for some pyproject.toml based projects (pycuda)

Some question about the environment

I think the idea of environment scheduling is very novel. Multi-environment and multi-agent are scheduled on GPU, which improves GPU utilization ratio.
I have some questions about the tag-continuous:

  • Does continuous represent continuous action space? As I saw that actually the action space of tag-continuous is discrete.
  • Is there any example about ppo algorithm with tag or gridworld?

Using a custom policy model

Hello !

I was wondering how to save and use a new policy template? How to do this for an Environment is well explained in the tutorials, however I haven't found a way to do something similar with a custom Policy.

For instance if I want to create an alternative to fully_connected.py in my project, I would like to create a custom_connected.py file with

class CustomConnected(nn.Module):
    blablabla

How do I use this ?

The syntax of the command is incorrect.

Dear WarpDrive team,

I have arrived with another issue regarding the testing of installation of WarpDrive via pip rl-warp-drive.

I also git-cloned the repository onto my local machine.

I am running on Windows 11, Python 3.8.10, Pytorch 1.9.0+cu111

Running the python warp_drive/utils/run_unittests.py command in Visual Studios Code gives the error:

The syntax of the command is incorrect.
Traceback (most recent call last):
  File "warp_drive/utils/run_unittests.py", line 21, in <module>
    cuda_function_manager.compile(main_file, cubin_file)
  File "C:\Users\ryanl\AppData\Local\Programs\Python\Python38\lib\site-packages\warp_drive\managers\function_manager.py", line 262, in _compile
    raise Exception("make bin file failed ... ")
Exception: make bin file failed ...

On the other hand running the python .\run_trainer_tests.py command gives the error:

=============================================================================== FAILURES =============================================================================== 
_______________________________________________________________ MyTestCase.test_tag_continuous_training ________________________________________________________________ 

self = <tests.wd_training.test_env_training.MyTestCase testMethod=test_tag_continuous_training>

    def test_tag_continuous_training(self):
        run_config = self.get_config("tag_continuous")
        try:
            run_config["env"]["num_taggers"] = 2
            run_config["env"]["num_runners"] = 10
>           launch_process(
                setup_trainer_and_train,
                kwargs={"run_configuration": run_config, "verbose": False},
            )

C:\Users\ryanl\AppData\Local\Programs\Python\Python38\lib\site-packages\tests\wd_training\test_env_training.py:66:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _  

func = <function setup_trainer_and_train at 0x00000294A9BB6700>
kwargs = {'run_configuration': {'env': {'edge_hit_penalty': 0.0, 'end_of_game_reward_for_runner': 1.0, 'episode_length': 500, '.../tmp', 'metrics_log_freq': 100, 'model_params_save_freq': 5000, 'name': 'tag_continuous', ...}, ...}, 'verbose': False}

    def launch_process(func, kwargs):
        """
        Run a Python function on a separate process.
        """
        p = ProcessWrapper(target=func, kwargs=kwargs)
        p.start()
        p.join()
        if p.exception:
>           raise p.exception
E           Exception: make bin file failed ...

C:\Users\ryanl\AppData\Local\Programs\Python\Python38\lib\site-packages\tests\wd_training\test_env_training.py:28: Exception
------------------------------------------------------------------------- Captured stdout call ------------------------------------------------------------------------- 
The syntax of the command is incorrect.
------------------------------------------------------------------------- Captured stderr call ------------------------------------------------------------------------- 
C:\Users\ryanl\AppData\Local\Programs\Python\Python38\lib\site-packages\gym\utils\seeding.py:41: DeprecationWarning: WARN: Function `rng.rand(*size)` is marked as deprecated and will be removed in the future. Please use `Generator.random(size)` instead.
  deprecation(
ERROR:root:make bin file failed ...
________________________________________________________________ MyTestCase.test_tag_gridworld_training ________________________________________________________________ 

self = <tests.wd_training.test_env_training.MyTestCase testMethod=test_tag_gridworld_training>

    def test_tag_gridworld_training(self):
        run_config = self.get_config("tag_gridworld")
        try:
>           launch_process(
                setup_trainer_and_train,
                kwargs={"run_configuration": run_config, "verbose": False},
        p.join()
        if p.exception:
>           raise p.exception
E           Exception: make bin file failed ...

C:\Users\ryanl\AppData\Local\Programs\Python\Python38\lib\site-packages\tests\wd_training\test_env_training.py:28: Exception
------------------------------------------------------------------------- Captured stdout call ------------------------------------------------------------------------- The syntax of the command is incorrect.
------------------------------------------------------------------------- Captured stderr call ------------------------------------------------------------------------- 
ERROR:root:make bin file failed ...
=========================================================================== warnings summary =========================================================================== 
..\..\..\..\..\..\AppData\Local\Programs\Python\Python38\lib\site-packages\pycuda\compyte\dtypes.py:120
  C:\Users\ryanl\AppData\Local\Programs\Python\Python38\lib\site-packages\pycuda\compyte\dtypes.py:120: DeprecationWarning: `np.bool` is a deprecated alias for the builtin `bool`. To silence this warning, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here.
  Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
    reg.get_or_register_dtype("bool", np.bool)

AppData/Local/Programs/Python/Python38/lib/site-packages/tests/wd_training/test_env_training.py::MyTestCase::test_tag_gridworld_training_with_multiple_devices
  C:\Users\ryanl\AppData\Local\Programs\Python\Python38\lib\site-packages\tests\wd_training\test_env_training.py:75: UserWarning: Only single GPU is detected, we skip trainer test for multiple devices
    warnings.warn("Only single GPU is detected, we skip trainer test for multiple devices")

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
======================================================================= short test summary info ======================================================================== 
FAILED ..\..\..\..\..\..\AppData\Local\Programs\Python\Python38\lib\site-packages\tests\wd_training\test_env_training.py::MyTestCase::test_tag_continuous_training - E...FAILED ..\..\..\..\..\..\AppData\Local\Programs\Python\Python38\lib\site-packages\tests\wd_training\test_env_training.py::MyTestCase::test_tag_gridworld_training - Ex...=============================================================== 2 failed, 1 passed, 2 warnings in 11.18s =============================================================== 

I was not able to deduce the issue behind this error.

Also could I find out whether it is better to run it in a LINUX environment or a WINDOWS environment?

I appreciate any help I can get on this!

Thank you :)

Warp Drive PyCuda Error

I am currently running a training script using warp-drive.

I have my environment initialized in this dockerfile.

When running my training_script, I get the following error:

python training_script.py --env simple_wood_and_stone

Inside training_script.py: 1 GPUs are available.
Inside env_wrapper.py: 1 GPUs are available.
/home/miniconda/lib/python3.7/site-packages/torch/cuda/__init__.py:120: UserWarning:
    Found GPU%d %s which is of cuda capability %d.%d.
    PyTorch no longer supports this GPU because it is too old.
    The minimum cuda capability supported by this library is %d.%d.

  warnings.warn(old_gpu_warn.format(d, name, major, minor, min_arch // 10, min_arch % 10))
Initializing the CUDA data manager...
Initializing the CUDA function manager...
WARNING:root:the destination header file /home/miniconda/lib/python3.7/site-packages/warp_drive/cuda_includes/env_config.h already exists; remove and rebuild.
WARNING:root:the destination runner file /home/miniconda/lib/python3.7/site-packages/warp_drive/cuda_includes/env_runner.cu already exists; remove and rebuild.
Traceback (most recent call last):
  File "training_script.py", line 109, in <module>
    customized_env_registrar=env_registry,
  File "/home/miniconda/lib/python3.7/site-packages/ai_economist/foundation/env_wrapper.py", line 208, in __init__
    self.cuda_function_manager.initialize_functions([step_function])
  File "/home/miniconda/lib/python3.7/site-packages/warp_drive/managers/function_manager.py", line 330, in initialize_functions
    self._cuda_functions[fname] = self._CUDA_module.get_function(fname)
pycuda._driver.LogicError: cuModuleGetFunction failed: named symbol not found

was wondering if someone ran into this before or has any idea how to fix it?

Confusion about environment reset on device and save_and_apply_at_reset

Hi! I would like to ask about resetting the environment that is running on GPU. From what I have read, if GPU backend is used, the environment would only be reset once (via the reset() function we define for our environment), and subsequently all data is pushed onto the device, and the reset() function would not be called anymore, as warp_drive would use those data that is save_and_apply_at_reset to automatically innitalize the environement.

If my understanding is right, then how should we deal with those environement attributes that are not save_and_apply_at_reset, but need to be re-calculated or generated at every reset? For example, I need to randomly generate some environment attirbutes (like a 2D Numpy arrays representing random city terrains) at every reset, and since this generation is called in reset() of my customized environment, it seems that I have to stick to this throughout the entire training process.

Thanks alot in advance!

AssertionError: the customized environment is expected to be a valid PYTHONPATH

Hi again,

I am trying to register a custom environment to warp_drive using :

from warp_drive.utils.env_registrar import EnvironmentRegistrar
from stage import Environment

env_registrar = EnvironmentRegistrar()
env_registrar.add_cuda_env_src_path(Environment.name, os.path.abspath("stage.py"), env_backend="numba")

However I get the following error : AssertionError: the customized environment is expected to be a valid PYTHONPATH

As the documentation for env_registrar specifies : :param cuda_env_src_path: ABSOLUTE path to the customized environment source code in CUDA, os.path.abspath("file_name") is supposed to get me this absolute path but due to this line of code

this Error is due to those lines of code in env_registrar

        elif env_backend == "numba":
            assert (
                "/" not in cuda_env_src_path
            ), "the customized environment is expected to be a valid PYTHONPATH"

But isn't that contradictory with the doc asking to provide an absolute path ?

Failure when running test script

Hi! I encounter a failure when running the test script, with following command:
python utils/unittests/run_unittests_pycuda.py
Note that I am running this inside the ~/warp-drive/warp_drive directory.
The test results are attached at the bottom. As you can see, there is one failure when running the TestEnvironmentReset.test_reset_for_different_dim function. May I know how does this error affects the package? Or how should I troubleshoot this? Since I would be using warp drive to develop my customized multi-agent RL environment to run on single GPU device.

I clone and install warpdrive from github, inside a seperate environment in miniconda. The OS I am using is Ubuntu 22.04, and I have nvidia driver and cuda toolkit installed. I have tested torch.cuda.is_available() in this separate environment, and it returns True.

As for other two test commands, they pass without errors, just with some deprecation warnings and skipping tests for multiple GPUs (which is fine, since I only have a single GPU).

This is my first time openning an issue, and many thanks for your assistance! I would be happy to provide more information when requested. I have also attached the packages installed inside the same environment as well, after the test result.

Running Unit tests: pytest /home/hibiki/miniconda3/envs/warp_drive/lib/python3.7/site-packages/tests/warp_drive/pycuda_tests 
============================= test session starts ==============================
platform linux -- Python 3.7.16, pytest-7.3.1, pluggy-1.0.0
rootdir: /home/hibiki
collected 8 items                                                              

../../miniconda3/envs/warp_drive/lib/python3.7/site-packages/tests/warp_drive/pycuda_tests/test_action_sampler.py . [ 12%]
..                                                                       [ 37%]
../../miniconda3/envs/warp_drive/lib/python3.7/site-packages/tests/warp_drive/pycuda_tests/test_data_manager.py . [ 50%]
..                                                                       [ 75%]
../../miniconda3/envs/warp_drive/lib/python3.7/site-packages/tests/warp_drive/pycuda_tests/test_env_reset.py . [ 87%]
                                                                         [ 87%]
../../miniconda3/envs/warp_drive/lib/python3.7/site-packages/tests/warp_drive/pycuda_tests/test_function_manager.py . [100%]

============================== 8 passed in 5.07s ===============================
Running Unit tests: pytest /home/hibiki/miniconda3/envs/warp_drive/lib/python3.7/site-packages/tests/example_envs/pycuda_tests 
============================= test session starts ==============================
platform linux -- Python 3.7.16, pytest-7.3.1, pluggy-1.0.0
rootdir: /home/hibiki
collected 5 items                                                              

../../miniconda3/envs/warp_drive/lib/python3.7/site-packages/tests/example_envs/pycuda_tests/test_tag_continuous.py . [ 20%]
                                                                         [ 20%]
../../miniconda3/envs/warp_drive/lib/python3.7/site-packages/tests/example_envs/pycuda_tests/test_tag_gridworld.py . [ 40%]
                                                                         [ 40%]
../../miniconda3/envs/warp_drive/lib/python3.7/site-packages/tests/example_envs/pycuda_tests/test_tag_gridworld_step_cuda.py . [ 60%]
                                                                         [ 60%]
../../miniconda3/envs/warp_drive/lib/python3.7/site-packages/tests/example_envs/pycuda_tests/test_tag_gridworld_step_python.py . [ 80%]
.                                                                        [100%]

=============================== warnings summary ===============================
miniconda3/envs/warp_drive/lib/python3.7/site-packages/tests/example_envs/pycuda_tests/test_tag_continuous.py: 24 warnings
  /home/hibiki/miniconda3/envs/warp_drive/lib/python3.7/site-packages/gym/utils/seeding.py:42: DeprecationWarning: WARN: Function `rng.rand(*size)` is marked as deprecated and will be removed in the future. Please use `Generator.random(size)` instead.
    "Function `rng.rand(*size)` is marked as deprecated "

miniconda3/envs/warp_drive/lib/python3.7/site-packages/tests/example_envs/pycuda_tests/test_tag_continuous.py: 8000 warnings
  /home/hibiki/miniconda3/envs/warp_drive/lib/python3.7/site-packages/gym/utils/seeding.py:64: DeprecationWarning: WARN: Function `rng.randint(low, [high, size, dtype])` is marked as deprecated and will be removed in the future. Please use `rng.integers(low, [high, size, dtype])` instead.
    "Function `rng.randint(low, [high, size, dtype])` is marked as deprecated "

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
======================= 5 passed, 8024 warnings in 6.67s =======================
/home/hibiki/miniconda3/envs/warp_drive/lib/python3.7/site-packages/warp_drive/cuda_includes/../../example_envs/tag_gridworld/tag_gridworld_step_pycuda.cu(151): warning #2361-D: invalid narrowing conversion from "unsigned int" to "int"
      int global_state_arr_shape[] = {gridDim.x, wkNumberAgents};
                                      ^

Remark: The warnings can be suppressed with "-diag-suppress <warning-number>"

Running Unit tests: pytest /home/hibiki/miniconda3/envs/warp_drive/lib/python3.7/site-packages/tests/multiblocks_per_env/warp_drive/pycuda_tests 
============================= test session starts ==============================
platform linux -- Python 3.7.16, pytest-7.3.1, pluggy-1.0.0
rootdir: /home/hibiki
collected 5 items                                                              

../../miniconda3/envs/warp_drive/lib/python3.7/site-packages/tests/multiblocks_per_env/warp_drive/pycuda_tests/test_action_sampler_multiblocks.py . [ 20%]
..                                                                       [ 60%]
../../miniconda3/envs/warp_drive/lib/python3.7/site-packages/tests/multiblocks_per_env/warp_drive/pycuda_tests/test_env_reset_multiblocks.py F [ 80%]
                                                                         [ 80%]
../../miniconda3/envs/warp_drive/lib/python3.7/site-packages/tests/multiblocks_per_env/warp_drive/pycuda_tests/test_function_manager_multiblocks.py . [100%]

=================================== FAILURES ===================================
______________ TestEnvironmentReset.test_reset_for_different_dim _______________

self = <tests.multiblocks_per_env.warp_drive.pycuda_tests.test_env_reset_multiblocks.TestEnvironmentReset testMethod=test_reset_for_different_dim>

    def test_reset_for_different_dim(self):
    
        self.dm.data_on_device_via_torch("_done_")[:] = torch.from_numpy(
            np.array([1, 0])
        ).cuda()
    
        done = self.dm.pull_data_from_device("_done_")
        self.assertSequenceEqual(list(done), [1, 0])
    
        data_feed = DataFeed()
        data_feed.add_data(
            name="a", data=np.random.randn(2, 10, 3), save_copy_and_apply_at_reset=True
        )
        data_feed.add_data(
            name="b", data=np.random.randn(2, 10), save_copy_and_apply_at_reset=True
        )
        data_feed.add_data(
            name="c", data=np.random.randn(2), save_copy_and_apply_at_reset=True
        )
        data_feed.add_data(
            name="d",
            data=np.random.randint(10, size=(2, 10, 3), dtype=np.int32),
            save_copy_and_apply_at_reset=True,
        )
        data_feed.add_data(
            name="e",
            data=np.random.randint(10, size=(2, 10), dtype=np.int32),
            save_copy_and_apply_at_reset=True,
        )
        data_feed.add_data(
            name="f",
            data=np.random.randint(10, size=2, dtype=np.int32),
            save_copy_and_apply_at_reset=True,
        )
    
        self.dm.push_data_to_device(data_feed)
    
        torch_data_feed = DataFeed()
        torch_data_feed.add_data(
            name="at", data=np.random.randn(2, 10, 3), save_copy_and_apply_at_reset=True
        )
        torch_data_feed.add_data(
            name="bt", data=np.random.randn(2, 10), save_copy_and_apply_at_reset=True
        )
        torch_data_feed.add_data(
            name="ct", data=np.random.randn(2), save_copy_and_apply_at_reset=True
        )
        torch_data_feed.add_data(
            name="dt",
            data=np.random.randint(10, size=(2, 10, 3), dtype=np.int32),
            save_copy_and_apply_at_reset=True,
        )
        torch_data_feed.add_data(
            name="et",
            data=np.random.randint(10, size=(2, 10), dtype=np.int32),
            save_copy_and_apply_at_reset=True,
        )
        torch_data_feed.add_data(
            name="ft",
            data=np.random.randint(10, size=2, dtype=np.int32),
            save_copy_and_apply_at_reset=True,
        )
        self.dm.push_data_to_device(torch_data_feed, torch_accessible=True)
    
        a = self.dm.pull_data_from_device("a")
        b = self.dm.pull_data_from_device("b")
        c = self.dm.pull_data_from_device("c")
        d = self.dm.pull_data_from_device("d")
        e = self.dm.pull_data_from_device("e")
        f = self.dm.pull_data_from_device("f")
        at = self.dm.pull_data_from_device("at")
        bt = self.dm.pull_data_from_device("bt")
        ct = self.dm.pull_data_from_device("ct")
        dt = self.dm.pull_data_from_device("dt")
        et = self.dm.pull_data_from_device("et")
        ft = self.dm.pull_data_from_device("ft")
    
        # change the value in place
        self.dm.data_on_device_via_torch("at")[:] = torch.rand(2, 10, 3).cuda()
        self.dm.data_on_device_via_torch("bt")[:] = torch.rand(2, 10).cuda()
        self.dm.data_on_device_via_torch("ct")[:] = torch.rand(2).cuda()
        self.dm.data_on_device_via_torch("dt")[:] = torch.randint(
            10, size=(2, 10, 3)
        ).cuda()
        self.dm.data_on_device_via_torch("et")[:] = torch.randint(
            10, size=(2, 10)
        ).cuda()
        self.dm.data_on_device_via_torch("ft")[:] = torch.randint(10, size=(2,)).cuda()
    
        self.resetter.reset_when_done(self.dm)
    
        a_after_reset = self.dm.pull_data_from_device("a")
        b_after_reset = self.dm.pull_data_from_device("b")
        c_after_reset = self.dm.pull_data_from_device("c")
        d_after_reset = self.dm.pull_data_from_device("d")
        e_after_reset = self.dm.pull_data_from_device("e")
        f_after_reset = self.dm.pull_data_from_device("f")
    
        at_after_reset = self.dm.pull_data_from_device("at")
        bt_after_reset = self.dm.pull_data_from_device("bt")
        ct_after_reset = self.dm.pull_data_from_device("ct")
        dt_after_reset = self.dm.pull_data_from_device("dt")
        et_after_reset = self.dm.pull_data_from_device("et")
        ft_after_reset = self.dm.pull_data_from_device("ft")
    
        self.assertTrue(np.absolute((a - a_after_reset).mean()) < 1e-5)
        self.assertTrue(np.absolute((b - b_after_reset).mean()) < 1e-5)
        self.assertTrue(np.absolute((c - c_after_reset).mean()) < 1e-5)
        self.assertTrue(np.count_nonzero(d - d_after_reset) == 0)
        self.assertTrue(np.count_nonzero(e - e_after_reset) == 0)
        self.assertTrue(np.count_nonzero(f - f_after_reset) == 0)
    
        # so after the soft reset, only env_0 got reset because it has done flag on
        self.assertTrue(np.absolute((at - at_after_reset)[0].mean()) < 1e-5)
        self.assertTrue(np.absolute((bt - bt_after_reset)[0].mean()) < 1e-5)
        self.assertTrue(np.absolute((ct - ct_after_reset)[0].mean()) < 1e-5)
        self.assertTrue(np.absolute((at - at_after_reset)[1].mean()) > 1e-5)
        self.assertTrue(np.absolute((bt - bt_after_reset)[1].mean()) > 1e-5)
        self.assertTrue(np.absolute((ct - ct_after_reset)[1].mean()) > 1e-5)
        self.assertTrue(np.count_nonzero((dt - dt_after_reset)[0]) == 0)
        self.assertTrue(np.count_nonzero((et - et_after_reset)[0]) == 0)
        self.assertTrue(np.count_nonzero((ft - ft_after_reset)[0]) == 0)
        self.assertTrue(np.count_nonzero((dt - dt_after_reset)[1]) > 0)
        self.assertTrue(np.count_nonzero((et - et_after_reset)[1]) > 0)
        self.assertTrue(np.count_nonzero((ft - ft_after_reset)[1]) >= 0)
    
        done = self.dm.pull_data_from_device("_done_")
        self.assertSequenceEqual(list(done), [0, 0])
    
        # Now test if mode="force_reset" works
    
        torch_data_feed2 = DataFeed()
        torch_data_feed2.add_data(
            name="af", data=np.random.randn(2, 10, 3), save_copy_and_apply_at_reset=True
        )
        torch_data_feed2.add_data(
            name="bf", data=np.random.randn(2, 10), save_copy_and_apply_at_reset=True
        )
        torch_data_feed2.add_data(
            name="cf", data=np.random.randn(2), save_copy_and_apply_at_reset=True
        )
        torch_data_feed2.add_data(
            name="df",
            data=np.random.randint(10, size=(2, 10, 3), dtype=np.int32),
            save_copy_and_apply_at_reset=True,
        )
        torch_data_feed2.add_data(
            name="ef",
            data=np.random.randint(10, size=(2, 10), dtype=np.int32),
            save_copy_and_apply_at_reset=True,
        )
        torch_data_feed2.add_data(
            name="ff",
            data=np.random.randint(10, size=2, dtype=np.int32),
            save_copy_and_apply_at_reset=True,
        )
        self.dm.push_data_to_device(torch_data_feed2, torch_accessible=True)
    
        af = self.dm.pull_data_from_device("af")
        bf = self.dm.pull_data_from_device("bf")
        cf = self.dm.pull_data_from_device("cf")
        df = self.dm.pull_data_from_device("df")
        ef = self.dm.pull_data_from_device("ef")
        ff = self.dm.pull_data_from_device("ff")
    
        # change the value in place
        self.dm.data_on_device_via_torch("af")[:] = torch.rand(2, 10, 3).cuda()
        self.dm.data_on_device_via_torch("bf")[:] = torch.rand(2, 10).cuda()
        self.dm.data_on_device_via_torch("cf")[:] = torch.rand(2).cuda()
        self.dm.data_on_device_via_torch("df")[:] = torch.randint(
            10, size=(2, 10, 3)
        ).cuda()
        self.dm.data_on_device_via_torch("ef")[:] = torch.randint(
            10, size=(2, 10)
        ).cuda()
        self.dm.data_on_device_via_torch("ff")[:] = torch.randint(10, size=(2,)).cuda()
    
        self.resetter.reset_when_done(self.dm)
    
        af_after_soft_reset = self.dm.pull_data_from_device("af")
        bf_after_soft_reset = self.dm.pull_data_from_device("bf")
        cf_after_soft_reset = self.dm.pull_data_from_device("cf")
        df_after_soft_reset = self.dm.pull_data_from_device("df")
        ef_after_soft_reset = self.dm.pull_data_from_device("ef")
        ff_after_soft_reset = self.dm.pull_data_from_device("ff")
    
        self.assertTrue(np.absolute((af - af_after_soft_reset).mean()) > 1e-5)
        self.assertTrue(np.absolute((bf - bf_after_soft_reset).mean()) > 1e-5)
        self.assertTrue(np.absolute((cf - cf_after_soft_reset).mean()) > 1e-5)
        self.assertTrue(np.count_nonzero(df - df_after_soft_reset) > 0)
        self.assertTrue(np.count_nonzero(ef - ef_after_soft_reset) > 0)
>       self.assertTrue(np.count_nonzero(ff - ff_after_soft_reset) > 0)
E       AssertionError: False is not true

../../miniconda3/envs/warp_drive/lib/python3.7/site-packages/tests/multiblocks_per_env/warp_drive/pycuda_tests/test_env_reset_multiblocks.py:236: AssertionError
------------------------------ Captured log setup ------------------------------
WARNING  root:function_manager.py:57 
                `num_agents` cannot be divisible by `blocks_per_env`.
                Therefore, the running threads for the last block could
                possibly EXCEED the boundaries of the output arrays and
                incurs index our-of-range bugs.
                Consider to have a proper thread index boundary check,
                for example if you have already checked
                `if (kThisAgentId < NumAgents)`, please ignore this warning.
------------------------------ Captured log call -------------------------------
WARNING  root:data_manager.py:427 PyCUDADataManager casts the data 'a' from type float64 to float32
WARNING  root:data_manager.py:427 PyCUDADataManager casts the data 'b' from type float64 to float32
WARNING  root:data_manager.py:427 PyCUDADataManager casts the data 'c' from type float64 to float32
WARNING  root:data_manager.py:427 PyCUDADataManager casts the data 'at' from type float64 to float32
WARNING  root:data_manager.py:427 PyCUDADataManager casts the data 'bt' from type float64 to float32
WARNING  root:data_manager.py:427 PyCUDADataManager casts the data 'ct' from type float64 to float32
WARNING  root:data_manager.py:427 PyCUDADataManager casts the data 'af' from type float64 to float32
WARNING  root:data_manager.py:427 PyCUDADataManager casts the data 'bf' from type float64 to float32
WARNING  root:data_manager.py:427 PyCUDADataManager casts the data 'cf' from type float64 to float32
=========================== short test summary info ============================
FAILED ../../miniconda3/envs/warp_drive/lib/python3.7/site-packages/tests/multiblocks_per_env/warp_drive/pycuda_tests/test_env_reset_multiblocks.py::TestEnvironmentReset::test_reset_for_different_dim - AssertionError: False is not true
========================= 1 failed, 4 passed in 3.85s ==========================
Running Unit tests: pytest /home/hibiki/miniconda3/envs/warp_drive/lib/python3.7/site-packages/tests/multiblocks_per_env/example_envs/pycuda_tests 
============================= test session starts ==============================
platform linux -- Python 3.7.16, pytest-7.3.1, pluggy-1.0.0
rootdir: /home/hibiki
collected 1 item                                                               

../../miniconda3/envs/warp_drive/lib/python3.7/site-packages/tests/multiblocks_per_env/example_envs/pycuda_tests/test_tag_continuous_multiblocks.py . [100%]

=============================== warnings summary ===============================
miniconda3/envs/warp_drive/lib/python3.7/site-packages/tests/multiblocks_per_env/example_envs/pycuda_tests/test_tag_continuous_multiblocks.py: 24 warnings
  /home/hibiki/miniconda3/envs/warp_drive/lib/python3.7/site-packages/gym/utils/seeding.py:42: DeprecationWarning: WARN: Function `rng.rand(*size)` is marked as deprecated and will be removed in the future. Please use `Generator.random(size)` instead.
    "Function `rng.rand(*size)` is marked as deprecated "

miniconda3/envs/warp_drive/lib/python3.7/site-packages/tests/multiblocks_per_env/example_envs/pycuda_tests/test_tag_continuous_multiblocks.py: 8000 warnings
  /home/hibiki/miniconda3/envs/warp_drive/lib/python3.7/site-packages/gym/utils/seeding.py:64: DeprecationWarning: WARN: Function `rng.randint(low, [high, size, dtype])` is marked as deprecated and will be removed in the future. Please use `rng.integers(low, [high, size, dtype])` instead.
    "Function `rng.randint(low, [high, size, dtype])` is marked as deprecated "

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
======================= 1 passed, 8024 warnings in 6.55s =======================

running conda list with the environment activated return the following:

# packages in environment at /home/hibiki/miniconda3/envs/warp_drive:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                  2_kmp_llvm    conda-forge
aiohttp                   3.8.4                    pypi_0    pypi
aiosignal                 1.3.1                    pypi_0    pypi
appdirs                   1.4.4              pyh9f0ad1d_0    conda-forge
async-timeout             4.0.2                    pypi_0    pypi
asynctest                 0.13.0                   pypi_0    pypi
attrs                     23.1.0                   pypi_0    pypi
boost-cpp                 1.78.0               h6582d0a_3    conda-forge
bzip2                     1.0.8                h7f98852_4    conda-forge
ca-certificates           2023.5.7             hbcca054_0    conda-forge
certifi                   2023.5.7           pyhd8ed1ab_0    conda-forge
charset-normalizer        3.1.0                    pypi_0    pypi
cloudpickle               2.2.1                    pypi_0    pypi
cudatoolkit               11.8.0              h37601d7_11    conda-forge
cycler                    0.11.0                   pypi_0    pypi
exceptiongroup            1.1.1                    pypi_0    pypi
fonttools                 4.38.0                   pypi_0    pypi
frozenlist                1.3.3                    pypi_0    pypi
fsspec                    2023.1.0                 pypi_0    pypi
gym                       0.25.2                   pypi_0    pypi
gym-notices               0.0.8                    pypi_0    pypi
icu                       72.1                 hcb278e6_0    conda-forge
idna                      3.4                      pypi_0    pypi
importlib-metadata        6.6.0                    pypi_0    pypi
iniconfig                 2.0.0                    pypi_0    pypi
kiwisolver                1.4.4                    pypi_0    pypi
ld_impl_linux-64          2.38                 h1181459_1  
libblas                   3.9.0           16_linux64_openblas    conda-forge
libcblas                  3.9.0           16_linux64_openblas    conda-forge
libffi                    3.4.4                h6a678d5_0  
libgcc-ng                 12.2.0              h65d4601_19    conda-forge
libgfortran-ng            12.2.0              h69a702a_19    conda-forge
libgfortran5              12.2.0              h337968e_19    conda-forge
liblapack                 3.9.0           16_linux64_openblas    conda-forge
libopenblas               0.3.21          pthreads_h78a6416_3    conda-forge
libstdcxx-ng              12.2.0              h46fd767_19    conda-forge
libzlib                   1.2.13               h166bdaf_4    conda-forge
lightning-utilities       0.8.0                    pypi_0    pypi
llvm-openmp               16.0.4               h4dfa4b3_0    conda-forge
llvmlite                  0.39.1                   pypi_0    pypi
mako                      1.2.0              pyhd8ed1ab_0    conda-forge
markupsafe                2.1.1            py37h540881e_1    conda-forge
matplotlib                3.5.3                    pypi_0    pypi
multidict                 6.0.4                    pypi_0    pypi
ncurses                   6.4                  h6a678d5_0  
numba                     0.56.4                   pypi_0    pypi
numpy                     1.21.6           py37h976b520_0    conda-forge
openssl                   1.1.1t               h0b41bf4_0    conda-forge
packaging                 23.1                     pypi_0    pypi
pillow                    9.5.0                    pypi_0    pypi
pip                       22.3.1           py37h06a4308_0  
platformdirs              3.5.1              pyhd8ed1ab_0    conda-forge
pluggy                    1.0.0                    pypi_0    pypi
pycuda                    2022.1           py37h790c342_1    conda-forge
pyparsing                 3.0.9                    pypi_0    pypi
pytest                    7.3.1                    pypi_0    pypi
python                    3.7.16               h7a1cb2a_0  
python-dateutil           2.8.2                    pypi_0    pypi
python_abi                3.7                     2_cp37m    conda-forge
pytools                   2022.1.14          pyhd8ed1ab_0    conda-forge
pytorch-lightning         1.9.5                    pypi_0    pypi
pyyaml                    6.0                      pypi_0    pypi
readline                  8.2                  h5eee18b_0  
requests                  2.31.0                   pypi_0    pypi
rl-warp-drive             2.3                      pypi_0    pypi
setuptools                65.6.3           py37h06a4308_0  
six                       1.16.0                   pypi_0    pypi
sqlite                    3.41.2               h5eee18b_0  
tk                        8.6.12               h1ccaba5_0  
tomli                     2.0.1                    pypi_0    pypi
torch                     1.10.2                   pypi_0    pypi
torchmetrics              0.11.4                   pypi_0    pypi
tqdm                      4.65.0                   pypi_0    pypi
typing-extensions         4.6.2                hd8ed1ab_0    conda-forge
typing_extensions         4.6.2              pyha770c72_0    conda-forge
urllib3                   2.0.2                    pypi_0    pypi
wheel                     0.38.4           py37h06a4308_0  
xz                        5.4.2                h5eee18b_0  
yarl                      1.9.2                    pypi_0    pypi
zipp                      3.15.0                   pypi_0    pypi
zlib                      1.2.13               h166bdaf_4    conda-forge
zstd                      1.5.2                h3eb15da_6    conda-forge

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.