catalyst-team / catalyst-rl Goto Github PK

License: Apache License 2.0

Makefile 0.12% Shell 4.67% Python 95.21%

catalyst-rl's Introduction

Accelerated RL

PyTorch framework for RL research. It was developed with a focus on reproducibility, fast experimentation and code/ideas reusing. Being able to research/develop something new, rather than write another regular train loop.
Break the cycle - use the Catalyst!

Project manifest. Part of PyTorch Ecosystem. Part of Catalyst Ecosystem:

Alchemy - Experiments logging & visualization
Catalyst - Accelerated Deep Learning Research and Development
Reaction - Convenient Deep Learning models serving

Catalyst at AI Landscape.

Installation

Common installation:

pip install -U catalyst-rl

Catalyst.RL is compatible with: Python 3.6+. PyTorch 1.0.0+.

Getting started

For Catalyst.RL introduction, please follow OpenAI Gym example.

Docs and examples

Demo with minimal examples for CV, NLP, RecSys and GANs
Detailed classification tutorial
Advanced segmentation tutorial
Comprehensive classification pipeline
Binary and semantic segmentation pipeline

API documentation and an overview of the library can be found here .
In the examples folder of the repository, you can find advanced tutorials and Catalyst best practices.

Infos

To learn more about Catalyst internals and to be aware of the most important features, you can read Catalyst-info – our blog where we regularly write facts about the framework.

We also supervise Awesome Catalyst list – Catalyst-powered projects, tutorials and talks.
Feel free to make a PR with your project to the list. And don't forget to check out current list, there are many interesting projects.

Releases

We deploy a major release once a month with a name like YY.MM.
And micro-releases with framework improvements during a month in the format YY.MM.#.

You can view the changelog on the GitHub Releases page.
Current version:

Overview

Catalyst.RL helps you write compact but full-featured RL pipelines in a few lines of code. You get a training loop with metrics, early-stopping, model checkpointing and other features without the boilerplate.

Features

Universal train/inference loop.
Configuration files for model/data hyperparameters.
Reproducibility – all source code and environment variables will be saved.
Callbacks – reusable train/inference pipeline parts.
Training stages support.
Easy customization.
PyTorch best practices (SWA, AdamW, Ranger optimizer, OneCycle, FP16 and more).

Structure

RL – scalable Reinforcement Learning, all popular model-free algorithms implementations and their improvements with distributed training support.
contrib - additional modules contributed by Catalyst users.
utils - different useful utils for Deep Learning research.

Contribution guide

We appreciate all contributions. If you are planning to contribute back bug-fixes, please do so without any further discussion. If you plan to contribute new features, utility functions or extensions, please first open an issue and discuss the feature with us.

Please see the contribution guide for more information.
By participating in this project, you agree to abide by its Code of Conduct.

License

This project is licensed under the Apache License, Version 2.0 see the LICENSE file for details

Citation

Please use this bibtex if you want to cite this repository in your publications:

@misc{catalyst,
    author = {Kolesnikov, Sergey},
    title = {Accelerated RL.},
    year = {2018},
    publisher = {GitHub},
    journal = {GitHub repository},
    howpublished = {\url{https://github.com/catalyst-team/catalyst-rl}},
}

catalyst-rl's People

Contributors

Stargazers

Watchers

Forkers

rhololkeolke emigmo otreewen2020 lzw0123 klins101 rettyov wang-feihong

catalyst-rl's Issues

Differences between SAC original paper and catalyst implementation

Hi,
Now I'm trying to apply Soft Author-Critic approach in my RL task. I checked out details of catalyst implementation and noticed that it differs from the original paper: https://arxiv.org/abs/1801.01290
For example:

There is no sampling actions from policy. The policy itself produces actions rather than parameters of actions distribution. So there is no naturally built-in exploration into the policy comparing to the original paper, and that's why you need to add action noise (and parameter noise, which freezes the sampling for me, not sure why).
There is no V state value function network. And as a consequence, there is no moving average for V state value function. Instead, this moving average is applied to each of two Q state-action value functions networks, All these changes lead to different gradient equations.

Could you please shed some light on details of your implementation? Does it work better than original? Have you based your implementation on some other paper?

I'm really interested in all these details, because I want to compare SAC and TD3, but for me these two algorithms in catalyst look not so different - both use double Q learning and deterministic policy. Thanks!

No factory with name '{name}' was registered

Command is{CUDA_VISIBLE_DEVICES='0' catalyst-rl run-trainer --config configs/config.yml}, get error:
{catalyst.utils.tools.registry.RegistryException: No factory with name 'CoppeliaSimEnvWrapper' was registered}

Detail is:
Traceback (most recent call last):
File "/workspace/mazhengyu/anaconda3/envs/catalystenv/bin/catalyst-rl", line 8, in
sys.exit(main())
File "/workspace/mazhengyu/anaconda3/envs/catalystenv/lib/python3.6/site-packages/catalyst/rl/main.py", line 44, in main
COMMANDS[args.command].main(args, uargs)
File "/workspace/mazhengyu/anaconda3/envs/catalystenv/lib/python3.6/site-packages/catalyst/rl/scripts/run_trainer.py", line 69, in main
env = ENVIRONMENTS.get_from_params(**config["environment"])
File "/workspace/mazhengyu/anaconda3/envs/catalystenv/lib/python3.6/site-packages/catalyst/utils/tools/registry.py", line 244, in get_from_params
return self.get_instance(name, meta_factory=meta_factory, **kwargs)
File "/workspace/mazhengyu/anaconda3/envs/catalystenv/lib/python3.6/site-packages/catalyst/utils/tools/registry.py", line 216, in get_instance
f = self.get(name)
File "/workspace/mazhengyu/anaconda3/envs/catalystenv/lib/python3.6/site-packages/catalyst/utils/tools/registry.py", line 187, in get
f"No factory with name '{name}' was registered"
catalyst.utils.tools.registry.RegistryException: No factory with name 'CoppeliaSimEnvWrapper' was registered

Failed bulding wheel for pyarrow

Hi,
I'm using Ubuntu 22.04 and Python 3.9, when I tried to install catalyst-rl, I'm getting
ERROR: Failed building wheel for pyarrow .

I've tried upgrading pip, setuptools wheel, cmake ...

If I install latest version of pyarrow (10.0.1) it works.
But for catalyst-rl it is specifically required by requirements pyarrow==0.15.1 and this doesn't work.

Thanks for all the help!

Implementing RNNs into RL algorithms

Hey!

First of all thank you for this library!

I would like to take your actors and critics and implement RNN-enhanced TD3 algorithm as described here: https://arxiv.org/pdf/1710.06537.pdf.

I have investigated the source code and it seems that you are not supporting recurrent feature in your implementation of rl algorithms. Have you considered it? Is there anything you can recommend which may possibly help me with seamless transition to memory-based approach?

Cheers!

Edit: The "history len" parameter in the code, would this be something I am looking for?

How to load the last checkpoint

Hi, How to load the last checkpoint and set the save_path of the checkpoint? Can I load the previous checkpoint by editing the config.yaml ?