Giter Site home page Giter Site logo

catalyst-rl's Introduction

Catalyst logo

Accelerated RL

Build Status CodeFactor Pipi version Docs PyPI Status

Twitter Telegram Slack Github contributors

PyTorch framework for RL research. It was developed with a focus on reproducibility, fast experimentation and code/ideas reusing. Being able to research/develop something new, rather than write another regular train loop.
Break the cycle - use the Catalyst!

Project manifest. Part of PyTorch Ecosystem. Part of Catalyst Ecosystem:

  • Alchemy - Experiments logging & visualization
  • Catalyst - Accelerated Deep Learning Research and Development
  • Reaction - Convenient Deep Learning models serving

Catalyst at AI Landscape.


Installation

Common installation:

pip install -U catalyst-rl

Catalyst.RL is compatible with: Python 3.6+. PyTorch 1.0.0+.

Getting started

For Catalyst.RL introduction, please follow OpenAI Gym example.

Docs and examples

API documentation and an overview of the library can be found here Docs.
In the examples folder of the repository, you can find advanced tutorials and Catalyst best practices.

Infos

To learn more about Catalyst internals and to be aware of the most important features, you can read Catalyst-info – our blog where we regularly write facts about the framework.

We also supervise Awesome Catalyst list – Catalyst-powered projects, tutorials and talks.
Feel free to make a PR with your project to the list. And don't forget to check out current list, there are many interesting projects.

Releases

We deploy a major release once a month with a name like YY.MM.
And micro-releases with framework improvements during a month in the format YY.MM.#.

You can view the changelog on the GitHub Releases page.
Current version: Pipi version

Overview

Catalyst.RL helps you write compact but full-featured RL pipelines in a few lines of code. You get a training loop with metrics, early-stopping, model checkpointing and other features without the boilerplate.

Features

  • Universal train/inference loop.
  • Configuration files for model/data hyperparameters.
  • Reproducibility – all source code and environment variables will be saved.
  • Callbacks – reusable train/inference pipeline parts.
  • Training stages support.
  • Easy customization.
  • PyTorch best practices (SWA, AdamW, Ranger optimizer, OneCycle, FP16 and more).

Structure

  • RL – scalable Reinforcement Learning, all popular model-free algorithms implementations and their improvements with distributed training support.
  • contrib - additional modules contributed by Catalyst users.
  • utils - different useful utils for Deep Learning research.

Contribution guide

We appreciate all contributions. If you are planning to contribute back bug-fixes, please do so without any further discussion. If you plan to contribute new features, utility functions or extensions, please first open an issue and discuss the feature with us.

License

This project is licensed under the Apache License, Version 2.0 see the LICENSE file for details License

Citation

Please use this bibtex if you want to cite this repository in your publications:

@misc{catalyst,
    author = {Kolesnikov, Sergey},
    title = {Accelerated RL.},
    year = {2018},
    publisher = {GitHub},
    journal = {GitHub repository},
    howpublished = {\url{https://github.com/catalyst-team/catalyst-rl}},
}

catalyst-rl's People

Contributors

scitator avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

catalyst-rl's Issues

Differences between SAC original paper and catalyst implementation

Hi,
Now I'm trying to apply Soft Author-Critic approach in my RL task. I checked out details of catalyst implementation and noticed that it differs from the original paper: https://arxiv.org/abs/1801.01290
For example:

  1. There is no sampling actions from policy. The policy itself produces actions rather than parameters of actions distribution. So there is no naturally built-in exploration into the policy comparing to the original paper, and that's why you need to add action noise (and parameter noise, which freezes the sampling for me, not sure why).
  2. There is no V state value function network. And as a consequence, there is no moving average for V state value function. Instead, this moving average is applied to each of two Q state-action value functions networks, All these changes lead to different gradient equations.

Could you please shed some light on details of your implementation? Does it work better than original? Have you based your implementation on some other paper?

I'm really interested in all these details, because I want to compare SAC and TD3, but for me these two algorithms in catalyst look not so different - both use double Q learning and deterministic policy. Thanks!

No factory with name '{name}' was registered

Command is{CUDA_VISIBLE_DEVICES='0' catalyst-rl run-trainer --config configs/config.yml}, get error:
{catalyst.utils.tools.registry.RegistryException: No factory with name 'CoppeliaSimEnvWrapper' was registered}

Detail is:
Traceback (most recent call last):
File "/workspace/mazhengyu/anaconda3/envs/catalystenv/bin/catalyst-rl", line 8, in
sys.exit(main())
File "/workspace/mazhengyu/anaconda3/envs/catalystenv/lib/python3.6/site-packages/catalyst/rl/main.py", line 44, in main
COMMANDS[args.command].main(args, uargs)
File "/workspace/mazhengyu/anaconda3/envs/catalystenv/lib/python3.6/site-packages/catalyst/rl/scripts/run_trainer.py", line 69, in main
env = ENVIRONMENTS.get_from_params(**config["environment"])
File "/workspace/mazhengyu/anaconda3/envs/catalystenv/lib/python3.6/site-packages/catalyst/utils/tools/registry.py", line 244, in get_from_params
return self.get_instance(name, meta_factory=meta_factory, **kwargs)
File "/workspace/mazhengyu/anaconda3/envs/catalystenv/lib/python3.6/site-packages/catalyst/utils/tools/registry.py", line 216, in get_instance
f = self.get(name)
File "/workspace/mazhengyu/anaconda3/envs/catalystenv/lib/python3.6/site-packages/catalyst/utils/tools/registry.py", line 187, in get
f"No factory with name '{name}' was registered"
catalyst.utils.tools.registry.RegistryException: No factory with name 'CoppeliaSimEnvWrapper' was registered

Failed bulding wheel for pyarrow

Hi,
I'm using Ubuntu 22.04 and Python 3.9, when I tried to install catalyst-rl, I'm getting
ERROR: Failed building wheel for pyarrow .

I've tried upgrading pip, setuptools wheel, cmake ...

If I install latest version of pyarrow (10.0.1) it works.
But for catalyst-rl it is specifically required by requirements pyarrow==0.15.1 and this doesn't work.

Thanks for all the help!

Implementing RNNs into RL algorithms

Hey!

First of all thank you for this library!

I would like to take your actors and critics and implement RNN-enhanced TD3 algorithm as described here: https://arxiv.org/pdf/1710.06537.pdf.

I have investigated the source code and it seems that you are not supporting recurrent feature in your implementation of rl algorithms. Have you considered it? Is there anything you can recommend which may possibly help me with seamless transition to memory-based approach?

Cheers!

Edit: The "history len" parameter in the code, would this be something I am looking for?

How to load the last checkpoint

Hi, How to load the last checkpoint and set the save_path of the checkpoint? Can I load the previous checkpoint by editing the config.yaml ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.