Giter Site home page Giter Site logo

jeappen / idp-offline-rl Goto Github PK

View Code? Open in Web Editor NEW

This project forked from takuseno/d3rlpy

1.0 0.0 0.0 21 MB

Information-Directed Pessimism for Offline Reinforcement Learning [ICML 2024]

License: MIT License

Python 39.21% Jupyter Notebook 59.23% C++ 0.02% Cython 1.54%

idp-offline-rl's Introduction

[ICML 2024] Kernelized Stein Discrepancy for Offline Reinforcement Learning

This code builds off d3rlpy (MIT License), a framework for Offline Reinforcement Learning. We share the same dataset handling functionality. Follow the instructions below to set up the environment:

  • Create a new python 3.8 environment from Miniconda with the requirements
conda create -y -n offrl python=3.8 
conda activate offrl
conda install -y numpy pandas seaborn cython==0.29.21
pip install -r requirements.txt
pip install -e .
  • Add this environment to your jupyter notebook
conda install -y -c anaconda ipykernel
python -m ipykernel install --user --name offrl --display-name "Python (offrl)"
  • Now you can run the scripts and notebooks in this repo from the tutorials
  • Run pytest from the root to check your installation (first disable tqdm to avoid conflicts)
    TQDM_DISABLE=1 && pytest tests/lcb
    
  • All code for the main algorithms in the paper are in d3rlpy/lcb.

File Structure

.
├── ...
├── tests                   # Test files
│   ├── lcb                 # Basic functionality tests for paper algos
│   .
├── d3rlpy                  # All source files for d3rlpy (with some minor edits)
│   ├── lcb                 # Our code for the paper algorithms
│   .
├── tutorials               # Scripts and notebooks (front-end)
│   ├── run_*.py            # Scripts with options to choose data type for each environment
│   ├── PlotPickle.ipynb    # Edit paths and use to plot graphs once results are generated
│   .
└── ...

Scripts

Example Usage

conda activate offrl
# Run the random MDP experiments with the random policy dataset
python run_mdp.py --run-mode all --extra-exp-prefix run_random_test --dataset rand
# Run the random MDP Q-learning experiments with the easy dataset
python run_mdp.py --run-mode qling --extra-exp-prefix run_random_test --dataset easy

Notebooks

Misc.

For more example usage refer the ipython notebook tutorials/OfflineRL_frozenlake.ipynb.

Bibtex

@inproceedings{koppel2024informationdirected,
title={Information-Directed Pessimism for Offline Reinforcement Learning},
author={Koppel, Alec and Bhatt, Sujay and Jiacheng Guo and Joe Eappen and Mengdi Wang and Ganesh, Sumitra},
booktitle={Forty-first International Conference on Machine Learning},
year={2024},
}

D3RLPY Old README

check the old README for the original README.md for d3rlpy.

idp-offline-rl's People

Contributors

takuseno avatar jamartinh avatar pstansell avatar araffin avatar aiueola avatar jeappen avatar navidmdn avatar emrul avatar joshuaspear avatar astrojuanlu avatar meokz avatar lucmc avatar mohan-zhang-u avatar qqpann avatar zbzhu99 avatar tominku avatar

Stargazers

Reshma Ughade avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.