self implementation of RL method Policy Optimization with Model-based Explorations
conda create --f environment.yml
Tensorboard logged datas will be located below runs/ directory, to visualize data after a pome run
tensorboard --logdir runs or
python -m tensorboard.main --logdir=./experiments/{algo}/runs
- Currently, one file implementation pome.py can run without error, while algorithm is not tested
- For reproduction of OpenAI Baselines, there are many addtional implementation details, according to this site
- Visualization
- code for other environments
- fixed-length trajectory segments
- Orthogonal Initialization of Weights and Constant Initialization of biases
- Mini-batch Updates
- Skip Frame
- Resize images
- Scaling the Images to Range [0, 1]
- minibatch standardization
- remove reward estimation