Giter Site home page Giter Site logo

tu-rbo / learning-state-representations-with-robotic-priors Goto Github PK

View Code? Open in Web Editor NEW
18.0 7.0 7.0 4.46 MB

Code and data accompaning the paper "Learning State Representations with Robotic Priors" (Jonschkowski and Brock, 2015).

Home Page: http://tinyurl.com/gly9sma

License: MIT License

Python 100.00%

learning-state-representations-with-robotic-priors's Introduction

Learning State Representations with Robotic Priors

Author and Contact

Rico Jonschkowski ([email protected])

Introduction

This folder contains a simple implementation of the method for state representation learning described our the paper "Learning State Representations with Robotic Priors" (Jonschkowski and Brock, 2015). This implementation complements the paper to provide sufficient detail for reproducing our results and for reusing the method in other research while minimizing code overhead (extensive explanations and descriptions are omitted here and can be found in the paper: http://tinyurl.com/gly9sma).

If you are using this implementation in your research, please consider giving credit by citing our paper:

@article{jonschkowski2015learning,
  title={Learning state representations with robotic priors},
  author={Jonschkowski, Rico and Brock, Oliver},
  journal={Autonomous Robots},
  volume={39},
  number={3},
  pages={407--428},
  year={2015},
  publisher={Springer}
}

Files

main.py -- python3 script that includes our method, plotting functions, and batch learning experiments for two different tasks main_tf.py -- python3 script that is equivalent to main.py but uses Tensorflow and Sonnet instead of Theano *.npz -- training and test data for two different tasks (described in detail below, see DATA)

Dependencies

Our code builds on the following python3 libraries:

numpy

sudo apt-get install python3-numpy

matplotlib

sudo apt-get install python3-matplotlib

either lasagne (and theano) --> http://lasagne.readthedocs.io/en/latest/user/installation.html

or sonnet --> https://github.com/deepmind/sonnet and Tensorflow --> https://www.tensorflow.org/install/

Usage

If all dependencies are met, simply run

python3 main.py

or

python3 main_tf.py

which should first learn state representations for the simple navigation task, followed by the slot car task. After the representation is learned from one batch of data (*train.npz) it is then applied to a new batch of data (*test.npz) for each task. The command line output will walk you through the process.

Data

Each of the four .npz files include data from 5000 consecutive steps that were generated by random exploration. Each .npz file contains four numpy arrays: observations, actions, rewards, episode_starts. Observations are flattened 16x16 RGB images, actions are integers, rewards are floats, and episode_starts are flags that denote whether the current step is the beginning of a new episode. (Note that the action at time step t is taken at time t but only influences observations and rewards at time step t+1.)

The following excerpt from main.py shows how to load the data in python.

import numpy as np
training_data = np.load('simple_navigation_task_train.npz')

learning-state-representations-with-robotic-priors's People

Contributors

ricoj avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

learning-state-representations-with-robotic-priors's Issues

Wrong number of elements when creating minibatchlist

Hello,

It seems that when creating the minibatchlist (main.py#L121), the wrong total number of element is used (because all last observation of each episode is removed):

minibatchlist = [np.array(sorted(indices[start_idx:start_idx + self.batchsize]))
                for start_idx in range(0, num_samples - self.batchsize + 1, self.batchsize)]

indices has not a length of num_samples, so instead of num_samples - self.batchsize + 1, it should be len(indices) - self.batchsize + 1:

minibatchlist = [np.array(sorted(indices[start_idx:start_idx + self.batch_size]))
                 for start_idx in range(0, len(indices) - self.batchsize + 1, self.batch_size)]

In fact, the current code always works when we have num_episodes < batchsize, but breaks when we have a lot of episodes ( num_episodes > 2 * batchsize).

this bug was spotted by @Elovir and @hill-a

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.