avivt / vin Goto Github PK

Value Iteration Networks

License: Other

Python 50.87% MATLAB 30.75% Shell 18.39%

vin's Introduction

Allan Lab Website

This is the website of our academic research group at Leiden University.

This website is powered by Jekyll and some Bootstrap, Bootwatch. We tried to make it simple yet adaptable, so that it is easy for you to use it as a template. Plese feel free to copy and modify for your own purposes. You don't have to link to us or mention us (but of course we appreciate it).

Go to aboutwebsite.md to learn how to copy and modidy this page for your purpose.

vin's People

Contributors

Stargazers

Watchers

Forkers

ml-lab gandalfvn bekerov emigmo benjamesbabala sunbibei zuiwufenghua lukovkin wsjeon vyraun richardkelley codeaudit mafm hbu-mlc-3 zeyuan1987 nunofernandes-plight xflee josephwinston freeyawork lixuejian minkcho mazhengmac liubo-cs chpyang0229 williamwwang williamd4112 amarjyotismruti duguyue100 huiwenzhang juicyslew mekruthi sungikchoi devendrachaplot 4skynet tony32769 oarriaga lemonnight antsui ustcpcs neuralnetworkingtechnologies min-yang kefault ningweikang pencilandbike xlcodeme zgjszjggjt dogordog panxipeng luckyup afcarl walledcr osrlab parsonszeng rahulindoria5 fence hugallant jeme-yufeng-zhan vrtompki es6rc jayagupta678 tiancivalen xrosliang augustinharter venkataramansubramanian gaosz0755 iq-scm dencc21 4c3x

vin's Issues

Difficulties in reproducing results from the paper

I've encountered some issues in reproducing results from the VIN paper. I've run commands from the scripts/nips_gridworld_experiments_VIN.sh file, but I was able to reproduce only results for the 8x8 grid world. I've trained 6 times network for 16x16 grid world problem, but the best results that I've gotten was with loss 0.1006 and accuracy: 95.7% (loss 0.11 and accuracy 95% on average). In addition, I've trained network for 28x28 grid world problem, but my loss after 120 epochs was 0.26 and accuracy 89.37%.

Is there anything that I can do to get the same (or similar) accuracy as in the paper?

Does VIN naturally work with reinforcement learning?

From my view of the paper, examples shown in the main paper were mainly aimed at supervised learning (imitation learning), though there are some examples using reinforcement learning. So the question is does VIN naturally work with RL? In addition, almost all examples involve extracting some high level grid world representation of the state space, it is not clear how this model may be applied to a more realistic domain where representing all states may be infeasible?

why aren't obstacles considered during the rollout testing

Hi,

Could you please let me know why the obstacles are not contained in the rollout testing phase?
Since the code only considers the transitions to free cells not the valid transition under a particular action

Thanks

Cannot generate grid_world data with script_make_data.m

I tried to produce the grid_world data wth the script of script_make_data.m. However, I obtained

no obstacles added, or problem with border, regenerating map

I print out dom and check the variables of n_obs and add_border_res, found out that dom is always 1, and add_border_res makes n_obs == 0 || add_border_res true.

Could you please help on this? Thanks!

Purpose of image_shape and filter_shape in conv2D_keep_shape?

In your conv2d_keep_shape function which I copied below what is the purpose of the image_shape and filter_shape args?

def conv2D_keep_shape(x, w, image_shape, filter_shape, subsample=(1, 1)):
    # crop output to same size as input
    fs = T.shape(w)[2] - 1  # this is the filter size minus 1
    ims = T.shape(x)[2]     # this is the image size
    return theano.sandbox.cuda.dnn.dnn_conv(img=x,
                                            kerns=w,
                                            border_mode='full',
                                            subsample=subsample,
                                            )[:, :, fs/2:ims+fs/2, fs/2:ims+fs/2]

The parameters exist when you call the function every time you want to do a convolution, but they aren't being used in anyway. For example:

self.h = conv2D_keep_shape(in_x, self.w0, image_shape=[batchsize, self.w0.shape.eval()[1], imsize[0], imsize[1]], filter_shape=self.w0.shape.eval())

Are they intended to be used somehow or is this a remnant of old code being refactored? I ask because I implemented this architecture in TensorFlow and am attempting to replicate the results.

Release code for Web Navigation ?

Hi, Could you release the code for web navigation experiments as well??
Thanks!

Missing a function when running data generation file.

Hi! I want to do some experiments on larger domains such as 100*100 gridworld. But when I run the data generation file "make_data_girdworld_nips.m",a missing function called "shortest_paths()" causes code to fail.[~, pred] = shortest_paths(G_inv,goal_s,options);
Could you please release the code of function `"shortest_paths()"?Thank you very much!

Shape mismatch when taking out the Q values out from ConvNet

When I run the program with the provided data, I met shape mismatch error. The command I used is:

root@32f2d2ff7341:~/VIN# python NN_run_training.py --model valIterBatch --input ./data/gridworld_28.mat --output ./tmp

The error is caused by taking out the q values (see here).

Using gpu device 0: GeForce GTX TITAN X (CNMeM is disabled, cuDNN 5105)
/usr/local/lib/python2.7/dist-packages/theano/sandbox/cuda/__init__.py:600: UserWarning: Your cuDNN version is more recent than the one Theano officially supports. If you see any problems, try updating Theano or downgrading cuDNN to version 5.
  warnings.warn(warn)
/usr/local/lib/python2.7/dist-packages/theano/tensor/signal/downsample.py:6: UserWarning: downsample module has been moved to the theano.tensor.signal.pool module.
  "downsample module has been moved to the theano.tensor.signal.pool module.")
valIterBatch
     Epoch |  Train NLL |  Train Err |   Test NLL |   Test Err | Epoch Time
Traceback (most recent call last):
  File "NN_run_training.py", line 95, in <module>
    main()
  File "NN_run_training.py", line 91, in main
    grad_check=args.grad_check, batch_size=args.batchsize, data_fraction=args.data_fraction)
  File "/root/VIN/vin.py", line 103, in run_training
    ytrain[start*self.statebatchsize:end*self.statebatchsize])
  File "/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py", line 871, in __call__
    storage_map=getattr(self.fn, 'storage_map', None))
  File "/usr/local/lib/python2.7/dist-packages/theano/gof/link.py", line 314, in raise_with_op
    reraise(exc_type, exc_value, exc_trace)
  File "/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py", line 859, in __call__
    outputs = self.fn()
  File "/usr/local/lib/python2.7/dist-packages/theano/gof/op.py", line 912, in rval
    r = p(n, [x[0] for x in i], o)
  File "/usr/local/lib/python2.7/dist-packages/theano/tensor/subtensor.py", line 2166, in perform
    out[0] = inputs[0].__getitem__(inputs[1:])
IndexError: shape mismatch: indexing arrays could not be broadcast together with shapes (128,) (1280,) (1280,)
Apply node that caused the error: AdvancedSubtensor(HostFromGpu.0, Reshape{1}.0, SliceConstant{None, None, None}, Reshape{1}.0, Reshape{1}.0)
Toposort index: 243
Inputs types: [TensorType(float32, 4D), TensorType(int64, vector), <theano.tensor.type_other.SliceType object at 0x7fbbf44c0a50>, TensorType(int8, vector), TensorType(int8, vector)]
Inputs shapes: [(128, 10, 28, 28), (128,), 'No shapes', (1280,), (1280,)]
Inputs strides: [(31360, 3136, 112, 4), (8,), 'No strides', (1,), (1,)]
Inputs values: ['not shown', 'not shown', slice(None, None, None), 'not shown', 'not shown']
Outputs clients: [[GpuFromHost(AdvancedSubtensor.0)]]

Backtrace when the node is created(use Theano flag traceback.limit=N to make it longer):
  File "NN_run_training.py", line 95, in <module>
    main()
  File "NN_run_training.py", line 62, in main
    batchsize=args.batchsize, statebatchsize=args.statebatchsize)
  File "/root/VIN/vin.py", line 34, in __init__
    k=self.k)
  File "/root/VIN/vin.py", line 246, in __init__
    in_s2.flatten()]

I am not good at debugging Theano (much harder than TensorFlow). I guess it was caused by the : indexing. Could you please help on this?

Why not just use Python Packages -Nobody uses the old grandfather called MATLAB

It is waste of time to learn 2 libraries especially damn MATLAB. I have not used MATLAB since i left University 6 years ago and never come across any company in real world using MATLAB . In some die-hard colleges that don't want to let go still use MATLAB but majority have moved on to python .
There are Tensorflow , Theano and others that would accomplish the same task probably in easier way and help majority of people adopt your research

Release code for Mars Navigation and Mujoco continuous control ?

Is it also ok to release the code for Mars Navigation dataset, models and particularly the one for continuous control via Mujoco simulator. Did you use OpenAI Gym for it by the way ?

In addition, there are other continuous control task in OpenAI Gym, e.g. LunarLander, CarRacing, BipedalWalker, do you think we can try to use VIN on these tasks ? Perhaps even for some games in OpenAI Universe ?

Code for hierarchical VI modules

Hi, do you plan to release the code for hierarchical VI modules at some point?

Matlab called python Theano Module Wrong

Hi! Recently, we tried to reproduce the experiment, but at the visualization part (script_viz_policy.m) has to use MATLAB call python code (vin.py). In this part, the code imports theano module (which I have already installed). But it reported an err:
Undefined variable "py" or class "py.vin.vin".

Error in script_viz_policy (line 3)
tmp = py.vin.vin; clear tmp; % to load Python

And I followed instruction of Matlab Online Help, used command py.importlib.import_module('vin') for test, and still reported an err:

Error using theano_utils> (line 3)
Python Error: ImportError: No module named theano

Error in vin> (line 3)
from theano_utils import *

Error in init>import_module (line 37)
import(name)

But I can import theano in python,

import theano
import theano.tensor as T
theano.test()
Theano version 0.8.2
theano is installed in //anaconda/lib/python2.7/site-packages/theano
NumPy version 1.11.1
NumPy relaxed strides checking option: False
NumPy is installed in //anaconda/lib/python2.7/site-packages/numpy
Python version 2.7.12 |Anaconda 4.2.0 (x86_64)| (default, Jul 2 2016, 17:43:17) [GCC 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.11.00)]
nose version 1.3.7

I couldn't figure out where is the problem. Thank you in advance!

Release code for gridworld with reinforcement learning?

Hi, could you release the code for gridworld with reinforcement learning? Thanks a lot!

why l_q equals 10 rather than 8?

Hi,

VIN/vin.py

Line 30 in fe11bb1

l_q = 10 # channels in q layer (~actions)

why l_q equals 10 rather than 8?
Thanks