Giter Site home page Giter Site logo

xuanlinli17 / cs285_fa19_deep_reinforcement_learning Goto Github PK

View Code? Open in Web Editor NEW
116.0 116.0 37.0 52.33 MB

My solutions to UC Berkeley CS285 (originally CS294-112, deeprlcourse) Fall 2019 assignments

Home Page: https://github.com/xuanlinli17/CS285_Fa19_Deep_Reinforcement_Learning

Python 99.66% Jupyter Notebook 0.02% Shell 0.33%

cs285_fa19_deep_reinforcement_learning's People

Contributors

xuanlinli17 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

cs285_fa19_deep_reinforcement_learning's Issues

Reproducing the result of hw1 problem 1(b)

Hi there! I am trying to reproduce the result of homework 1, problem 1(b). I use the file requirements.txt to install all my dependencies. And when I ran the command:

python cs285/scripts/run_hw1_behavior_cloning.py --expert_policy_file cs285/policies/experts/HalfCheetah.pkl --env_name HalfCheetah-v2 --exp_name test_bc_hcheetah --n_iter 1 --expert_data cs285/expert_data/expert_data_HalfCheetah-v2.pkl --batch_size=1000 --eval_batch_size=5000

what I got:

Loading expert policy from... cs285/policies/experts/HalfCheetah.pkl
obs (1, 17) (1, 17)
Done restoring expert policy...


********** Iteration 0 ************

Training agent using sampled data from replay buffer...

Beginning logging procedure...

Collecting data for eval...
Eval_AverageReturn : 4.991946220397949
Eval_StdReturn : 17.147544860839844
Eval_MaxReturn : 32.29301452636719
Eval_MinReturn : -9.376068115234375
Eval_AverageEpLen : 1000.0
Train_AverageReturn : 4205.7783203125
Train_StdReturn : 83.038818359375
Train_MaxReturn : 4288.81689453125
Train_MinReturn : 4122.7392578125
Train_AverageEpLen : 1000.0
Train_EnvstepsSoFar : 0
TimeSinceStart : 4.198240041732788
Initial_DataCollection_AverageReturn : 4205.7783203125
Done logging...



Saving agent's actor...

So the average return of evaluation is about 4.99, which does not match the result provided in folder ./hw1/run_logs/bc_test_bc_hcheetah_HalfCheetah-v2_16-09-2019_00-58-58/. I was wondering which part I've done wrong and it would be nice if you could help me figure it out. Many thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.