minerllabs / baselines Goto Github PK

View Code? Open in Web Editor NEW

146.0 146.0 33.0 161.13 MB

A collection of baselines for the MineRL environment/datasets & the NeurIPS 2021 MineRL competitions

License: MIT License

Python 95.76% Shell 4.24%

baselines's People

Contributors

Stargazers

Watchers

baselines's Issues

ResetTrimInfoWrapper no longer needed

With minerl-0.2.3, env.reset() only returns obs, so ResetTrimInfoWrapper is no longer needed. It should be either commented out or deleted?

The baseline code is using this wrapper for PPO/Rainbow and other implementations.

ZeroDivisionError: float division by zero issue

When I run a dqfd code by a 'python train_dqfd.py
--env MineRLTreechop-v0 --expert-demo-path ./expert_dataset/MineRLTreechop-v0/
--frame-skip 4 --frame-stack 4 --gpu 0 --lr 6.25e-5 --minibatch-size 32
--n-experts 16 --use-noisy-net before-pretraining', I got a error message.

I try to correct a that issue myself. But I can not solve yet.

unknown option `remote-subodules'

When I run git clone https://github.com/minerllabs/baselines.git --recurse-submodules --remote-subodules, I get the error "unknown option `remote-subodules'". Can you help me?

TypeError: a bytes-like object is required, not 'NoneType' for rainbow on treechop

I've put the log below. Appears maybe a None action is being passed into env.step?

INFO     - 2019-08-16 12:26:14,778 - [utils log_versions 9] 3.6.6 (default, Sep  3 2018, 15:31:46) 
[GCC 4.8.5 20150623 (Red Hat 4.8.5-28)]
INFO     - 2019-08-16 12:26:15,566 - [utils log_versions 10] absl-py==0.4.1,asn1crypto==0.24.0,astor==0.7.1,backcall==0.1.0,bcrypt==3.1.6,bitarray==0.9.3,bleach==2.1.4,boto==2.48.0,boto3==1.9.7,botocore==1.12.7,bunch==1.0.1,cached-property==1.4.3,cachetools==2.1.0,certifi==2019.3.9,cffi==1.12.3,chainer==4.4.0,chainerrl==0.7.0,chardet==3.0.4,cloudpickle==1.2.1,coloredlogs==10.0,crowdai-api==0.1.21,cryptography==2.7,cupy==4.0.0,cupy-cuda91==4.0.0,cycler==0.10.0,Cython==0.28.5,dask==0.19.0,decorator==4.4.0,Django==1.9,docutils==0.14,entrypoints==0.2.3,fastrlock==0.4,filelock==3.0.6,Flask==0.10.1,Flask-SocketIO==2.6,Flask-WTF==0.12,future==0.17.1,gast==0.2.0,getch==1.0,gevent==1.1.0,gevent-websocket==0.9.3,google-api-core==1.4.0,google-auth==1.5.1,google-cloud-core==0.28.1,google-cloud-dataproc==0.2.0,google-cloud-logging==1.7.0,google-cloud-storage==1.12.0,google-resumable-media==0.3.1,googleapis-common-protos==1.5.3,greenlet==0.4.14,grpcio==1.14.2,gym==0.14.0,h5py==2.6.0,html5lib==1.0.1,humanfriendly==4.18,idna==2.8,image==1.5.24,ipykernel==4.9.0,ipython==7.7.0,ipython-genutils==0.2.0,ipywidgets==7.4.1,itsdangerous==0.24,jedi==0.13.3,Jinja2==2.10,jmespath==0.9.3,jsonschema==2.6.0,jupyter==1.0.0,jupyter-client==5.2.3,jupyter-console==5.2.0,jupyter-core==4.4.0,kaggle==1.5.4,Keras==2.2.4,Keras-Applications==1.0.6,Keras-Preprocessing==1.0.5,kiwisolver==1.1.0,leveldb==0.194,lmdb==0.87,lxml==4.4.1,malmo==0.36.0.0,Markdown==2.6.11,MarkupSafe==1.0,marlo==0.0.1.dev15,matplotlib==3.0.3,minerl==0.2.3,mistune==0.8.3,mock==2.0.0,mrjob==0.6.5,nbconvert==5.3.1,nbformat==4.4.0,networkx==2.1,nose==1.3.7,notebook==5.6.0,numpy==1.17.0,olefile==0.45.1,openai==0.1,opencv-python==4.1.0.25,pandas==0.23.4,pandocfilters==1.4.2,paramiko==2.5.0,parso==0.4.0,pbr==5.1.3,pexpect==4.7.0,pickleshare==0.7.5,Pillow==6.0.0,pip==10.0.1,pipenv==2018.7.1,pkginfo==1.4.2,prometheus-client==0.3.1,prompt-toolkit==2.0.9,protobuf==3.6.1,psutil==5.6.3,ptyprocess==0.6.0,pyasn1==0.4.4,pyasn1-modules==0.2.2,pycparser==2.19,pydotplus==2.0.2,pyglet==1.3.2,Pygments==2.4.2,PyNaCl==1.3.0,pyparsing==2.4.0,PyQt5==5.9.1,PyQt5-sip==4.19.13,Pyro4==4.76,pyserial==3.4,pySmartDL==1.3.1,python-dateutil==2.8.0,python-engineio==2.2.0,python-gflags==3.1.2,python-gitlab==1.8.0,python-magic==0.2,python-slugify==3.0.2,python-socketio==2.0.0,pytz==2018.5,PyWavelets==1.0.0,PyYAML==3.13,pyzmq==17.1.2,qtconsole==4.4.1,redis==3.2.1,requests==2.22.0,requests-toolbelt==0.8.0,rsa==4.0,s3transfer==0.1.13,scikit-fmm==0.0.9,scikit-image==0.14.0,scikit-learn==0.19.2,scipy==1.3.0,seaborn==0.9.0,Send2Trash==1.5.0,serpent==1.28,setuptools==41.0.1,simplegeneric==0.8.1,sip==4.19.8,six==1.12.0,spyder-kernels==0.3.0,tensorboard==1.13.1,tensorflow==1.12.0,tensorflow-estimator==1.13.0,tensorflow-gpu==1.13.1,termcolor==1.1.0,terminado==0.8.1,testpath==0.3.1,text-unidecode==1.2,toolz==0.9.0,torch==0.4.1,torchvision==0.2.1,tornado==5.1,tqdm==4.33.0,traitlets==4.3.2,twine==1.11.0,typing==3.6.6,urllib3==1.25.3,virtualenv==16.0.0,virtualenv-clone==0.3.0,wcwidth==0.1.7,webencodings==0.5.1,Werkzeug==0.14.1,wheel==0.31.1,widgetsnbextension==3.4.1,WTForms==2.1,wurlitzer==1.0.2,XBee==2.3.2
INFO     - 2019-08-16 12:26:15,568 - [__main__ _main 137] The first `gym.make(MineRL*)` may take several minutes. Be patient!
INFO     - 2019-08-16 12:26:20,518 - [minerl.env.malmo.instance.04a184 _launch_minecraft 635] Starting Minecraft process: ['/tmp/tmpipots4mk/Minecraft/launchClient.sh', '-port', '9002', '-env', '-runDir', '/tmp/tmpipots4mk/Minecraft/run']
INFO     - 2019-08-16 12:26:20,643 - [minerl.env.malmo.instance.04a184 _launch_process_watcher 658] Starting process watcher for process 4524 @ localhost:9002
INFO     - 2019-08-16 12:31:11,890 - [minerl.env.malmo.instance.04a184 launch 498] Minecraft process ready
INFO     - 2019-08-16 12:31:11,904 - [__main__ wrap_env 151] Detected `gym.wrappers.TimeLimit`! Unwrap it and re-wrap our own time limit.
INFO     - 2019-08-16 12:31:11,908 - [minerl.env.malmo log_to_file 513] Logging output of Minecraft to results/MineRLTreechop-v0/rainbow/20190816T122613.869530/logs/mc_2.log
INFO     - 2019-08-16 12:31:11,942 - [env_wrappers __init__ 270] always pressing keys: []
INFO     - 2019-08-16 12:31:11,946 - [env_wrappers __init__ 276] reversed pressing keys: []
INFO     - 2019-08-16 12:31:11,947 - [env_wrappers __init__ 281] always ignored keys: ['back', 'left', 'right', 'sneak', 'sprint']
INFO     - 2019-08-16 12:31:11,968 - [env_wrappers __init__ 315] Dict(attack:Discrete(2), back:Discrete(2), camera:Box(2,), forward:Discrete(2), jump:Discrete(2), left:Discrete(2), right:Discrete(2), sneak:Discrete(2), sprint:Discrete(2)) is converted to Discrete(6).
INFO     - 2019-08-16 12:31:11,970 - [__main__ wrap_env 151] Detected `gym.wrappers.TimeLimit`! Unwrap it and re-wrap our own time limit.
INFO     - 2019-08-16 12:31:12,099 - [env_wrappers __init__ 270] always pressing keys: []
INFO     - 2019-08-16 12:31:12,110 - [env_wrappers __init__ 276] reversed pressing keys: []
INFO     - 2019-08-16 12:31:12,112 - [env_wrappers __init__ 281] always ignored keys: ['back', 'left', 'right', 'sneak', 'sprint']
INFO     - 2019-08-16 12:31:12,133 - [env_wrappers __init__ 315] Dict(attack:Discrete(2), back:Discrete(2), camera:Box(2,), forward:Discrete(2), jump:Discrete(2), left:Discrete(2), right:Discrete(2), sneak:Discrete(2), sprint:Discrete(2)) is converted to Discrete(6).
INFO     - 2019-08-16 12:52:51,814 - [chainerrl.experiments.train_agent train_agent 67] outdir:results/MineRLTreechop-v0/rainbow/20190816T122613.869530 step:8000 episode:0 R:0.0
INFO     - 2019-08-16 12:52:52,043 - [chainerrl.experiments.train_agent train_agent 68] statistics:[('average_q', 0.1425044723945982), ('average_loss', 3.8487588996725908), ('n_updates', 748)]
INFO     - 2019-08-16 13:10:50,996 - [chainerrl.experiments.train_agent train_agent 67] outdir:results/MineRLTreechop-v0/rainbow/20190816T122613.869530 step:16000 episode:1 R:0.0
INFO     - 2019-08-16 13:10:51,246 - [chainerrl.experiments.train_agent train_agent 68] statistics:[('average_q', 0.13654071304094723), ('average_loss', 3.7585677461190676), ('n_updates', 2748)]
INFO     - 2019-08-16 13:12:34,895 - [chainerrl.experiments.train_agent save_agent 274] Saved the agent to results/MineRLTreechop-v0/rainbow/20190816T122613.869530/16384_except
ERROR    - 2019-08-16 13:12:34,898 - [__main__ main 132] execution failed.
Traceback (most recent call last):
  File "dqn_family.py", line 130, in main
    _main(args)
  File "dqn_family.py", line 263, in _main
    outdir=args.outdir, eval_env=eval_env, save_best_so_far_agent=True,
  File "/homes/mdl31/.local/lib/python3.6/site-packages/chainerrl/experiments/train_agent.py", line 163, in train_agent_with_evaluation
    logger=logger)
  File "/homes/mdl31/.local/lib/python3.6/site-packages/chainerrl/experiments/train_agent.py", line 54, in train_agent
    obs, r, done, info = env.step(action)
  File "/homes/mdl31/.local/lib/python3.6/site-packages/gym/core.py", line 285, in step
    return self.env.step(self.action(action))
  File "/homes/mdl31/.local/lib/python3.6/site-packages/gym/core.py", line 261, in step
    observation, reward, done, info = self.env.step(action)
  File "/homes/mdl31/.local/lib/python3.6/site-packages/gym/core.py", line 261, in step
    observation, reward, done, info = self.env.step(action)
  File "/homes/mdl31/.local/lib/python3.6/site-packages/gym/core.py", line 261, in step
    observation, reward, done, info = self.env.step(action)
  File "/homes/mdl31/.local/lib/python3.6/site-packages/chainerrl/wrappers/continuing_time_limit.py", line 38, in step
    observation, reward, done, info = self.env.step(action)
  File "/homes/mdl31/.local/lib/python3.6/site-packages/minerl/env/core.py", line 536, in step
    reward, done, sent = struct.unpack('!dbb', reply)
TypeError: a bytes-like object is required, not 'NoneType'
Traceback (most recent call last):
  File "dqn_family.py", line 271, in <module>
    main()
  File "dqn_family.py", line 130, in main
    _main(args)
  File "dqn_family.py", line 263, in _main
    outdir=args.outdir, eval_env=eval_env, save_best_so_far_agent=True,
  File "/homes/mdl31/.local/lib/python3.6/site-packages/chainerrl/experiments/train_agent.py", line 163, in train_agent_with_evaluation
    logger=logger)
  File "/homes/mdl31/.local/lib/python3.6/site-packages/chainerrl/experiments/train_agent.py", line 54, in train_agent
    obs, r, done, info = env.step(action)
  File "/homes/mdl31/.local/lib/python3.6/site-packages/gym/core.py", line 285, in step
    return self.env.step(self.action(action))
  File "/homes/mdl31/.local/lib/python3.6/site-packages/gym/core.py", line 261, in step
    observation, reward, done, info = self.env.step(action)
  File "/homes/mdl31/.local/lib/python3.6/site-packages/gym/core.py", line 261, in step
    observation, reward, done, info = self.env.step(action)
  File "/homes/mdl31/.local/lib/python3.6/site-packages/gym/core.py", line 261, in step
    observation, reward, done, info = self.env.step(action)
  File "/homes/mdl31/.local/lib/python3.6/site-packages/chainerrl/wrappers/continuing_time_limit.py", line 38, in step
    observation, reward, done, info = self.env.step(action)
  File "/homes/mdl31/.local/lib/python3.6/site-packages/minerl/env/core.py", line 536, in step
    reward, done, sent = struct.unpack('!dbb', reply)
TypeError: a bytes-like object is required, not 'NoneType'
ERROR    - 2019-08-16 13:12:36,078 - [minerl.env.malmo _kill_minecraft_via_malmoenv 688] Attempted to send kill command to minecraft process and failed.
INFO     - 2019-08-16 13:12:36,164 - [minerl.env.malmo on_terminate 345] Minecraft process psutil.Process(pid=4524, status='terminated') terminated with exit code 0
/usr/bin/xvfb-run: line 186: kill: (4389) - No such process```

When will you release the competition baseline?

There is only a README in the directory.

Question about DQFD result

Hello,

I am trying to reenact a DQFD for MineRLTreechop-v0 task. by running a dqfd.bash file. I change a pretrain step parameter to 250000, 500000 because of traning time is little long. After pretraning, my agent looks only at the continuation sky, not the tree.

Can you tell me if anything about that issue?

running multiple environments

Is there support for training on multiple environments simultaneously (i.e., PPO with 16 envs)?

Unable to clone quick_start: Permission denied.

I do not have the rights to access the repository. Can you please assist? Thanks!

Missing files in /data for the SQIL baseline agent

Hi,
I tried to train the SQIL baseline agent, but errors occur and it seems that some files that should be
in the /data dir are missing, so these cannot be imported:

from data.pipeline_wrapper import DataPipelineWrapper
from data.observation_converter import (
GrayScaleConverter, PoVOnlyConverter, VectorCombineConverter,
MoveAxisConverter, ScaledFloatConverter)
from data.action_converter import (
VectorActionConverter, VectorDiscretizeConverter,
KMeansActionConverter, DualKMeansActionConverter)

Could you please check about it? Thank you in advance!

Error occurred during training rainbow on Treechop

I just run your implementation rainbow script on my machine.
Error occurred during training rainbow on Treechop.

I think there is something wrong during env.step function is
receiving message communication with malmo
can you help me? error messages are below

File "/root/miniconda/lib/python3.7/site-packages/minerl/env/core.py", line 527, in step
reward, done, sent = struct.unpack('!dbb', reply)
TypeError: a bytes-like object is required, not 'NoneType'

execute rainbow.sh
.....
INFO - 2019-07-22 04:45:35,103 - [chainerrl.experiments.train_agent train_agent 68] statistics:[('average_q', 0.1883753465166074), ('average_loss', 3.5857960946642597), ('n_updates', 8567)]
INFO - 2019-07-22 04:47:50,474 - [chainerrl.experiments.train_agent train_agent 67] outdir:/shared/storage/personal/jinwoo.yoon/results/MineRLTreechop-v0/rainbow/20190722T040028.790414 step:41278 episode:21 R:0.0
INFO - 2019-07-22 04:47:50,474 - [chainerrl.experiments.train_agent train_agent 68] statistics:[('average_q', 0.20016808421641685), ('average_loss', 3.4986483235618375), ('n_updates', 9067)]
INFO - 2019-07-22 04:49:53,632 - [chainerrl.experiments.train_agent train_agent 67] outdir:/shared/storage/personal/jinwoo.yoon/results/MineRLTreechop-v0/rainbow/20190722T040028.790414 step:43278 episode:22 R:4.0
INFO - 2019-07-22 04:49:53,635 - [chainerrl.experiments.train_agent train_agent 68] statistics:[('average_q', 0.1945552027176348), ('average_loss', 3.4944816582370826), ('n_updates', 9567)]
INFO - 2019-07-22 04:53:11,385 - [chainerrl.experiments.train_agent save_agent 274] Saved the agent to /shared/storage/personal/jinwoo.yoon/results/MineRLTreechop-v0/rainbow/20190722T040028.790414/44291_except
ERROR - 2019-07-22 04:53:11,386 - [main main 132] execution failed.

Traceback (most recent call last):
File "dqn_family.py", line 273, in
main()
File "dqn_family.py", line 130, in main
_main(args)
File "dqn_family.py", line 265, in _main
outdir=args.outdir, eval_env=eval_env, save_best_so_far_agent=True,
File "/root/miniconda/lib/python3.7/site-packages/chainerrl/experiments/train_agent.py", line 163, in train_agent_with_evaluation
logger=logger)
File "/root/miniconda/lib/python3.7/site-packages/chainerrl/experiments/train_agent.py", line 54, in train_agent
obs, r, done, info = env.step(action)
File "/root/.local/lib/python3.7/site-packages/gym/core.py", line 282, in step
return self.env.step(self.action(action))
File "/root/miniconda/lib/python3.7/site-packages/chainerrl/wrappers/atari_wrappers.py", line 210, in step
ob, reward, done, info = self.env.step(action)
File "/root/.local/lib/python3.7/site-packages/gym/core.py", line 258, in step
observation, reward, done, info = self.env.step(action)
File "/root/.local/lib/python3.7/site-packages/gym/core.py", line 258, in step
observation, reward, done, info = self.env.step(action)
File "/root/.local/lib/python3.7/site-packages/gym/core.py", line 258, in step
observation, reward, done, info = self.env.step(action)
File "/root/quick_start/chainerrl_baselines/baselines/env_wrappers.py", line 110, in step
obs, reward, done, info = self.env.step(action)
File "/root/.local/lib/python3.7/site-packages/gym/core.py", line 224, in step
return self.env.step(action)
File "/root/miniconda/lib/python3.7/site-packages/chainerrl/wrappers/continuing_time_limit.py", line 38, in step
observation, reward, done, info = self.env.step(action)
File "/root/miniconda/lib/python3.7/site-packages/minerl/env/core.py", line 527, in step
reward, done, sent = struct.unpack('!dbb', reply)
TypeError: a bytes-like object is required, not 'NoneType'
ERROR - 2019-07-22 04:53:12,426 - [minerl.env.malmo _kill_minecraft_via_malmoenv 606] Attempted to send kill command to minecraft process and failed.
INFO - 2019-07-22 04:53:12,427 - [minerl.env.malmo on_terminate 277] Minecraft process psutil.Process(pid=2194, status='terminated') terminated with exit code 0

[Question] cv2 error on using expert_dataset.py (follow-up from DeprecationWarning: `DataPipeline.sarsd_iter`)

Hello. There is a cv2 error on running any baseline code from general/chainerrl/baselines (after fixing this issue). Here's the traceback on running the GAIL.py baseline:

Traceback (most recent call last):
  File "gail.py", line 421, in <module>
    main()
  File "gail.py", line 146, in main
    _main(args)
  File "gail.py", line 291, in _main
    shuffle=False)
  File "C:\MineRL\baselines\general\chainerrl\baselines\expert_dataset.py", line 110, in __init__
    self._convert(original_dataset)
  File "C:\MineRL\baselines\general\chainerrl\baselines\expert_dataset.py", line 120, in _convert
    obs = self.observation_converter(orig_obs)
  File "C:\MineRL\baselines\general\chainerrl\baselines\observation_wrappers.py", line 24, in converter
    cv2.cvtColor(obs, cv2.COLOR_RGB2GRAY), axis=-1)
cv2.error: OpenCV(4.3.0) c:\projects\opencv-python\opencv\modules\imgproc\src\color.simd_helpers.hpp:92: error: (-2:Unspecified error) in function '__cdecl cv::impl::`anonymous-namespace'::CvtHelper<struct cv::impl::`anonymous namespace'::Set<3,4,-1>,struct cv::impl::A0xe227985e::Set<1,-1,-1>,struct cv::impl::A0xe227985e::Set<0,2,5>,2>::CvtHelper(const class cv::_InputArray &,const class cv::_OutputArray &,int)'
> Invalid number of channels in input image:
>     'VScn::contains(scn)'
> where
>     'scn' is 1

Exception in thread QueueManagerThread:
Traceback (most recent call last):
  File "C:\Users\prabh\Miniconda3\envs\minerlpk\lib\threading.py", line 917, in _bootstrap_inner
    self.run()
  File "C:\Users\prabh\Miniconda3\envs\minerlpk\lib\threading.py", line 865, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Users\prabh\Miniconda3\envs\minerlpk\lib\concurrent\futures\process.py", line 354, in _queue_management_worker
    ready = wait(readers + worker_sentinels)
  File "C:\Users\prabh\Miniconda3\envs\minerlpk\lib\multiprocessing\connection.py", line 872, in wait
    ov.cancel()
OSError: [WinError 6] The handle is invalid

Failed to delete the temporary minecraft directory.

On tracing the issue, turns out what's fed into self.observation_converter() and self.action_converter() variables has shape (1, 32, 64, 64, 3) instead of (32, 64, 64, 3) (introduced from the new DataPipeline.batch_iter(**1**, 32, 1))
So, in both observation_wrappers.py and action_wrappers.py we need to ignore the first dim (use dict[keys][0] instead of dict[keys]) and append the dim on return (return np.expand_dims(list, 0) instead of return list). This seems to fix the problem

Note: dict = observation, action and list = ret, val in the two files, respectively

PS: If the spaces weren't dict, we could fix this by just initializing the converters as lambda x: x[0] instead of lambda x: x :v

System Info

Windows 10
Python 3.7.0
Tensorflow 2.2.0
minerl 0.3.5

Result in Treechop

I retrain the ppo in treechop environment. But the result is different from paper. I only get 20 reward final. I didn't change anything. What problem would it be?

minerllabs / baselines Goto Github PK

baselines's People

Contributors

Stargazers

Watchers

Forkers

baselines's Issues

Recommend Projects

Recommend Topics

Recommend Org