Giter Site home page Giter Site logo

Comments (7)

hshaikusa avatar hshaikusa commented on August 16, 2024

Another error:

command issued for mnist example:

C:\mlperf\mlbox_11062020\box_examples\mnist> docker run --rm --net=host --privileged=true --volume C:\mlperf\mlbox_11062020\box_examples\mnist\workspace/data:/mlbox_io0/data --volume C:\mlperf\mlbox_11062020\box_examples\mnist\workspace/download_logs:/mlbox_io1/download_logs serebrya/mlbox_mnist:0.0.2 download --data_dir=/mlbox_io0/data --log_dir=/mlbox_io1/download_logs

here is the error:

2020-11-10 16:58:42.772479: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer.so.6'; dlerror: libnvinfer.so.6: cannot open shared object file: No such file or directory
2020-11-10 16:58:42.772697: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer_plugin.so.6'; dlerror: libnvinfer_plugin.so.6: cannot open shared object file: No such file or directory
2020-11-10 16:58:42.772714: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:30] Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.

from mlcube.

sergey-serebryakov avatar sergey-serebryakov commented on August 16, 2024

@hshaikusa These errors are OK. When no GPUs are available, TF should fall back to CPU compute backend. I see these messages on Linux machines as well.

from mlcube.

hshaikusa avatar hshaikusa commented on August 16, 2024

@sergey-serebryakov , ok
here is another error i am facing for mnist:

command:
C:\mlperf\mlbox_11062020\box_examples\mnist> mlcommons_box_docker run --mlbox=. --platform=platforms/docker.yaml --task=run/train.yaml

outcome:

MLBox(root=C:\mlperf\mlbox_11062020\box_examples\mnist, name=mnist, version=0.1.0, task=MLBoxTask(inputs={'data_dir': 'directory', 'parameters_file': 'file'}, outputs={'log_dir': 'directory', 'model_dir': 'directory'}), invoke=MLBoxInvoke(task_name=train, input_binding={'data_dir': '$WORKSPACE/data', 'parameters_file': '$WORKSPACE/parameters/default.parameters.yaml'}, output_binding={'log_dir': '$WORKSPACE/train_logs', 'model_dir': '$WORKSPACE/model'}), platform=<mlcommons_box.common.objects.platform_config.PlatformConfig object at 0x0000015A78854F48>)
docker inspect --type=image serebrya/mlbox_mnist:0.0.2 > /dev/null 2>&1
The system cannot find the path specified.
Docker image (serebrya/mlbox_mnist:0.0.2) does not exist. Running 'configure' phase.
docker pull serebrya/mlbox_mnist:0.0.2
0.0.2: Pulling from serebrya/mlbox_mnist
Digest: sha256:75667646473cda957bd23b52b6f660fb462986d7776d323a654ae59269ce02b9
Status: Image is up to date for serebrya/mlbox_mnist:0.0.2
docker.io/serebrya/mlbox_mnist:0.0.2
mounts={'C:\mlperf\mlbox_11062020\box_examples\mnist\workspace/data': '/mlbox_io0/data', 'C:\mlperf\mlbox_11062020\box_examples\mnist\workspace/parameters': '/mlbox_io1/C:\mlperf\mlbox_11062020\box_examples\mnist\workspace/parameters', 'C:\mlperf\mlbox_11062020\box_examples\mnist\workspace/train_logs': '/mlbox_io2/train_logs', 'C:\mlperf\mlbox_11062020\box_examples\mnist\workspace/model': '/mlbox_io3/model'}, args=['train', '--data_dir=/mlbox_io0/data', '--parameters_file=/mlbox_io1/C:\mlperf\mlbox_11062020\box_examples\mnist\workspace/parameters/default.parameters.yaml', '--log_dir=/mlbox_io2/train_logs', '--model_dir=/mlbox_io3/model']
docker run --rm --net=host --privileged=true --volume C:\mlperf\mlbox_11062020\box_examples\mnist\workspace/data:/mlbox_io0/data --volume C:\mlperf\mlbox_11062020\box_examples\mnist\workspace/parameters:/mlbox_io1/C:\mlperf\mlbox_11062020\box_examples\mnist\workspace/parameters --volume C:\mlperf\mlbox_11062020\box_examples\mnist\workspace/train_logs:/mlbox_io2/train_logs --volume C:\mlperf\mlbox_11062020\box_examples\mnist\workspace/model:/mlbox_io3/model serebrya/mlbox_mnist:0.0.2 train --data_dir=/mlbox_io0/data --parameters_file=/mlbox_io1/C:\mlperf\mlbox_11062020\box_examples\mnist\workspace/parameters/default.parameters.yaml --log_dir=/mlbox_io2/train_logs --model_dir=/mlbox_io3/model

docker: Error response from daemon: invalid mode: \mlperf\mlbox_11062020\box_examples\mnist\workspace/parameters.
See 'docker run --help'.
Traceback (most recent call last):
File "c:\programdata\anaconda3\envs\mlbox_11062020\lib\runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "c:\programdata\anaconda3\envs\mlbox_11062020\lib\runpy.py", line 85, in run_code
exec(code, run_globals)
File "C:\ProgramData\Anaconda3\envs\mlbox_11062020\Scripts\mlcommons_box_docker.exe_main
.py", line 7, in
File "c:\programdata\anaconda3\envs\mlbox_11062020\lib\site-packages\click\core.py", line 829, in call
return self.main(*args, **kwargs)
File "c:\programdata\anaconda3\envs\mlbox_11062020\lib\site-packages\click\core.py", line 782, in main
rv = self.invoke(ctx)
File "c:\programdata\anaconda3\envs\mlbox_11062020\lib\site-packages\click\core.py", line 1259, in invoke
return process_result(sub_ctx.command.invoke(sub_ctx))
File "c:\programdata\anaconda3\envs\mlbox_11062020\lib\site-packages\click\core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "c:\programdata\anaconda3\envs\mlbox_11062020\lib\site-packages\click\core.py", line 610, in invoke
return callback(*args, **kwargs)
File "c:\programdata\anaconda3\envs\mlbox_11062020\lib\site-packages\mlcommons_box_docker_main
.py", line 45, in run
runner.run()
File "c:\programdata\anaconda3\envs\mlbox_11062020\lib\site-packages\mlcommons_box_docker\docker_run.py", line 72, in run
self._run_or_die(cmd)
File "c:\programdata\anaconda3\envs\mlbox_11062020\lib\site-packages\mlcommons_box_docker\docker_run.py", line 117, in _run_or_die
raise RuntimeError('Command failed: {}'.format(cmd))
RuntimeError: Command failed: docker run --rm --net=host --privileged=true --volume

C:\mlperf\mlbox_11062020\box_examples\mnist\workspace/data:/mlbox_io0/data --volume C:\mlperf\mlbox_11062020\box_examples\mnist\workspace/parameters:/mlbox_io1/C:\mlperf\mlbox_11062020\box_examples\mnist\workspace/parameters --volume C:\mlperf\mlbox_11062020\box_examples\mnist\workspace/train_logs:/mlbox_io2/train_logs --volume C:\mlperf\mlbox_11062020\box_examples\mnist\workspace/model:/mlbox_io3/model serebrya/mlbox_mnist:0.0.2 train --data_dir=/mlbox_io0/data --parameters_file=/mlbox_io1/C:\mlperf\mlbox_11062020\box_examples\mnist\workspace/parameters/default.parameters.yaml --log_dir=/mlbox_io2/train_logs --model_dir=/mlbox_io3/model

from mlcube.

sergey-serebryakov avatar sergey-serebryakov commented on August 16, 2024

@hshaikusa Thanks, there's one more issue to be fixed associated with how mount points are constructed. I updated the first message in this thread.

I cannot run docker on my win laptop (probably, due to McAfee). I asked our admins to allocate a Windows virtual instance that I can use for testing.

from mlcube.

swiftdiaries avatar swiftdiaries commented on August 16, 2024

I think we might need to support Windows specific filepath construction. Probably a workaround for now (as we're working to stabilize the code) is to maybe use WSL and add instructions for that.

from mlcube.

sergey-serebryakov avatar sergey-serebryakov commented on August 16, 2024

Update: I got access to Windows server and I could install docker. I should be able to provide a fix for Windows systems (local Docker runner) next week.

from mlcube.

hshaikusa avatar hshaikusa commented on August 16, 2024

@sergey-serebryakov cool. looking forward to the fixes. please plan for them to push to PyPI once you are done with your level of validation. I would like them to validate as an outsider who can download as per the instructions and play with them.

from mlcube.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.