Giter Site home page Giter Site logo

mlcommons / mlcube Goto Github PK

View Code? Open in Web Editor NEW
153.0 26.0 32.0 4.58 MB

MLCube® is a project that reduces friction for machine learning by ensuring that models are easily portable and reproducible.

Home Page: https://mlcommons.github.io/mlcube/

License: Apache License 2.0

Python 100.00%
mlcube machine-learning mlperf mlcommons

mlcube's Introduction

MLCube

License

PyPI MLCube PyPI MLCube Docker Runner PyPI MLCube Singularity Runner

MLCube® brings the concept of interchangeable parts to the world of machine learning models. It is the shipping container that enables researchers and developers to easily share the software that powers machine learning.

MLCube is a set of common conventions for creating ML software that can just "plug-and-play" on many systems. MLCube makes it easier for researchers to share innovative ML models, for a developer to experiment with many models, and for software companies to create infrastructure for models. It creates opportunities by putting ML in the hands of more people.

MLCube isn’t a new framework or service; MLCube is a consistent interface to machine learning models in containers like Docker. Models published with the MLCube interface can be run on local machines, on a variety of major clouds, or in Kubernetes clusters - all using the same code. MLCommons provides open source “runners” for each of these environments that make training a model in an MLCube a single command.

Note: This project is still in the very early stages and under active development, some parts may have unexpected/inconsistent behaviours.

Installing MLCube

Install from PyPI:

pip install mlcube

To uninstall:

pip uninstall mlcube

Usage Examples

Check out the examples for detailed examples and MLCube wiki.

License

MLCube is licensed under the Apache License 2.0.

See LICENSE for more information.

MLCube is a trademark of the MLCommons® Association.

Support

Create a GitHub issue

mlcube's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mlcube's Issues

move old readme according to new directory structure

While validating the Document published:https://mlperf.github.io/mlbox/

(mlbox) PS C:\mlperf\mlbox> dir

Directory: C:\mlperf\mlbox

Mode LastWriteTime Length Name


d----- 9/1/2020 2:22 AM .github
d----- 9/1/2020 2:22 AM docs
d----- 9/1/2020 2:22 AM examples
d----- 9/1/2020 2:22 AM mlcommons_box
d----- 9/1/2020 2:22 AM runners
------ 9/1/2020 2:22 AM 1225 .gitignore
------ 9/1/2020 2:22 AM 11357 LICENSE
------ 9/1/2020 2:22 AM 75 mkdocs-requirements.txt
------ 9/1/2020 2:22 AM 666 mkdocs.yml
------ 9/1/2020 2:22 AM 1698 README.md
------ 9/1/2020 2:22 AM 23 requirements.txt
------ 9/1/2020 2:22 AM 1254 setup.py

(mlbox) PS C:\mlperf\mlbox> pip install .
Processing c:\mlperf\mlbox
ERROR: Command errored out with exit status 1:

error: package directory 'mlbox' does not exist
----------------------------------------

ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.

Release 0.2

Release mlcommons-box-ssh mlcommons-box-docker & mlcommons-box-singulairy
for 0.1.1 release

Runners 0.2.1 depend on mlcommons-box 0.2

In my past PR #107 I changed the version of runners from 0.2 to 0.2.1 but forgot to update the runners' dependencies (requirements.txt files). Currently, 0.2.1 runners depend on mlcommons-box version 0.2 that results in the following error:

pip install mlcommons-box mlcommons-box-docker
...
ERROR: mlcommons-box-docker 0.2.1 has requirement mlcommons-box==0.2, but you'll have mlcommons-box 0.2.1 which is incompatible.
...

Installing mlcommons-box-docker does not help since runners do require mlcommons-box 0.2.1. What helps is specifying docker runner version: pip install mlcommons-box-docker=0.2 (this will work with previous versions of mlboxes).

What would be the best way to solve it? In the version 0.2.2?

OCI compliant Images in mlcube

Creating an issue for documenting the conversation about OCI compliant images
and also talk about the recent news of docker being removed from Kubernetes and what it means to users of mlcube.

mlperf-mlbox sprint #3

Sept 6 - Sept 17

Goals

  • Work towards 0.1 release for Reference Implementation for Training WG.
  • Implement runner spec v0.1 #66
  • Complete WIP: PRs from sprint -2
  • Complete Runner implementation #77
  • Rename to mlcommons_box

Milestone date : Dec -10 - 2020

MLCube a Training Model

Let's pick another training model to MLCube and create a PR to the training repo.

Take a look at these two benchmarks and pick one to MLCube (based on which one is easier for you to run or work with):
https://github.com/mlcommons/training/tree/master/image_segmentation/pytorch
https://github.com/mlcommons/training/tree/master/rnn_speech_recognition/pytorch

Sergey can answer a lot of questions here, so I added him.

You can create a PR to the training repository like the SSD mlcube.

Running MLBoxes on windows machines.

Docker and other MLCommons-Box runners assume they run in Linux environment. Several updates are required to support windows machines as well. Let's use this thread to track what is required and also document the process of running boxes on windows.

__How to run docker-based MLBoxes on Windows machines?

  • Do this ...
  • Do that ...

Fixed:

  • docker run command #134.

To be fixed:

  • docker inspect command that uses /dev/null. Error:
    Could not find a part of the path 'C:\dev\null'
    
    Seems like it should either be removed for windows platform (that /dev/null), or the docker runner needs to be able to figure out where it runs (cmd, power shell). Depending on environment, either NUL or $null are used.
  • The function that creates mount points needs to be updated. Currently, for file names the following is generated:
    mounts:
        C:\mlperf\mlbox_11062020\box_examples\mnist\workspace/parameters: '/mlbox_io1/C:\mlperf\mlbox_11062020\box_examples\mnist\workspace/parameters'
    
  • Paths on a command line need to be quoted.

Prepare Hello World Run Tutorial

Make a tutorial how to do the hello world box with the pip install commands (walking through how to run it).

Basically, the steps could look like:

  1. Setup host (install docker, mlbox, etc.)
  2. Download the hello world box
  3. Run the box, modifying inputs / reading outputs
  4. Make small change to the box and re-build?

[Design] Packaging and distributing mlboxes.

We should start looking at choices for packaging, distributing mlboxes.

We can probably start with a tarball or a zip file for the box but in order to be re-usable across teams and automating with CI/CD workflows we probably need to come up with more longer term solutions.

running hello_world example in windows: docker: Error response from daemon: invalid mode: \mlperf\mlbox_11062020\box_examples\hello_world\workspace/chats

while running hello_world example in windows, getting the following errors:
mlcommons versions installed:
mlcommons-box 0.2.3
mlcommons-box-docker 0.2.3
mlcommons-box-singularity 0.2.3
mlcommons-box-ssh 0.2.3
mlspeclib 1.0.0

docker run --rm --net=host --privileged=true --volume C:\mlperf\mlbox_11062020\box_examples\hello_world\workspace/names:/mlbox_io0/C:\mlperf\mlbox_11062020\box_examples\hello_world\workspace/names --volume C:\mlperf\mlbox_11062020\box_examples\hello_world\workspace/chats:/mlbox_io1/C:\mlperf\mlbox_11062020\box_examples\hello_world\workspace/chats mlperf/mlbox_hello_world:0.0.1 hello --name=/mlbox_io0/C:\mlperf\mlbox_11062020\box_examples\hello_world\workspace/names/alice.txt --chat=/mlbox_io1/C:\mlperf\mlbox_11062020\box_examples\hello_world\workspace/chats/chat_with_alice.txt
**docker: Error response from daemon: invalid mode: \mlperf\mlbox_11062020\box_examples\hello_world\workspace/chats.
**
See 'docker run --help'.
Traceback (most recent call last):
File "c:\programdata\anaconda3\envs\mlbox_11062020\lib\runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "c:\programdata\anaconda3\envs\mlbox_11062020\lib\runpy.py", line 85, in run_code
exec(code, run_globals)
File "C:\ProgramData\Anaconda3\envs\mlbox_11062020\Scripts\mlcommons_box_docker.exe_main
.py", line 7, in
File "c:\programdata\anaconda3\envs\mlbox_11062020\lib\site-packages\click\core.py", line 829, in call
return self.main(*args, **kwargs)
File "c:\programdata\anaconda3\envs\mlbox_11062020\lib\site-packages\click\core.py", line 782, in main
rv = self.invoke(ctx)
File "c:\programdata\anaconda3\envs\mlbox_11062020\lib\site-packages\click\core.py", line 1259, in invoke
return process_result(sub_ctx.command.invoke(sub_ctx))
File "c:\programdata\anaconda3\envs\mlbox_11062020\lib\site-packages\click\core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "c:\programdata\anaconda3\envs\mlbox_11062020\lib\site-packages\click\core.py", line 610, in invoke
return callback(*args, **kwargs)
File "c:\programdata\anaconda3\envs\mlbox_11062020\lib\site-packages\mlcommons_box_docker_main
.py", line 45, in run
runner.run()
File "c:\programdata\anaconda3\envs\mlbox_11062020\lib\site-packages\mlcommons_box_docker\docker_run.py", line 72, in run
self._run_or_die(cmd)
File "c:\programdata\anaconda3\envs\mlbox_11062020\lib\site-packages\mlcommons_box_docker\docker_run.py", line 117, in _run_or_die
raise RuntimeError('Command failed: {}'.format(cmd))
RuntimeError: Command failed: docker run --rm --net=host --privileged=true --volume C:\mlperf\mlbox_11062020\box_examples\hello_world\workspace/names:/mlbox_io0/C:\mlperf\mlbox_11062020\box_examples\hello_world\workspace/names --volume C:\mlperf\mlbox_11062020\box_examples\hello_world\workspace/chats:/mlbox_io1/C:\mlperf\mlbox_11062020\box_examples\hello_world\workspace/chats mlperf/mlbox_hello_world:0.0.1 hello --name=/mlbox_io0/C:\mlperf\mlbox_11062020\box_examples\hello_world\workspace/names/alice.txt --chat=/mlbox_io1/C:\mlperf\mlbox_11062020\box_examples\hello_world\workspace/chats/chat_with_alice.txt

mlperf-mlbox sprint #4

Sept 22 - Oct 2

Goals

  • Work towards 0.1 release for Reference Implementation for Training WG.
  • Implement runner spec v0.1 #66
  • Update Documentation
  • Focussing on release to public
  • Oct 30 Demo with SSH/Docker Runners

OpenFL + MLCube

Let's create a POC of open FL + MLCube.

Xin and Jurado - can you work on this together?

mlcube-0.0.1: wrong file inside

Somehow, the mlcube-0.0.1 was packaged with the wrong file.

Error:

Traceback (most recent call last):
  File "/tmp/env/bin/mlcube_docker", line 5, in <module>
    from mlcube_docker.__main__ import cli
  File "/tmp/env/lib/python3.8/site-packages/mlcube_docker/__main__.py", line 4, in <module>
    from mlcube.common import mlcube_metadata
ImportError: cannot import name 'mlcube_metadata' from 'mlcube.common' (/tmp/env/lib/python3.8/site-packages/mlcube/common/__init__.py)

Expected: file name is mlcube/common/mlcube_metadata.py
Observed: file name is mlcube/common/mlbox_metadata.py
Solution: the master branch contains the right file, will be fixed in the next release.

k8s runner should not hard code task name

The task name is hard coded as "kubernetes" in the k8s runner, which causes errors when using a custom task name. A custom task name should be expected and the runner should auto feed it to the generated command it runs.

CI is failing for 0.2.1 release

  • Add Actions to release to pypi test
  • Add new test which builds from source for PR
  • Refactor test which install packages from pypi

Use mlcommons_box to run all MLBoxes

Once MLCommons-Box and MLCommons-Box Docker runner are installed (pip install mlcommons-box-docker), users run docker-based boxes using the following command line:

mlcommons_box_docker run --mlbox= ...

We were discussing the architecture based on plugins to automatically select the appropriate MLCommons-Box runner so that users run all mlboxes like this:

mlcommons_box run --mlbox= ...

Currently, the mlcommons_box entry point does not support this.

We do not have this in the current sprint, nor as a task for December release. I think it's a useful feature and I will be glad to implement it with the help on exact plugin approach.

mlperf-mlbox sprint #2

Aug 24 - Sept 4

Goals

  • Work towards 0.1 release for Reference Implementation for Training WG.

Milestone date : Dec -10 - 2020

Implement runner spec v0.1

Base runner library

A base runner library is needed to provide common APIs and functionalities for runners to use. This will make it much easier to standardize runners.

change sub-mod/gh-action-pypi-publish@master to pypa/gh-action-pypi-publish

pypa/gh-action-pypi-publish had issues with pypa warehouse and so to debug and fix the issue I had created sub-mod/gh-action-pypi-publish.
Now that pypa/warehouse issues are fixed and code changes from sub-mod/gh-action-pypi-publish are merged in pypa/gh-action-pypi-publish we should revert to pypa/gh-action-pypi-publish

mlperf-mlbox sprint #1

Aug 10 - Aug 21

Goals

  • Work towards 0.1 release for Reference Implementation for Training WG.
    • Clean up the project directory structure for 0.1 release
    • Add SSH, Docker runner and examples
    • Add documentation
    • Getting Started Document/Video Demo
    • Add tests and CI(TravisCI, GitHub Actions)
    • Target training MNIST
    • Target one cloud platform/workflow tool(next sprint)
      • SSH runner for GCP (next sprint)
    • Package 0.1 release and release
  • Resolve issues older than 06/2020

Milestone date : Dec -10 - 2020

Docs site admin setup

The docs-site GitHub Action needs some initial work to set it up. And needs admin privileges.

I'll detail some of the work here. Someone who has admin rights can pick this up.

The GitHub Action needs a separate deploy key, the link on the first task has setup instructions for that.
Secondly, we need to enable GitHub Pages to deploy the static site from the gh-pages branch. The second task has a link to setup instructions for that

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.