Giter Site home page Giter Site logo

mlrun / mlrun Goto Github PK

View Code? Open in Web Editor NEW
1.3K 25.0 241.0 67.14 MB

MLRun is an open source MLOps platform for quickly building and managing continuous ML applications across their lifecycle. MLRun integrates into your development and CI/CD environment and automates the delivery of production data, ML pipelines, and online applications.

Home Page: https://mlrun.org

License: Apache License 2.0

Dockerfile 0.24% Makefile 0.38% Python 97.70% Mako 0.01% Shell 0.24% Go 1.38% HTML 0.01% Jupyter Notebook 0.06%
mlops python data-science machine-learning data-engineering experiment-tracking model-serving mlops-workflow workflow kubernetes

mlrun's Introduction

Build Status License PyPI version fury.io Documentation Ruff GitHub commit activity GitHub release (latest SemVer) Join MLOps Live

MLRun logo

Using MLRun

MLRun is an open MLOps platform for quickly building and managing continuous ML applications across their lifecycle. MLRun integrates into your development and CI/CD environment and automates the delivery of production data, ML pipelines, and online applications, significantly reducing engineering efforts, time to production, and computation resources. With MLRun, you can choose any IDE on your local machine or on the cloud. MLRun breaks the silos between data, ML, software, and DevOps/MLOps teams, enabling collaboration and fast continuous improvements.

Get started with MLRun Tutorials and Examples, Installation and setup guide, or read about MLRun Architecture.

This page explains how MLRun addresses the MLOps Tasks and the MLRun core components.

MLOps tasks

mlrun-tasks


The MLOps development workflow section describes the different tasks and stages in detail. MLRun can be used to automate and orchestrate all the different tasks or just specific tasks (and integrate them with what you have already deployed).

Project management and CI/CD automation

In MLRun the assets, metadata, and services (data, functions, jobs, artifacts, models, secrets, etc.) are organized into projects. Projects can be imported/exported as a whole, mapped to git repositories or IDE projects (in PyCharm, VSCode, etc.), which enables versioning, collaboration, and CI/CD. Project access can be restricted to a set of users and roles.

See: Docs: Projects and Automation, CI/CD Integration, Tutorials: Quick start, Automated ML Pipeline, Video: Quick start.

Ingest and process data

MLRun provides abstract interfaces to various offline and online data sources, supports batch or realtime data processing at scale, data lineage and versioning, structured and unstructured data, and more. In addition, the MLRun Feature Store automates the collection, transformation, storage, catalog, serving, and monitoring of data features across the ML lifecycle and enables feature reuse and sharing.

See: Docs: Ingest and process data, Feature Store, Data & Artifacts; Tutorials: Quick start, Feature Store.

Develop and train models

MLRun allows you to easily build ML pipelines that take data from various sources or the Feature Store and process it, train models at scale with multiple parameters, test models, tracks each experiments, register, version and deploy models, etc. MLRun provides scalable built-in or custom model training services, integrate with any framework and can work with 3rd party training/auto-ML services. You can also bring your own pre-trained model and use it in the pipeline.

See: Docs: Develop and train models, Model Training and Tracking, Batch Runs and Workflows; Tutorials: Train, compare, and register models, Automated ML Pipeline; Video: Train and compare models.

Deploy models and applications

MLRun rapidly deploys and manages production-grade real-time or batch application pipelines using elastic and resilient serverless functions. MLRun addresses the entire ML application: intercepting application/user requests, running data processing tasks, inferencing using one or more models, driving actions, and integrating with the application logic.

See: Docs: Deploy models and applications, Realtime Pipelines, Batch Inference, Tutorials: Realtime Serving, Batch Inference, Advanced Pipeline; Video: Serving pre-trained models.

Monitor and alert

Observability is built into the different MLRun objects (data, functions, jobs, models, pipelines, etc.), eliminating the need for complex integrations and code instrumentation. With MLRun, you can observe the application/model resource usage and model behavior (drift, performance, etc.), define custom app metrics, and trigger alerts or retraining jobs.

See: Docs: Monitor and alert, Model Monitoring Overview, Tutorials: Model Monitoring & Drift Detection.

MLRun core components

mlrun-core


MLRun includes the following major components:

Project Management: A service (API, SDK, DB, UI) that manages the different project assets (data, functions, jobs, workflows, secrets, etc.) and provides central control and metadata layer.

Functions: automatically deployed software package with one or more methods and runtime-specific attributes (such as image, libraries, command, arguments, resources, etc.).

Data & Artifacts: Glueless connectivity to various data sources, metadata management, catalog, and versioning for structures/unstructured artifacts.

Feature Store: automatically collects, prepares, catalogs, and serves production data features for development (offline) and real-time (online) deployment using minimal engineering effort.

Batch Runs & Workflows: Execute one or more functions with specific parameters and collect, track, and compare all their results and artifacts.

Real-Time Serving Pipeline: Rapid deployment of scalable data and ML pipelines using real-time serverless technology, including API handling, data preparation/enrichment, model serving, ensembles, driving and measuring actions, etc.

Real-Time monitoring: monitors data, models, resources, and production components and provides a feedback loop for exploring production data, identifying drift, alerting on anomalies or data quality issues, triggering retraining jobs, measuring business impact, etc.

mlrun's People

Contributors

alonmr avatar alxtkr77 avatar assaf758 avatar benbd86 avatar daniels290813 avatar davesh0812 avatar dinal avatar eyal-danieli avatar george0st avatar gilad-shaham avatar gtopper avatar guy1992l avatar hedingber avatar jillnogold avatar jond01 avatar katyakats avatar laurybueno avatar liranbg avatar quaark avatar rokatyy avatar tankilevitch avatar tebeka avatar thesaarco avatar tomerm-iguazio avatar tomershor avatar yacouby avatar yaelgen avatar yanburman avatar yaronha avatar yonishelach avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mlrun's Issues

Auto generate function documentation

We'd like a bi-directional way to communicate help Python <-> YAML

Python -> YAML

Parse function docstring and populate YAML from it

YAML -> Python

Have a way to view help in the function object in Jupyter. Probably populate __doc__ from the YAML

httpd/db artifacts broken

@tebeka the artifacts get/store etc is broken

a. you must use the artifact.to_json() vs dumps like in filedb, since to_json() accounts for model details, also body is not stored in the DB

b. artifacts object path should be /artifact/<project>/<uid/tag>/<key> (key not as arg)
, (uid/tag is like a git tree and can also list all keys in a tree, if user specify a tag i.e. not latest we can add the tag via args, but on get he doesnt need to pass tag as arg since get will be on either uid or tag never both, think like git hash & tag)

def store_artifact(self, key, artifact, uid, tag='', project=''):

AttributeError: 'NameConstant' object has no attribute 'id'

Happens when running this notebook

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-43-a34f02098586> in <module>
      1 # create an ML function from the notebook, attache it to iguazio data fabric (v3io)
----> 2 trainer = code_to_function(name='my-trainer', runtime='job', with_doc=True)
~/.pythonlibs/lib/python3.6/site-packages/mlrun/run.py in code_to_function(name, project, tag, filename, handler, runtime, kind, image, embed_code, with_doc)
    376 
    377     if with_doc:
--> 378         handlers = find_handlers(code)
    379         r.spec.entry_points = {h[name]: h for h in handlers}
    380     return r
~/.pythonlibs/lib/python3.6/site-packages/mlrun/funcdoc.py in find_handlers(code, handlers)
    209     visitor = ASTVisitor()
    210     visitor.visit(mod)
--> 211     funcs = [ast_func_info(fn) for fn in visitor.funcs]
    212     if handlers:
    213         return [f for f in funcs if f['name'] in handlers]
~/.pythonlibs/lib/python3.6/site-packages/mlrun/funcdoc.py in <listcomp>(.0)
    209     visitor = ASTVisitor()
    210     visitor.visit(mod)
--> 211     funcs = [ast_func_info(fn) for fn in visitor.funcs]
    212     if handlers:
    213         return [f for f in funcs if f['name'] in handlers]
~/.pythonlibs/lib/python3.6/site-packages/mlrun/funcdoc.py in ast_func_info(func)
    160 def ast_func_info(func: ast.FunctionDef):
    161     doc = ast.get_docstring(func) or ''
--> 162     rtype = func.returns.id if func.returns else ''
    163     params = [ast_param_dict(p) for p in func.args.args]
    164     defaults = func.args.defaults
AttributeError: 'NameConstant' object has no attribute 'id'

Test notebooks

We'd like to test some notebooks that uses mlrun.

Have the ability to get creds from the environment and if they exist run againt k8/nuclio dashboard.

@handler decorator

We'd like to have @handler decorator. This will allow use to know which function are handler and which ones are utility.

@handler should get some optional parameters.

async build/submit API

We'd like to have build/submit return some kind of job ID and let the client poll on it

Options to run HTTP DB

We'd like to have:

  • A docker image that runs the HTTP DB server
  • Command line executable (mlrun-db)
  • Command line option for port

Remove SQLAlchemy warnings

When running mlrun you see the following warning:

/home/miki/work/envs/mlrun/lib/python3.8/site-packages/sqlalchemy/ext/declarative/clsregistry.py:125: SAWarning: This declarative base already contains a class with the same class name and module name as mlrun.db.sqldb.Label, and will be replaced in the string-lookup table.

Make it go away :)

Add periodic task to HTTP db

We'd like to have the HTTP db deamon run a periodic task. The task should be initialized and then run every duration

Move test output outside source directory

Currently when runnig tests, we have some files left over the source directory, we'd like them to be create elsewhere.

Files/directories are:

  • chart
  • chart.html
  • dask-worker-space/
  • dataset.csv
  • iteration_results.csv
  • model.txt
  • results.html
  • tests/test_results/

Clean logging error from tests

At the end of the tests we see the following output. It doesn't fail the test but it's annoying.

--- Logging error ---
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/logging/__init__.py", line 996, in emit
    stream.write(msg)
  File "/usr/local/lib/python3.6/site-packages/_pytest/capture.py", line 427, in write
    self.buffer.write(obj)
ValueError: I/O operation on closed file
Call stack:
  File "/usr/local/lib/python3.6/threading.py", line 884, in _bootstrap
    self._bootstrap_inner()
  File "/usr/local/lib/python3.6/threading.py", line 916, in _bootstrap_inner
    self.run()
  File "/usr/local/lib/python3.6/threading.py", line 864, in run
    self._target(*self._args, **self._kwargs)
  File "/mlrun/mlrun/db/periodic.py", line 35, in _schedule
    logger.info('running periodic task')
Message: 'running periodic task'
Arguments: ()

Support secrets encryption

We'd like the ability to encrypt using private/public key.

Basically a function that gets a dict and return the dict with all values encrypted.
Provide an example on how to generate the key (openssh?)

Have config populate on import

Since some people would use the SDK and we don't want them calling config.populate() manually (they'll forget/miss the docs). We'd like to change config.populate() to re-run instead of being a no-op the second time.

This way we'll have automatic way and can also manually re-populate. Probably rename the method to reload then

Config System

@yaronha

Add config class which can override defaults from a file (can potentially add env var options)

we can put the “default namespace” in it as an example (search for all places with ‘default-namespace’ string), will add more later base on it

We'd like at least two level in the configuration (e.g. http.port)

Support "outputs" in function documentation

Apart from what a function returns, we have outputs which are emitted via mlrun functions.
We'd like to have a way to document these outputs (in docstring) and have func_info and friends return them.

Support function input sources

We like to have the user specify where some input parameters come from.

See

  • kubeflow input types
  • mlflow path type
  • luigi
  • pytest.mark.parametrize

SQL rundb

Create SQL rundb. When it's ready - have the HTTP server use it.

Add pagination to SQL DB

Queries might return a lot of answers (run...), this will choke the UI. We'd like to add pagination to some queries in the database.

Probably change the return value to have (total_count, results)

SQL DB Errors

@yaronha said:

File "/mlrun/mlrun/db/httpd.py", line 275, in update_run
_db.update_run(data, uid, project, iter=iter)
File "/mlrun/mlrun/db/sqldb.py", line 145, in update_run
run = self._query(Run, uid=uid, project=project).one_or_none()
File "/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/query.py", line 3303, in one_or_none
"Multiple rows were found for one_or_none()"
sqlalchemy.orm.exc.MultipleResultsFound: Multiple rows were found for one_or_none()
10.233.81.40 - - [03/Dec/2019 22:57:28] "PATCH /api/run/default/6d259e0c13854d11b6064c7a8acdd2c9?iter=0
HTTP/1.1" 500 -

looks like u didnt set the tabel keys properly, e.g. in runs the key is project+uid+iter
so every write gets a new ID (i.e. it add rows vs update the same row)
u also seem to ignore iter in read/store/update of runs

Docker for HTTP DB with gunicorn

Write a docker to run the HTTP DB with gunicorn.

Currently ignore the fact that the underlying filedb is not thread/process safe.

Add functions to the database

We'd like to store functions in the database. We're going to store YAML/JSON.

@yaronha said:

  • YAML, metadata spec, labels
  • Key is project + name + version tag
  • put/get/list/delete
  • search by project + name + labels

Better HTTP server

Currently we're using flask development server as the HTTP server. For performance we might consider using gunicorn or uwsgi

However, once we run several Python process, we need to check the underlying filedb behaves well in concurrency.

Nuke mutable default values

In several places we function definitions like: def gen_list(items=[], tag='td'): which is bad.

Find all places in code and nuke them. The usual solution is:

def gen_list(items=None, tag='td'):
    items = [] if items is None else items
    ...

We can also consider adding flake8 and flake8-bugbear a pre-test step.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.