Giter Site home page Giter Site logo

mlinfra-io / mlinfra Goto Github PK

View Code? Open in Web Editor NEW
57.0 3.0 6.0 12.07 MB

deploy ML Infrastructure and MLOps tooling anywhere quickly and with best practices with a single command

Home Page: https://mlinfra.io/

License: Apache License 2.0

Smarty 0.40% HCL 43.83% Shell 2.82% Python 52.95%
mlops platform-engineering distributed-ml python software-engineering ml-infrastructure

mlinfra's Introduction

Banner Logo

Open source MLOps infrastructure deployment on Public Cloud providers

Open source MLOps: Open source tools for different stages in an MLOps lifecycle.
Public Cloud Providers: Supporting all major cloud providers including AWS, GCP, Azure and AliBaba

GitHub License mlinfra releases Documentation CI test status mlinfra Python package on PyPi mlinfra Python package downloads on PyPi Discord cloud providers AWS Examples GCP Examples Azure Examples Alibaba Examples

mlinfra is the swiss army knife for deploying scalable MLOps infrastructure. It aims to make MLOps infrastructure deployment easy and accessible to all ML teams by liberating IaC logic for creating MLOps stacks which is usually tied to other frameworks.

Contribute to the project by opening a issue or joining project roadmap and design related discussion on discord. Complete roadmap will be released soon!

๐Ÿš€ Installation

Requirements

mlinfra requires the following to run perfectly:

  • terraform >= 1.4.0 should be installed on the system.

mlinfra can be installed simply by creating a python virtual environment and installing mlinfra pip package

python -m venv venv
source venv/bin/activate
pip install mlinfra

Copy a deployment config from the examples folder, change your AWS account in the config file, configure your AWS credentials and deploy the configuration using

mlinfra terraform --action apply --stack-config-path <path-to-your-config>

For more information, read the mlinfra user guide

Supported Providers

The core purpose is to build for all cloud and deployment platforms out there. Any user should be able to just change the cloud provider or runtime environment (whether it be linux or windows) and have the capability to deploy the same tools.

Currently a lot of work has been done around AWS

This project will be supporting the following providers:

Supported MLOps Tools

mlinfra intends to support as many MLOps tools deployable in a platform in their standalone as well as high availability across different layers of an MLOps stack:

  • data_versioning
  • experiment_tracker
  • orchestrator
  • artifact_tracker / model_registry
  • model_inference
  • monitoring
  • alerting

Deployment Config

  • mlinfra deploys infrastructure using declarative approach. It requires resources to be defined in a yaml file with the following format
name: aws-mlops-stack
provider:
  name: aws
  account-id: xxxxxxxxx
  region: eu-central-1
deployment:
  type: cloud_vm # (this would create ec2 instances and then deploy applications on it)
stack:
  data_versioning:
    - lakefs # can also be pachyderm or lakefs or neptune and so on
  secrets_manager:
    - secrets_manager # can also be vault or any other
  experiment_tracker:
    - mlflow # can be weights and biases or determined, or neptune or clearml and so on...
  orchestrator:
    - zenml # can also be argo, or luigi, or airflow, or dagster, or prefect or flyte or kubeflow and so on...
  orchestrator:
    - aws-batch # can also be aws step functions or aws-fargate or aws-eks or azure-aks and so on...
  runtime_engine:
    - ray # can also be horovod or apache spark
  artifact_tracker:
    - mlflow # can also be neptune or clearml or lakefs or pachyderm or determined or wandb and so on...
  # model registry and serving are quite close, need to think about them...
  model_registry:
    - bentoml # can also be  mlflow or neptune or determined and so on...
  model_serving:
    - nvidia triton # can also be bentoml or fastapi or cog or ray or seldoncore or tf serving
  monitoring:
    - nannyML # can be grafana or alibi or evidently or neptune or mlflow or prometheus or weaveworks and so on...
  alerting:
    - mlflow # can be mlflow or neptune or determined or weaveworks or prometheus or grafana and so on...
  • This was minimal spec for aws cloud as infra with custom applications. Other stacks such as feature_store, event streamers, loggers or cost dashboards can be added via community requests.
  • For more information, please check out the docs for detailed documentation.

Vision

  • I realised MLOps infrastructure deployment is not as easy and common over the years of creating and deploying ML platforms for multiple teams. A lot of the times, teams start on wrong foot, leading to months of planning and implementation of MLOps infrastructure. This project is an attempt to create a common MLOps infrastructure deployment framework that can be used by any ML team to deploy their MLOps stack in a single command.

Development

  • This project relies on terraform for IaC code and python to glue it all together.
  • To get started, install terraform and python.
  • You can install the required python packages by running pip install -r requirements-dev.txt
  • You can run any of the available examples from the examples folder by running cd src and invoke terraform --stack-config-path examples/<application>/<cloud>-<application>.yaml --action <action> where <action> corresponds to terraform actions such as plan, apply and destroy.

For more information, please refer to the Engineering Wiki of the project (https://mlinfra.io/user_guide/) regarding what are the different components of the project and how they work together.

Contributions

  • Contributions are welcome! Help us onboard all of the available mlops tools on currently available cloud providers.
  • For major changes, please open an issue first to discuss what you would like to change. A team member will get to you soon.
  • For information on the general development workflow, see the contribution guide.

License

The mlinfra library is distributed under the Apache-2 license.

mlinfra's People

Contributors

aliabbasjaffri avatar dependabot[bot] avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

mlinfra's Issues

[BUG]: mlinfra not taking the path of local file

Python Version

3.10.7

Terraform Version

v1.6.5

Other Dependencies

No response

Issue Details

In cloud_infra_deployment we're using relative_project_root instead of absolute_project_root. This needs to be corrected.

Steps to Reproduce

  1. Copy an example file and run a terraform command on it

Previous Report Check

  • I confirm that I have searched the existing issues and this issue has not been previously reported.

Willingness to Contribute

Yes, I would like to work on this issue.

Configure amplitude for gathering statistics of usage

Currently the only statistics that we're gathering are star gazers and pypi downloads. We want to use amplitude to be able to track the metrics around the project better, e.g. how many time platinfra apply and platinfra destroy has been used, which docs are being viewed the most, and which docs can be improved.

update-github-issue-template

This issue aims to update github config and issue template to:

  • allow better tagging of issues
  • comprehensive information gathering from the user
  • gathering enough context regarding an issue before jumping on it

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.