Giter Site home page Giter Site logo

mlops-zoomcamp's Introduction

MLOps Zoomcamp

Our MLOps Zoomcamp course

Overview

Objective

Teach practical aspects of productionizing ML services — from collecting requirements to model deployment and monitoring.

Target audience

Data scientists and ML engineers. Also software engineers and data engineers interested in learning about putting ML in production

Pre-requisites

  • Python
  • Docker
  • Being comfortable with command line
  • Prior exposure to machine learning (at work or from other courses, e.g. from ML Zoomcamp)
  • Prior programming experience (1+ years of professional experience)

Timeline

Course start: 16 of May

Syllabus

There are five modules in the course and one project at the end. Each module is 1-2 lessons and homework. One lesson is 60-90 minutes long.

This is a draft and will change.

Module 1: Introduction

  • What is MLOps
  • MLOps maturity model
  • Running example: NY Taxi trips dataset
  • Why do we need MLOps
  • Course overview
  • Environment preparation

Module 2: Processes

  • CRISP-DM, CRISP-ML
  • ML Canvas
  • Data Landscape canvas
  • (optional) MLOps Stack Canvas
  • Documentation practices in ML projects (Model Cards Toolkit)

Instructors: Larysa Visengeriyeva

2 hours

Module 3: Training

  • Tracking experiments
  • MLFlow
  • Model registry
  • ML pipelines, TFX, Kubeflow Pipelines
  • Scheduling pipelines (Airflow?)
  • Model testing

Instructors: Cristian Martinez, Theofilos Papapanagiotou

Homework:

  • ? something with MLFlow perhaps as it’s easier to run locally

Module 4: Serving

  • Batch vs online
  • For online: web services vs streaming
  • Serving models with Kubeflow+Kubernetes (refer to ML Zoomcamp)
  • Serving models in Batch mode (AWS Batch, Spark)
  • Streaming (Kinesis/SQS + AWS Lambda)

Instructors: Alexey Grigorev

Homework:

  • Deploy a model with Spark (local mode)

Module 5: Monitoring

  • ML monitoring VS software monitoring
  • Data quality monitoring
  • Data drift / concept drift
  • Batch VS real-time monitoring
  • Tools: Evidently
  • Tools: Prometheus/Grafana

Instructors: Emeli Dral

Homework:

  • ?

Other things:

  • Data quality issues
  • Alerts

Module 6: Best practices

  • Devops
  • Virtual environments and Docker
  • Python: logging, linting
  • Testing: unit, integration, regression
  • CI/CD (github actions)
  • Infrastructure as code (terraform, cloudformation)
  • Cookiecutter
  • Makefiles

Instructors: Sejal Vaidya

Homework:

  • ?

Project

  • End-to-end project with all the things above

Running example

To make it easier to connect different modules together, we’d like to use the same running example throughout the course.

Possible candidates:

Instructors

  • Larysa Visengeriyeva
  • Cristian Martinez
  • Theofilos Papapanagiotou
  • Alexey Grigorev
  • Emeli Dral
  • Sejal Vaidya

Other courses from DataTalks.Club:

FAQ

I want to start preparing for the course. What can I do?

If you haven't used Flask or Docker

If you have no previous experience with ML

  • Check Module 1 from ML Zoomcamp for an overview
  • Module 3 will also be helpful if you want to learn Scikit-Learn (we'll use it in this course)

mlops-zoomcamp's People

Contributors

alexeygrigorev avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.