Giter Site home page Giter Site logo

annielytix / advanced-databricks-for-ml-build-2019 Goto Github PK

View Code? Open in Web Editor NEW
14.0 1.0 19.0 14.77 MB

Using Azure Databricks (Spark) for ML, this is the //build 2019 repository with homework examples, code and notebooks

Jupyter Notebook 85.28% Scala 14.72%
azure azure-databricks databricks-challenges databricks-notebooks python scala build-2019 microsoft

advanced-databricks-for-ml-build-2019's Introduction

build2019-Advanced Azure Databricks for ML

Using Azure Databricks (Spark) for ML, this is the repository prsented at //build 2019 with additional homework examples, code and notebooks

Welcome

Welcome to //build 2019 Advanced Databricks Challenge. We will focus on hands-on activities that develop proficiency in advanced Databricks concepts such as data exploration using Spark, building Supervised & Unsupervised Learning Models, Evaluating Models and using advanced libraries like MMLSpark. These challenges assume an introductory to intermediate knowledge of Azure Databricks, and if this is not the case, please spend time working through the Introduction to Databricks challenges first.

Goals

Most challenges observed by customers in these realms are in stitching multiple services together. As such, where possible, we have tried to place key concepts in the context of a broader example.

At the end of this workshop, you should be able to:

  • Understand how to use Azure Databricks to build ML models including:

    • Supervised Learning (classification)
    • Unsupervised Learning (clustering / recommendation )
  • How to evaluate those models using Azure Databricks

  • Understanding Libraries: Introduction to MMLSpark and when to use it

-Introduction to Deep Learning

Background Knowledge

This workshop is meant for a Data Scientist on Azure who actively scripts using a common data science language like Python. Since this is only a short workshop, there are certain things you will need to read or setup after you arrive.

Firstly, you should have some previous exposure to Python. We will be using it for everything we are building in the workshop, so you should be familiar with how to use it to create ML models. Additionally, this is not a class where we teach you about how to choose the correct algorithm for the business scenario. We assume you have some familiarity with these concepts ahead of time.

Secondly, you should have some experience with Azure Databricks and the core concepts including workspaces, libraries et al. If not, please check out the Intro to Azure Databricks workshop first.

Thirdly, you should have experience with the portal and be able to create resources (and spend money) on Azure. We will not be providing Azure passes for this workshop.

For fun, I have included a EU soccer example (.DBC) as well as a Retail Fashion example and by popular demand, a Pandas UDF Benchmark notebook to help you get started with your User Defined Functions with Pandas. Please let me know if you have any questions.

Challenges

[Business Case I - Azure Databricks

  1. Start by following the steps in the [README] to provision your Azure environment and fork both the [labs] below and the notebooks used in the challenges.
  2. Challenge 0 - Administration. ****Please note: you do not need to run through Admin if you are an attendee of //build(see note below for when to use this Databricks Archive).
  3. Challenge 1 - Exploring Data with Spark.
  4. Challenge 2 - Building Supervised Learning Models.
  5. Challenge 3 - Evaluating Supervised Learning Models.
  6. Challenge 4 - Recommenders and Clustering.
  7. Challenge 5 - Using the MMLSpark Library

Note: The Challenge 0 - Administration archive is to help facilitate this workshop in your offices after the fact.**

Discussion Forum

  • SWAG given for most active participants
  • Q&A and Feedback

advanced-databricks-for-ml-build-2019's People

Contributors

annedroid avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

advanced-databricks-for-ml-build-2019's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.