Giter Site home page Giter Site logo

azure-databricks-exercise's Introduction

Azure Databricks Hands-on (Tutorials)

Follow each instructions on notebook below.

  1. Storage Settings
  2. Basics of Pyspark and Spark Machine Learning
  3. Spark Machine Learning Pipeline
  4. Hyper-parameter Tuning
  5. MLeap (requires ML runtime)
  6. Horovod Runner on Databricks Runtime for ML (requires ML runtime)
  7. Structured Streaming (Basic)
  8. Structured Streaming with Azure EventHub or Kafka
  9. Delta Lake
  10. Work with MLFlow (requires ML runtime)
  11. Orchestration with Azure Data Services

Before you start

  • Create Azure Databricks resource in Microsoft Azure, and launch workspace. See details from instructor or from the Quickstart.

  • Create a computing cluster on Databricks workspace. (Select "Compute" in Workspace UI.)
    Databricks Runtime Version 10.2 ML or above is recommended for running this tutorial.

  • Download HandsOn.dbc and import into your workspace.

    • Select "Workspace" in Workspace UI.
    • Go to user folder.
    • Click your e-mail (the arrow in the right side) and select "import" command to import HandsOn.dbc.
  • Open the imported notebook and attach your cluster in the notebook. (Select cluster on top of notebook.)

Note : You cannot use Azure Trial (Free) subscription, because of limited vCPU quota. Please promote to Pay-As-You-Go when you use trial subscription. (The credit will be reserved even when you transit to Pay-As-You-Go.)

Additional resources for further exploration

Modified by Ed Fine @Afinepoint
Links to code provided to keep up to date. Original code by Tsuyoshi Matsuzaki @ Microsoft

azure-databricks-exercise's People

Contributors

edfine avatar tsmatsuz avatar

Forkers

rohitsgit

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.