Giter Site home page Giter Site logo

dask-databricks's Introduction

dask-databricks

Cluster tools for running Dask on Databricks multi-node clusters.

Quickstart

To launch a Dask cluster on Databricks you need to create an init script with the following contents and configure your multi-node cluster to use it.

#!/bin/bash

# Install Dask + Dask Databricks
/databricks/python/bin/pip install --upgrade dask[complete] dask-databricks

# Start Dask cluster components
dask databricks run

Then from your Databricks Notebook you can quickly connect a Dask Client to the scheduler running on the Spark Driver Node.

import dask_databricks

client = dask_databricks.get_client()

Now you can submit work from your notebook to the multi-node Dask cluster.

def inc(x):
    return x + 1

x = client.submit(inc, 10)
x.result()

Dashboard

You can access the Dask dashboard via the Databricks driver-node proxy. The link can be found in Client or DatabricksCluster repr or via client.dashboard_link.

>>> print(client.dashboard_link)
https://dbc-dp-xxxx.cloud.databricks.com/driver-proxy/o/xxxx/xx-xxx-xxxx/8087/status

Releasing

Releases of this project are automated using GitHub Actions and the pypa/gh-action-pypi-publish action.

To create a new release push a tag to the upstream repo in the format x.x.x. The package will be built and pushed to PyPI automatically and then later picked up by conda-forge.

# Make sure you have an upstream remote
git remote add upstream [email protected]:dask-contrib/dask-databricks.git

# Create a tag and push it upstream
git tag x.x.x && git push upstream main --tags

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.