Giter Site home page Giter Site logo

Comments (8)

benrutter avatar benrutter commented on June 12, 2024 1

I've taken a bit of a look at this, although haven't been able to get anything up and running yet. Just posting some progress here in case anyone else is looking at it too.

Databricks has this article on getting ray up and running which looks helpful and very similar to what getting dask running should look like. It suggests like you've put in the description using DB_IS_DRIVER and DB_DRIVER_IP.

By my reckoning, that should imply this simple init script would work:

set -e

# Install dask
pip install "dask[complete]"

# if no runtime version start dask worker
if [[ $DB_IS_DRIVER = "TRUE" ]]; then
  dask scheduler &
# otherwise, start start dask scheduler
else
  sleep 40
  dask worker tcp://$DB_DRIVER_IP:8786 &
fi

spoiler alert: it doesn't. Seems to set the scheduler fine, but not any workers, or at least, something different must be needed to communicate with workers than what I've put together.

This script registers the client ok, but the submitted job just hangs permanently on "pending" suggest no workings are registered:

from dask.distributed import Client
import os

client = Client(f'{os.environ["SPARK_LOCAL_IP"]}:8786')

def inc(x):
    return x + 1

x = client.submit(inc, 10)

from dask-databricks.

jacobtomlinson avatar jacobtomlinson commented on June 12, 2024 1

Also @skirui-source shared this updated version of the script that uses nc to poll for the scheduler instead of waiting for a sleep which is a little more robust.

#!/bin/bash

set -e

echo "DB_IS_DRIVER = $DB_IS_DRIVER"
echo "DB_DRIVER_IP = $DB_DRIVER_IP"

pip install --upgrade pip dask[complete]

if [[ $DB_IS_DRIVER = "TRUE" ]]; then
  echo "This node is the Dask scheduler."
  dask scheduler &
else
  echo "This node is a Dask worker."
  echo "Connecting to Dask scheduler at $DB_DRIVER_IP:8786"
  # Wait for the scheduler to start 
  while ! nc -z $DB_DRIVER_IP 8786; do
    echo "Scheduler not available yet. Waiting..."
    sleep 1
  done
  dask worker tcp://$DB_DRIVER_IP:8786 &
fi

from dask-databricks.

jacobtomlinson avatar jacobtomlinson commented on June 12, 2024

Thanks for that @benrutter. I just tried your example and it worked for me!

image

from dask-databricks.

benrutter avatar benrutter commented on June 12, 2024

Oh nice one! Did submitting a job to the client work and return a response?

I'm curious about why it didn't work on the cluster I tried, what version of databricks runtime / nodes set up did you have?

from dask-databricks.

jacobtomlinson avatar jacobtomlinson commented on June 12, 2024

Yeah you can see the result of 11 printed in my screenshot.

Here's the configuration I used.

image

I wonder if the sleep of 40 seconds is too short to work consistently?

from dask-databricks.

benrutter avatar benrutter commented on June 12, 2024

Not sure how I missed in the screenshot! Could be the 40 seconds, I'll see if I can get it to work by extending it.

from dask-databricks.

benrutter avatar benrutter commented on June 12, 2024

Yup, I think it must have either:

  • The 40 seconds not being enough
  • The driver node not having a wait time, so being callable ahead of workers

I don't really know enough about when databricks considers a cluster "ready", but given I was starting up a cluster to immediately run a notebook, its possible I just needed to wait an extra 40 seconds being scheduling the job.

Either way, tried again with a 100 second wait both before the worker nodes, and after the dask set up on the driver node, and all worked great.

Nice! What would be involved in setting up a runner to use this method?

from dask-databricks.

jacobtomlinson avatar jacobtomlinson commented on June 12, 2024

@benrutter awesome glad you had success!

Now that we've figured some things out and have a POC I'm not convinced this fits into the runner paradigm anymore. Given the use of init scripts it would probably be better for us to just provide a small utility to replace most of that script, and also provide some client-side tools for use in the notebook.

I've created another repo which will probably also get moved to dask-contrib at some point. I'm going to transfer this issue over there and close it out. Please feel free to engage in the other discussions I'm starting there.

from dask-databricks.

Related Issues (17)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.