Comments (4)
I think scaling is theoretically possible with something like this: https://docs.databricks.com/api/workspace/clusters#resize I guess the tricky bits are:
- Additional access tokens etc required to manage cluster with the api
- Not accidently giving people footguns to scale above previously specified max-clusters
- Maybe most imporantly, databricks provides "autoscale" functionality, which I think is normally on (although I have no stats at all). Worst case scenario would be the dask cluster and databricks getting in an argument over who's in charge of scaling.
from dask-databricks.
In #9 @benrutter added an initial implementation of this, but we should explore migrating the base class over to Cluster
instead of LocalCluster
.
from dask-databricks.
I'd also be very curious to know if we can somehow scale the Databricks cluster from within our notebooks. If so we could wire up the DatabricksCluster.scale()
method.
from dask-databricks.
Thanks @benrutter these are all good points. It might not be a sensible thing to do. Your comment does lead me to wonder how the autoscaling is triggered, is it based on node properties such as CPU/Memory or does Spark make those decisions? If it's Spark then there is no way for Dask to trigger that scale-up.
In general, I think autoscaling Dask on Databricks is an advanced topic that is best saved for later.
from dask-databricks.
Related Issues (15)
- High-level plan and scope HOT 1
- Add support for alternative worker commands and config options
- Add CI
- Add precommit and precommit CI
- Add auto VCS versioning
- Add docs HOT 1
- Publish to PyPI HOT 1
- Publish to conda-forge HOT 1
- Access Dask dashboard through Databricks proxy
- Add a DatabricksRunner HOT 8
- example of read/write parquet using dbfs://
- Installing additional packages on the dask workers HOT 4
- Unable to use all nodes/threads setup for Databricks Compute HOT 2
- Check process health HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dask-databricks.