Giter Site home page Giter Site logo

rafaelpierre / pyjaws Goto Github PK

View Code? Open in Web Editor NEW
37.0 3.0 3.0 3.54 MB

PyJaws: A Pythonic Way to Define Databricks Jobs and Workflows

Home Page: https://pyjaws.readthedocs.io

License: MIT License

Makefile 0.23% Python 83.14% Jinja 16.63%
airflow cicd data-engineering databricks mlops orchestration spark apache-airflow apache-spark pyspark

pyjaws's People

Contributors

aaravind100 avatar rafaelpierre avatar rafaelvp-db avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

pyjaws's Issues

Support for Databricks SDK

Expected Behavior

Currently, we use Databricks Jobs and Workflows REST API. We should replace this by Databricks Python SDK. Benefits:

  • Not having to adapt to potential changes in the API
  • Removing dependency on Databricks CLI (which doesn't have 100% coverage at the moment)
  • Removing the complexity associated with Jinja templating, which will help increase the coverage of PyJaws as well

Current Behavior

PyJaws integrates with Databricks REST API

Steps to Reproduce (for bugs)

Context

Your Environment

  • pyjaws version used:

Add an option to export JSON / YAML

Expected Behavior

Should be possible to output the payload as JSON or YAML if needed

Current Behavior

Steps to Reproduce (for bugs)

Context

Your Environment

  • pyjaws version used:

implicit context managers

Expected Behavior

Changing the way Cluster context managers work, so that we don't need to assign clusters to tasks inside the with Cluster(...) as cluster block

Current Behavior

Steps to Reproduce (for bugs)

Context

Your Environment

  • pyjaws version used:

Airflow DAG converter

Expected Behavior

Users should be able to convert Airflow DAGs to PyJaws ones

Current Behavior

Steps to Reproduce (for bugs)

Context

Your Environment

  • pyjaws version used:

context manager capabilities for cluster config

Expected Behavior

Should be possible to do something like: with Cluster(...) as cluster: and declare tasks within this with block. This would result in the cluster object being assigned to all tasks declared within this block.

Current Behavior

NA

Steps to Reproduce (for bugs)

Context

Your Environment

  • pyjaws version used:

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.