Comments (20)
#1256 adds formal support for Airflow in Metaflow. Docs & release announcement to follow soon!
from metaflow.
I would second @impredicative 's comment that this is probably too broad.
In particular, I think there's potential, independent value of having a plugin implementation of a k8s cli, compute environment, and decorator. Based on a quick scan, it doesn't seem like there's too much functionality there to implement -- just make a kube job definition, come up with an annotation scheme (probably can do something similar to what airflow does), and handle cleanup. Drop in some example RBAC templates and you're probably good to go.
I think it would probably be fine to stop at container/job orchestration, and leave things like cluster autoscaling to pointers to existing k8s docs and tools.
The scheduler (i.e Airflow or AWS Step Function or Argo) seems like a separate discussion that's out of scope of a question of Kubernetes.
from metaflow.
Why not compile to Kubeflow Pipelines via an intermediate representation (IR) [1]?
from metaflow.
For folks following this thread, we recently announced an equivalent support for AWS Step Functions. Here is an article with more details.
from metaflow.
@JoshZastrow I thought that MetaFlow could integrate with kubeflow, which is the machine learning toolkit for Kubernetes.
from metaflow.
@JoshZastrow Thanks for opening the issue! Yes, we are evaluating and prioritizing our roadmap currently.
from metaflow.
And what about using argo instead of airflow ? (https://github.com/argoproj/argo)
Can it be included in this issue or should it be another one ?
from metaflow.
@nlaille Let's track that as a separate issue so that people can vote and weigh in with their opinions.
from metaflow.
IMHO this issue is too broad. Let me separate the use of Airflow with and without Kubernetes. You probably don't need Metaflow if you're using Airflow with Kubernetes. You may need Metaflow as an Airflow executor and an Airflow operator if you're using Airflow without Kubernetes.
Admittedly not entirely familiar with what all Metaflow offers just yet
I love open source software and solutions including Airflow which I use, but I believe this issue should be closed unless the o.p. can substantiate what Metaflow would meaningfully add to the Airflow with Kubernetes combo.
from metaflow.
Orchestration part could be cloud solutions like AWS Step Function or container based orchestration solutions like argo or other orchestrations like Airflow.
One reasonable option is to map metaflow DAG to step function/ARGO/Airflow DAG and execute remotely. Computing resources need to be changed correspondingly. Totally agree on @impredicative 's point, unless users have clear requirements, otherwise, it's not that meaningful to do this integration.
from metaflow.
@talebzeghmi Yes, an IR for KfP would be great. Is there an RFC for it? We are happy to contribute our thoughts.
from metaflow.
There are existing mechanisms for triggering workflows based on external events.
For clarity, what are these? What if I want to trigger it on a schedule like Airflow allows me to do?
from metaflow.
For AWS Step Functions, we provide time-based triggers out of the box right now. You can very easily configure other triggers (say data availability in S3 using Amazon EventBridge).
from metaflow.
@savingoyal is there any support for event-based triggers? (e.g. REST API)
from metaflow.
@lucianoviola Yes, you can use AWS Event Bridge to do event-based triggering of Step Functions workflows.
from metaflow.
#50 (comment) If you would like to try out and give feedback on our Kubernetes integration, please reach out at http://slack.outerbounds.co
from metaflow.
#992 provides GA support for Kubernetes. https://github.com/outerbounds/metaflow/tree/airflow is tracking the Airflow integration on top of Kubernetes.
from metaflow.
Kubernetes support was done via supporting the Argo-Workflows, great!
#992 (Dispatch Metaflow flows to Argo Workflows)
from metaflow.
This branch tracks the work for this issue.
from metaflow.
https://outerbounds.com/blog/better-airflow-with-metaflow/
from metaflow.
Related Issues (20)
- Conda environment being treated as disabled, and not appending environment to PATH.
- Metaflow crashes on AWS Batch if folder called `metaflow` is present in the working directory HOT 5
- Cardview on WSL error HOT 2
- S3 access denied even if I have full access to S3
- Certain flows failing on Argo Workflows =>3.5.0 HOT 1
- Metaflow job completion or exit handlers?
- run.finished not set when using AWS Step Functions and there's an error
- setting METAFLOW_OTEL_ENDPOINT when running in ECS fargate, not Kubernetes HOT 1
- add __repr__ methods to Parameter
- create contributing guide
- "Service token file does not exist" error when deploying flow to Argo from CI HOT 1
- argo-workflows create --only-json doesn't export the cron workflow configuration
- Using `tags` as a Parameter name breaks flow. HOT 1
- Add option to batch decorator to increase ephemeralStorage on Fargate
- `--package-suffixes` omits dotfiles HOT 1
- Is it possible to run metaflow steps in custom docker containers on local?
- Opentelemetry configuration not carrying over to Batch
- Add a priority class option for the kubernetes flow decorator HOT 1
- Reduce the number of reserved parameter names
- Logs don't show up on the console. gs_tail raises NotFound error
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from metaflow.