Giter Site home page Giter Site logo

googlecloudplatform / airflow-operator Goto Github PK

View Code? Open in Web Editor NEW
299.0 21.0 68.0 38.97 MB

Kubernetes custom controller and CRDs to managing Airflow

License: Apache License 2.0

Makefile 1.48% Go 98.23% Dockerfile 0.30%
kubernetes kubernetes-operator apache-airflow airflow crd kubernetes-controller workflow-engine airflow-operator

airflow-operator's Introduction

Go Report Card

This is not an officially supported Google product.

Community

Project Status

Alpha

The Airflow Operator is still under active development and has not been extensively tested in production environment. Backward compatibility of the APIs is not guaranteed for alpha releases.

Prerequisites

  • Version >= 1.9 of Kubernetes.
  • Uses 1.9 of Airflow (1.10.1+ for k8s executor)
  • Uses 4.0.x of Redis (for celery operator)
  • Uses 5.7 of MySQL

Get Started

One Click Deployment from Google Cloud Marketplace to your GKE cluster

Get started quickly with the Airflow Operator using the Quick Start Guide

For more information check the Design and detailed User Guide

Airflow Operator Overview

Airflow Operator is a custom Kubernetes operator that makes it easy to deploy and manage Apache Airflow on Kubernetes. Apache Airflow is a platform to programmatically author, schedule and monitor workflows. Using the Airflow Operator, an Airflow cluster is split into 2 parts represented by the AirflowBase and AirflowCluster custom resources. The Airflow Operator performs these jobs:

  • Creates and manages the necessary Kubernetes resources for an Airflow deployment.
  • Updates the corresponding Kubernetes resources when the AirflowBase or AirflowCluster specification changes.
  • Restores managed Kubernetes resources that are deleted.
  • Supports creation of Airflow schedulers with different Executors
  • Supports sharing of the AirflowBase across mulitple AirflowClusters

Checkout out the Design

Airflow Cluster

Development

Refer to the Design and Development Guide.

Managed Airflow solution

Google Cloud Composer is a fully managed workflow orchestration service targeting customers that need a workflow manager in the cloud.

airflow-operator's People

Contributors

barney-s avatar cedbossneo avatar hxquangnhat avatar jcunhasilva avatar jie8357ioii avatar migueltp avatar pabloem avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

airflow-operator's Issues

Add comprehensive e2e tests

End-to-End(e2e) tests added are BDD test using gingko.

It(“should create airflow-base mysql and nfs components”)

Step 1: create AirflowBase object with storage and mysql enabled
Step 2: wait for mysql StatefulSet and nfs StatefulSet to become ready
all pods have to be available
Step 3: verify
for mysql, the root password secret should be created
Mysql and nfs stateful sets should have 1 pods each

It(“should create airflow-base sqlproxy and nfs components”)

Step 1: create AirflowBase object with storage and sqlproxy enabled
Step 2: wait for sqlproxy StatefulSet and nfs StatefulSet to become ready
all pods have to be available
Step 3: verify
for cloudsql (sqlproxy), the root password secret should be created
sqlproxy and nfs stateful sets should have 1 pods each

It(“should create airflow-cluster components using mysql base and celery executor”)

Step 1: create AirflowBase object with storage and mysql enabled
Step 2: wait for mysql StatefulSet and nfs StatefulSet to become ready
all pods have to be available
Step 3: create AirflowCluster object with redis, ui, scheduler and workers enabled and celery executor
Step 4: wait for redis, ui, scheduler and worker StatefulSets to become ready
all pods have to be available
Step 5: verify
All stateful sets have 1 pod each except workers which should have 2 pods
Scheduler is configured correctly to connect to mysql, and celery connection string points to redis instance
UI is configured correctly to connect to mysql
Workers celery config and mysql config is correct
Scheduler, ui and workers all synced the DAGs from git repo

It(“should create airflow-cluster components using cloudsql base and celery executor”)

Step 1: create AirflowBase object with storage and sqlproxy enabled
Step 2: wait for sqlproxy StatefulSet and nfs StatefulSet to become ready
all pods have to be available
Step 3: create AirflowCluster object with redis, ui, scheduler and workers enabled and celery executor
Step 4: wait for redis, ui, scheduler and worker StatefulSets to become ready
all pods have to be available
Step 5: verify
All stateful sets have 1 pod each except workers which should have 2 pods
Scheduler is configured correctly to connect to cloudsql, and celery connection string points to redis instance
UI is configured correctly to connect to mysql
Workers celery config and mysql config is correct
Scheduler, ui and workers all synced the DAGs from git repo

It(“should create airflow-cluster components using mysql base and celery executor with DAGs in GCS bucket”)

Step 1: create AirflowBase object with storage and mysql enabled
Step 2: wait for mysql StatefulSet and nfs StatefulSet to become ready
all pods have to be available
Step 3: create AirflowCluster object with redis, ui, scheduler and workers enabled and DAGs pointing to GCS bucket and celery executor
Step 4: wait for redis, ui, scheduler and worker StatefulSets to become ready
all pods have to be available
Step 5: verify
All stateful sets have 1 pod each except workers which should have 2 pods
Scheduler is configured correctly to connect to mysql, and celery connection string points to redis instance
UI is configured correctly to connect to mysql
Workers celery config and mysql config is correct
Scheduler, ui and workers all synced the DAGs from GCS bucket

It(“should support monitoring by scraping output of INFO command”)

Step 1: create AirflowBase object with storage and mysql enabled
Step 2: wait for mysql StatefulSet and nfs StatefulSet to become ready
all pods have to be available
Step 3: create AirflowCluster object with redis, ui, scheduler and workers enabled and celery executor
Step 4: wait for redis, ui, scheduler and worker StatefulSets to become ready
all pods have to be available
Step 5: verify
use curl to scrape monitoring endpoint of scheduler for prometheus-style INFO details
check that the scraped INFO details have the necessary airflow metrics

It(“should support scaling for workers”)

Step 1: create AirflowBase object with storage and mysql enabled
Step 2: wait for mysql StatefulSet and nfs StatefulSet to become ready
all pods have to be available
Step 3: create AirflowCluster object with redis, ui, scheduler and workers enabled and celery executor
Step 4: wait for redis, ui, scheduler and worker StatefulSets to become ready
all pods have to be available
Step 3: scale up workers by one (via modifying .spec.worker.replicas)
Step 4: verify new worker pod is created
Step 6: scale down workers by one (via modifying .spec.worker.replicas)
Step 7: verify the latest worker pod is deleted

It(“should respect topology constraints when scheduling shard and sentinel pods”)

Step 1: create AirflowBase object with storage and mysql enabled
Step 2: wait for mysql StatefulSet and nfs StatefulSet to become ready
all pods have to be available
Step 3: create AirflowCluster object with redis, ui, scheduler and workers enabled and celery executor and .spec.affinity set to “failure-domain.beta.kubernetes.io/zone”
Step 5: wait for redis, ui, scheduler and worker StatefulSets to become ready
all pods have to be available
Step 6: check all airflow cluster pods are scheduled with respect to topology constraint
Step 7: repeat step 1 to 6 with .spec.affinity specified as kubernetes.io/hostname

It(“should create airflow-cluster components using mysql base and local executor”)

Step 1: create AirflowBase object with storage and mysql enabled
Step 2: wait for mysql StatefulSet and nfs StatefulSet to become ready
all pods have to be available
Step 3: create AirflowCluster object with ui, scheduler enabled and local executor
Step 4: wait for ui, scheduler StatefulSets to become ready
all pods have to be available
Step 5: verify
All stateful sets have 1 pod each
Scheduler is configured correctly to connect to mysql
UI is configured correctly to connect to mysql
Scheduler, ui all synced the DAGs from git repo

It(“should create airflow-cluster components using cloudsql base and local executor”)

Step 1: create AirflowBase object with storage and sqlproxy enabled
Step 2: wait for sqlproxy StatefulSet and nfs StatefulSet to become ready
all pods have to be available
Step 3: create AirflowCluster object with ui, scheduler enabled and local executor
Step 4: wait for ui, scheduler StatefulSets to become ready
all pods have to be available
Step 5: verify
All stateful sets have 1 pod each
Scheduler is configured correctly to connect to sqlproxy
UI is configured correctly to connect to sqlproxy
Scheduler, ui all synced the DAGs from git repo
It(“should create airflow-cluster components using mysql base and k8s executor”)
Step 1: create AirflowBase object with storage and mysql enabled
Step 2: wait for mysql StatefulSet and nfs StatefulSet to become ready
all pods have to be available
Step 3: create AirflowCluster object with ui, scheduler enabled and k8s executor
Step 4: wait for ui, scheduler StatefulSets to become ready
all pods have to be available
Step 5: verify
All stateful sets have 1 pod each
Scheduler is configured correctly to connect to mysql
UI is configured correctly to connect to mysql
Scheduler, ui all synced the DAGs from git repo

It(“should create airflow-cluster components using cloudsql base and k8s executor”)

Step 1: create AirflowBase object with storage and sqlproxy enabled
Step 2: wait for sqlproxy StatefulSet and nfs StatefulSet to become ready
all pods have to be available
Step 3: create AirflowCluster object with ui, scheduler enabled and k8s executor
Step 4: wait for ui, scheduler StatefulSets to become ready
all pods have to be available
Step 5: verify
All stateful sets have 1 pod each
Scheduler is configured correctly to connect to sqlproxy
UI is configured correctly to connect to sqlproxy
Scheduler, ui all synced the DAGs from git repo

git-sync on airflow does not support SSH authentication

The current object only supports the username + password authentication method:

if sp.DAGs.Git.CredSecretRef != nil {

in order to fully support git-sync we need to 1. set GIT_SYNC_SSH to true and map a mounted volume that points to the ConfigMap: https://github.com/kubernetes/git-sync/blob/9ceb61f7947fbe463b1cc6e9ae5d719f5d8eebd2/docs/ssh.md#step-3-configure-git-sync-container

Manifest generation doesn't work with Kustomize v2.0+

Following the instructions in quickstart.md

$ kustomize version
Version: {KustomizeVersion:v2.0.3 GitCommit:a6f65144121d1955266b0cd836ce954c04122dc8 BuildDate:2019-03-18T22:15:21+00:00 GoOs:darwin GoArch:amd64}

$ make deploy
kustomize build config/default | kubectl apply -f -
Error: rawResources failed to read Resources: Load from path ../rbac/rbac_role.yaml failed: security; file '../rbac/rbac_role.yaml' is not in or below '/Users/zx8/.go/src/github.com/GoogleCloudPlatform/airflow-operator/config/default'

Related issue: kubernetes-sigs/kustomize#766

Advice - handling deployments whilst a DAG is running

First of all, I am thrilled you're working on this operator! Also, great work on Composer.

I was wondering if anyone was prepared to discuss achieving DAG reliability whilst a component is being deployed on Airflow. As Kubernetes routinely can schedule pods I imagined it requires higher reliability from DAGs.

When using Airflow to say run a Spark job on Dataproc, what would happen to a DAG run if a restart were to happen? Do you have any advice on improving reliability?

Please feel free to reply and talk offline if that's useful. Hopefully you can provide input

Kubernetes Executor: worker pods are not executing

I was testing the fixes made here: #31

Although Git authentication is working now, I am still not able to run our workers using the Kubernetes executor.

When the worker pod is launched no work is done, or at least the scheduler doesn't get any feedback and the UI doesn't display any execution updates regarding the DAG run.

GCS DAG Sync - Service Account Secret

Currently, the GCS DAG sync uses the node's service account to connect to the GCS bucket.
The node's SA may not have access to the GCS bucket.

DAG sync should be able to mount a secret containing the SA to use.

GCS & 1.10.2

Using dag provision through gcs and the latest release continously crash loop.

airflow.exceptions.AirflowConfigException: In kubernetes mode the following must be set in the `kubernetes` config section: `dags_volume_claim` or `dags_volume_host` or `dags_in_image` or `git_repo and git_branch and git_dags_folder_mount_point`

Is there anything config specific to bypass this? I've tried setting manually the dags_volume_host which does not seem to be passed to the actual pods.

Create documentation

Create a docs folder for explaining the design and the Custom Resources.
As well as proposals.

Airflow 1.10.0

Any plan to upgrade to latest airflow that was released not so long ago?

Allow dags_in_image option for k8s

I was trying to run this project using k8s executor and the dags_in_image option and having no dags section

apiVersion: airflow.k8s.io/v1alpha1
kind: AirflowCluster
metadata:
  name: pk-cluster
spec:
  executor: Kubernetes
  config:
    airflow:
      AIRFLOW__KUBERNETES__DAGS_IN_IMAGE: "1"
  ui:
    image: "my-custom-airflow-image"
    version: "1.10.2"
    replicas: 1
  scheduler:
    image: "my-custom-airflow-image"
    version: "1.10.2"
  worker:
    image: "my-custom-airflow-image"
    version: "1.10.2"
  airflowbase:
    name: pc-base

However, I was getting errors when trying to run the operator:

  Conditions:
    Last Transition Time:  2019-06-06T06:39:42Z
    Last Update Time:      2019-06-06T06:39:42Z
    Message:               templates/airflow-configmap.yaml:template: tmpl:174:25: executing "tmpl" at <.Cluster.Spec.DAGs.G...>: can't evaluate field Repo in type *v1alpha1.GitSpec
    Reason:                ErrorSeen
    Status:                True
    Type:                  Error

I believe the issue is caused by doing a hardcoded lookup on the spec values for git

git_repo = {{.Cluster.Spec.DAGs.Git.Repo}}
git_branch = {{.Cluster.Spec.DAGs.Git.Branch}}
git_subpath = {{.Cluster.Spec.DAGs.DagSubdir}}
git_dags_folder_mount_point = /usr/local/airflow/dags/
git_sync_dest = gitdags
git_user =
git_password =

Not exactly sure how to fix this in a proper manner, but removing these lines in the operator image was enough to get me past this issue.

Thanks a lot

GIT authentication is not working during cluster setup

I created the following cluster using the airflow operator deployed directly from Google Cloud Marketplace:

apiVersion: airflow.k8s.io/v1alpha1
kind: AirflowCluster
metadata:
  name: my-airflow-cluster
spec:
  executor: Kubernetes
  ui:
    replicas: 1
    version: "1.10.1"
  scheduler:
    version: "1.10.1"
  worker:
    version: "1.10.1"
  config:
    airflow:
      AIRFLOW__SCHEDULER__DAG_DIR_LIST_INTERVAL: 100
  dags:
    subdir: "pipelines/"
    git:
      repo: "https://github.com/MyOrg/my-repo.git"
      once: false
      branch: master
      user: MY_USER
      cred:
        name: MY_GIT_SECRET
  airflowbase:
    name: my-airflow-base

Once the cluster is up and running, I get the following error in the git-sync side container:
Cloning into '/git'...\nfatal: could not read Username for 'https://github.com': No such device or address\n"

So it seems the container is struggling with GIT authentication. When inspecting the container's environment variables I can see that these variables were created:

GIT_PASSWORD	
defined as key password of MY_GIT_SECRET
GIT_USER	
my_user

I stumbled upon this issue in the git-sync project where other people had issues setting up authentication:
kubernetes/git-sync#126

In this issue we have the following configuration

...
    - name: GIT_SYNC_USERNAME
      valueFrom:
        secretKeyRef:
          name: git-creds
          key: username
    - name: GIT_SYNC_PASSWORD
      valueFrom:
        secretKeyRef:
          name: git-creds
          key: password
...

so it seems the airflow-operator is using the wrong prefix for the authentication ("GIT_" instead of "GIT_SYNC_")

Unit tests return errors

When I run make test, I got the following errors:
E0206 17:47:07.034186 87463 genericreconciler.go:50] Failed: [*v1alpha1.AirflowBase/default/foo] observing resources. no matches for kind "Application" in version "app.k8s.io/v1beta1"

Does anyone have any idea on this error?

Quick Start Guide does not work

The vendors/k8s.io/ directory structure has changed since the guide was written and the Makefile needs to be updated to reflect the change

Support for secrets in the spec.config.airflow field

Environment variables are passed to our airflow containers using the "airflow" field:

config:
    airflow:
      AIRFLOW__CORE__FERNET_KEY: "MY_SECRET_KEY"

In this case we are defining the fernet key which is sensitive information. We should be able to use secrets in this field.

We have this block in airflow.go:

for _, k := range keys {
    env = append(env, corev1.EnvVar{Name: k, Value: sp.Config.AirflowEnv[k]})
}

Maybe we can have a separate field ("airflowSecrets") to inject secret values:

for _, k := range keys {
    env = append(env, corev1.EnvVar{Name: k, ValueFrom: envFromSecret(sp.Config.AirflowSecrets[k].name, sp.Config.AirflowSecrets[k].field)})
}

Infinite restart loop when Airflow authentication is activated

I tried setting Airflow authentication using the config field in the cluster:

apiVersion: airflow.k8s.io/v1alpha1
kind: AirflowCluster
metadata:
  name: my-airflow-cluster
spec:
  executor: Local
  ui:
    replicas: 1
    version: "1.10.1"
  scheduler:
    version: "1.10.1"
  config:
    airflow:
      AIRFLOW__SCHEDULER__DAG_DIR_LIST_INTERVAL: "100"
      AIRFLOW__WEBSERVER__AUTHENTICATE: "True"
      AIRFLOW__WEBSERVER__AUTH_BACKEND: "airflow.contrib.auth.backends.password_auth"

The env variables work properly as the login screen is displayed when we enter the Airflow UI. However, both the ui and scheduler are being killed and restarted every 2 minutes. It seems that the controller thinks they are in an inconsistent state and forces the restart.

When I inspect the controller logs I don't see any apparent difference from those available when the authentication is turned off.

./hack/sample configurations not working

Created an image based off of the tip of master and deployed.

Attempted to use PostGre/Celery configuration and MySQL/Celery configuration from the ./hack/sample directories. And the scheduler never successfully started.

Scheduler log showed that it was getting access denied using the account.
Connected to DB and set the password to the value stored in the webui secret and the scheduler started normally.

Support multiple worker 'types'

Add support for multiple worker types. For example, a high memory worker and a 'default' worker.

A couple of ways this could be implemented:

  • An option for an AirflowCluster to be worker only, i.e. no flower, ui, scheduler and attach to the same base. Manually set your Redis?
  • An AirflowWorker CRD so you can define them multiple times
  • A slice of []Workers with AirflowCluster

Airflow Operator + Postgres Cloud SQL - Airflow UI pod won't restart after it is killed

After we start the Airflow Operator with the base and cluster configurations, using the Postgres Cloud SQL Configuration, an error is thrown in the "postgres-dbcreate" init container whenever the Airflow UI pod is restarted:
Database already exists

This container needs to check if the database already exists before creating the database and tables.

Incorrect `Kind` in ownerRef

I am trying to run Airflow Operator on OpenShift 4 and I am hitting an interesting issue:

ERROR: logging before flag.Parse: E0930 15:45:08.736404       1 genericreconciler.go:52] Failed: [*v1alpha1.AirflowBase/airflowop-system/pc-base(cmpnt:*airflowbase.Postgres)] Create. statefulsets.apps "pc-base-postgres" is forbidden: cannot set blockOwnerDeletion in this case because cannot find RESTMapping for APIVersion airflow.k8s.io/v1alpha1 Kind *v1alpha1.AirflowBase: no matches for kind "*v1alpha1.AirflowBase" in version "airflow.k8s.io/v1alpha1"

I traced the source of this to https://github.com/GoogleCloudPlatform/airflow-operator/blob/master/vendor/sigs.k8s.io/controller-reconciler/pkg/genericreconciler/genericreconciler.go#L219 - i.e. the Kind gets set to actual type (*v1alpha1.AirflowBase) instead of just AirflowBase

I also noticed this function https://github.com/GoogleCloudPlatform/airflow-operator/blob/master/pkg/apis/airflow/v1alpha1/airflowbase_types.go#L513 which looks like that is what we actually want to be used, but it never enters that function during execution.

It also seems the controller-reconciler project is gone.

I was able to workaround the issue by turning of the blockOwnerDeletion but I don't like that at all:) It unblocked me for experimenting more with Airflow, but I would like to find a real solution.

Any thoughts on how to solve this?

Add Env variable to Container

How to overwrite airflow config with environment variables?
I tried the following in the AirflowCluster yaml, the the operator did not pick it up.

spec:
  config:
    airflow:
      AIRFLOW__WEBSERVER__NAVBAR_COLOR: #f5f5f5

Add support for SQL Proxy connections in our workers

Currently if we want to connect to a Cloud SQL database in our workers/dags, we need to create a connection using the database's Public or Private IP.
The best way to connect to these databases is through SQL Proxy side containers as described here:
https://cloud.google.com/sql/docs/postgres/connect-kubernetes-engine

We should have a way of specifying predefined sql proxy connections to be attached to the scheduler or worker pods. We could define these connections directly in the cluster configuration.

The configuration spec should be similar as the one currently available in the base yaml configuration:

spec:
  sqlproxy:
    project: kubeflow-193622
    region: us-central1
    instance: testsql-cluster

airflow operator question

Hi @barney-s ,

We(Lyft) are exploring running airflow on k8s internally. I wonder whether google has run this repo in production. Any hiccups encounter so far?

Thanks,
-Tao

Create application CRD

The Airflow operator requires an application CRD so that details of cluster resources can be displayed via GCP console. I will work on this.

Quickstart Guide references files which do not exist

docs/quickstart.md references files under manifests dir, which does not exist.

Either:

  1. Tell me which files are the ones which are referenced in the quickstart guide but no longer exist under those names, then I am happy to update docs to point to correct files.
  2. Update docs/quickstart.md to point to correct location

Local Executor: display worker logs in Stackdriver

When we run our workers using the Local executor, the worker logs are placed in a local folder (/usr/local/airflow/logs/). It would be nice if we could also see these logs in Stackdriver since they disappear when the scheduler pod is restarted.

Documentation on installation

Trying to install on a non GCE cluster, I get an error right off the bat:

$> make install
make: gcloud: Command not found
make: gcloud: Command not found
go run vendor/sigs.k8s.io/controller-tools/cmd/controller-gen/main.go all
vendor/sigs.k8s.io/controller-tools/cmd/controller-gen/main.go:25:2: cannot find package "sigs.k8s.io/controller-tools/pkg/crd/generator" in any of:
        /usr/src/sigs.k8s.io/controller-tools/pkg/crd/generator (from $GOROOT)
        /home/user/go/src/sigs.k8s.io/controller-tools/pkg/crd/generator (from $GOPATH)
vendor/sigs.k8s.io/controller-tools/cmd/controller-gen/main.go:26:2: cannot find package "sigs.k8s.io/controller-tools/pkg/generate/rbac" in any of:
        /usr/src/sigs.k8s.io/controller-tools/pkg/generate/rbac (from $GOROOT)
        /home/user/go/src/sigs.k8s.io/controller-tools/pkg/generate/rbac (from $GOPATH)
make: *** [Makefile:47: manifests] Error 1

If I try with NOTGCP, I get a different error:

make install NOTGCP=true
go run vendor/sigs.k8s.io/controller-tools/cmd/controller-gen/main.go all
vendor/sigs.k8s.io/controller-tools/cmd/controller-gen/main.go:25:2: cannot find package "sigs.k8s.io/controller-tools/pkg/crd/generator" in any of:
        /usr/src/sigs.k8s.io/controller-tools/pkg/crd/generator (from $GOROOT)
        /home/user/go/src/sigs.k8s.io/controller-tools/pkg/crd/generator (from $GOPATH)
vendor/sigs.k8s.io/controller-tools/cmd/controller-gen/main.go:26:2: cannot find package "sigs.k8s.io/controller-tools/pkg/generate/rbac" in any of:
        /usr/src/sigs.k8s.io/controller-tools/pkg/generate/rbac (from $GOROOT)
        /home/user/go/src/sigs.k8s.io/controller-tools/pkg/generate/rbac (from $GOPATH)
make: *** [Makefile:47: manifests] Error 1`

I am probably doing something wrong. I tried installing those packages, but only see this in the corresponding directory:

ls src/sigs.k8s.io/controller-tools/pkg         
total 40K
drwxrwxr-x. 10 user user 4.0K Aug 13 20:18 .
drwxrwxr-x.  6 user user 4.0K Aug 13 20:18 ..
drwxrwxr-x.  4 user user 4.0K Aug 13 20:18 crd
drwxrwxr-x.  3 user user 4.0K Aug 13 20:18 deepcopy
drwxrwxr-x.  3 user user 4.0K Aug 13 20:18 genall
drwxrwxr-x.  3 user user 4.0K Aug 13 20:18 loader
drwxrwxr-x.  2 user user 4.0K Aug 13 20:18 markers
drwxrwxr-x.  3 user user 4.0K Aug 13 20:18 rbac
drwxrwxr-x.  2 user user 4.0K Aug 13 20:18 typescaffold
drwxrwxr-x.  2 user user 4.0K Aug 13 20:18 webhook

Probably user error, but maybe this would benefit from better documentation.

Env variables for custom airflow docker images

Just giving this a spin with my own images and one of the reasons it doesn't currently work for me is that certain env variables (AIRFLOW__CORE__SQL_ALCHEMY_CONN and AIRFLOW__CORE__FERNET_KEY in particular) are composed in the entrypoint.sh as opposed to directly from env vars. It appears the airflow docker image is based on https://github.com/puckel/docker-airflow, is that correct? Are there reasons why the airflow Dockerfile wouldn't be part of this repo? (I'm using a different docker image which doesn't have the same entrypoint as that one.)

In any event, my suggestion would be to try and keep the go build process decoupled from any particular docker image/entrypoint. I see the args are configurable in utils.go. For sensitive config parameters (e.g. sql_alchemy_conn and fernet_key), these would probably have to come from kubernetes secrets and be injected as env_vars in the UI, scheduler, and k8s worker.

Clarify docs/api.md regarding field CredSecretRef

Attempting to set up git-sync to use a username and OAuth token, but cannot seem to figure out how to use the CredSecretRef field - can we clarify the docs, perhaps providing an example?

I have tried to set the field using the UID / selfLink to the secret containing the token / password, however am getting the following error:

spec.dags.git.cred in body must be of type object: "string"

Any clarification would be appreciated!

Add Ingress for UI, Flower

The UI and Flower don't currently have an Ingress option and you have to port forward to reach them.

Please add Ingress(es) for the UI, and if CeleryExecutor, Flower.

CloudSQL over private IP address

Currently CloudSQL appears to only allow connections via SQLProxy.

It would be great for this to be extended to allow connections over a private IP.

I understand this is more complex due to selecting networks, etc., if you're creating the SQL instance via this operator, but personally I'd like to use this with an existing Cloud SQL server.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.