googlecloudplatform / airflow-operator Goto Github PK

View Code? Open in Web Editor NEW

299.0 21.0 68.0 38.97 MB

Kubernetes custom controller and CRDs to managing Airflow

License: Apache License 2.0

Makefile 1.48% Go 98.23% Dockerfile 0.30%

kubernetes kubernetes-operator apache-airflow airflow crd kubernetes-controller workflow-engine airflow-operator

airflow-operator's Introduction

This is not an officially supported Google product.

Community

Join our Slack channel.

Project Status

Alpha

The Airflow Operator is still under active development and has not been extensively tested in production environment. Backward compatibility of the APIs is not guaranteed for alpha releases.

Prerequisites

Version >= 1.9 of Kubernetes.
Uses 1.9 of Airflow (1.10.1+ for k8s executor)
Uses 4.0.x of Redis (for celery operator)
Uses 5.7 of MySQL

Get Started

One Click Deployment from Google Cloud Marketplace to your GKE cluster

Get started quickly with the Airflow Operator using the Quick Start Guide

For more information check the Design and detailed User Guide

Airflow Operator Overview

Airflow Operator is a custom Kubernetes operator that makes it easy to deploy and manage Apache Airflow on Kubernetes. Apache Airflow is a platform to programmatically author, schedule and monitor workflows. Using the Airflow Operator, an Airflow cluster is split into 2 parts represented by the AirflowBase and AirflowCluster custom resources. The Airflow Operator performs these jobs:

Creates and manages the necessary Kubernetes resources for an Airflow deployment.
Updates the corresponding Kubernetes resources when the AirflowBase or AirflowCluster specification changes.
Restores managed Kubernetes resources that are deleted.
Supports creation of Airflow schedulers with different Executors
Supports sharing of the AirflowBase across mulitple AirflowClusters

Checkout out the Design

Development

Refer to the Design and Development Guide.

Managed Airflow solution

Google Cloud Composer is a fully managed workflow orchestration service targeting customers that need a workflow manager in the cloud.

airflow-operator's People

Contributors

Stargazers

Watchers

Forkers

datalayer-externals crimsonfaith91 eamonkeane etsangsplk maduhu mortent sysadminsemantics aland-zhang jie8357ioii feng-tao onewealthplace barney-s huanwei boob-sbcm benluteijn jcunhasilva migueltp headnick23 madhurisudan hxquangnhat rambabunaidukoda rodrigonavarro23 talcoding chowdhary987 anandswaminathan dimberman functicons shanhsl jingwangumg jin-doc sarjeet2013 mmyers5 tyagiprince2010 sanjc vpavlin belchatek pabloem chattarajoy mapr jessewei faddiekh jpoley lauralevy111 anhuaxiang munjojung ed00m muskanmahajan37 sushi30 amulmgr dviejopomata rajendraprasad9 isabella232 kc105 njc-gov iamkavinash jmesong mustang2247 madhuri5279 wrkolcomx frankfanslc joaonart tianrobin vitaly-zverev loop54

airflow-operator's Issues

manifests/install_controller.yaml links to wrong image tag

Hi,

The image tag in manifests/install_controller.yaml from the example in the readme points to tag alpha which doesn't seem to exist. Changing this to v1alpha1 seemed to fix this.

airflow 1.10+ broken with cloudsql(mysql) explicit_defaults_for_timestamp=off

Since 1.10.0, Airflow requires MySQL to have explicit_defaults_for_timestamp flag set to ON.
CloudSQL can only support MySQL with explicit_defaults_for_timestamp=OFF for now.

Add comprehensive e2e tests

End-to-End(e2e) tests added are BDD test using gingko.

It(“should create airflow-base mysql and nfs components”)

Step 1: create AirflowBase object with storage and mysql enabled
Step 2: wait for mysql StatefulSet and nfs StatefulSet to become ready
all pods have to be available
Step 3: verify
for mysql, the root password secret should be created
Mysql and nfs stateful sets should have 1 pods each

It(“should create airflow-base sqlproxy and nfs components”)

It(“should create airflow-cluster components using mysql base and celery executor”)

Step 1: create AirflowBase object with storage and mysql enabled
Step 2: wait for mysql StatefulSet and nfs StatefulSet to become ready
all pods have to be available
Step 3: create AirflowCluster object with redis, ui, scheduler and workers enabled and celery executor
Step 4: wait for redis, ui, scheduler and worker StatefulSets to become ready
all pods have to be available
Step 5: verify
All stateful sets have 1 pod each except workers which should have 2 pods
Scheduler is configured correctly to connect to mysql, and celery connection string points to redis instance
UI is configured correctly to connect to mysql
Workers celery config and mysql config is correct
Scheduler, ui and workers all synced the DAGs from git repo

It(“should create airflow-cluster components using cloudsql base and celery executor”)

It(“should create airflow-cluster components using mysql base and celery executor with DAGs in GCS bucket”)

Step 1: create AirflowBase object with storage and mysql enabled
Step 2: wait for mysql StatefulSet and nfs StatefulSet to become ready
all pods have to be available
Step 3: create AirflowCluster object with redis, ui, scheduler and workers enabled and DAGs pointing to GCS bucket and celery executor
Step 4: wait for redis, ui, scheduler and worker StatefulSets to become ready
all pods have to be available
Step 5: verify
All stateful sets have 1 pod each except workers which should have 2 pods
Scheduler is configured correctly to connect to mysql, and celery connection string points to redis instance
UI is configured correctly to connect to mysql
Workers celery config and mysql config is correct
Scheduler, ui and workers all synced the DAGs from GCS bucket

It(“should support monitoring by scraping output of INFO command”)

Step 1: create AirflowBase object with storage and mysql enabled
Step 2: wait for mysql StatefulSet and nfs StatefulSet to become ready
all pods have to be available
Step 3: create AirflowCluster object with redis, ui, scheduler and workers enabled and celery executor
Step 4: wait for redis, ui, scheduler and worker StatefulSets to become ready
all pods have to be available
Step 5: verify
use curl to scrape monitoring endpoint of scheduler for prometheus-style INFO details
check that the scraped INFO details have the necessary airflow metrics

It(“should support scaling for workers”)

Step 1: create AirflowBase object with storage and mysql enabled
Step 2: wait for mysql StatefulSet and nfs StatefulSet to become ready
all pods have to be available
Step 3: create AirflowCluster object with redis, ui, scheduler and workers enabled and celery executor
Step 4: wait for redis, ui, scheduler and worker StatefulSets to become ready
all pods have to be available
Step 3: scale up workers by one (via modifying .spec.worker.replicas)
Step 4: verify new worker pod is created
Step 6: scale down workers by one (via modifying .spec.worker.replicas)
Step 7: verify the latest worker pod is deleted

It(“should respect topology constraints when scheduling shard and sentinel pods”)

Step 1: create AirflowBase object with storage and mysql enabled
Step 2: wait for mysql StatefulSet and nfs StatefulSet to become ready
all pods have to be available
Step 3: create AirflowCluster object with redis, ui, scheduler and workers enabled and celery executor and .spec.affinity set to “failure-domain.beta.kubernetes.io/zone”
Step 5: wait for redis, ui, scheduler and worker StatefulSets to become ready
all pods have to be available
Step 6: check all airflow cluster pods are scheduled with respect to topology constraint
Step 7: repeat step 1 to 6 with .spec.affinity specified as kubernetes.io/hostname

It(“should create airflow-cluster components using mysql base and local executor”)

Step 1: create AirflowBase object with storage and mysql enabled
Step 2: wait for mysql StatefulSet and nfs StatefulSet to become ready
all pods have to be available
Step 3: create AirflowCluster object with ui, scheduler enabled and local executor
Step 4: wait for ui, scheduler StatefulSets to become ready
all pods have to be available
Step 5: verify
All stateful sets have 1 pod each
Scheduler is configured correctly to connect to mysql
UI is configured correctly to connect to mysql
Scheduler, ui all synced the DAGs from git repo

It(“should create airflow-cluster components using cloudsql base and local executor”)

Step 1: create AirflowBase object with storage and sqlproxy enabled
Step 2: wait for sqlproxy StatefulSet and nfs StatefulSet to become ready
all pods have to be available
Step 3: create AirflowCluster object with ui, scheduler enabled and local executor
Step 4: wait for ui, scheduler StatefulSets to become ready
all pods have to be available
Step 5: verify
All stateful sets have 1 pod each
Scheduler is configured correctly to connect to sqlproxy
UI is configured correctly to connect to sqlproxy
Scheduler, ui all synced the DAGs from git repo
It(“should create airflow-cluster components using mysql base and k8s executor”)
Step 1: create AirflowBase object with storage and mysql enabled
Step 2: wait for mysql StatefulSet and nfs StatefulSet to become ready
all pods have to be available
Step 3: create AirflowCluster object with ui, scheduler enabled and k8s executor
Step 4: wait for ui, scheduler StatefulSets to become ready
all pods have to be available
Step 5: verify
All stateful sets have 1 pod each
Scheduler is configured correctly to connect to mysql
UI is configured correctly to connect to mysql
Scheduler, ui all synced the DAGs from git repo

It(“should create airflow-cluster components using cloudsql base and k8s executor”)

Work with puckel/docker-airflow docker image

Currently the env variables injected are specific to this docker image https://github.com/barney-s/docker-airflow

Align it with https://github.com/puckel/docker-airflow and make necessary changes in the operator to make it work with both.

related to #10

git-sync on airflow does not support SSH authentication

The current object only supports the username + password authentication method:

airflow-operator/pkg/apis/airflow/v1alpha1/utils.go

Line 307 in 1fbc02d

if sp.DAGs.Git.CredSecretRef != nil {

in order to fully support git-sync we need to 1. set GIT_SYNC_SSH to true and map a mounted volume that points to the ConfigMap: https://github.com/kubernetes/git-sync/blob/9ceb61f7947fbe463b1cc6e9ae5d719f5d8eebd2/docs/ssh.md#step-3-configure-git-sync-container

Postgres doesn't work with latest Airflow Operator

With latest operator, Postgres as database doesn't seem to work anymore.
I can work on this then.

Manifest generation doesn't work with Kustomize v2.0+

Following the instructions in quickstart.md

$ kustomize version
Version: {KustomizeVersion:v2.0.3 GitCommit:a6f65144121d1955266b0cd836ce954c04122dc8 BuildDate:2019-03-18T22:15:21+00:00 GoOs:darwin GoArch:amd64}

$ make deploy
kustomize build config/default | kubectl apply -f -
Error: rawResources failed to read Resources: Load from path ../rbac/rbac_role.yaml failed: security; file '../rbac/rbac_role.yaml' is not in or below '/Users/zx8/.go/src/github.com/GoogleCloudPlatform/airflow-operator/config/default'

Related issue: kubernetes-sigs/kustomize#766

Advice - handling deployments whilst a DAG is running

First of all, I am thrilled you're working on this operator! Also, great work on Composer.

I was wondering if anyone was prepared to discuss achieving DAG reliability whilst a component is being deployed on Airflow. As Kubernetes routinely can schedule pods I imagined it requires higher reliability from DAGs.

When using Airflow to say run a Spark job on Dataproc, what would happen to a DAG run if a restart were to happen? Do you have any advice on improving reliability?

Please feel free to reply and talk offline if that's useful. Hopefully you can provide input

Kubernetes Executor: worker pods are not executing

I was testing the fixes made here: #31

Although Git authentication is working now, I am still not able to run our workers using the Kubernetes executor.

When the worker pod is launched no work is done, or at least the scheduler doesn't get any feedback and the UI doesn't display any execution updates regarding the DAG run.

GCS DAG Sync - Service Account Secret

Currently, the GCS DAG sync uses the node's service account to connect to the GCS bucket.
The node's SA may not have access to the GCS bucket.

DAG sync should be able to mount a secret containing the SA to use.

GCS & 1.10.2

Using dag provision through gcs and the latest release continously crash loop.

airflow.exceptions.AirflowConfigException: In kubernetes mode the following must be set in the `kubernetes` config section: `dags_volume_claim` or `dags_volume_host` or `dags_in_image` or `git_repo and git_branch and git_dags_folder_mount_point`

Is there anything config specific to bypass this? I've tried setting manually the dags_volume_host which does not seem to be passed to the actual pods.

apache/airflow-on-k8s-operator vs GoogleCloudPlatform/airflow-operator

Hello,

I found Airflow operators on two repositories:
https://github.com/apache/airflow-on-k8s-operator
https://github.com/GoogleCloudPlatform/airflow-operator
What are the differences between these repositories? Is it worth putting this information in README.md? For me it is very confusing.

Thanks

Expose airflow.cfg config settings via CRs

possible options:

via airflow env variables
via airflow.cfg map[string]string stored in config-map

Create documentation

Create a docs folder for explaining the design and the Custom Resources.
As well as proposals.

Add an Admission Controller for K8s executor

Add an Admission controller hooks to:

inject DAG sidecar for k8s executor
inject DAG sidecar to controller, worker, UI pods

README refers to Google internal site

Under "Get Started", the "One Click Deployment" link points to "pantheon.corp.google.com".

Airflow 1.10.0

Any plan to upgrade to latest airflow that was released not so long ago?

Allow dags_in_image option for k8s

I was trying to run this project using k8s executor and the dags_in_image option and having no dags section

apiVersion: airflow.k8s.io/v1alpha1
kind: AirflowCluster
metadata:
  name: pk-cluster
spec:
  executor: Kubernetes
  config:
    airflow:
      AIRFLOW__KUBERNETES__DAGS_IN_IMAGE: "1"
  ui:
    image: "my-custom-airflow-image"
    version: "1.10.2"
    replicas: 1
  scheduler:
    image: "my-custom-airflow-image"
    version: "1.10.2"
  worker:
    image: "my-custom-airflow-image"
    version: "1.10.2"
  airflowbase:
    name: pc-base

However, I was getting errors when trying to run the operator:

  Conditions:
    Last Transition Time:  2019-06-06T06:39:42Z
    Last Update Time:      2019-06-06T06:39:42Z
    Message:               templates/airflow-configmap.yaml:template: tmpl:174:25: executing "tmpl" at <.Cluster.Spec.DAGs.G...>: can't evaluate field Repo in type *v1alpha1.GitSpec
    Reason:                ErrorSeen
    Status:                True
    Type:                  Error

I believe the issue is caused by doing a hardcoded lookup on the spec values for git

airflow-operator/templates/airflow-configmap.yaml

Lines 174 to 180 in 980a184

    
               git_repo = {{.Cluster.Spec.DAGs.Git.Repo}} 
        
               git_branch = {{.Cluster.Spec.DAGs.Git.Branch}} 
        
               git_subpath = {{.Cluster.Spec.DAGs.DagSubdir}} 
        
               git_dags_folder_mount_point = /usr/local/airflow/dags/ 
        
               git_sync_dest = gitdags 
        
               git_user = 
        
               git_password =

Not exactly sure how to fix this in a proper manner, but removing these lines in the operator image was enough to get me past this issue.

Thanks a lot

GIT authentication is not working during cluster setup

I created the following cluster using the airflow operator deployed directly from Google Cloud Marketplace:

apiVersion: airflow.k8s.io/v1alpha1
kind: AirflowCluster
metadata:
  name: my-airflow-cluster
spec:
  executor: Kubernetes
  ui:
    replicas: 1
    version: "1.10.1"
  scheduler:
    version: "1.10.1"
  worker:
    version: "1.10.1"
  config:
    airflow:
      AIRFLOW__SCHEDULER__DAG_DIR_LIST_INTERVAL: 100
  dags:
    subdir: "pipelines/"
    git:
      repo: "https://github.com/MyOrg/my-repo.git"
      once: false
      branch: master
      user: MY_USER
      cred:
        name: MY_GIT_SECRET
  airflowbase:
    name: my-airflow-base

Once the cluster is up and running, I get the following error in the git-sync side container:
Cloning into '/git'...\nfatal: could not read Username for 'https://github.com': No such device or address\n"

So it seems the container is struggling with GIT authentication. When inspecting the container's environment variables I can see that these variables were created:

GIT_PASSWORD	
defined as key password of MY_GIT_SECRET
GIT_USER	
my_user

I stumbled upon this issue in the git-sync project where other people had issues setting up authentication:
kubernetes/git-sync#126

In this issue we have the following configuration

...
    - name: GIT_SYNC_USERNAME
      valueFrom:
        secretKeyRef:
          name: git-creds
          key: username
    - name: GIT_SYNC_PASSWORD
      valueFrom:
        secretKeyRef:
          name: git-creds
          key: password
...

so it seems the airflow-operator is using the wrong prefix for the authentication ("GIT_" instead of "GIT_SYNC_")

GoogleCloudPlatform/click-to-deploy vs GoogleCloudPlatform/airflow-operator

Hello,

I found Airflow operators on two repositories belonging to GoogleCloudPlatform:
https://github.com/GoogleCloudPlatform/click-to-deploy/tree/master/k8s/airflow-operator
https://github.com/GoogleCloudPlatform/airflow-operator
What are the differences between these repositories? Is it worth putting this information in README.md? For me it is very confusing when there are two operators.

Thanks

Unit tests return errors

When I run make test, I got the following errors:
E0206 17:47:07.034186 87463 genericreconciler.go:50] Failed: [*v1alpha1.AirflowBase/default/foo] observing resources. no matches for kind "Application" in version "app.k8s.io/v1beta1"

Does anyone have any idea on this error?

Support existing RabbitMQ Celery Broker

Quick Start Guide does not work

The vendors/k8s.io/ directory structure has changed since the guide was written and the Makefile needs to be updated to reflect the change

Support for secrets in the spec.config.airflow field

Environment variables are passed to our airflow containers using the "airflow" field:

config:
    airflow:
      AIRFLOW__CORE__FERNET_KEY: "MY_SECRET_KEY"

In this case we are defining the fernet key which is sensitive information. We should be able to use secrets in this field.

We have this block in airflow.go:

for _, k := range keys {
    env = append(env, corev1.EnvVar{Name: k, Value: sp.Config.AirflowEnv[k]})
}

Maybe we can have a separate field ("airflowSecrets") to inject secret values:

for _, k := range keys {
    env = append(env, corev1.EnvVar{Name: k, ValueFrom: envFromSecret(sp.Config.AirflowSecrets[k].name, sp.Config.AirflowSecrets[k].field)})
}

Add support for cloudsql with postgres

Currently we only support cloudsql(mysql) and the support is broken: #48

Add support for CloudSQL (postgres).

Work involved:

Add a field in cluster to determine cloudsql-postgres or mysql
Use the flag the determine which DB create container we need to add: https://github.com/GoogleCloudPlatform/airflow-operator/blob/master/pkg/apis/airflow/v1alpha1/airflow.go#L558

Infinite restart loop when Airflow authentication is activated

I tried setting Airflow authentication using the config field in the cluster:

apiVersion: airflow.k8s.io/v1alpha1
kind: AirflowCluster
metadata:
  name: my-airflow-cluster
spec:
  executor: Local
  ui:
    replicas: 1
    version: "1.10.1"
  scheduler:
    version: "1.10.1"
  config:
    airflow:
      AIRFLOW__SCHEDULER__DAG_DIR_LIST_INTERVAL: "100"
      AIRFLOW__WEBSERVER__AUTHENTICATE: "True"
      AIRFLOW__WEBSERVER__AUTH_BACKEND: "airflow.contrib.auth.backends.password_auth"

The env variables work properly as the login screen is displayed when we enter the Airflow UI. However, both the ui and scheduler are being killed and restarted every 2 minutes. It seems that the controller thinks they are in an inconsistent state and forces the restart.

When I inspect the controller logs I don't see any apparent difference from those available when the authentication is turned off.

./hack/sample configurations not working

Created an image based off of the tip of master and deployed.

Attempted to use PostGre/Celery configuration and MySQL/Celery configuration from the ./hack/sample directories. And the scheduler never successfully started.

Scheduler log showed that it was getting access denied using the account.
Connected to DB and set the password to the value stored in the webui secret and the scheduler started normally.

Add Service for Flower

Flower currently is exposed on 5555 but there's no service to route to it.

Support multiple worker 'types'

Add support for multiple worker types. For example, a high memory worker and a 'default' worker.

A couple of ways this could be implemented:

An option for an AirflowCluster to be worker only, i.e. no flower, ui, scheduler and attach to the same base. Manually set your Redis?
An AirflowWorker CRD so you can define them multiple times
A slice of []Workers with AirflowCluster

Airflow Operator + Postgres Cloud SQL - Airflow UI pod won't restart after it is killed

After we start the Airflow Operator with the base and cluster configurations, using the Postgres Cloud SQL Configuration, an error is thrown in the "postgres-dbcreate" init container whenever the Airflow UI pod is restarted:
Database already exists

This container needs to check if the database already exists before creating the database and tables.

move to kubebuilder 1.0.0

kubebuilder 1.0 was released.
Plan to move to that version.

Incorrect `Kind` in ownerRef

I am trying to run Airflow Operator on OpenShift 4 and I am hitting an interesting issue:

ERROR: logging before flag.Parse: E0930 15:45:08.736404       1 genericreconciler.go:52] Failed: [*v1alpha1.AirflowBase/airflowop-system/pc-base(cmpnt:*airflowbase.Postgres)] Create. statefulsets.apps "pc-base-postgres" is forbidden: cannot set blockOwnerDeletion in this case because cannot find RESTMapping for APIVersion airflow.k8s.io/v1alpha1 Kind *v1alpha1.AirflowBase: no matches for kind "*v1alpha1.AirflowBase" in version "airflow.k8s.io/v1alpha1"

I traced the source of this to https://github.com/GoogleCloudPlatform/airflow-operator/blob/master/vendor/sigs.k8s.io/controller-reconciler/pkg/genericreconciler/genericreconciler.go#L219 - i.e. the Kind gets set to actual type (*v1alpha1.AirflowBase) instead of just AirflowBase

I also noticed this function https://github.com/GoogleCloudPlatform/airflow-operator/blob/master/pkg/apis/airflow/v1alpha1/airflowbase_types.go#L513 which looks like that is what we actually want to be used, but it never enters that function during execution.

It also seems the controller-reconciler project is gone.

I was able to workaround the issue by turning of the blockOwnerDeletion but I don't like that at all:) It unblocked me for experimenting more with Airflow, but I would like to find a real solution.

Any thoughts on how to solve this?

Add Env variable to Container

How to overwrite airflow config with environment variables?
I tried the following in the AirflowCluster yaml, the the operator did not pick it up.

spec:
  config:
    airflow:
      AIRFLOW__WEBSERVER__NAVBAR_COLOR: #f5f5f5

Add support for SQL Proxy connections in our workers

Currently if we want to connect to a Cloud SQL database in our workers/dags, we need to create a connection using the database's Public or Private IP.
The best way to connect to these databases is through SQL Proxy side containers as described here:
https://cloud.google.com/sql/docs/postgres/connect-kubernetes-engine

We should have a way of specifying predefined sql proxy connections to be attached to the scheduler or worker pods. We could define these connections directly in the cluster configuration.

The configuration spec should be similar as the one currently available in the base yaml configuration:

spec:
  sqlproxy:
    project: kubeflow-193622
    region: us-central1
    instance: testsql-cluster

airflow operator question

Hi @barney-s ,

We(Lyft) are exploring running airflow on k8s internally. I wonder whether google has run this repo in production. Any hiccups encounter so far?

Thanks,
-Tao

Create application CRD

The Airflow operator requires an application CRD so that details of cluster resources can be displayed via GCP console. I will work on this.

Development Docs Update

make build in the docs below doesn't work, as build is not in the Makefile.

https://github.com/GoogleCloudPlatform/airflow-operator/blob/master/docs/development.md

Quickstart Guide references files which do not exist

docs/quickstart.md references files under manifests dir, which does not exist.

Either:

Tell me which files are the ones which are referenced in the quickstart guide but no longer exist under those names, then I am happy to update docs to point to correct files.
Update docs/quickstart.md to point to correct location

Local Executor: display worker logs in Stackdriver

When we run our workers using the Local executor, the worker logs are placed in a local folder (/usr/local/airflow/logs/). It would be nice if we could also see these logs in Stackdriver since they disappear when the scheduler pod is restarted.

Documentation on installation

Trying to install on a non GCE cluster, I get an error right off the bat:

$> make install
make: gcloud: Command not found
make: gcloud: Command not found
go run vendor/sigs.k8s.io/controller-tools/cmd/controller-gen/main.go all
vendor/sigs.k8s.io/controller-tools/cmd/controller-gen/main.go:25:2: cannot find package "sigs.k8s.io/controller-tools/pkg/crd/generator" in any of:
        /usr/src/sigs.k8s.io/controller-tools/pkg/crd/generator (from $GOROOT)
        /home/user/go/src/sigs.k8s.io/controller-tools/pkg/crd/generator (from $GOPATH)
vendor/sigs.k8s.io/controller-tools/cmd/controller-gen/main.go:26:2: cannot find package "sigs.k8s.io/controller-tools/pkg/generate/rbac" in any of:
        /usr/src/sigs.k8s.io/controller-tools/pkg/generate/rbac (from $GOROOT)
        /home/user/go/src/sigs.k8s.io/controller-tools/pkg/generate/rbac (from $GOPATH)
make: *** [Makefile:47: manifests] Error 1

If I try with NOTGCP, I get a different error:

make install NOTGCP=true
go run vendor/sigs.k8s.io/controller-tools/cmd/controller-gen/main.go all
vendor/sigs.k8s.io/controller-tools/cmd/controller-gen/main.go:25:2: cannot find package "sigs.k8s.io/controller-tools/pkg/crd/generator" in any of:
        /usr/src/sigs.k8s.io/controller-tools/pkg/crd/generator (from $GOROOT)
        /home/user/go/src/sigs.k8s.io/controller-tools/pkg/crd/generator (from $GOPATH)
vendor/sigs.k8s.io/controller-tools/cmd/controller-gen/main.go:26:2: cannot find package "sigs.k8s.io/controller-tools/pkg/generate/rbac" in any of:
        /usr/src/sigs.k8s.io/controller-tools/pkg/generate/rbac (from $GOROOT)
        /home/user/go/src/sigs.k8s.io/controller-tools/pkg/generate/rbac (from $GOPATH)
make: *** [Makefile:47: manifests] Error 1`

I am probably doing something wrong. I tried installing those packages, but only see this in the corresponding directory:

ls src/sigs.k8s.io/controller-tools/pkg         
total 40K
drwxrwxr-x. 10 user user 4.0K Aug 13 20:18 .
drwxrwxr-x.  6 user user 4.0K Aug 13 20:18 ..
drwxrwxr-x.  4 user user 4.0K Aug 13 20:18 crd
drwxrwxr-x.  3 user user 4.0K Aug 13 20:18 deepcopy
drwxrwxr-x.  3 user user 4.0K Aug 13 20:18 genall
drwxrwxr-x.  3 user user 4.0K Aug 13 20:18 loader
drwxrwxr-x.  2 user user 4.0K Aug 13 20:18 markers
drwxrwxr-x.  3 user user 4.0K Aug 13 20:18 rbac
drwxrwxr-x.  2 user user 4.0K Aug 13 20:18 typescaffold
drwxrwxr-x.  2 user user 4.0K Aug 13 20:18 webhook

Probably user error, but maybe this would benefit from better documentation.

Env variables for custom airflow docker images

Just giving this a spin with my own images and one of the reasons it doesn't currently work for me is that certain env variables (AIRFLOW__CORE__SQL_ALCHEMY_CONN and AIRFLOW__CORE__FERNET_KEY in particular) are composed in the entrypoint.sh as opposed to directly from env vars. It appears the airflow docker image is based on https://github.com/puckel/docker-airflow, is that correct? Are there reasons why the airflow Dockerfile wouldn't be part of this repo? (I'm using a different docker image which doesn't have the same entrypoint as that one.)

In any event, my suggestion would be to try and keep the go build process decoupled from any particular docker image/entrypoint. I see the args are configurable in utils.go. For sensitive config parameters (e.g. sql_alchemy_conn and fernet_key), these would probably have to come from kubernetes secrets and be injected as env_vars in the UI, scheduler, and k8s worker.

Clarify docs/api.md regarding field CredSecretRef

Attempting to set up git-sync to use a username and OAuth token, but cannot seem to figure out how to use the CredSecretRef field - can we clarify the docs, perhaps providing an example?

I have tried to set the field using the UID / selfLink to the secret containing the token / password, however am getting the following error:

spec.dags.git.cred in body must be of type object: "string"

Any clarification would be appreciated!

KubernetesExecutor broken with Airflow 1.10.1 + Airflow operator

@dimberman

Open Issues:

In 1.10.1 release of Airflow, the AirflowExecutor hardcodes the dag source to /tmp/dags for gitsync. With latest master hardcoding is removed. But latest master is not stable.
Airflow operator does not create/set the configmap for k8sexecutor config section.

RabbitMQ/Memorystore support

Do we have plan to support rabbitMQ or memoryStore in near future?

[Question] Redis Required for k8s Executor?

This looks great. Is redis required for the k8s executor?

Add Ingress for UI, Flower

The UI and Flower don't currently have an Ingress option and you have to port forward to reach them.

Please add Ingress(es) for the UI, and if CeleryExecutor, Flower.

Support Postgres for the operator on Kubernetes

As currently, the operator only supports MySQL as the database of Airflow, would be great if we can support also Postgres.

Restart UI, Scheduler, Workers on detecting new DAGs in the DAG folder

Drain celery workers nodes and k8s executor pods to prepare for restart
Restart airflow UI and Scheduler on detecting new DAGs in the DAG folder

PGBouncer support

PostGres has a lot fewer connections available than MySQL. PGBouncer is a good way to avoid max connections being hit.

An example deployment is available here:
https://github.com/astronomer/helm.astronomer.io/blob/master/charts/airflow/templates/pgbouncer/pgbouncer-deployment.yaml

CloudSQL over private IP address

Currently CloudSQL appears to only allow connections via SQLProxy.

It would be great for this to be extended to allow connections over a private IP.

I understand this is more complex due to selecting networks, etc., if you're creating the SQL instance via this operator, but personally I'd like to use this with an existing Cloud SQL server.

	git_repo = {{.Cluster.Spec.DAGs.Git.Repo}}
	git_branch = {{.Cluster.Spec.DAGs.Git.Branch}}
	git_subpath = {{.Cluster.Spec.DAGs.DagSubdir}}
	git_dags_folder_mount_point = /usr/local/airflow/dags/
	git_sync_dest = gitdags
	git_user =
	git_password =