Giter Site home page Giter Site logo

provider-gcp's People

Contributors

andrealmar avatar bassam avatar bradkwadsworth-mw avatar cartyc avatar displague avatar evgenykharitonov avatar feggah avatar hasheddan avatar hferentschik avatar ichekrygin avatar isthereaspoon avatar jbw976 avatar jensentanlo avatar jkylling avatar lukeweber avatar mastersingh24 avatar micnncim avatar muvaf avatar negz avatar prasek avatar psinghal20 avatar rbwsam avatar ruhika1417 avatar sergenyalcin avatar suskin avatar szydek avatar terrytangyuan avatar thephred avatar tommyknows avatar turkenh avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

provider-gcp's Issues

CloudSQL thrashing in constant update loop

What happened?

I believe the combination of our v1beta1 logic and cross resource referencing has created issues with the CloudSQL controller. It uses the following logic to determine whether a resource is up to date:

// IsUpToDate checks whether current state is up-to-date compared to the given
// set of parameters.
func IsUpToDate(in *v1beta1.CloudSQLInstanceParameters, currentState sqladmin.DatabaseInstance) bool {
	currentParams := &v1beta1.CloudSQLInstanceParameters{}
	LateInitializeSpec(currentParams, currentState)
	return reflect.DeepEqual(in, currentParams)
}

The problem is that IsUpToDate (which slightly predates the CRR work) assumes it can create an exact copy (currentParams) of in by late initializing an empty CloudSQLInstanceParameters from the currently observed values, but this is not quite equal to in because in has cross resource references set that can't be inferred from observing the external resource.

I've confirmed that the following fixes this issue:

- 	return reflect.DeepEqual(in, currentParams)
+	return cmp.Equal(in, currentParams, cmpopts.IgnoreInterfaces(struct{ resource.AttributeReferencer }{}))

How can we reproduce it?

  1. Create a CloudSQLInstance
  2. Run stack-gcp with the debug flag turned on.
  3. Watch it reconcile every second or so.

What environment did it happen in?

Crossplane version:

Support GKE Horizontal Pod Autoscaling

What problem are you facing?

As an Upbound infrastructure engineer I'd like to dogfood Crossplane by using it to manage my GKE clusters. These GKE clusters may require horizontal pod autoscaler support, per https://cloud.google.com/kubernetes-engine/docs/how-to/scaling-apps.

How could Crossplane help solve your problem?

stack-gcp could support GKE HPA, e.g. expose https://cloud.google.com/kubernetes-engine/docs/reference/rest/v1/projects.locations.clusters#horizontalpodautoscaling

Support network policy enabled GKE clusters

https://cloud.google.com/kubernetes-engine/docs/how-to/network-policy

What problem are you facing?

As an Upbound infrastructure engineer I'd like to dogfood Crossplane by using it to manage my GKE clusters. These GKE clusters require network policy per https://cloud.google.com/kubernetes-engine/docs/how-to/network-policy.

How could Crossplane help solve your problem?

Crossplane could expose the option to enable network policy support when creating a GKE cluster, e.g. https://cloud.google.com/kubernetes-engine/docs/reference/rest/v1/projects.locations.clusters#Cluster.NetworkPolicy

Support GKE cluster autoscaling

What problem are you facing?

As an Upbound infrastructure engineer I'd like to dogfood Crossplane by using it to manage my GKE clusters. These GKE clusters require cluster autoscaling support per https://cloud.google.com/kubernetes-engine/docs/concepts/cluster-autoscaler.

How could Crossplane help solve your problem?

GKE cloud support configuring cluster autoscaling, per https://cloud.google.com/kubernetes-engine/docs/reference/rest/v1/projects.locations.clusters#Cluster.ClusterAutoscaling.

Kubeconfig in connection secret is not working in the latest (0.4) version

What happened?

The kubeconfig in connection secrets seems to be missing username and password (both of them are empty). When I try to use the kubeconfig locally using cli to access the cluster, I am asked for username and password.
When I looked into config of this cluster in GKE console, Basic Authentication was disabled.

How can we reproduce it?

Use crossplane 0.6 and stack-gcp 0.4. Try provisioning a cluster using the following cluster claim and cluster class

# class
apiVersion: container.gcp.crossplane.io/v1beta1
kind: GKEClusterClass
metadata:
  labels:
    className: "app-kubernetes-class"
  name: app-kubernetes-class
  namespace: crossplane-system
specTemplate:
  forProvider:
    location: us-central1
  providerRef:
    name: gcp-provider
  reclaimPolicy: Delete
  writeConnectionSecretsToNamespace: crossplane-system
---
# claim
apiVersion: compute.crossplane.io/v1alpha1
kind: KubernetesCluster
metadata:
  name: app-kubernetes
  namespace: crossplane-system
  annotations:
    crossplane.io/external-name: foobarbaz
spec:
  classSelector:
    matchLabels:
      className: "app-kubernetes-class"
  writeConnectionSecretToRef:
    name: app-kubernetes
---
# nodepool
apiVersion: container.gcp.crossplane.io/v1alpha1
kind: NodePool
metadata:
  name: gke-nodepool
  namespace: crossplane-system
spec:
  providerRef:
    name: gcp-provider
  writeConnectionSecretToRef:
    name: gke-nodepool
    namespace: crossplane-system

  forProvider:
    cluster: "projects/myproject-12345/locations/us-central1/clusters/foobarbaz"
    initialNodeCount: 2

Check the cluster connection secret using

$ kubectl get secret app-kubernetes -o yaml

username and password field are empty.

What environment did it happen in?

  • I am running crossplane on minikube
  • Crossplane version: 0.6
  • Cloud provider: GCP
  • Kubernetes version (use kubectl version)
Client Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.2", GitCommit:"c97fe5036ef3df2967d086711e6c0c405941e14b", GitTreeState:"clean", BuildDate:"2019-10-15T19:18:23Z", GoVersion:"go1.12.10", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.0", GitCommit:"e8462b5b5dc2584fdcd18e6bcfe9f1e4d970a529", GitTreeState:"clean", BuildDate:"2019-06-19T16:32:14Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}
  • Kubernetes distribution: GKE

More details

I tried provisioning cluster manually first, by providing username in masterAuth by referring to this
https://github.com/crossplaneio/stack-gcp/blob/a6131969f4d1b2d6cbb0abd84cf4d452a1400367/pkg/clients/gke/gke.go#L98

through GKE's web api (https://cloud.google.com/kubernetes-engine/docs/reference/rest/v1/projects.locations.clusters/create). If I provide a username in the API here, sure enough Basic Authentication is enabled and I get username and password.
This makes me wonder if there's some problem on our side. I am looking into the codebase to check if there's something wrong.

Update 1

  • Above code related to masterAuth does is for v1alpha3, v1beta1 does not seem to have Basic Authentication enabled by default

https://github.com/crossplaneio/stack-gcp/blob/a6131969f4d1b2d6cbb0abd84cf4d452a1400367/pkg/clients/cluster/cluster.go#L255

Update 2

Update 3

Adopt new external name feature

What problem are you facing?

The new version of crossplane-runtime introduced external name annotation that will be used as the source of truth for the name identifier on the provider's systems. By default, managed reconciler takes care of the propagation logic from claim to managed and vice versa.

However, the resources that don't get to choose their name, like VPC in AWS, or the ones that can't work with new <namespace>-<name>-<5char random> string should opt out of using ManagedNameAsExternalName initializer and handle setting the external name in their own way. This could be either supplying its own Initializer or calling meta.SetExternalName(Managed, string) in their ExternalClient call.

See:
https://github.com/crossplaneio/crossplane/blob/master/design/one-pager-managed-resource-api-design.md#external-resource-name
crossplane/crossplane-runtime#45

How could Crossplane help solve your problem?

This issue tracks the adoption of this feature in this stack. Please close this as soon as all resources that use managed reconciler adopts the feature.

CloudSQL instances in PENDING_CREATE should be considered "Creating"

What happened?

I noticed when creating a Postgres CloudSQL instance that it seems to start in state PENDING_CREATE, which we evidently map to the Unavailable() condition. It would probably be more accurate to map it to the Creating() condition.

Status:
  Conditions:
    Last Transition Time:  2019-09-13T05:22:52Z
    Reason:                Managed resource is not available for use
    Status:                False
    Type:                  Ready
    Last Transition Time:  2019-09-13T05:22:52Z
    Reason:                Successfully reconciled managed resource
    Status:                True
    Type:                  Synced
  Public Ip:               35.235.101.158
  State:                   PENDING_CREATE
Events:                    <none>

How can we reproduce it?

Use the below resource class to create a CloudSQL instance:

apiVersion: database.gcp.crossplane.io/v1alpha2
kind: CloudsqlInstanceClass
metadata:
  name: cloudsqlinstancepostgresql
  namespace: crossplane-system
specTemplate:
  databaseVersion: POSTGRES_9_6
  tier: db-custom-1-3840
  region: us-west2
  storageType: PD_SSD
  storageGB: 10
  ipv4Enabled: true
  providerRef:
    name: example
    namespace: crossplane-system
  reclaimPolicy: Delete

What environment did it happen in?

Crossplane version:

Allow adding labels to CloudSQL instances

What problem are you facing?

We run in heavily multi-tenant environments (as in, multiple departments from the same company) and it would be really convenient if all resources created can be labelled automatically. This is already possible for the CloudMemorystoreInstanceClass, so I hope it won't be too much of a problem to add to CloudSQL as well.

How could Crossplane help solve your problem?

It would be great if Crossplane could have the option to set labels on all resources.

Implement Simple Resource Class Selection for GCP

What problem are you facing?

Dynamic provisioning is complicated. See crossplane/crossplane#926 for full context.

How could Crossplane help solve your problem?

Implement the patterns described in crossplane/crossplane#926 for the GCP Stack. Specifically:

  • Make providers, classes, and managed resources cluster scoped.
  • Match classes to claims using label selectors
  • Fall back to using a resource class annotated as the default.

This will depend on crossplane/crossplane#927 and crossplane/crossplane-runtime#48.

Implement integration tests

Part of crossplane/crossplane#1033

What problem are you facing?

Currently, all integration testing is being run in a manual ad-hoc manner. It is desirable to automate this process and run the tests on a more frequent basis.

How could Crossplane help solve your problem?

Initial implementation should use the framework developed to create tests for a single managed resource, as well as a Jenkins stage / separate pipeline to execute the test.

Re-using a GCP CloudSQLInstance

There are potentially 2 separate use cases here:

  • I would like to be able to bind to a SQL instance that already exists.

  • I would like to be able to use Crossplane to create a SQL instance, and then have it re-used by multiple claims (e.g. I deploy the same app twice).

The objective is just to save time waiting for the resource to be created. I don't want to wait 5 minutes every time I start an application, especially if I am giving a demo, or iterating fast. I am aware that the database might contain data from a previous deployment. Applications are used to dealing with that sort of thing - the database is persistent and has a long lifecycle, and the applications that consume the data are short-lived and liable to change more frequently.

What actually happens currently is

  • If I point a CloudSQLInstance to an existing resource, Crossplane happily creates a secret that contains a lot of useful information, but does not contain the database password. You can then create a claim for the instance (using MySQLInstance) and another secret gets created and the claim looks successful. So an application that uses the secret and tries to connect thinks it can do it, but fails when it actually needs to create a connection.

  • Then there is a (possibly separate) problem that the default reclaimPolicy causes a CloudSQLInstance to be marked as Released when a claim is deleted. The result is that you can subsequently claim the resource but the MySQLInstance gets stuck in a state where it is "Managed claim is waiting for managed resource to become bindable". Again, it looks successful, but never actually becomes usable by an application. In this case the secret is never created.

If it isn't possible to re-use instances, or is deemed bad practice (e.g. because the database might leak data into the next claim), maybe it would be better to make it obvious that it has failed? Refuse to accept the claim, for instance.

CloudSQL: read-replica mode with SSL-enabled support

What problem are you facing?

CloudSQL can be run in read-replica mode with an external master. Though in order to configure it to talk with master through a SSL connection, one needs to provide the SSL client information generated from the master's certificate. Currently, there is no way to give an SSL certificate as an input to CloudSQL resource. Details here.

How could Crossplane help solve your problem?

The actual issue here is that there is no way to provide an input in the form of a secret to resource provisioning. If we get that, it should be fairly easy to create a secret, refer it in the resource and use that in Insert call.

GCP: Refactor all managed resource controllers into generic managed reconciler

Refactor existing GCP managed resources to adopt the latest managed resource controller pattern implemented in crossplane/crossplane#603

Part of crossplane/crossplane#615

What seems to be the problem?

We need to eliminate inconsistency around controller patterns in the codebase to have an easily extendible project and easier troubleshooting during the operations.

What does it look like when we're done?

All managed GCP resources use generic managed reconciler with their version being v1alpha2
All possible value validations are in place in the types using CRD validation mechanism.

Resources

  • CloudMemory Instance #4
  • CloudSQL Instance #33
  • GKE Cluster #34
  • Bucket #32

Support Certificate-based cluster authentication

What problem are you facing?

Currently authentication to create GKE clusters are done through static
username-password
authentication. This is not a recommended approach, as it doesn't let leveraging RBAC for authorization, and can not be rotated.

How could Crossplane help solve your problem?

Support Certificate-based client authentication. Although Google recommends Google-OIDC token based authentication, but this will limit the user to only use Google auth tokens, whereas the user might want to use more flexible certificate based authentication to support multiple cloud providers and an independent authentication provider.

Need to bump stack-gcp version in app.yaml to match released versions

What happened?

Crossplane getting started docs shows stack-gcp version as v0.5.0 for the helm chart:

image

But the stack version in https://github.com/crossplaneio/stack-gcp/blob/15ff8c6d8fc7a82956d1e57bf71981a1da3f281e/config/stack/manifests/app.yaml#L24 is 0.0.1 in the app.yaml.

What can we do to help?

Keep the versions in sync so we know we're looking at the same version of the stack, i.e. update gcp/blob/15ff8c6d8fc7a82956d1e57bf71981a1da3f281e/config/stack/manifests/app.yaml#L24 to match the released version of the stack so it's consistent in all the places.

TODO

  • create associated tickets to bump app.yaml versions of stack-aws, stack-azure, stack-minimal-gcp, stack-minimal-aws, stack-minimal-azure, sample-stack-wordpress

v1beta1 GKECluster and GKENodePool

Definition of done: GKE support is considered v1beta1.

v1beta1 managed resources comply with our Managed Resource API Patterns design. The CloudSQL and CloudMemorystore controllers are good reference implementations of v1beta1 resources.

The GKE API currently allows you to create nodes by:

  • Specifying a node config and count for the cluster via the Cluster API object.
  • Specifying node pools inline in the Cluster API object.
  • Specifying node pools as distinct API objects.

As part of moving GKE to v1beta1 we'd like to support modelling node pools as a distinct managed resource, i.e. GKENodePool. We may also want to support modeling them inline of GKECluster.

Support enabling Private Service Connections

What problem are you facing?

Right now, if you'd like to give access to your CloudSQL instance from your private VPC network, you have to create a Private Service Connection that will peer the network of CloudSQL instance and your VPC. See https://cloud.google.com/sql/docs/mysql/configure-private-ip

However, we don't have that as managed resource.

How could Crossplane help solve your problem?

We can either implement the creation of that connection as embedded to each service's managed reconciler logic or have a controller on its own. Short term, it's easier to just call Service Networking API to create it if spec.privateNetwork is specified in CloudSQL managed resource. But in the long term, we need to consider whether we want to implement CRD+controller for that type.

CloudSQLInstance keeps modifying the instance

What happened?

After creating a CloudSQLInstance, the system keeps trying to make modifications to it. This can be seen in the console itself (the instances are perpetually in a state of being modified) and in the state.atProvider.settingsVersion, which keeps incrementing every few seconds. Sadly, there's nothing logged in stack-gcp or the other crossplane Pods that give any indication what it's trying to do.

How can we reproduce it?

This is the resource we created:

apiVersion: database.gcp.crossplane.io/v1beta1
kind: CloudSQLInstance
metadata:
  annotations:
    crossplane.io/external-name: infra-xplane-test-jkgk7
  creationTimestamp: "2020-02-07T14:03:51Z"
  finalizers:
  - finalizer.managedresource.crossplane.io
  generateName: infra-xplane-test-
  generation: 2
  name: infra-xplane-test-jkgk7
  resourceVersion: "35099427"
  selfLink: /apis/database.gcp.crossplane.io/v1beta1/cloudsqlinstances/infra-xplane-test-jkgk7
  uid: 0201d2c2-3418-4d55-8f7f-18efd83d8e4a
spec:
  claimRef:
    apiVersion: database.crossplane.io/v1alpha1
    kind: PostgreSQLInstance
    name: xplane-test
    namespace: infra
    uid: 8f12b8aa-1691-486b-9694-4a9c854e1a6b
  classRef:
    apiVersion: database.gcp.crossplane.io/v1beta1
    kind: CloudSQLInstanceClass
    name: postgres-9.6
    uid: fa73b0ff-9f9b-4786-84b2-9f89153ceb4a
  forProvider:
    databaseVersion: POSTGRES_9_6
    gceZone: europe-west1-d
    instanceType: CLOUD_SQL_INSTANCE
    region: europe-west1
    settings:
      activationPolicy: ALWAYS
      backupConfiguration:
        binaryLogEnabled: false
        enabled: false
        startTime: "09:00"
      dataDiskSizeGb: 10
      dataDiskType: PD_SSD
      ipConfiguration:
        privateNetwork: projects/redacted/global/networks/default
      locationPreference:
        zone: europe-west1-d
      pricingPlan: PER_USE
      replicationType: SYNCHRONOUS
      storageAutoResize: true
      storageAutoResizeLimit: 500
      tier: db-g1-small
  providerRef:
    name: redacted-crossplane-provider
  reclaimPolicy: Delete
  writeConnectionSecretToRef:
    name: 8f12b8aa-1691-486b-9694-4a9c854e1a6b
    namespace: infra
status:
  atProvider:
    backendType: SECOND_GEN
    connectionName: redacted:europe-west1:infra-xplane-test-jkgk7
    gceZone: europe-west1-d
    ipAddresses:
    - ipAddress: 10.6.1.11
      type: PRIVATE
    project: redacted
    selfLink: https://www.googleapis.com/sql/v1beta4/projects/redacted/instances/infra-xplane-test-jkgk7
    serviceAccountEmailAddress: [email protected]
    settingsVersion: 58
    state: RUNNABLE
  bindingPhase: Bound
  conditions:
  - lastTransitionTime: "2020-02-07T14:03:53Z"
    reason: Successfully resolved resource references to other resources
    status: "True"
    type: ReferencesResolved
  - lastTransitionTime: "2020-02-07T14:08:35Z"
    reason: Resource is available for use
    status: "True"
    type: Ready
  - lastTransitionTime: "2020-02-07T14:03:55Z"
    reason: Successfully reconciled resource
    status: "True"
    type: Synced

No settings were changed after the creation. Here's a screenshot from the console:
Console screenshot

What environment did it happen in?

Crossplane version: 0.7.0
Stack GCP version: 0.5.0
Kubernetes version: 1.15.9

Let me know if you need any additional information!

Examples for resources without claims

What problem are you facing?

We have some resources that do not have a corresponding claims such as network, subnetwork etc. Only end to end tutorials expose them to the users as ready-to-use yaml files.

How could Crossplane help solve your problem?

Add examples for those resources.

Expose the connectionName for a CloudSQLInstance

When you bind to an SQL instance using CloudSQLInstance you get a secret with useful things like "username", "publicIP", and "password". Generally speaking you get hold of that secret either directly, if it is in your namespace, or via a "claim" (implemented as a MySQLInstance), which causes the secret to be copied into its own namespace.

What the secret does not contain is the magic GCP connection string that allows (for instance) a cloud_sql_proxy to connect from a sidecar. The application needs that string, and it has no way of knowing what it is without being able to consult the kubernetes or gcloud APIs. It would be better if it was just there in the secret. In fact it is already there in the status of the CloudSQLInstance so it could be copied into the secret:

$ kubectl get cloudsqlinstance -o yaml
apiVersion: v1
items:
- apiVersion: database.gcp.crossplane.io/v1beta1
  kind: CloudSQLInstance
  metadata:
    annotations:
      crossplane.io/external-name: default-mysql-claim-j2b85
...
  status:
    atProvider:
      backendType: SECOND_GEN
      connectionName: cf-sandbox-dsyer:us-central1:default-mysql-claim-j2b85
...

(it's that thing called "connectionName").

(Copied from a comment on another issue.)

GCP: Add DNS, SSL, and Ingress support

What problem are you facing?

Integration of dns, ssl, and ingress in crossplane. I've added these as one ticket as they are often related.

If I configure the dns, I can provision certs with a major provider, and by extension I can associate those certs with a load balancer.

Although this can be supported somewhat with self-service by applying externaldns, and cert-manager to a kubernetes target cluster with workload, it moves this out of the control of Crossplane and has downsides. Cert-manager can be less than ideal in some cases like a zero downtime migration to a different cluster, you don't get certs on cluster until the dns resolves to the new cluster, which is whatever the delay is for the dns migration.

As part of this story, full automation makes for a great demo, but we would likely also want to allow users to set a private key and ca and allow crossplane to associate this cert to any load balancer in the major providers.

How could Crossplane help solve your problem?

Example flow with GCP cloud DNS + AWS EKS to setup SSL, DNS, Ingress:

  1. Want to deploy app in a target EKS cluster behind https://myhost.com
  2. AWS ACM - Request Cert
  3. Create a DNS entry in Cloud DNS on GCP to verify control of domain
  4. Associate ACM Cert with EKS ALB
  5. Point Cloud DNS and AWS ALB

Further related reading:
GCP K8s multi-cluster ingress
google managed certs
google pre-shared certs
Import external cert to AWS ACM
Static IPs for ALBs

Please allow for more verbose logging

What problem are you facing?

It would be nice if logging verbosity could be increased, at least to the level that it shows for each action what it's trying to do. This would also help in providing more information for bug reports, for instance, for my last bug report #164, it's impossible for me (ot at least not documented) how I should go about making stack gcp log what it's actually trying to do with the Google API.

How could Crossplane help solve your problem?

Allow logging of API calls in stack-gcp would be a good start.

Cloudsql unnecessary updates cause error in status

What happened?

During the creation of CloudSQL instance, the following appears as error in the status:

- lastTransitionTime: "2019-12-25T15:33:23Z"
    message: 'update failed: cannot update the CloudSQL instance: googleapi: Error
      409: Operation failed because another operation was already in progress., operationInProgress'
    reason: Encountered an error during managed resource reconciliation
    status: "False"
    type: Synced

The reason is we don't skip the update during creation or another update, which is something we do in other resources whose API complains about this.

How can we reproduce it?

Create a CloudSQLInstance and watch for its status.

What environment did it happen in?

Crossplane version: 0.6.0

CloudMemorystore does not test UpToDate logic in ExternalObservation

What problem are you facing?

As of crossplane/crossplane-runtime@ab3cac0 the Observe() method reports back whether a external resource needs to be updated or not. Therefore, the IsUpToDate logic is being checked in Observe() rather than Update(). When CloudMemorystore was moved to v1beta1, it started using this pattern, but it does not test the value of ResourceUpToDate in the ExternalObservation in TestObserve.

How could Crossplane help solve your problem?

TestObserve should be updated to test that we report resource needs update when IsUpToDate returns false.

Subnetwork should not update 'network' field

What happened?

Subnetworks can be created successfully, but fail when attempting to update with the following error:

update of Subnetwork resource has failed: googleapi: Error 400: Invalid
        value for field ''resource'': ''{  "name": "example",  "network": "projects/REDACTED/global/networks/example",  "ipCidr...''.
        The following field(s) specified in the request cannot be modified: [network],
        invalid

How can we reproduce it?

Create a subnetwork with the following spec:

apiVersion: v1
items:
- apiVersion: compute.gcp.crossplane.io/v1alpha1
  kind: Subnetwork
  metadata:
    name: example
    namespace: example
  spec:
    ipCidrRange: 192.168.0.0/24
    name: example
    network: projects/REDACTED/global/networks/example
    privateIpGoogleAccess: true
    providerRef:
      name: example
      namespace: crossplane-system
    reclaimPolicy: Delete
    region: us-central1
    secondaryIpRanges:
    - ipCidrRange: 10.0.0.0/8
      rangeName: pods
    - ipCidrRange: 172.16.0.0/16
      rangeName: services
    writeConnectionSecretToRef: {}
  status:
    PrivateIPGoogleAccess: true
    conditions:
    - lastTransitionTime: "2019-09-06T11:18:17Z"
      reason: Managed resource is available for use
      status: "True"
      type: Ready
    - lastTransitionTime: "2019-09-06T11:18:18Z"
      message: 'update of Subnetwork resource has failed: googleapi: Error 400: Invalid
        value for field ''resource'': ''{  "name": "example",  "network": "projects/REDACTED/global/networks/example",  "ipCidr...''.
        The following field(s) specified in the request cannot be modified: [network],
        invalid'
      reason: Encountered an error during managed resource reconciliation
      status: "False"
      type: Synced
    creationTimestamp: "2019-09-06T04:18:16.766-07:00"
    fingerprint: ENdy1UwLSIc=
    gatewayAddress: 192.168.0.1
    id: 482972742549765783
    ipCidrRange: 192.168.0.0/24
    kind: compute#subnetwork
    name: example
    network: https://www.googleapis.com/compute/v1/projects/REDACTED/global/networks/example
    region: https://www.googleapis.com/compute/v1/projects/REDACTED/regions/us-central1
    secondaryIpRanges:
    - ipCidrRange: 10.0.0.0/8
      rangeName: pods
    - ipCidrRange: 172.16.0.0/16
      rangeName: services
    selfLink: https://www.googleapis.com/compute/v1/projects/REDACTED/regions/us-central1/subnetworks/example

What environment did it happen in?

Crossplane version:

Expose the projectId for a CloudSQLInstance

I guess it might be useful for nearly all GCP objects, but CloudSQLInstance is the one I am interested in for now. Currently I need the project ID occasionally in application configuration, and the only way to get it is to hard code it somewhere. It is always available to crossplane when it creates the secret, so it could just be included as its own key.

See #159 for similar change adding SQL specific key.

Support gVisor enabled GKE clusters

What problem are you facing?

As an Upbound infrastructure engineer I'd like to dogfood Crossplane by using it to manage my GKE clusters. These GKE clusters require sandbox, aka gVisor, support per https://cloud.google.com/kubernetes-engine/sandbox/.

How could Crossplane help solve your problem?

Crossplane could expose the option to enable gVisor support when creating a GKE cluster, e.g. https://cloud.google.com/kubernetes-engine/docs/reference/rest/v1beta1/NodeConfig#SandboxConfig. Note that this cannot be done for the default node pool, so this will require us to either model node pools as a distinct managed resource (per #86), or to support managing additional node pools inline as part of the existing GKECluster managed resource.

To enable GKE Sandbox, you configure a node pool. The default node pool (the first node pool in your cluster, created when the cluster is created) cannot use GKE Sandbox. To enable GKE Sandbox during cluster creation, you must add a second node pool when you create the cluster.

GCP BucketPolicyOnly Error on Provisioning

What happened?

When creating a Bucket on GCP, validation error occurs on the managed resource when thebucketPolicyOnly field is set. This is due to the lockedTime (metav1.Time) sub-field of bucketPolicyOnly having a zero value of null rather than "".

How can we reproduce it?

  1. Create a GCP BucketClass with bucketPolicyOnly defined.
  2. Create a Bucket claim that references the class.

What environment did it happen in?

Crossplane version: Master (commit 43aa434)

Error Body

cannot create managed resource: Bucket.storage.gcp.crossplane.io "bucket-f6e19330-065c-49b1-9e7f-977f6ed1f964" is invalid: []: Invalid value: map[string]interface {}{"apiVersion":"storage.gcp.crossplane.io/v1alpha1", "kind":"Bucket", "metadata":map[string]interface {}{"creationTimestamp":"2019-08-27T20:36:59Z", "generation":1, "name":"bucket-f6e19330-065c-49b1-9e7f-977f6ed1f964", "namespace":"crossplane-system", "ownerReferences":[]interface {}{map[string]interface {}{"apiVersion":"storage.crossplane.io/v1alpha1", "kind":"Bucket", "name":"gitlab-artifacts", "uid":"f6e19330-065c-49b1-9e7f-977f6ed1f964"}}, "uid":"95c4267d-e57d-4285-8605-bf13c6ec3a16"}, "spec":map[string]interface {}{"bucketPolicyOnly":map[string]interface {}{"enabled":true, "lockedTime":interface {}(nil)}, "claimRef":map[string]interface {}{"apiVersion":"storage.crossplane.io/v1alpha1", "kind":"Bucket", "name":"gitlab-artifacts", "namespace":"default", "uid":"f6e19330-065c-49b1-9e7f-977f6ed1f964"}, "classRef":map[string]interface {}{"apiVersion":"storage.gcp.crossplane.io/v1alpha1", "kind":"BucketClass", "name":"standard-gcp-bucket", "namespace":"crossplane-system", "uid":"52d3aa4c-4e19-47cf-9e3a-ab0bac25c89a"}, "labels":map[string]interface {}{"app":"gitlab-demo"}, "lifecycle":map[string]interface {}{}, "location":"US", "nameFormat":"gitlab-demo-artifacts-%s", "providerRef":map[string]interface {}{"name":"example", "namespace":"crossplane-system"}, "reclaimPolicy":"Delete", "serviceAccountSecretRef":map[string]interface {}{"name":"demo-gcs-creds"}, "storageClass":"MULTI_REGIONAL", "writeConnectionSecretToRef":map[string]interface {}{"name":"f6e19330-065c-49b1-9e7f-977f6ed1f964"}}}: validation failure list:
spec.bucketPolicyOnly.lockedTime in body must be of type string: "null"

Stack binary should be named "stack-gcp"

What happened?

The entrypoint binary for stack-gcp appears to be named crossplane. This breaks at least the make run command, which expects the output binary to be the same as the project name.

How can we reproduce it?

make run

What environment did it happen in?

Crossplane version:

ServiceNetworking Connection reports managed resource being created when it is in error state

What happened?

When creating a GlobalAddress and a Connection that references it, after the Connection's reference to the GlobalAddress was resolved, it became stuck in "Managed resource is being created". Upon further investigation, the controller was Creating without error being returned, but then unable to find the Connection in Observe. This is not inherently wrong, as it is still attempting to recreate the resource (similar to a scenario where a resource is deleted externally after Crossplane creates it). However, it is not clear that the resource will ever be created, so it is somewhat misleading.

I believe this situation is a result of some strange behavior of GCP private service connections, that disallow them to be modified and may reserve their name for some time after they are deleted. I have seen similar confusion in the following places:

The last in the list calls out some GCP documentation:

After you have established a private services access connection, and created a Cloud SQL instance with private IP configured for that connection, the corresponding (internal) subnet and range used by the Cloud SQL service cannot be modified or deleted. This is true even if you delete the peering and your IP range.

How can we reproduce it?

Setup crossplane and stack-gcp, then create network.yaml in the GCP services guide: https://crossplane.io/docs/master/services/gcp-services-guide.html

What environment did it happen in?

Crossplane version: v0.4.0

Validation for possible values of a field

What problem are you facing?

In GCP libraries, for many fields it's documented what values that are acceptable in the comments. An example:

// AddressType: The type of address to reserve, either INTERNAL or
// EXTERNAL. If unspecified, defaults to EXTERNAL.
//
// Possible values:
//   "EXTERNAL"
//   "INTERNAL"
//   "UNSPECIFIED_TYPE"
// +optional
AddressType string `json:"addressType,omitempty"`

As you see, the type of the field is string but one could easily enumerate the possible values.

How could Crossplane help solve your problem?'

We can have the exact comments in our structs as well (which is already the case for recently implemented types). Then implement an admission validator webhook for all GCP resources to check for the possible values which are already there but just not enforced programatically. It could either be generated(more effort) or manually implemented. We can run it as a side-car container together with stack-gcp pod or create a pod of its own.

GCP networking resources to v1beta1

What problem are you facing?

Would like a v1beta1 version of networking resources

How could Crossplane help solve your problem?

Move networking resources to v1beta1 standards

CloudSQLInstance stuck in FAILED if using private connectivity and Connection does not exist

What happened?

When creating a GlobalAddress, Connection, and CloudSQLInstance, I noticed the CloudSQLInstance would enter a FAILED state and never come out of it. The error was:

Failed to create subnetwork. Please create Service Networking connection with service 'servicenetworking.googleapis.com' from consumer project '283222062215' network 'argo' again.

Upon reaching this state the instance had to be deleted and recreated because it was observed as existing but could not be updated.

How can we reproduce it?

  1. Create a GlobalAddress, a Connection (that references it), and a CloudSQLInstance with secure connectivity enabled all at the same time in the same network.
  2. The GlobalAddress should come available almost immediately, but the Connection will wait a short time before creation because its first reconciliation will result in unresolved references.
  3. The CloudSQLInstance will begin creation because it does not have references to either of the GlobalAddress or Connection, but will fail if it gets to the subnetwork creation step before the Connection is created because it will be unable to peer the network.

Note: The need to delete and recreate the resource in this scenario is not a major problem because it is somewhat unlikely that this scenario will be exercised that frequently, and the instance will never be created so there is not risk of losing data. However, it does somewhat hamper the immediate bootstrap of a full environment that includes a database.

What environment did it happen in?

Crossplane version: v0.4.0
stack-gcp version: v0.2.0

Add a managed resource for Node Pools, to support more advanced configurations

What problem are you facing?

Some of the configurations of the GKE cluster nodes (like pod sandboxing and auto-scaling), are only possible through non-default Node Pools. However Node Pools are not currently supported, and when creating a GKE cluster, the corresponding parameter is not populated.

How could Crossplane help solve your problem?

I would envision to tackle this issue by:

  • Support Node Pool external resource, by developing its equivalent managed resource
  • Support providing an existing Node Pool when creating a GKE cluster, instead of creating the default Node Pool

CloudSQL instance gets inexplicably deleted

What happened?

A couple of times this weekend I had a cloudsqlinstance managed resource get deleted for no clear reason that I can tell. The cloudsqlinstance was dynamically provisioned from a postgresqlinstance claim then after some time the claim still exists but is left pointing to a cloudsqlinstance resource that doesn't exist in Kubernetes API nor in GCP.

> kubectl -n crossplane-auto-devops-31-production get postgresqlinstance
NAME              STATUS   CLASS-KIND              CLASS-NAME                            RESOURCE-KIND      RESOURCE-NAME                                                AGE
production-demo   Bound    CloudSQLInstanceClass   cloudsqlinstancepostgresql-standard   CloudSQLInstance   crossplane-auto-devops-31-production-production-demo-lfbc7   4h

> kubectl get cloudsqlinstance
No resources found.

How can we reproduce it?

This is currently reproducing using the GitLab managed app and Auto Devops pipeline. Documentation can be found at https://gitlab.com/gitlab-org/gitlab/blob/master/doc/user/clusters/crossplane.md

This is the basic flow:

  1. Crossplane v0.4.1 and stack-gcp v0.2.0 are installed into the GKE cluster via GitLab managed app
  2. GlobalAddress and Connection are created
  3. CloudSQLInstanceClass is created
  4. Auto Devops pipeline runs and produces this PostgreSQLInstance
  5. A CloudSQLInstance is dynamically provisioned in GCP via the CloudSQLInstanceClass
  6. The setup is left to sit for some time. I've seen this repro after 15 mins and after 5 hours.
  7. The CloudSQLInstance disappears from both K8s API and GCP. GCP activity feed shows a deletion operation occurring.

Full -o yaml output and stack-gcp log can be found in https://gist.github.com/jbw976/b1a582b7a878258a2ad0d491d8064c7f

Note the flurry of network issues over a couple minute time span such as:

  • dial tcp 10.0.16.1:443: connect: connection refused"
  • net/http: TLS handshake timeout
  • dial tcp 10.0.16.1:443: i/o timeout"

We also see some errors for the CloudSQLInstance resource, such as:

{"level":"error","ts":1574041845.3841505,"logger":"controller-runtime.controller","msg":"Reconciler error","controller":"postgresqlinstance.cloudsqlinstance.database.gcp.crossplane.io","request":"crossplane-auto-devops-31-production/production-demo","error":"cannot update resource claim status:
Operation cannot be fulfilled on postgresqlinstances.database.crossplane.io \"production-demo\": StorageError: invalid object, Code: 4, Key: /registry/database.crossplane.io/postgresqlinstances/crossplane-auto-devops-31-production/production-demo, ResourceVersion: 0,
AdditionalErrorMsg: Precondition failed: UID in precondition: 9809b429-098b-11ea-917f-42010aa80077, UID in object meta: ",
"errorVerbose":"Operation cannot be fulfilled on postgresqlinstances.database.crossplane.io \"production-demo\": StorageError: invalid object, Code: 4, Key: /registry/database.crossplane.io/postgresqlinstances/crossplane-auto-devops-31-production/production-demo, ResourceVersion: 0,
AdditionalErrorMsg: Precondition failed: UID in precondition: 9809b429-098b-11ea-917f-42010aa80077, UID in object meta: \ncannot update resource claim status

Related to kubernetes/kubernetes#82130?

What environment did it happen in?

Crossplane version: crossplane/crossplane:v0.4.1
stack-gcp version: crossplane/stack-gcp:v0.2.0

  • Cloud provider or hardware configuration: GKE with k8s 1.13.11
  • Kubernetes version (use kubectl version)
Client Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.1", GitCommit:"4485c6f18cee9a5d3c3b4e523bd27972b1b53892", GitTreeState:"clean", BuildDate:"2019-07-18T09:18:22Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"13+", GitVersion:"v1.13.11-gke.14", GitCommit:"56d89863d1033f9668ddd6e1c1aea81cd846ef88", GitTreeState:"clean", BuildDate:"2019-11-07T19:12:22Z", GoVersion:"go1.12.11b4", Compiler:"gc", Platform:"linux/amd64"}
  • Kubernetes distribution (e.g. Tectonic, GKE, OpenShift): GKE
  • OS (e.g. from /etc/os-release)
  • Kernel (e.g. uname -a)

GCP CloudSQL: Proxy connections

What problem are you facing?

This is related to connecting to CloudSQL instances from an application managed by Crossplane.

@ichekrygin and I were talking about different connectivity models for accessing CloudSQL instances from managed applications, and I want to capture some of the notes from that discussion. Maybe it makes sense to turn this into a one-pager in the future. This is all early thinking, and is not necessarily well thought out yet : ).

The question is: how do we connect to a Crossplane-managed CloudSQL instance from a managed application? Because it's a managed application and a managed database, the following requirements apply:

  • The specification should not be provider-specific; it should be portable
  • The specification should be simple

There is also a consideration of supporting connections between an application in a non-GCP provider and a database in GCP. But I consider this more of a tradeoff between approaches than a requirement.

Proposal

The discussion here will explore the CloudSQL Proxy option. The reason for focusing on the proxy approach is the following positive tradeoffs:

  • On the application side, the proxy is portable across all environments and providers, and it sidesteps any network connectivity questions
  • The proxy model is similar to access models for other services (e.g. Redis), and could inform our thinking around other proxy types.

From the application configuration side, the naive approach is to configure the proxy next to the application, which is provider-specific and not portable. The portable way to do this would be for the application to receive a connection string which points it to the cloudsql proxy. This is also a better developer experience.

To determine the connection string, the most likely scenario is that the crossplane controller would need to set up a cloudsql proxy container for the application, and would need to return a connection string that could be injected into the application container's environment. In order for the controller to know how to configure the proxy container, it would need to know which database the application is trying to connect to.

To summarize, the proposed model for interacting with a CloudSQL instance using a proxy would be:

  • The application configuration specifies which database it depends on (at the crossplane level, so for example, the name of a crossplane-managed database claim)
  • The controller managing KubernetesApplications (or maybe KubernetesApplicationResources; I'm not sure at this time) spins up a CloudSQL proxy for an application which declares that it depends on a particular database (if that database is bound to a CloudSQL instance). This could also be a controller which is separate from those other controllers. Maybe it goes as far as being modeled as a claim on a database connection, for example. Or maybe the same controller which manages a database claim handles this.
  • The aforementioned controller exposes the connection string for the application to consume
  • The application is now able to consume the connection string in the regular way

Further reading and related issues

GCP defaults do not match zero-value of Go structs

What happened?

When the value of a field is set to its type's zero value (for bools it's false, for ints it's 0 etc.), GCP client doesn't include them in the REST call. When a field isn't included in the REST call, GCP falls back to default value of that setting, which in some cases is not the same with Golang's zero value. For example, CloudSQL IPv4Enabled setting's default is true on GCP but if you set its value to false, we have to force send it otherwise the setting is lost.

How can we reproduce it?

See #14

Proposed Solution

In our translation code where we convert our CRs into Google's structs, we should force send all fields so that our settings propagate correctly without missing something.

BucketAccessControl managed resource implementation

What problem are you facing?

GCP Buckets allow access to buckets and specific objects through BucketAccessControl object. You can define this object on Buckets API object as well but in Crossplane, we respect the resource separation decisions of the cloud provider.

So, we need a managed resource named BucketAccessControl and Bucket resource should refer to these objects to allow access.

How could Crossplane help solve your problem?

Implement BucketAccessControl object and make Acl and defaultObjectAcl fields be reference to that managed resource.

GCP storage buckets to v1beta1

What problem are you facing?

Would like a v1beta1 version of storage buckets

How could Crossplane help solve your problem?

Move storage buckets to v1beta1 standards

Update INSTALL.md to include local development instructions

I have built my gcp-stack image following the INSTALL.md

But it ends at Install part with TBD. I also tried out installing this using the crossplane repo's way (by running scripts under cluster/local/minikube.sh up) but there are not scripts under cluster/ folder.

Will running the built image as deployment do or is there any canonical/recommended way to install gcp-stack in my minikube?

CloudSQL: there is currently no way to rotate SSL certificate

What problem are you facing?

Currently, CloudSQL connection secret includes the SSL certificate information for clients that want to access it via SSL. However, when the certificate expires GCP requires you to take a manual action and rotate the keys. Details are here.

If the user takes rotation action on GCP Console, Crossplane does propagate it back to the connection secret. However, there is no mechanism to trigger that rotation through Crossplane.

How could Crossplane help solve your problem?

This is an imperative action, so, we'd probably need to come up with a generic way for handling imperative actions and apply it here. The first thing comes up to my mind is that we could have a field certExpired: true and in each reconcile, we'd update that. In case it's value is false and the certificate did expire, it means user changed it, so, we'd call the rotation action. But this'd require the certificate to be expired before rotation. So, not a really bright solution.

Error message unclear when name of resource has been in use recently

Explained in the Slack Channel already, but I'll summarise it here again:

When using static provisioning (and thus non-random naming) for your CloudSQL Instances, it can happen that the name I'm trying to use has been in use recently, but deleted. The timeframe between "instance is deleted" and "new instance with same name can be provisioned" can be up to a week.

The thing is that Crossplane then adds the following message to the status of the CloudSQLInstance resource:

    Message:               cannot create new CloudSQL instance: googleapi: Error 409: The instance or operation is not in an appropriate state to handle the request., invalidState

I found the reason for this here.

Like terraform, it would be nice if Crossplane could report a nicer error than what you see above.

I think the code that generates this message is coming from here.

Terraform fixed this by checking on the return code (409).

However, there is a test that checks for an already existing instance with the same name, which also returns a 409. although returning an error like

the name %s is unavailable because it was used recently

is not wrong, it could also still be in use.

I'd be happy to submit a PR if someone tells me what to implement exactly. checking on the 409 should be quite straightforward, checking if the DB still exists to return the proper message would be more work.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.