Giter Site home page Giter Site logo

Comments (8)

tomhjp avatar tomhjp commented on June 12, 2024 1

Thanks for adding those configs. I need to write up some documentation to explain this limitation, but the spec.parameters.vaultAddress value is what's tripping it up here. The helm chart configures the CSI provider to use a unix socket pointed at the Agent as its Vault address, but setting the CSI provider's -vault-addr flag or setting the SPC's vaultAddress parameter will configure it to reach out directly to Vault instead. That means the Agent isn't aware of the dynamic lease that needs renewing, so those Vault address options become a bit of a trip hazard.

I'd like to make it less of a trip hazard, but haven't come up with any plans yet. Let me know if you have thoughts. Perhaps at the very least the CSI provider can log a warning if it detects that an Agent sidecar is being bypassed.

from vault-csi-provider.

tomhjp avatar tomhjp commented on June 12, 2024

If you're seeing new leases get created when the pod restarts, that's expected behaviour as the Kubernetes token used to authenticate to Vault is bound to both the service account and pod (including the UIDs). Apologies if the changelogs didn't make that clear enough - the docs site could probably do with some updating too to explain the ins and outs in one place.

However, if you're seeing new leases on the same pod before the TTL is up then that sounds like a misconfiguration between the provider and the Vault Agent sidecar. If that's the case, please could you share the deployed pod yaml and the SecretProviderClass spec?

from vault-csi-provider.

tjjosep avatar tjjosep commented on June 12, 2024

Thanks for the quick response @tomhjp. We dont see new secret leases unless the POD is restarted.

The issue is that the secret lease is NOT renewing (revoked) while the pod alive. When the secret-lease reaches 1hr (3600s) default TTL, dynamic-secret (database) is getting revoked from vault due to the non-renewal.

The expected outcome was to renew secret lease (by the newly added vault agent injector) as long as the pod is alive. And the secret lease will be ultimately revoked due to one of the below scenarios.

  • Same pod is still alive and the lease was revoked when it reached 10hrs (36000s) Max TTL.
  • The pod was terminated and the lease was revoked due to non renewal.
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: ms-demo-ephemeral-pg-secret-csi
  name: ms-demo-ephemeral-pg-secret-csi
  namespace: my-kube-namespace
spec:
  selector:
    matchLabels:
      app: ms-demo-ephemeral-pg-secret-csi
  template:
    metadata:
      labels:
        app: ms-demo-ephemeral-pg-secret-csi
    spec:
      containers:
      - env:
        - name: KUBE_VOLUME_MOUNT_PATH_PG_DB_SECRET
          value: /var/run/secrets/my-service-account-csi-store/my-vault-namespace.database.creds.dynamic_write_role
        image: docker-dev.artifactory.com/ms-demo-ephemeral-pg-secret-csi
        name: ms-demo-ephemeral-pg-secret-csi
        volumeMounts:
        - mountPath: /var/run/secrets/my-service-account-csi-store
          name: my-service-account-csi-store
          readOnly: true
      serviceAccountName: my-service-account  
      volumes:
      - csi:
          driver: secrets-store.csi.k8s.io
          readOnly: true
          volumeAttributes:
            secretProviderClass: vault-csi-secretsync
        name: my-service-account-csi-store
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: my-service-account  
  namespace: my-kube-namespace

---
apiVersion: secrets-store.csi.x-k8s.io/v1
kind: SecretProviderClass
metadata:
  name: vault-csi-secretsync
  namespace: my-kube-namespace
spec:
  provider: vault
  parameters:
    vaultAddress: "https://vault-aws.my-company.com/"  
    vaultKubernetesMountPath: "kube/my-cluster"  
    roleName: "my-service-account"  
    objects: |

      # Get raw Vault response
      - objectName: "my-vault-namespace.database.creds.dynamic_write_role"
        secretPath: "my-vault-namespace/database/creds/dynamic_write_role"

Below is the Database Secret setup

vault write database/roles/dynamic_write_role \
    db_name=my-postgres-db \
    creation_statements=@pg_dynamic_user_creation.sql \
    revocation_statements=@pg_dynamic_user_revocation.sql \
    rollback_statements=@pg_dynamic_user_rollback.sql \
    renew_statements=@pg_dynamic_user_renew.sql \
    default_ttl=3600s \
    max_ttl=36000s

from vault-csi-provider.

tjjosep avatar tjjosep commented on June 12, 2024

That worked.

We found that if we route the csi traffic through agent, then whenever the default ttl is reached; it will renew the database secrets up to the Max-ttl. So the secret retrieved by the POD will remain valid as long as the PERIOD-TOKEN is enabled on the kube service accounts vault auth role.

So this proves that we can maintain a shorter ttl (e.g 10 mins) on the dynamic db secrets. But we must have the period token (can have shorter ttl such as 10 mins) need to be enabled on the auth token. We still need to have a longer MAX ttl so we could schedule a POD recycle before the DB secret's max ttl expires.

If the period token is not enabled on the auth token, then the lease will be revoked when ever the auth-token ttl is reached.

Type Renewal Expiry
Database Dynamic Secret Default TTL: 10 mins MAX TTL: 8 days (Pod will to recycle at 7th day so Max will never reach)
Auth Token Period Token: 10 mins 32 days (is it called as ttl? is this ignored If a period token specified? Can this be shorter?)

from vault-csi-provider.

tomhjp avatar tomhjp commented on June 12, 2024

from vault-csi-provider.

msitworld avatar msitworld commented on June 12, 2024

That worked.

We found that if we route the csi traffic through agent, then whenever the default ttl is reached; it will renew the database secrets up to the Max-ttl. So the secret retrieved by the POD will remain valid as long as the PERIOD-TOKEN is enabled on the kube service accounts vault auth role.

So this proves that we can maintain a shorter ttl (e.g 10 mins) on the dynamic db secrets. But we must have the period token (can have shorter ttl such as 10 mins) need to be enabled on the auth token. We still need to have a longer MAX ttl so we could schedule a POD recycle before the DB secret's max ttl expires.

If the period token is not enabled on the auth token, then the lease will be revoked when ever the auth-token ttl is reached.

Type Renewal Expiry
Database Dynamic Secret Default TTL: 10 mins MAX TTL: 8 days (Pod will to recycle at 7th day so Max will never reach)
Auth Token Period Token: 10 mins 32 days (is it called as ttl? is this ignored If a period token specified? Can this be shorter?)

Hello,

I tried to pass the CSI traffic throught vault-agent-injector but CSI can not connect to it due to the SSL error. I conducted a search for that to fix it with adding some annotation but it did not work. Did you encounter this error? any idea?

reconciler.go:223] "failed to reconcile spc for pod" err="failed to rotate objects for pod application/app-655fd88bfd-qc7z2, err:
rpc error: code = Unknown desc = error making mount request: couldn't read secret  \"test\": failed to login: Post
\"https://vault-agent-injector-svc/v1/auth/kubernetes/login\": tls: failed to verify certificate: x509: certificate signed by
 unknown authority" spc="test-spc" pod="app-655fd88bfd-qc7z2" controller="rotation"

As a heads up, I've installed the Vault using the official helm chart and enabled the vault-agent-injector in helm values.yaml

from vault-csi-provider.

tomhjp avatar tomhjp commented on June 12, 2024

@msitworld that sounds like a missing CA cert somewhere. If you can mount Vault's CA cert into the pod using csi.volumes and csi.volumeMounts, then you can use csi.agent.extraArgs to pass in the path to the CA using -ca-cert=/path/to/ca.pem. It maybe seems like the helm chart could do with some additional options around setting up TLS for the agent as well though, as it doesn't look easy to do with environment variables or custom Agent config currently.

from vault-csi-provider.

tjjosep avatar tjjosep commented on June 12, 2024

The original issue was resolved when the traffic routed to the csi-provider-agent-sidecar with the csi-provider version 1.4.1. Closing the issue.

Appreciate the help.

from vault-csi-provider.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.