open-metadata / openmetadata-helm-charts Goto Github PK
View Code? Open in Web Editor NEWLicense: Apache License 2.0
License: Apache License 2.0
For OpenMetadata, we need to restrict Auth Provider Values that we provide under global.authentication.provider
with only below valid options -
Validate the input and error out if the value does not meet the above options with a meaningful error.
Add the validation logic with go templates.
Ref - https://openmetadata.slack.com/archives/C02B6955S4S/p1675754184053859
Helm upgrades from versions prior to 0.0.51
to latest releases fails due to an error in helm hooks as secret "openmetadata-secret" not found
with pod status being CreateContainerConfigError
.
The reason for this is helm hooks are called on pre upgrade step and OpenMetadata Secret for Fernet Key introduced with #66 create the secrets during the helm upgrade. This means that pre upgrade step does not get the secret available during helm upgrades.
The solution to above is to use annotations for secret object to be available before running the helm hooks!
Should this be openmetadata-elasticsearch-password
?
After enabling airflow in values.yaml, I got the following error:
vi values.yaml
airflow:
enabled: true
helm install -f values.yaml
Error: UPGRADE FAILED: YAML parse error on openmetadata/templates/deployment.yaml: error converting YAML to JSON: yaml: line 83: block sequence entries are not allowed in this context
Looks like a template error in deployment.yaml.
provide custom helm values which will be used for ports mapping for OpenMetadata Helm Kubernetes Deployment Object containerPorts. Current scenario, we are using hard coded TCP Protocol and overridable Port number (still within TCP Protocol).
Environment Variables that are created with Array Values currently does not support query string parameters with url like https://my.website.com/auth2/keys?test=123
. This fails the openmetadata server from starting with yaml syntax parsing error.
The environment variable value configured on pod for AUTHENTICATION_PUBLIC_KEYS looks like '[http://localhost:8585/api/v1/config/jwks?test=123[]](http://localhost:8585/api/v1/config/jwks?test=123])'
.
Ideally, the environment variable value configured on pod for AUTHENTICATION_PUBLIC_KEYS should be looking like '["http://localhost:8585/api/v1/config/jwks?test=123"]'
. Notice the quotes wrapping the array element values.
Use an helper function to achieve the above configuration in openmetadata helm charts.
Currently the OM 1.0.3 only support "noop", "aws", "aws-ssm" , would be really appreciated to update the helm chart to support below parameters
"noop", "managed-aws","aws", "managed-aws-ssm", "aws-ssm", "in-memory"
Also , would be great to change the value.yaml file with the latest KMS configuration example
When we helm template the main chart, it does not add namespace to the manifest. Would be great to have this, otherwise you would need to always do a helm install.
To replicate:
helm repo add open-metadata https://helm.open-metadata.org/
helm repo update
helm template openmetadata open-metadata/openmetadata --namespace my_namespace
Expected:
Manifest is filled with namespaces. Ex.:
apiVersion: apps/v1
kind: Deployment
metadata:
name: openmetadata
namespace: my_namespace
labels:
helm.sh/chart: openmetadata-1.0.1
app.kubernetes.io/name: openmetadata
app.kubernetes.io/instance: openmetadata
app.kubernetes.io/version: "1.0.0"
app.kubernetes.io/managed-by: Helm
...
Instead of hardcoding fernet key in the values file
fernetkey:
value: "jJ/9sz0g0OHxsfxOoSfdFdmk3ysNmPRnH3TUAbz3IHA="
secretRef: ""
secretKey: ""
a safer way is to use (randAlphaNum 32 | b64enc)
as part of the initialization if fernetKey.secretRef: ~
I am curious if the hardcoded value is indeed intended
with that, it is also easier if we move fernetKey.secretRef
and fernetKey.secretKey
to .tpl
and format its naming with {{ .Release.Name }}
which i think would be another scope to discuss with
Below are the list of conditions to handle as part of this Issue:
global.authorizer.className
will only have a fixed value of org.openmetadata.service.security.DefaultAuthorizer
global.authorizer.containerRequestFilter
will only have a fixed value of org.openmetadata.service.security.JwtFilter
global.secretsManager.provider
can only have values as noop, aws, aws-ssm
global.airflow.openmetadata.authProvider
value is azure
, then global.airflow.openmetadata.authConfig.azure.clientSecret.secretRef
, global.airflow.openmetadata.authConfig.azure.clientSecret.secretKey
, global.airflow.openmetadata.authConfig.azure.authority
, global.airflow.openmetadata.authConfig.azure.clientId
and global.airflow.openmetadata.authConfig.azure.scopes
field are requiredglobal.airflow.openmetadata.authProvider
value is google
, then global.airflow.openmetadata.authConfig.google.secretKeyPath
and global.airflow.openmetadata.authConfig.google.audience
is requiredglobal.airflow.openmetadata.authProvider
value is okta
, then global.airflow.openmetadata.authConfig.okta.privateKey.secretRef
, global.airflow.openmetadata.authConfig.okta.privateKey.secretKey
, global.airflow.openmetadata.authConfig.okta.email
, global.airflow.openmetadata.authConfig.okta.clientId
, global.airflow.openmetadata.authConfig.okta.orgUrl
, global.airflow.openmetadata.authConfig.okta.scopes
is requiredglobal.airflow.openmetadata.authProvider
value is auth0
, then global.airflow.openmetadata.authConfig.auth0.clientId
, global.airflow.openmetadata.authConfig.auth0.secretKey.sercretRef
, global.airflow.openmetadata.authConfig.auth0.secretKey.secretKey
, global.airflow.openmetadata.authConfig.auth0.domain
is requiredglobal.airflow.openmetadata.authProvider
value is openmetadata
, then global.airflow.openmetadata.authConfig.openMetadata.jwtToken.secretRef
, global.airflow.openmetadata.authConfig.openMetadata.jwtToken.secretKey
is requiredglobal.airflow.openmetadata.authProvider
value is custom-oidc
, then global.airflow.openmetadata.authConfig.customOidc.secretKey.secretRef
, global.airflow.openmetadata.authConfig.customOidc.secretKey.secretKey
, global.airflow.openmetadata.authConfig.customOidc.clientId
, global.airflow.openmetadata.authConfig.customOidc.tokenEndpoint
is requiredAffected module
Backend
Describe the bug
When setting trustStore.enabled=true in a custom helm values.yaml for the openmetadata deployment and trying to helm install, there is an error thrown:
Error: INSTALLATION FAILED: template: openmetadata/templates/deployment.yaml:144:34: executing "openmetadata/templates/deployment.yaml" at <.password.secretRef>: nil pointer evaluating interface {}.secretRef
The reason is a check in the deployment template, which does the following:
{{- with .Values.global.elasticsearch.trustStore.password }}
- name: ELASTICSEARCH_TRUST_STORE_PASSWORD
valueFrom:
secretKeyRef:
name: {{ .password.secretRef }}
key: {{ .password.secretKey }}
the with clause has password in the end of the path, while the valueFrom clause also looks for .password.secretRef and .password.secretKey. A wrong path is created where password is duplicated. The with clause should not contain the end .password section.
Another issue is the if clause which checks if the trustStore is present or not:
{{- if .Values.global.elasticsearch.trustStore.enabled -}}
- name: ELASTICSEARCH_TRUST_STORE_PATH
With the end "-}}" this will not generate a new line before the ELASTICSEARCH_TRUST_STORE_PATH variable. Example:
- name: ELASTICSEARCH_PASSWORD
valueFrom:
secretKeyRef:
name: elasticsearch-secrets
key: openmetadata-elasticsearch-password- name: ELASTICSEARCH_TRUST_STORE_PATH
value: /opt/truststore.jks
A valid similar example which operates correctly can be seen here:
{{- if .Values.global.jwtTokenConfiguration.enabled }}
- name: RSA_PUBLIC_KEY_FILE_PATH
value: "{{ .Values.global.jwtTokenConfiguration.rsapublicKeyFilePath }}"
Notice the missing end "-" in the if section. The same should apply for the trustStore.enabled check.
To Reproduce
Install the openmetadata helm chart by setting the trustStore to true, e.g. sample:
global:
elasticsearch:
host: es-cluster.com
port: 9200
scheme: https
connectionTimeoutSecs: 5
socketTimeoutSecs: 60
batchSize: 10
trustStore:
enabled: true
path: /opt/truststore.jks
password:
secretRef: elasticsearch-truststore-secrets
secretKey: openmetadata-elasticsearch-truststore-password
Screenshots or steps to reproduce
Expected behavior
The helm install should successfully install openmetadata chart.
Version:
Openemetadata version: 0.0.39
When configuring Openmetadata with TLS I get the following error:
Warning Unhealthy 3m45s (x2822 over 16h) kubelet Readiness probe failed: Get "http://10.244.1.16:8585/": net/http: HTTP/1.x transport connection broken: malformed HTTP response "\x15\x03\x03\x00\x02\x02P"
The http check fails as the server is using https
OpenMetadata Server Kubernetes Service does not expose admin port. Looks like a typo in service.yaml file.
As per the changes done in the PR - 11134, Helm Pre Upgrade Hooks are not required anymore. It served the same purpose of repairing a migrations before starting the upgrades! This is already available now as part of migrate-all
script call which is available in openmetadata-start.sh.
In order to deprecate this functionality and simply helm charts, we need to remove the below changes -
The MySQL and airflow password keys are inconsistent
https://github.com/open-metadata/openmetadata-helm-charts#quickstart
MySQL pod does not start because the /bitnami/mysql complains about out of space
This can be used to automate processes such as initialization.
ref: https://github.com/hashicorp/vault-helm/blob/main/values.yaml#L494
In my production deployment, I used external MySQL, not from helm dependency. And I have override the openmetadata.yaml
file to reflect that change.
However in openmetadata-start.sh
while ! wget -O /dev/null -o /dev/null mysql:3306; do sleep 5; done
It's assuming mysql is the URI for mysql host. It would not work if MySQL is customised to other url.
As a walk around, I did the following hack in the K8S spec. This would skip the MySQL check and start the server.
- name: openmetadata
image: "openmetadata/server:0.7.0"
args:
- "/openmetadata-0.7.0/bin/openmetadata-server-start.sh"
- "/openmetadata-0.7.0/conf/openmetadata.yaml"
We need to use Elastic.co for production deployment, the default configuration comes with a password, so it should be configurable.
My walk around
extraEnvs:
- name: ELASTICSEARCH_USERNAME
value: elastic
- name: ELASTICSEARCH_PASSWORD
valueFrom:
secretKeyRef:
name: "es-secrets"
key: "es-password"
AIRFLOW_AUTH_PROVIDER
value to be custom-oidc
instead of customOidc
.
Slack Context - https://openmetadata.slack.com/archives/C02B6955S4S/p1653895330507059
It may be worth add to the README
somewhere that due to the k8s.gcr.io
registry being shut down soon, users may see errors with old helm chart
The indent should be 10 here, with 12, it doesn't generate the valid yaml file
We would like to able to customise the startup script in the prod environment.
In the future, we need our own bash script to start the service.
This customisation allows us to continue using offical image while maintain some flexibilities to add audit functionalities to the setup.
With the default value to be same as it is now, it will not impact the normal setup.
image: "openmetadata/server:0.8.0"
imagePullPolicy: Always
command: # Customise this command
args: # And this
It would be great to be able to configure Google SSO via the helm chart
idea:
maybe put the openmetadata.yaml
config into a configmap so it can be easier templated
Greetings,
Upgrading from 0.0.53
to 0.0.56
results with:
Error: UPGRADE FAILED: pre-upgrade hooks failed: timed out waiting for the condition
Detailed log
STDERR:
history.go:56: [debug] getting history for release openmetadata
upgrade.go:144: [debug] preparing upgrade for openmetadata
upgrade.go:152: [debug] performing update for openmetadata
upgrade.go:324: [debug] creating upgraded release for openmetadata
client.go:477: [debug] Starting delete for "db-migrations-cm-hook" ConfigMap
client.go:133: [debug] creating 1 resource(s)
client.go:477: [debug] Starting delete for "openmetadata-db-migrations-hook" Job
client.go:133: [debug] creating 1 resource(s)
client.go:703: [debug] Watching for changes to Job openmetadata-db-migrations-hook with timeout of 5m0s
client.go:731: [debug] Add/Modify event for openmetadata-db-migrations-hook: ADDED
client.go:770: [debug] openmetadata-db-migrations-hook: Jobs active: 1, jobs failed: 0, jobs succeeded: 0
upgrade.go:436: [debug] warning: Upgrade "openmetadata" failed: pre-upgrade hooks failed: timed out waiting for the condition
Error: UPGRADE FAILED: pre-upgrade hooks failed: timed out waiting for the condition
helm.go:84: [debug] pre-upgrade hooks failed: timed out waiting for the condition
UPGRADE FAILED
main.newUpgradeCmd.func2
helm.sh/helm/v3/cmd/helm/upgrade.go:203
github.com/spf13/cobra.(*Command).execute
github.com/spf13/[email protected]/command.go:916
github.com/spf13/cobra.(*Command).ExecuteC
github.com/spf13/[email protected]/command.go:1044
github.com/spf13/cobra.(*Command).Execute
github.com/spf13/[email protected]/command.go:968
main.main
helm.sh/helm/v3/cmd/helm/helm.go:83
runtime.main
runtime/proc.go:250
runtime.goexit
runtime/asm_amd64.s:1594
Could you please check whether that issue is reproducible ?
Please advise
Error: UPGRADE FAILED: values don't meet the specifications of the schema(s) in the following chart(s):
openmetadata:
- global.pipelineServiceClientConfig: Additional property hostIp is not allowed
It seems the json schema is missing the yaml configuration of hostIp
. To fix this, will need a new helm charts publish!
Ability to support and enable PostgreSQL as backend for OpenMetadata and Airflow as part of OpenMetadata Dependencies Helm Chart.
Hi, I'm trying to reuse the underlying airflow instance for other purposes and adding new DAGs in the mounted DAG folder.
However, I cannot seem to run the KubernetesPodOperator and my DAGs cannot be imported.
When I use the airflow.contrib one
from airflow.contrib.operators.kubernetes_pod_operator import KubernetesPodOperator
I get
# log
Broken DAG: [/airflow-dags/dags/dbt.py] Traceback (most recent call last):
File "/airflow-dags/dags/dbt.py", line 3, in <module>
from airflow.contrib.operators.kubernetes_pod_operator import KubernetesPodOperator
File "/home/airflow/.local/lib/python3.9/site-packages/airflow/contrib/operators/kubernetes_pod_operator.py", line 25, in <module>
from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator # noqa
ModuleNotFoundError: No module named 'airflow.providers.cncf.kubernetes'
When I use the airflow.providers one
from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
I get
from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
I'm surprised the jobs from OMD can actually run and spawn new kubernetes workers/jobs without having this dependency. When looking at the deployed code, the airflow image of the pods seem to use your custom docker image docker.getcollate.io/openmetadata/ingestion:0.13.2
. Could the kubernetesPodOperator be integrated into this one, or maybe have a different docker image more suited for kubernetes ?
The airflow setup represents several GBs of pod memory, so I'm not keen into duplicating Airflow instances and having just one for OMD, and just one for the rest. I'd rather reuse the one from your OMD-dependencies helm chart which is already connected to OMD. But I'm open to other suggestions
Slack discussion there
Example -
List value:
global:
authorizer:
initialAdmins:
- "abc"
- "def"
Expected ENV value - '[abc,def]'
ENV Value currently generated - '[abc def]'
in v0.0.7
helm gives me an error:
UPGRADE FAILED: pre-upgrade hooks failed: warning: Hook pre-upgrade openmetadata/templates/check-db-migrations-job-hook.yaml failed: Job in version "v1" cannot be handled as a Job: v1.Job.Spec: v1.JobSpec.Template: v1.PodTemplateSpec.Spec: v1.PodSpec.Containers: []v1.Container: v1.Container.Env: []v1.EnvVar: v1.EnvVar.Value: ReadString: expects " or n, but found 8, error found in #10 byte of ...|,"value":8585},{"nam|..., bigger context ...|e":"openmetadata"},{"name":"SERVER_PORT","value":8585},{"name":"SERVER_ADMIN_PORT","value":8586},{"n|..
e.g. the deployment.yaml rendered looks like this:
...
containers:
- name: openmetadata
...
env:
- name: SERVER_HOST
value: openmetadata
- name: SERVER_PORT
value: 8585 #<--- this is not allowed because string type is expected, but for yaml it is not a string but a number
- name: SERVER_ADMIN_PORT
value: 8586
- name: ELASTICSEARCH_HOST
value: elasticsearch-master.openmetadata.svc.cluster.local
- name: ELASTICSEARCH_PORT
value: 9200
- name: ELASTICSEARCH_SCHEME
value: http
- name: MYSQL_HOST
value: mysql.openmetadata.svc.cluster.local
- name: MYSQL_PORT
value: 3306
Currently, when deploying Openmetada in GKE, we can specify a specific Nodepool via this entry in "values.yml" file
nodeSelector: cloud.google.com/gke-nodepool: <nodepool-name>
This is possible, except for "check-db-migrations-job-hook" Pod.
So the fix would be :
Fix Helm Charts Security Reports for critical and high CVE's.
When installing the default steps, we end up with a few deployments (generated by openmetadata-dependencies) that are stuck failing with the error:
CreateContainerConfigError: secret "airflow-mysql-secrets" not found
OpenMetadata currently uses Environment Variables to map values from values.yaml. Improve on this to use Kubernetes Secrets within the Helm Charts for better security.
There is a bug where the MySQL database init script has a hardcoded password.
initdbScripts:
init_openmetadata_db_scripts.sql: |
CREATE DATABASE openmetadata_db;
CREATE USER 'openmetadata_user'@'%' IDENTIFIED BY 'openmetadata_password';
GRANT ALL PRIVILEGES ON openmetadata_db.* TO 'openmetadata_user'@'%' WITH GRANT OPTION;
In the readme, it mentions to create a password of your choice under the secret mysql-secrets
As the init script has a hardcoded value, the Open Metadata Server is unable to connect to MySQL DB on
deployment.
Update Helm Values to be able to override Readiness and Liveliness block for OpenMetadata Deployments.
After enabling the ingress config, browser reloads ends up in a 404. Navigating over the links still works.
Anything to pay attention here?
MySQL uses jks
as the SSL certificate which is a binary file. Our environment doesn't work with binary file in the configuration manager, only works with text format.
127.0.0.1
as the mysql server configThe proposed solution does not complicate the general setup, but gives flexibilities to setup in a more security restricted environment.
Hey guys,
I am trying to get configure airflow as the main pipeline service for OpenMetaData in my Kubernetes cluster. I am using the official open meta data helm chart along with the airflow managed by bitnami. See chart links section for links.
What have I done:
For bitnami/airflow value.yaml
I installed the python packages as mentioned in the official page: https://docs.open-metadata.org/openmetadata/connectors/pipeline/airflow/lineage-backend
openmetadata-ingestion
openmetadata-airflow-managed-apis
and set the following environment variables to configure a lineage backend:
AIRFLOW__LINEAGE__BACKEND="airflow_provider_openmetadata.lineage.openmetadata.OpenMetadataLineageBackend"
AIRFLOW__LINEAGE__AIRFLOW_SERVICE_NAME="airflow_helm"
AIRFLOW__LINEAGE__OPENMETADATA_API_ENDPOINT="http://openmetadata:8585/api"
AIRFLOW__LINEAGE__AUTH_PROVIDER_TYPE="no-auth"
For OpenMetaData I disable the default airflow pod that comes with OpenMetaData and add the configuration for the bitnami/airflow.
Issue:
Now. Going to OpenMetaData UI I getting the following error
Airflow Exception [Failed to get Pipeline Service host IP.] due to [Failed to get Pipeline Service host IP due to { "detail": "The requested URL was not found on the server. If you entered the URL manually please check your spelling and try again.", "status": 404, "title": "Not Found", "type": "about:blank" } ].
Basically, OpenMetaData cannot find Airflow. Note, that both in the same namespace which is not the default. I tried debugging for hours and could not found the source of the issue.
Thanks for the help. I attached the values for both OpenMetaData and Bitnami/airflow.
Chart versions links:
Open meta data helm - https://github.com/open-metadata/openmetadata-helm-charts
Airflow helm - https://github.com/bitnami/charts/tree/master/bitnami/airflow
Versions:
Helm version: 3.2.1
airflow_values.txt
openmeta_dep_values.txt
openmeta_values.txt
I'm not sure where the best place to report this is, but on this page of documentation it appears that theglobal.airflow.openmetadata.customoidc.secretKeyPath
parameter to set the client secret is inaccurate according to here. If you will notice, there is the addition of an .authConfig
group as well as an object called secretKey
that specifies the secretRef
and secretKey
to reference.
I thought perhaps this repo would be the place to add a PR correcting the documentation, but I'm not seeing anything in the content directory that leads me to the markdown that I need to edit.
It might be easier if there would be a possibility to save TLS certificates or openmetadata.keystore.jks in a Kubernetes secret which will be mounted to OM deployment container
Relates to open-metadata/OpenMetadata#9022
Hi Team,
I am trying to install the Ingestion in local, but its failed to create a PVC with ReadWriteMany.
Also the validation fail to keep ReadWriteOnce.
Any help to run local ? may be i need to disable the template validation ?
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal WaitForFirstConsumer 41m persistentvolume-controller waiting for first consumer to be created before binding
Normal Provisioning 10m (x8 over 41m) rancher.io/local-path_local-path-provisioner-684f458cdd-vztxn_237b9e7f-993b-49b4-938b-2731994ad6e0 External provisioner is provisioning volume for claim "default/openmetadata-dependencies-logs"
Warning ProvisioningFailed 10m (x8 over 41m) rancher.io/local-path_local-path-provisioner-684f458cdd-vztxn_237b9e7f-993b-49b4-938b-2731994ad6e0 failed to provision volume with StorageClass "standard": Only support ReadWriteOnce access mode
Normal ExternalProvisioning 114s (x162 over 41m) persistentvolume-controller waiting for a volume to be created, either by external provisioner "rancher.io/local-path" or manually created by system administrator
Error: INSTALLATION FAILED: execution error at (openmetadata-dependencies/charts/airflow/templates/_helpers/validate-values.tpl:105:5): The
logs.persistence.accessModemust be
ReadWriteMany!
the mysql pod gets restarted due to the startup probe failure. after disabling the liveness probe it started to work.
Startup probe failed: mysqladmin: [Warning] Using a password on the command line interface can be insecure. �mysqladmin: connect to server at 'localhost' failed error: 'Can't connect to local MySQL server through socket '/opt/bitnami/mysql/tmp/mysql.sock' (2)' Check that mysqld is running and that the socket: '/opt/bitnami/mysql/tmp/mysql.sock' exists!
helm value
mysql:
image:
debug: true
startupProbe:
enabled: false
debug didn't show any error
Hi,
I have found an issue for the AWS secrets manager seems when the service has been removed from open metadata but the secrets still remain on AWS.
More details please see the blow Threads
https://openmetadata.slack.com/archives/C02B6955S4S/p1686560318700229
Let's have airflow pods that spin up from openmetadata-dependencies use a full override name from airflow-helm values.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.