Giter Site home page Giter Site logo

helm-charts's Introduction

Temporal Helm Chart

FOSSA Status

Temporal is a distributed, scalable, durable, and highly available orchestration engine designed to execute asynchronous long-running business logic in a resilient way.

This repo contains a basic V3 Helm chart that deploys Temporal to a Kubernetes cluster. The dependencies that are bundled with this solution by default offer an easy way to experiment with Temporal software. This Helm chart can also be used to install just the Temporal server, configured to connect to dependencies (such as a Cassandra, MySQL, or PostgreSQL database) that you may already have available in your environment.

The only portions of the helm chart that are production ready are the parts that configure and manage Temporal Server itself—not Cassandra, Elasticsearch, Prometheus, or Grafana.

This Helm Chart code is tested by a dedicated test pipeline. It is also used extensively by other Temporal pipelines for testing various aspects of Temporal systems. Our test pipeline currently uses Helm 3.1.1.

Install Temporal service on a Kubernetes cluster

Prerequisites

This sequence assumes

  • that your system is configured to access a kubernetes cluster (e. g. AWS EKS, kind, or minikube), and
  • that your machine has

This repo only contains one chart currently, but is structured in the standard helm repo way. This means you will find the chart in the charts/temporal directory. All example helm commands below should be run from that directory.

Download Helm Chart Dependencies

Download Helm dependencies:

helm dependencies update

Install Temporal with Helm Chart

Temporal can be configured to run with various dependencies. The default "Batteries Included" Helm Chart configuration deploys and configures the following components:

  • Cassandra
  • ElasticSearch
  • Prometheus
  • Grafana

The sections that follow describe various deployment configurations, from a minimal one-replica installation using included dependencies, to a replicated deployment on existing infrastructure.

Minimal installation with required dependencies only

To install Temporal in a limited but working and self-contained configuration (one replica of Cassandra and each of Temporal's services, no metrics or ElasticSearch), you can run the following command

helm install \
    --set server.replicaCount=1 \
    --set cassandra.config.cluster_size=1 \
    --set prometheus.enabled=false \
    --set grafana.enabled=false \
    --set elasticsearch.enabled=false \
    temporaltest . --timeout 15m

This configuration consumes limited resources and it is useful for small scale tests (such as using minikube).

Below is an example of an environment installed in this configuration:

$ kubectl get pods
NAME                                           READY   STATUS    RESTARTS   AGE
temporaltest-admintools-6cdf56b869-xdxz2       1/1     Running   0          11m
temporaltest-cassandra-0                       1/1     Running   0          11m
temporaltest-frontend-5d5b6d9c59-v9g5j         1/1     Running   2          11m
temporaltest-history-64b9ddbc4b-bwk6j          1/1     Running   2          11m
temporaltest-matching-c8887ddc4-jnzg2          1/1     Running   2          11m
temporaltest-metrics-server-7fbbf65cff-rp2ks   1/1     Running   0          11m
temporaltest-web-77f68bff76-ndkzf              1/1     Running   0          11m
temporaltest-worker-7c9d68f4cf-8tzfw           1/1     Running   2          11m

Install with required and optional dependencies

This method requires a three node kubernetes cluster to successfully bring up all the dependencies.

By default, Temporal Helm Chart configures Temporal to run with a three node Cassandra cluster (for persistence) and Elasticsearch (for "visibility" features), Prometheus, and Grafana. By default, Temporal Helm Chart installs all dependencies, out of the box.

To install Temporal with all of its dependencies run this command:

helm install temporaltest . --timeout 900s

To use your own instance of ElasticSearch, MySQL, PostgreSQL, or Cassandra, please read the "Bring Your Own" sections below.

Other components (Prometheus, Grafana) can be omitted from the installation by setting their corresponding enable flag to false:

helm install \
    --set prometheus.enabled=false \
    --set grafana.enabled=false \
    temporaltest . --timeout 900s

Install with sidecar containers

You may need to provide your own sidecar containers.

To do so, you may look at the example for Google's cloud sql proxy in the values/values.cloudsqlproxy.yaml and pass that file to helm install.

Example:

helm install -f values/values.cloudsqlproxy.yaml temporaltest . --timeout 900s

Install with your own ElasticSearch

You might already be operating an instance of ElasticSearch that you want to use with Temporal.

To do so, fill in the relevant configuration values in values.elasticsearch.yaml, and pass the file to 'helm install'.

Example:

helm install -f values/values.elasticsearch.yaml temporaltest . --timeout 900s

Install with your own MySQL

You might already be operating a MySQL instance that you want to use with Temporal.

In this case, create and configure temporal databases on your MySQL host with temporal-sql-tool. The tool is part of temporal repo, and it relies on the schema definition, in the same repo.

Here are examples of commands you can use to create and initialize the databases:

# in https://github.com/temporalio/temporal git repo dir
export SQL_PLUGIN=mysql8
export SQL_HOST=mysql_host
export SQL_PORT=3306
export SQL_USER=mysql_user
export SQL_PASSWORD=mysql_password

make temporal-sql-tool

./temporal-sql-tool --database temporal create-database
SQL_DATABASE=temporal ./temporal-sql-tool setup-schema -v 0.0
SQL_DATABASE=temporal ./temporal-sql-tool update -schema-dir schema/mysql/v8/temporal/versioned

./temporal-sql-tool --database temporal_visibility create-database
SQL_DATABASE=temporal_visibility ./temporal-sql-tool setup-schema -v 0.0
SQL_DATABASE=temporal_visibility ./temporal-sql-tool update -schema-dir schema/mysql/v8/visibility/versioned

Once you initialized the two databases, fill in the configuration values in values/values.mysql.yaml, and run

# in https://github.com/temporalio/helm-charts git repo dir
helm install -f values/values.mysql.yaml temporaltest . --timeout 900s

Alternatively, instead of modifying values/values.mysql.yaml, you can supply those values in your command line:

# in https://github.com/temporalio/helm-charts git repo dir
helm install -f values/values.mysql.yaml temporaltest \
  --set elasticsearch.enabled=false \
  --set server.config.persistence.default.sql.user=mysql_user \
  --set server.config.persistence.default.sql.password=mysql_password \
  --set server.config.persistence.visibility.sql.user=mysql_user \
  --set server.config.persistence.visibility.sql.password=mysql_password \
  --set server.config.persistence.default.sql.host=mysql_host \
  --set server.config.persistence.visibility.sql.host=mysql_host . --timeout 900s

NOTE: Requires MySQL 8.0.17+, older versions are not supported.

Install with your own PostgreSQL

You might already be operating a PostgreSQL instance that you want to use with Temporal.

In this case, create and configure temporal databases on your PostgreSQL host with temporal-sql-tool. The tool is part of temporal repo, and it relies on the schema definition, in the same repo.

Here are examples of commands you can use to create and initialize the databases:

# in https://github.com/temporalio/temporal git repo dir
export SQL_PLUGIN=postgres12
export SQL_HOST=postgresql_host
export SQL_PORT=5432
export SQL_USER=postgresql_user
export SQL_PASSWORD=postgresql_password

make temporal-sql-tool

./temporal-sql-tool --database temporal create-database
SQL_DATABASE=temporal ./temporal-sql-tool setup-schema -v 0.0
SQL_DATABASE=temporal ./temporal-sql-tool update -schema-dir schema/postgresql/v12/temporal/versioned

./temporal-sql-tool --database temporal_visibility create-database
SQL_DATABASE=temporal_visibility ./temporal-sql-tool setup-schema -v 0.0
SQL_DATABASE=temporal_visibility ./temporal-sql-tool update -schema-dir schema/postgresql/v12/visibility/versioned

Once you initialized the two databases, fill in the configuration values in values/values.postgresql.yaml, and run

# in https://github.com/temporalio/helm-charts git repo dir
helm install -f values/values.postgresql.yaml temporaltest . --timeout 900s

Alternatively, instead of modifying values/values.postgresql.yaml, you can supply those values in your command line:

# in https://github.com/temporalio/helm-charts git repo dir
helm install -f values/values.postgresql.yaml temporaltest \
  --set elasticsearch.enabled=false \
  --set server.config.persistence.default.sql.user=postgresql_user \
  --set server.config.persistence.default.sql.password=postgresql_password \
  --set server.config.persistence.visibility.sql.user=postgresql_user \
  --set server.config.persistence.visibility.sql.password=postgresql_password \
  --set server.config.persistence.default.sql.host=postgresql_host \
  --set server.config.persistence.visibility.sql.host=postgresql_host . --timeout 900s

NOTE: Requires PostgreSQL 12+, older versions are not supported.

Install with your own Cassandra

You might already be operating a Cassandra instance that you want to use with Temporal.

In this case, create and setup keyspaces in your Cassandra instance with temporal-cassandra-tool. The tool is part of temporal repo, and it relies on the schema definition, in the same repo.

Here are examples of commands you can use to create and initialize the keyspaces:

# in https://github.com/temporalio/temporal git repo dir
export CASSANDRA_HOST=cassandra_host
export CASSANDRA_PORT=9042
export CASSANDRA_USER=cassandra_user
export CASSANDRA_PASSWORD=cassandra_user_password

./temporal-cassandra-tool create-Keyspace -k temporal
CASSANDRA_KEYSPACE=temporal ./temporal-cassandra-tool setup-schema -v 0.0
CASSANDRA_KEYSPACE=temporal ./temporal-cassandra-tool update -schema-dir schema/cassandra/temporal/versioned

Once you initialized the two keyspaces, fill in the configuration values in values/values.cassandra.yaml, and run

helm install -f values/values.cassandra.yaml temporaltest . --timeout 900s

Note that Temporal cannot run without setting up a store for Visibility, and Cassandra is not a supported database for Visibility. We recommend using Elasticsearch in this case (see below how to setup).

Enable Archival

By default archival is disabled. You can enable it by picking one of the three provider options:

  • File Store, values file values/values.archival.filestore.yaml
  • S3, values file values/values.archival.s3.yaml
  • GCloud, values file values/values.archival.gcloud.yaml

So to use the minimal command again and to enable archival with file store provider:

helm install -f values/values.archival.filestore.yaml \
    --set server.replicaCount=1 \
    --set cassandra.config.cluster_size=1 \
    --set prometheus.enabled=false \
    --set grafana.enabled=false \
    --set elasticsearch.enabled=false \
    temporaltest . --timeout 15m

Note that if archival is enabled, it is also enabled for all newly created namespaces. Make sure to update the specific archival provider values file to set your configs.

Install and configure Temporal

If a live application environment already uses systems that Temporal can use as dependencies, then those systems can continue to be used. This Helm chart can install the minimal pieces of Temporal such that it can then be configured to use those systems as its dependencies.

The example below demonstrates a few things:

  1. How to set values via the command line rather than the environment.
  2. How to configure a database (shows Cassandra, but MySQL works the same way)
  3. How to enable TLS for the database connection.
  4. How to enable Auth for the Web UI
helm install temporaltest \
    -f values/values.cassandra.yaml \
    -f values/values.elasticsearch.yaml \
    --set elasticsearch.enabled=true \
    --set grafana.enabled=false \
    --set prometheus.enabled=false \
    --set server.replicaCount=5 \
    --set server.config.persistence.default.cassandra.hosts=cassandra.data.host.example \
    --set server.config.persistence.default.cassandra.user=cassandra_user \
    --set server.config.persistence.default.cassandra.password=cassandra_user_password \
    --set server.config.persistence.default.cassandra.tls.caData=$(base64 --wrap=0 cassandra.ca.pem) \
    --set server.config.persistence.default.cassandra.tls.enabled=true \
    --set server.config.persistence.default.cassandra.replicationFactor=3 \
    --set server.config.persistence.default.cassandra.keyspace=temporal \
    . \
    --timeout 15m \
    --wait

Play With It

Exploring Your Cluster

You can use your favorite kubernetes tools (k9s, kubectl, etc.) to interact with your cluster.

$ kubectl get svc
NAME                                   TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                                        AGE
...
temporaltest-admintools                ClusterIP   172.20.237.59    <none>        22/TCP                                         15m
temporaltest-frontend-headless         ClusterIP   None             <none>        7233/TCP,9090/TCP                              15m
temporaltest-history-headless          ClusterIP   None             <none>        7234/TCP,9090/TCP                              15m
temporaltest-matching-headless         ClusterIP   None             <none>        7235/TCP,9090/TCP                              15m
temporaltest-worker-headless           ClusterIP   None             <none>        7239/TCP,9090/TCP                              15m
...
$ kubectl get pods
...
temporaltest-admintools-7b6c599855-8bk4x                1/1     Running   0          25m
temporaltest-frontend-54d94fdcc4-bx89b                  1/1     Running   2          25m
temporaltest-history-86d8d7869-lzb6f                    1/1     Running   2          25m
temporaltest-matching-6c7d6d7489-kj5pj                  1/1     Running   3          25m
temporaltest-worker-769b996fd-qmvbw                     1/1     Running   2          25m
...

Running Temporal CLI From the Admin Tools Container

You can also shell into admin-tools container via k9s or by running

$ kubectl exec -it services/temporaltest-admintools /bin/bash
bash-5.0#

and run Temporal CLI from there:

bash-5.0# tctl namespace list
Name: temporal-system
Id: 32049b68-7872-4094-8e63-d0dd59896a83
Description: Temporal internal system namespace
OwnerEmail: [email protected]
NamespaceData: map[string]string(nil)
Status: Registered
RetentionInDays: 7
EmitMetrics: true
ActiveClusterName: active
Clusters: active
HistoryArchivalStatus: Disabled
VisibilityArchivalStatus: Disabled
Bad binaries to reset:
+-----------------+----------+------------+--------+
| BINARY CHECKSUM | OPERATOR | START TIME | REASON |
+-----------------+----------+------------+--------+
+-----------------+----------+------------+--------+
bash-5.0# tctl --namespace nonesuch namespace desc
Error: Namespace nonesuch does not exist.
Error Details: Namespace nonesuch does not exist.
bash-5.0# tctl --namespace nonesuch namespace re
Namespace nonesuch successfully registered.
bash-5.0# tctl --namespace nonesuch namespace desc
Name: nonesuch
UUID: 465bb575-8c01-43f8-a67d-d676e1ae5eae
Description:
OwnerEmail:
NamespaceData: map[string]string(nil)
Status: Registered
RetentionInDays: 3
EmitMetrics: false
ActiveClusterName: active
Clusters: active
HistoryArchivalStatus: ArchivalStatusDisabled
VisibilityArchivalStatus: ArchivalStatusDisabled
Bad binaries to reset:
+-----------------+----------+------------+--------+
| BINARY CHECKSUM | OPERATOR | START TIME | REASON |
+-----------------+----------+------------+--------+
+-----------------+----------+------------+--------+

Forwarding Your Machine's Local Port to Temporal FrontEnd

You can also expose your instance's front end port on your local machine:

$ kubectl port-forward services/temporaltest-frontend-headless 7233:7233
Forwarding from 127.0.0.1:7233 -> 7233
Forwarding from [::1]:7233 -> 7233

and, from a separate window, use the local port to access the service from your application or Temporal samples.

Forwarding Your Machine's Local Port to Temporal Web UI

Similarly to how you accessed Temporal front end via kubernetes port forwarding, you can access your Temporal instance's web user interface.

To do so, forward your machine's local port to the Web service in your Temporal installation

$ kubectl port-forward services/temporaltest-web 8080:8080
Forwarding from 127.0.0.1:8080 -> 8080
Forwarding from [::1]:8080 -> 8080

and navigate to http://127.0.0.1:8080 in your browser.

Exploring Metrics via Grafana

By default, the full "Batteries Included" configuration comes with a few Grafana dashboards.

To access those dashboards, follow the following steps:

  1. Extract Grafana's admin password from your installation:
$ kubectl get secret --namespace default temporaltest-grafana -o jsonpath="{.data.admin-password}" | base64 --decode

t7EqZQpiB6BztZV321dEDppXbeisdpiEAMgnu6yy%
  1. Setup port forwarding, so you can access Grafana from your host:
$ kubectl port-forward services/temporaltest-grafana 8081:80
Forwarding from 127.0.0.1:8081 -> 3000
Forwarding from [::1]:8081 -> 3000
...
  1. Navigate to the forwarded Grafana port in your browser (http://localhost:8081/), login as admin (using the password from step 1), and click on the "Home" button (upper left corner) to see available dashboards.

Updating Dynamic Configs

By default dynamic config is empty, if you want to override some properties for your cluster, you should:

  1. Create a yaml file with your config (for example dc.yaml).
  2. Populate it with some values under server.dynamicConfig prefix (use the sample provided at values/values.dynamic_config.yaml as a starting point)
  3. Install your helm configuration:
helm install -f values/values.dynamic_config.yaml temporaltest . --timeout 900s

Note that if you already have a running cluster you can use the "helm upgrade" command to change dynamic config values:

helm upgrade -f values/values.dynamic_config.yaml temporaltest . --timeout 900s

WARNING: The "helm upgrade" approach will trigger a rolling upgrade of all the pods.

If a rolling upgrade is not desirable, you can also generate the ConfigMap file explicitly and then apply it using the following command:

kubectl apply -f dynamicconfigmap.yaml

You can use helm upgrade with the "--dry-run" option to generate the content for the dynamicconfigmap.yaml.

The dynamic-config ConfigMap is referenced as a mounted volume within the Temporal Containers, so any applied change will be automatically picked up by all pods within a few minutes without the need for pod recycling. See k8S documentation (https://kubernetes.io/docs/tasks/configure-pod-container/configure-pod-configmap/#mounted-configmaps-are-updated-automatically) for more details on how this works.

Updating Temporal Web Config

the config file server/config.yml for the temporal web ui is referenced as a mounted volume within the Temporal Web UI Container and can be populated by inserting values in the web.config section in the values.yml for possible config check (https://github.com/temporalio/web#configuring-authentication-optional)

Uninstalling

Note: in this example chart, uninstalling a Temporal instance also removes all the data that might have been created during its lifetime.

helm uninstall temporaltest

Upgrading

To upgrade your cluster, upgrade your database schema (if the release includes schema changes), and then use helm upgrade command to perform a rolling upgrade of your installation.

Note:

  • Not supported: running newer binaries with an older schema.
  • Supported: downgrading binaries – running older binaries with a newer schema.

Example:

Upgrade Schema

Here are examples of commands you can use to upgrade the "default" schema in your "bring your own" Cassandra database.

Upgrade default schema:

temporal_v1.2.1 $ temporal-cassandra-tool \
   --tls \
   --tls-ca-file ... \
   --user cassandra-user \
   --password cassandra-password \
   --endpoint cassandra.example.com \
   --keyspace temporal \
   --timeout 120 \
   update \
   --schema-dir ./schema/cassandra/temporal/versioned

To upgrade your MySQL database, please use temporal-sql-tool tool instead of temporal-cassandra-tool.

Upgrade Temporal Instance's Docker Images

Here is an example of a helm upgrade command that can be used to upgrade a cluster:

helm \
    upgrade \
    temporaltest \
    -f values/values.cassandra.yaml \
    -f values/values.elasticsearch.yaml \
    --set elasticsearch.enabled=true \
    --set server.replicaCount=8 \
    --set server.config.persistence.default.cassandra.hosts='{c1.example.com,c2.example.com,c3.example.com}' \
    --set server.config.persistence.default.cassandra.user=cassandra-user \
    --set server.config.persistence.default.cassandra.password=cassandra-password \
    --set server.config.persistence.default.cassandra.tls.caData=... \
    --set server.config.persistence.default.cassandra.tls.enabled=true \
    --set server.config.persistence.default.cassandra.replicationFactor=3 \
    --set server.config.persistence.default.cassandra.keyspace=temporal \
    --set server.image.tag=1.2.1 \
    --set server.image.repository=temporalio/server \
    --set admintools.image.tag=1.2.1 \
    --set admintools.image.repository=temporalio/admin-tools \
    --set web.image.tag=1.1.1 \
    --set web.image.repository=temporalio/web \
    . \
    --wait \
    --timeout 15m

Acknowledgements

Many thanks to Banzai Cloud whose Cadence Helm Charts heavily inspired this work.

License

FOSSA Status

helm-charts's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

helm-charts's Issues

Admintools has an invalid TEMPORAL_CLI_ADDRESS when deployed in a non-default namespace

Describe the bug

Error Details: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp: lookup workflow-frontend on 10.0.0.10:53: no such host"

Upon launching a fresh deployment in a non-default namespace, it appears my admintools service lacks a valid TEMPORAL_CLI_ADDRESS envvar. The one set does not match the services launched. I don't know the full effects of the issue but I know when I exec into the container, I need to manually set the env var to a valid hostname.

To Reproduce
Steps to reproduce the behavior:

  1. Launch a chart into a non-default namespace with the temporal chart as a dependency.
    helm install workflow -n workflow ./ --create-namespace
  2. Exec into the admintools pod
    kubectl exec -it -n workflow services/workflow-temporal-admintools /bin/bash
  3. List your namespaces
    tctl namespace list
  4. This should fail with a no such host.

Expected behavior
A list of declared namespaces.

Screenshots/Terminal ouput

% kubectl exec -it -nworkflow services/workflow-temporal-admintools /bin/bash
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
bash-5.0# tctl namespace list
Error: Error when list namespaces info
Error Details: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp: lookup workflow-frontend on 10.0.0.10:53: no such host"
('export TEMPORAL_CLI_SHOW_STACKS=1' to see stack traces)

# export TEMPORAL_CLI_ADDRESS=workflow-temporal-frontend:7233

# tctl namespace list
Name: temporal-system
...
...

Versions (please complete the following information where relevant):

  • OS: Linux
  • Temporal Version: 1.7.0
  • are you using Docker or Kubernetes or building Temporal from source? Kubernetes

Additional context
The web deployment follows a different pattern:

https://github.com/iamjohnnym/helm-charts/blob/master/templates/web-deployment.yaml#L38

[Bug] Wrong ES template creation

Describe the bug
In v1.10.5 tha path of the ES index template is schema/elasticsearch/v7/visibility/index_template.json and not schema/elasticsearch/visibility/index_template_v7.json, therefore the temporal-es-index-setup does not set up ES correctly.

Moreover, the default index name temporal_visibility_v1_dev in the values gives a wrong hint because the index template wouldn't match the index since the index pattern has the following values:

"index_patterns": [
    "temporal-visibility-*"
  ],

To Reproduce
Steps to reproduce the behavior:
Just perform a clean deployment from the current master (eac55bf)

Expected behavior
Create the right index template

Screenshots/Terminal ouput

Warning: Couldn't read data from file 
Warning: "schema/elasticsearch/visibility/index_template_v7.json", this makes 
Warning: an empty POST.
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   163  100   163    0     0   4657      0 --:--:-- --:--:-- --:--:--  4657
{"error":{"root_cause":[{"type":"parse_exception","reason":"request body is required"}],"type":"parse_exception","reason":"request body is required"},"status":400}  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    85  100    85    0     0    310      0 --:--:-- --:--:-- --:--:--   310
(reverse-i-search)`9': kubectl -n daisy logs -f telemetry-ingestor-6fdb9cf68^C4wnks telemetry-ingestorards_acknowledged":true,"index":"temporal_visibility_v1_dev"}

Custom annotation results in unable to decode error

Hello,

When I am doing a helm install and trying to specify a custom annotation as using something like this:

--set server.frontend.service.annotations.service\.beta\.kubernetes\.io/azure-load-balancer-internal=true
It returns with this error:

Error: unable to build kubernetes objects from release manifest: unable to decode "": resource.metadataOnlyObject.ObjectMeta: v1.ObjectMeta.Annotations: ReadString: expects " or n, but found t, error found in #10 byte of ...|nternal":true},"labe|..., bigger context ...|beta.kubernetes.io/azure-load-balancer-internal":true},"labels":{"app.kubernetes.io/component":"fron|...
helm.go:84: [debug] unable to decode "": resource.metadataOnlyObject.ObjectMeta: v1.ObjectMeta.Annotations: ReadString: expects " or n, but found t, error found in #10 byte of ...|nternal":true},"labe|..., bigger context ...|beta.kubernetes.io/azure-load-balancer-internal":true},"labels":{"app.kubernetes.io/component":"fron|...
unable to build kubernetes objects from release manifest

Even when adding double quotes, escaped or not, results in something funky like this:

service.beta.kubernetes.io/azure-load-balancer-internal: '"true"'

These are all the parameters I am passing. It only complains about the custom annotation I am trying to set.

helm install \
          --debug \
          --dry-run \
          -n temporal \
          -f values/values.cassandra.yaml          
          --set prometheus.enabled=false \
          --set grafana.enabled=false \
          --set elasticsearch.enabled=false \
          --set kafka.enabled=false \
          --set server.replicaCount=1 \
          --set server.frontend.service.annotations.service\.beta\.kubernetes\.io/azure-load-balancer-internal=true
          --set server.image.repository=temporalio/server \
          --set admintools.image.repository=temporalio/admin-tools \
          --set web.image.repository=temporalio/web \
          --set server.config.persistence.default.cassandra.hosts=cas-temporal.db.westus.test.azure.com \
          --set server.config.persistence.default.cassandra.user=temporalpe \
          --set server.config.persistence.default.cassandra.password=temporalpe \
          --set server.config.persistence.visibility.cassandra.hosts=cas-temporal.db.westus.test.azure.com \
          --set server.config.persistence.visibility.cassandra.user=temporalpe \
          --set server.config.persistence.visibility.cassandra.password=temporalpe \
          --set web.ingress.enabled=true \
          --set server.frontend.service.annotations.service\.beta\.kubernetes\.io/azure-load-balancer-internal="true" \
          --set web.ingress.hosts={temporal.svc.westus.test.azure.com} \
          --set web.ingress.annotations\.kubernetes\.io/ingress\.class="nginx" \
          --set web.ingress.annotations\.nginx\.org/mergeable-ingress-type="minion" \
          --set server.frontend.service.type=LoadBalancer" \
          temporal . --timeout 15m

web-ingress.yaml tls and rules indentation

in the web-ingress.yaml,

the tls and rules key is in the same indentation as the spec. Due to which the install is failing.

Error: unable to build kubernetes objects from release manifest: error validating "": error validating data: [ValidationError(Ingress): unknown field "rules" in io.k8s.api.networking.v1beta1.Ingress, ValidationError(Ingress): unknown field "tls" in io.k8s.api.networking.v1beta1.Ingress]

kindly change the indentation.

[Bug] Duplicated entry error when deploy temporal using helm

Describe the bug
Temporal Server start failed.

To Reproduce
The configures are provided in Additional context.

Expected behavior
Temporal Server successfully start.

Screenshots/Terminal ouput
Unable to start server. Error: unable to initialize system namespace: unable to register system namespace: CreateNamespace operation failed. Failed to commit transaction. Error: Error 1062: Duplicate entry '54321-2�hxr@��c��Y�j�' for key 'PRIMARY'

Versions (please complete the following information where relevant):

  • OS: Linux
  • Temporal Version v1.9.2
  • Using Helm to deploy Temporal. Helm Version: v1.9.2

Additional context

Firstly I initialize the database using temporal-sql-tool version v1.9.2.

The Database Version is TiDB v4.0.x.

Several yaml configs are rewritten.

values/values.elasticsearch.yaml

elasticsearch:
  enabled: false
  external: true
  host: "xxx"
  port: "xxx"
  version: "v6"
  scheme: "http"
  logLevel: "info"

values/values.mysql.yaml

server:
  config:
    persistence:
      default:
        driver: "sql"
 
 
        sql:
          driver: "mysql"
          host: xxx
          port: 3306
          database: temporal
          user: root
          password: xxx
          maxConns: 20
          maxConnLifetime: "1h"
 
      visibility:
        driver: "sql"
 
        sql:
          driver: "mysql"
          host: xxx
          port: 3306
          database: temporal_visibility
          user: root
          password: xxx
          maxConns: 20
          maxConnLifetime: "1h"
 
cassandra:
  enabled: false
 
mysql:
  enabled: true
 
postgresql:
  enabled: false
 
schema:
  setup:
    enabled: false
  update:
    enabled: false

install cmd:

helm install -f values/values.elasticsearch.yaml -f values/values.mysql.yaml temporaltest \
    --set prometheus.enabled=false \
    --set grafana.enabled=false . --timeout 900s

[Feature Request] Split out dependencies from chart and create separate dev chart with dependencies

Currently parts of our helm chart are production ready and parts are development only. This makes it difficult to communicate clearly about the status of the chart.

To fix this we would like to shift this repo to house 2 charts: one production ("temporal") for only the temporal services, and another for development ("temporal-development-deps") that is more like the current chart but consumes the to-be-created temporal service only chart as a dependency along side the other dependencies.

After doing this we would also like to set up real helm releases for our charts.

'helm template' command results in a newline character being stripped

Bug
Running the helm template command with helm v2 can result in whitespace being removed, invalidating the helm chart.

The output of range templates inserver-service.yaml and server-deployment.yaml results in:

---apiVersion: v1
kind: Service

instead of

---
apiVersion: v1
kind: Service

See this issue with Cadence template for more detail: helm/helm#7149

To Reproduce

  1. Using any helm 2.x
  2. Run helm template

Expected behavior
Helm should not strip whitespace.

Screenshots/Terminal ouput
Screen Shot 2021-05-24 at 14 21 12

Versions (please complete the following information where relevant):

  • OS: mac
  • Temporal Version 1.9.2
  • using Kubernetes and helm v2

Additional context

I think this could be fixed by changing:

{{- range $service := (list "frontend" "history" "matching" "worker") }}
{{- $serviceValues := index $.Values.server $service -}}

to

{{- range $service := (list "frontend" "history" "matching" "worker") }}
{{- $serviceValues := index $.Values.server $service }}

which will not remove whitespace.

I'm not sure if this will have any unintended impact

-temporal- sometimes injected into pod/svc names

the first time I installed this i ended up having longer names for pods/svcs than indicated in the docs:

$ k get pods
NAME                                                     READY   STATUS    RESTARTS   AGE
elasticsearch-master-0                                   1/1     Running   0          12m
elasticsearch-master-1                                   1/1     Running   0          12m
elasticsearch-master-2                                   1/1     Running   0          12m
temporlicious-cassandra-0                                1/1     Running   0          12m
temporlicious-cassandra-1                                1/1     Running   0          10m
temporlicious-cassandra-2                                1/1     Running   0          7m43s
temporlicious-grafana-8684f55d85-cspgp                   1/1     Running   0          12m
temporlicious-kafka-0                                    1/1     Running   5          12m
temporlicious-kube-state-metrics-689fdb76cc-6cv7n        1/1     Running   0          12m
temporlicious-prometheus-alertmanager-69fb7f4f6d-twwcw   2/2     Running   0          12m
temporlicious-prometheus-pushgateway-6dd8fdbbbc-mk64k    1/1     Running   0          12m
temporlicious-prometheus-server-564fbc54d9-4w6xq         2/2     Running   0          12m
temporlicious-temporal-admintools-5df8cdbd55-rt69b       1/1     Running   0          12m
temporlicious-temporal-frontend-5687b9f84c-rjdls         1/1     Running   0          12m
temporlicious-temporal-history-66b88465c6-9gst5          1/1     Running   4          12m
temporlicious-temporal-matching-7745757866-cfhms         1/1     Running   0          12m
temporlicious-temporal-web-85cf5cfdcc-77rr9              1/1     Running   0          12m
temporlicious-temporal-worker-57965bc4dc-bnwtx           1/1     Running   4          12m
temporlicious-zookeeper-0                                1/1     Running   0          12m

this also had the side effect of tctl in the admintools pod not having the correct address for the frontend (it was looking for temporlicious-frontend rather than temporlicious-temporal-frontend).

I thought this was because i installed into the defualt namespace, and when i tried installing into a temporal namespace the names were as expected (without the -temporal- labels in the middle), but then when i tried installing in default again I couldn't repro this issue.

[Feature Request] Approaching Official PostgreSQL 9.6 EOL Date

Is your feature request related to a problem? Please describe.

PostgreSQL has an EOL date of November 11, 2021. Officially, the temporal platform seems to support PostgreSQL only @v9.6.

https://docs.temporal.io/docs/server/versions-and-dependencies/#persistence
https://www.postgresql.org/support/versioning/

Describe the solution you'd like

Official verbiage that more recent versions have been tested and are deemed validated as a supported database.

Additional context

PostgreSQL EOL
Temporal Supported Databases

Aurora MySQL fails deployment - Unknown system variable 'transaction_isolation'

DB: Aurora MySQL 5.7.12, Serverless
Temporal: v1.3.0

Deployment fails when connecting to Aurora MySQL due to version <5.7.20:

Error:

{"level":"info","ts":"2020-11-15T13:47:53.291Z","msg":"Starting server for services","value":"[worker]","logging-call-at":"server.go:109
"}
Unable to start server: sql schema version compatibility check failed: unable to create SQL connection: Error 1193: Unknown system variable 'transaction_isolation'.

From MySQL 8.0 Release Notes:

The tx_isolation and tx_read_only system variables have been removed. Use transaction_isolation and transaction_read_only instead.

From MySQL 5.7 Release Notes: https://dev.mysql.com/doc/refman/5.7/en/added-deprecated-removed.html#optvars-deprecated

tx_isolation: Default transaction isolation level. Deprecated as of MySQL 5.7.20.

Found similar issue addressed by Cadence here: banzaicloud/banzai-charts@8fbf828

Fix docker config permissions

From slack:

Hi, We tested cadence in an openshift environement; we have used the banzaichards cloud for cadence helm chard.(only supported with kubernetes). In fact it doesn't work on openshift.....but I small change in dockerfile makes it working under openshift. Could you add this on your dockerfile (FROM ubercadence/server:0.11.0
RUN chmod 775 /etc/cadence/config) . This chmod is not a problem with kubernetes and I will work on openshift . It's a win-win change I think.

Missing SQL store support

I have been trying to deploy Temporal.io with MySQL store support. I see the rest of the chart is heavily inspired from banzaicloud's cadence chart, but it seems MySQL support has been explicitly removed from the chart while Temporal itself seems to support it.

Is there a specific reason?

I have made changes to this chart to support it for my use for now. If the Temporal team/community is interested, I can create a PR.

Make sure logger is configured correctly by default

If the config for server is missing log section then server sets the output stream as stderr.
Users need to explicitly configure logger by providing the following section in config:

log:
  stdout: true
  level: info

Relevant code which sets the logger.

ClusterMetadata: Default cluster potentially uses wrong frontend RPC address

We're working on standing up the temporal service via helm and I noticed this while I was configuring the various yaml files. If a user configures a custom gRPC port for the frontend service, then the hardcoded default of 7933 will be incorrect.

rpcAddress: "127.0.0.1:7933"

It also seems that the localhost address 127.0.0.1 address would be incorrect in a deployed environment assuming that the various services (history, matching, frontend, worker) are deployed separately.

https://github.com/temporalio/temporal/blob/e2e26004552cbc0867afb342238bb3f9efeee6ce/client/clientBean.go#L87-L96

not enough master nodes discovered during pinging, elastic search

I just did a fresh install using your helm chart


org.elasticsearch.cluster.ClusterStateObserver$ContextPreservingListener.onTimeout(ClusterStateObserver.java:322) [elasticsearch-6.8.8.jar:6.8.8] 
	at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:249) [elasticsearch-6.8.8.jar:6.8.8] 
	at org.elasticsearch.cluster.service.ClusterApplierService$NotifyTimeout.run(ClusterApplierService.java:564) [elasticsearch-6.8.8.jar:6.8.8] 
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:681) [elasticsearch-6.8.8.jar:6.8.8] 
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?] 
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?] 
	at java.lang.Thread.run(Thread.java:832) [?:?] 
[2021-01-18T22:44:42,691][WARN ][o.e.d.z.ZenDiscovery     ] [elasticsearch-master-0] not enough master nodes discovered during pinging (found [[Candidate{node={elasticsearch-master-0}{fT0h0QSLRviF288vMzreMQ}{JIIqoPB_SD2J6Ugj3Cqu6w}{10.10.2.16}{10.10.2.16:9300}{ml.machine_memory=2147483648, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true}, clusterStateVersion=-1}]], but needed [2]), pinging again 

Any Suggestions? Thank you

Add support for Kubernetes sidecar

This is a feature request from customer, i.e. adding support for Kubernetes sidecar (SQL proxy)

Seems like sidecar pattern is best according to docs (https://cloud.google.com/sql/docs/postgres/connect-kubernetes-engine).
It seems like supporting generic sidecarContainers for temporal would be useful in the helm-charts anyways like prometheus does for server.sidecarContainers (https://github.com/helm/charts/tree/master/stable/prometheus#configuration).
Then you could extend it like in this comment: https://stackoverflow.com/a/62910122

Connection with password protected ElasticSearch

Describe the bug
Getting issue while connecting with an external Elasticsearch, which expects username password as an input. The connection of temporal server with an external elastic search that does not require password is working fine, but when trying with a password protected one, it is giving error. This error occur when a request to elasticsearch is made without username password.

To Reproduce
Steps to reproduce the behavior:

  1. Install Elastic Search and enabe authentication
  2. Set the variable properly of the elastic Search

Expected behavior
Temporal server should interact with elastic search by adding username password.

Screenshots/Terminal ouput
{"error":{"root_cause":[{"type":"security_exception","reason":"missing authentication credentials for REST request [/_template/temporal-visibility-template]","header":{"WWW-Authenticate":"Basic realm=\"security\" charset=\"UTF-8\""}}],"type":"security_exception","reason":"missing authentication credentials for REST request [/_template/temporal-visibility-template]","header":{"WWW-Authenticate":"Basic realm=\"security\" charset=\"UTF-8\""}},"status":401}{"error":{"root_cause":[{"type":"security_exception","reason":"missing authentication credentials for REST request [/prod-temporal-visibility]","header":{"WWW-Authenticate":"Basic realm=\"security\" charset=\"UTF-8\""}}],"type":"security_exception","reason":"missing authentication credentials for REST request [/prod-temporal-visibility]","header":{"WWW-Authenticate":"Basic realm=\"security\" charset=\"UTF-8\""}},"status":401}{"level":"info","ts":"2021-06-08T13:18:36.909Z","msg":"Updated dynamic config","logging-call-at":"file_based_client.go:262"}

Versions (please complete the following information where relevant):

  • OS: Linux
  • Docker Server Version:- 1.9.2
  • Using helm-chart to start Temporal server

Additional context

The environment variable of username, password asked by the helm chart is also getting properly set in the containers, verified by echoing it.

Do not assumption a specific installation name (`temporaltest`)

Reported by a customer:

https://temporalio.slack.com/archives/CTRCR8RBP/p1588095830424300

I’m installing the current helm chart into a kube cluster with all defaults and have run into the following problem:

{"level":"fatal","ts":"2020-04-28T17:00:33.140Z","msg":"Creating visibility producer failed","service":"history","error":"kafka: client has run out of available brokers to talk to (Is your cluster reachable?)","logging-call-at":"service.go:380","stacktrace":"github.com/temporalio/temporal/common/log/loggerimpl.(*loggerImpl).Fatal\n\t/temporal/common/log/loggerimpl/logger.go:140\ngithub.com/temporalio/temporal/service/history.NewService.func1\n\t/temporal/service/history/service.go:380\ngithub.com/temporalio/temporal/common/resource.New\n\t/temporal/common/resource/resourceImpl.go:211\ngithub.com/temporalio/temporal/service/history.NewService\n\t/temporal/service/history/service.go:393\ngithub.com/temporalio/temporal/cmd/server/temporal.(*server).startService\n\t/temporal/cmd/server/temporal/server.go:234\ngithub.com/temporalio/temporal/cmd/server/temporal.(*server).Start\n\t/temporal/cmd/server/temporal/server.go:79\ngithub.com/temporalio/temporal/cmd/server/temporal.startHandler\n\t/temporal/cmd/server/temporal/temporal.go:87\ngithub.com/temporalio/temporal/cmd/server/temporal.BuildCLI.func1\n\t/temporal/cmd/server/temporal/temporal.go:207\ngithub.com/urfave/cli.HandleAction\n\t/go/pkg/mod/github.com/urfave/[email protected]/app.go:492\ngithub.com/urfave/cli.Command.Run\n\t/go/pkg/mod/github.com/urfave/[email protected]/command.go:210\ngithub.com/urfave/cli.(*App).Run\n\t/go/pkg/mod/github.com/urfave/[email protected]/app.go:255\nmain.main\n\t/temporal/cmd/server/main.go:34\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:203"}

AFAICT, the kafka pod is online, am i missing something in the docs?

(Thank you, Joseph!)

It turns out, our helm configs are relying on a specific helm deployment name (temporaltest) 🤦‍♂️:

https://github.com/temporalio/temporal-helm-charts/blob/93be1064017acd32f40e84e433c7b9edc58a0b1a/templates/server-configmap.yaml#L118

Please remove this dependency, so the system does not have to be name temporaltest 🤦‍♂️

Provide out of the box support for dynamic config

Temporal provides file based implementation to drive the dynamic config experience for the server.

Need better support or atleast documentation in best practice on how to configure this using helm charts

SchemaUpdate: Kube spec only supports cassandra for schema update job

As the title says, the schema update job rendering is broken, as-is if a user is not using cassandra, then the update job will attempt to start the temporal server since the container receives no args.

{{- if eq (include "temporal.persistence.driver" (list $ $store)) "cassandra" }}
# args: ["temporal-cassandra-tool", "update-schema", "-d", "/etc/temporal/schema/cassandra/{{ include "temporal.persistence.schema" $store }}/versioned"]
args: ['sh', '-c', 'temporal-cassandra-tool update-schema -d /etc/temporal/schema/cassandra/{{ include "temporal.persistence.schema" $store }}/versioned']
{{- end }}

versus:

args: ["temporal-{{ include "temporal.persistence.driver" (list $ $store) }}-tool", "setup-schema", "-v", "0.0"]

Postgres TLS

Is your feature request related to a problem? Please describe.

There doesn't seem to be a way to configure TLS options for Postgres via the helm chart. The underlying server supports it so just seems like it needs to be surfaced in values.

Describe the solution you'd like

Ability to configure Postgres TLS options via Temporal Helm Chart.

[Feature Request] Create keyspaces for BYO Cassandra

I'm installing this chart with my own Cassandra but I'd still love it to automatically create keyspaces and do all the initialization as if the Cassandra was built-in. The need to build and manually run temporal-cassandra-tool disrupts my automation workflow.

[Bug] Easy way to use chart with Postgresql

Describe the bug
I was trying to disable cassandra and use the chart with postgresql easily. Looks like it cannot be done because of server-job template.

To Reproduce
Steps to reproduce the behavior:

  1. Download 1.9.1 release of the chart
  2. Try to install with
helm install \
    --set server.replicaCount=1 \
    --set cassandra.enabled=false \
    --set elasticsearch.enabled=false \
    --set kafka.enabled=false \
    --set postgresql.enabled=true \
    temporaltest . --timeout 15m
  1. got Error: execution error at (temporal/templates/server-job.yaml:92:24): Please specify cassandra port for default store

Expected behavior
Temporal starts without cassandra but with postgresql.

Versions (please complete the following information where relevant):

  • OS: Max
  • Temporal Version 1.9.1
  • are you using Docker or Kubernetes or building Temporal from source?

Additional context
Add any other context about the problem here.

Missing TLS config for Cassandra and support for TLS Cert

Hopefully Temporal ported over the Cassandra + CQL updates to allow Cassandra over TLS support. Currently we are maintaining our own fork of the old Cadence helm charts to add support for a configmap TLS CA cert and TLS server config to load said CA cert and use TLS to connect to Cassandra. Not sure if I have time to port them over + test them. Quick dump of some changes. These are kinda hacky but they work:

--- a/_infra/charts/cadence/templates/server-deployment.yaml
+++ b/_infra/charts/cadence/templates/server-deployment.yaml
@@ -102,12 +102,22 @@ spec:
             - name: config
               mountPath: /etc/cadence/config/config_template.yaml
               subPath: config_template.yaml
+            {{- if $.Values.tls.enabled }}
+            - name: certs
+              mountPath: /tlsfiles/caCert.pem
+              subPath: caCert.pem
+            {{- end}}
           resources:
             {{- toYaml (default $.Values.server.resources $serviceValues.resources) | nindent 12 }}
       volumes:
         - name: config
           configMap:
             name: {{ include "cadence.fullname" $ }}
+        {{- if $.Values.tls.enabled }}
+        - name: certs
+          configMap:
+            name: {{ include "cadence.fullname" $ }}-tlsfiles
+        {{- end}}
       {{- with (default $.Values.server.nodeSelector $serviceValues.nodeSelector) }}
       nodeSelector:
         {{- toYaml . | nindent 8 }}
diff --git a/_infra/charts/cadence/templates/tls-configmap.yaml b/_infra/charts/cadence/templates/tls-configmap.yaml
new file mode 100644
index 0000000..bf8f49b
--- /dev/null
+++ b/_infra/charts/cadence/templates/tls-configmap.yaml
@@ -0,0 +1,16 @@
+{{- if .Values.tls.enabled }}
+apiVersion: v1
+kind: ConfigMap
+metadata:
+  name: {{ include "cadence.fullname" . }}-tlsfiles
+  labels:
+    app.kubernetes.io/name: {{ include "cadence.name" . }}
+    helm.sh/chart: {{ include "cadence.chart" . }}
+    app.kubernetes.io/managed-by: {{ .Release.Service }}
+    app.kubernetes.io/instance: {{ .Release.Name }}
+    app.kubernetes.io/version: {{ .Chart.AppVersion | replace "+" "_" }}
+    app.kubernetes.io/part-of: {{ .Chart.Name }}
+data:
+  caCert.pem: |
+{{ .Files.Get (printf "%s" .Values.tls.caCert) | indent 4 }}
+{{- end }}
\ No newline at end of file

Example Usage:

diff --git a/_infra/charts/cadence/values-staging.yaml b/_infra/charts/cadence/values-staging.yaml
index 26b78f5..d132bb3 100644
--- a/_infra/charts/cadence/values-staging.yaml
+++ b/_infra/charts/cadence/values-staging.yaml
@@ -19,6 +19,9 @@ server:
           keyspace: cadence001
           user: "cadence001"
           existingSecret: "cadence001-default-store"
+          tls:
+            enabled: true
+            caFile: "/tlsfiles/caCert.pem"
       visibility:
         driver: "cassandra"
         cassandra:
@@ -27,6 +30,9 @@ server:
           keyspace: cadence001_visibility
           user: "cadence001_visibility"
           existingSecret: "cadence001-visibility-store"
+          tls:
+            enabled: true
+            caFile: "/tlsfiles/caCert.pem"
 schema:
   setup:
     enabled: false
@@ -36,3 +42,7 @@ cassandra:
   enabled: false
 mysql:
   enabled: false
+
+tls:
+  enabled: true
+  caCert: "staging-cert.pem"

[Bug] Docker images not available

Describe the bug

Pods can not download docker images.

To Reproduce
Steps to reproduce the behavior:

  1. helm install \
--set server.replicaCount=1 \
--set cassandra.config.cluster_size=1 \
--set prometheus.enabled=false \
--set grafana.enabled=false \
--set elasticsearch.enabled=false \
--set kafka.enabled=false \
temporaltest . --timeout 15m

Expected behavior
To see all pods up and Running

Screenshots/Terminal ouput

kubectl get pods
NAME READY STATUS RESTARTS AGE
temporaltest-admintools-7d58dc8455-9zzfp 0/1 ContainerCreating 0 37s
temporaltest-cassandra-0 0/1 ErrImagePull 0 37s
temporaltest-frontend-57d9458c7c-ntwvs 0/1 Init:ImagePullBackOff 0 37s
temporaltest-history-79c944d586-m6nwj 0/1 Init:ImagePullBackOff 0 37s
temporaltest-matching-687d4bdd6c-8qbvx 0/1 Init:0/4 0 37s
temporaltest-schema-setup-rp5w4 0/2 Init:ErrImagePull 0 37s
temporaltest-web-6d47bff77d-dbl66 0/1 ImagePullBackOff 0 37s
temporaltest-worker-7fd7db64fc-gxg6s 0/1 Init:ImagePullBackOff 0 37s

kubectl describe pod temporaltest-cassandra-0
Events:
Type Reason Age From Message


Normal Scheduled 54s default-scheduler Successfully assigned default/temporaltest-cassandra-0 to kind-worker2
Warning Failed 26s kubelet Failed to pull image "cassandra:3.11.3": rpc error: code = Unknown desc = failed to pull and unpack image "docker.io/library/cassandra:3.11.3": failedto copy: httpReaderSeeker: failed open: unexpected status code https://registry-1.docker.io/v2/library/cassandra/manifests/sha256:ce85468c5badfa2e0a04ae6825eee9421b42d9b12d1a781c0dd154f70d1ca288: 429 Too Many Requests - Server message: toomanyrequests: You have reached your pull rate limit. You may increase the limit by authenticating and upgrading: https://www.docker.com/increase-rate-limit
Warning Failed 26s kubelet Error: ErrImagePull
Normal BackOff 25s kubelet Back-off pulling image "cassandra:3.11.3"
Warning Failed 25s kubelet Error: ImagePullBackOff
Normal Pulling 11s (x2 over 53s) kubelet Pulling image "cassandra:3.11.3"

Though I can download the cassandra using same docker account as above:

docker pull docker.io/library/cassandra:3.11.3
3.11.3: Pulling from library/cassandra
6ae821421a7d: Pull complete
a0fef69a7a19: Pull complete
6849fd6936d9: Pull complete
832b4e4feae8: Pull complete
12e36f0fa0d9: Pull complete
625655f45ec7: Pull complete
c4392f7e9b96: Pull complete
4f6f85e6e245: Pull complete
e60258d103eb: Pull complete
30a7210918ab: Pull complete
Digest: sha256:ce85468c5badfa2e0a04ae6825eee9421b42d9b12d1a781c0dd154f70d1ca288
Status: Downloaded newer image for cassandra:3.11.3
docker.io/library/cassandra:3.11.3

Versions (please complete the following information where relevant):

  • OS: Linux
  • Temporal Version : 1.8.2
  • are you using Docker or Kubernetes or building Temporal from source? Running on kind

Additional context

[Bug] Unable to disable elastic authentication

Describe the bug
If external elasticsearch is used, u cannot disable user authentification and becauser of that init containers will fail during ES checks ( e.g. aws es service )

To Reproduce
Steps to reproduce the behavior:

  1. Deploy with external ES without username, password
  2. Check check-elasticsearch-index init container in server deployment
  3. See error

Expected behavior
Add possibility to continue without username, password

Archival feature is not being set to Enabled

Necessary changes that should enable archival on cluster level #63
There is still something missing in the PR

Repro:

  1. Install temporal service
helm install temporaltest . --timeout 900s
  1. Create a namespace with archivals enabled
kubectl exec -it services/temporaltest-admintools -- bash -c "tctl --ns nstest n register -has enabled -vas enabled"
  1. Check whether archival status is enabled
kubectl exec -it services/temporaltest-admintools -- bash -c "tctl --ns nstest namespace describe""

Expected:
Archivals are enabled
Actual:
Archivals are disabled

Kibana version incompatible with Elasticsearch version

The Kibana pod fails to come up:
image

The error logs from running kubectl -n temporal logs pod/temporal-kibana-f95df4f85-cb26q:

{"type":"log","@timestamp":"2020-07-10T16:02:03Z","tags":["info","plugins-service"],"pid":8,"message":"Plugin \"case\" is disabled."}
{"type":"log","@timestamp":"2020-07-10T16:02:33Z","tags":["info","plugins-system"],"pid":8,"message":"Setting up [37] plugins: [taskManager,siem,licensing,infra,encryptedSavedObjects,code,usageCollection,metrics,canvas,timelion,features,security,apm_oss,translations,reporting,uiActions,data,navigation,status_page,share,newsfeed,kibana_legacy,management,dev_tools,inspector,expressions,visualizations,embeddable,advancedUiActions,dashboard_embeddable_container,home,spaces,cloud,apm,graph,eui_utils,bfetch]"}
{"type":"log","@timestamp":"2020-07-10T16:02:33Z","tags":["info","plugins","taskManager"],"pid":8,"message":"Setting up plugin"}
{"type":"log","@timestamp":"2020-07-10T16:02:33Z","tags":["info","plugins","siem"],"pid":8,"message":"Setting up plugin"}
{"type":"log","@timestamp":"2020-07-10T16:02:33Z","tags":["info","plugins","licensing"],"pid":8,"message":"Setting up plugin"}
{"type":"log","@timestamp":"2020-07-10T16:02:33Z","tags":["info","plugins","infra"],"pid":8,"message":"Setting up plugin"}
{"type":"log","@timestamp":"2020-07-10T16:02:33Z","tags":["info","plugins","encryptedSavedObjects"],"pid":8,"message":"Setting up plugin"}
{"type":"log","@timestamp":"2020-07-10T16:02:33Z","tags":["warning","plugins","encryptedSavedObjects","config"],"pid":8,"message":"Generating a random key for xpack.encryptedSavedObjects.encryptionKey. To be able to decrypt encrypted saved objects attributes after restart, please set xpack.encryptedSavedObjects.encryptionKey in kibana.yml"}
{"type":"log","@timestamp":"2020-07-10T16:02:33Z","tags":["info","plugins","code"],"pid":8,"message":"Setting up plugin"}
{"type":"log","@timestamp":"2020-07-10T16:02:33Z","tags":["info","plugins","usageCollection"],"pid":8,"message":"Setting up plugin"}
{"type":"log","@timestamp":"2020-07-10T16:02:33Z","tags":["info","plugins","metrics"],"pid":8,"message":"Setting up plugin"}
{"type":"log","@timestamp":"2020-07-10T16:02:33Z","tags":["info","plugins","canvas"],"pid":8,"message":"Setting up plugin"}
{"type":"log","@timestamp":"2020-07-10T16:02:33Z","tags":["info","plugins","timelion"],"pid":8,"message":"Setting up plugin"}
{"type":"log","@timestamp":"2020-07-10T16:02:33Z","tags":["info","plugins","features"],"pid":8,"message":"Setting up plugin"}
{"type":"log","@timestamp":"2020-07-10T16:02:33Z","tags":["info","plugins","security"],"pid":8,"message":"Setting up plugin"}
{"type":"log","@timestamp":"2020-07-10T16:02:33Z","tags":["warning","plugins","security","config"],"pid":8,"message":"Generating a random key for xpack.security.encryptionKey. To prevent sessions from being invalidated on restart, please set xpack.security.encryptionKey in kibana.yml"}
{"type":"log","@timestamp":"2020-07-10T16:02:33Z","tags":["warning","plugins","security","config"],"pid":8,"message":"Session cookies will be transmitted over insecure connections. This is not recommended."}
{"type":"log","@timestamp":"2020-07-10T16:02:33Z","tags":["info","plugins","apm_oss"],"pid":8,"message":"Setting up plugin"}
{"type":"log","@timestamp":"2020-07-10T16:02:33Z","tags":["info","plugins","translations"],"pid":8,"message":"Setting up plugin"}
{"type":"log","@timestamp":"2020-07-10T16:02:33Z","tags":["info","plugins","data"],"pid":8,"message":"Setting up plugin"}
{"type":"log","@timestamp":"2020-07-10T16:02:33Z","tags":["info","plugins","share"],"pid":8,"message":"Setting up plugin"}
{"type":"log","@timestamp":"2020-07-10T16:02:33Z","tags":["info","plugins","home"],"pid":8,"message":"Setting up plugin"}
{"type":"log","@timestamp":"2020-07-10T16:02:33Z","tags":["info","plugins","spaces"],"pid":8,"message":"Setting up plugin"}
{"type":"log","@timestamp":"2020-07-10T16:02:33Z","tags":["info","plugins","cloud"],"pid":8,"message":"Setting up plugin"}
{"type":"log","@timestamp":"2020-07-10T16:02:33Z","tags":["info","plugins","apm"],"pid":8,"message":"Setting up plugin"}
{"type":"log","@timestamp":"2020-07-10T16:02:33Z","tags":["info","plugins","graph"],"pid":8,"message":"Setting up plugin"}
{"type":"log","@timestamp":"2020-07-10T16:02:33Z","tags":["info","plugins","bfetch"],"pid":8,"message":"Setting up plugin"}
{"type":"log","@timestamp":"2020-07-10T16:02:33Z","tags":["info","savedobjects-service"],"pid":8,"message":"Waiting until all Elasticsearch nodes are compatible with Kibana before starting saved objects migrations..."}
{"type":"log","@timestamp":"2020-07-10T16:02:34Z","tags":["error","elasticsearch","data"],"pid":8,"message":"Request error, retrying\nGET http://elasticsearch-master:9200/_xpack => connect ECONNREFUSED 10.43.81.26:9200"}
{"type":"log","@timestamp":"2020-07-10T16:02:34Z","tags":["error","elasticsearch","admin"],"pid":8,"message":"Request error, retrying\nGET http://elasticsearch-master:9200/_nodes?filter_path=nodes.*.version%2Cnodes.*.http.publish_address%2Cnodes.*.ip => connect ECONNREFUSED 10.43.81.26:9200"}
{"type":"log","@timestamp":"2020-07-10T16:02:34Z","tags":["error","elasticsearch","data"],"pid":8,"message":"Request error, retrying\nHEAD http://elasticsearch-master:9200/.apm-agent-configuration => connect ECONNREFUSED 10.43.81.26:9200"}
{"type":"log","@timestamp":"2020-07-10T16:02:35Z","tags":["warning","elasticsearch","data"],"pid":8,"message":"Unable to revive connection: http://elasticsearch-master:9200/"}
{"type":"log","@timestamp":"2020-07-10T16:02:35Z","tags":["warning","elasticsearch","data"],"pid":8,"message":"No living connections"}
Could not create APM Agent configuration: No Living connections
{"type":"log","@timestamp":"2020-07-10T16:02:37Z","tags":["warning","elasticsearch","data"],"pid":8,"message":"Unable to revive connection: http://elasticsearch-master:9200/"}
{"type":"log","@timestamp":"2020-07-10T16:02:37Z","tags":["warning","elasticsearch","data"],"pid":8,"message":"No living connections"}
{"type":"log","@timestamp":"2020-07-10T16:02:37Z","tags":["warning","plugins","licensing"],"pid":8,"message":"License information could not be obtained from Elasticsearch due to Error: No Living connections error"}
{"type":"log","@timestamp":"2020-07-10T16:02:37Z","tags":["warning","elasticsearch","admin"],"pid":8,"message":"Unable to revive connection: http://elasticsearch-master:9200/"}
{"type":"log","@timestamp":"2020-07-10T16:02:37Z","tags":["warning","elasticsearch","admin"],"pid":8,"message":"No living connections"}
{"type":"log","@timestamp":"2020-07-10T16:02:37Z","tags":["error","elasticsearch-service"],"pid":8,"message":"Unable to retrieve version information from Elasticsearch nodes."}
{"type":"log","@timestamp":"2020-07-10T16:02:39Z","tags":["warning","elasticsearch","admin"],"pid":8,"message":"Unable to revive connection: http://elasticsearch-master:9200/"}
{"type":"log","@timestamp":"2020-07-10T16:02:39Z","tags":["warning","elasticsearch","admin"],"pid":8,"message":"No living connections"}
{"type":"log","@timestamp":"2020-07-10T16:02:42Z","tags":["warning","elasticsearch","admin"],"pid":8,"message":"Unable to revive connection: http://elasticsearch-master:9200/"}
{"type":"log","@timestamp":"2020-07-10T16:02:42Z","tags":["warning","elasticsearch","admin"],"pid":8,"message":"No living connections"}
{"type":"log","@timestamp":"2020-07-10T16:02:44Z","tags":["warning","elasticsearch","admin"],"pid":8,"message":"Unable to revive connection: http://elasticsearch-master:9200/"}
{"type":"log","@timestamp":"2020-07-10T16:02:44Z","tags":["warning","elasticsearch","admin"],"pid":8,"message":"No living connections"}
{"type":"log","@timestamp":"2020-07-10T16:02:47Z","tags":["warning","elasticsearch","admin"],"pid":8,"message":"Unable to revive connection: http://elasticsearch-master:9200/"}
{"type":"log","@timestamp":"2020-07-10T16:02:47Z","tags":["warning","elasticsearch","admin"],"pid":8,"message":"No living connections"}
{"type":"log","@timestamp":"2020-07-10T16:02:49Z","tags":["error","elasticsearch-service"],"pid":8,"message":"This version of Kibana (v7.6.1) is incompatible with the following Elasticsearch nodes in your cluster: v6.8.8 @ 10.42.0.196:9200 (10.42.0.196), v6.8.8 @ 10.42.0.198:9200 (10.42.0.198), v6.8.8 @ 10.42.0.192:9200 (10.42.0.192)"}

Most notably:

This version of Kibana (v7.6.1) is incompatible with the following Elasticsearch nodes in your cluster: v6.8.8 @ 10.42.0.196:9200 (10.42.0.196), v6.8.8 @ 10.42.0.198:9200 (10.42.0.198), v6.8.8 @ 10.42.0.192:9200 (10.42.0.192)

(Using default helm chart)

[Bug] temporal-web pods stuck in ImagePullBackoff

Describe the bug
When installed with the default web.image.tag of 1.10.1, the temporal-web pods fail to start (ImagePullBackoff).
It appears that the 1.10.1 tag does not yet exist for the temporalio/web image (tags).

To Reproduce
Steps to reproduce the behavior:

  1. Install this chart as described in the installation instructions.
  2. Observe that the deployed temporal-web pods remain in the ImagePullBackoff status.

Expected behavior
All pods deployed as a result of the chart installation eventually reach a Running status.

Screenshots/Terminal ouput

$ kubectl get pods -n temporal                                  
NAME                                  READY   STATUS             RESTARTS   AGE
temporal-admintools-8b848bc98-wsvzv   1/1     Running            0          4m38s
temporal-cassandra-0                  1/1     Running            0          4m38s
temporal-cassandra-1                  1/1     Running            0          2m51s
temporal-cassandra-2                  0/1     Running            0          65s
temporal-frontend-cbf6c8767-n97qw     1/1     Running            3          4m38s
temporal-history-6cd9cb7676-4jfjd     1/1     Running            3          4m38s
temporal-matching-658b7464cf-wjd84    1/1     Running            3          4m38s
temporal-web-569975fff5-k7cxr         0/1     ImagePullBackOff   0          4m38s
temporal-worker-5c9bc769ff-d9zzq      1/1     Running            3          4m38s
$ kubectl describe pod temporal-web-569975fff5-k7cxr -n temporal
Name:         temporal-web-569975fff5-k7cxr
Namespace:    temporal
Priority:     0
Node:         <elided>
Start Time:   Fri, 09 Jul 2021 14:48:03 -0500
Labels:       app.kubernetes.io/component=web
              app.kubernetes.io/instance=temporal
              app.kubernetes.io/managed-by=Helm
              app.kubernetes.io/name=temporal
              app.kubernetes.io/part-of=temporal
              app.kubernetes.io/version=1.10.1
              helm.sh/chart=temporal-0.10.1
              pod-template-hash=569975fff5
Annotations:  <none>
Status:       Pending
IP:           <elided>
IPs:
  IP:           <elided>
Controlled By:  ReplicaSet/temporal-web-569975fff5
Containers:
  temporal-web:
    Container ID:   
    Image:          temporalio/web:1.10.1
    Image ID:       
    Port:           8088/TCP
    Host Port:      0/TCP
    State:          Waiting
      Reason:       ImagePullBackOff
    Ready:          False
    Restart Count:  0
    Environment:
      TEMPORAL_GRPC_ENDPOINT:  temporal-frontend.temporal.svc:7233
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-6hf98 (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  default-token-6hf98:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-6hf98
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason     Age                  From               Message
  ----     ------     ----                 ----               -------
  Normal   Scheduled  11m                  default-scheduler  Successfully assigned temporal/temporal-web-569975fff5-k7cxr to <elided>
  Normal   Pulling    9m53s (x4 over 11m)  kubelet            Pulling image "temporalio/web:1.10.1"
  Warning  Failed     9m53s (x4 over 11m)  kubelet            Failed to pull image "temporalio/web:1.10.1": rpc error: code = NotFound desc = failed to pull and unpack image "docker.io/temporalio/web:1.10.1": failed to resolve reference "docker.io/temporalio/web:1.10.1": docker.io/temporalio/web:1.10.1: not found
  Warning  Failed     9m53s (x4 over 11m)  kubelet            Error: ErrImagePull
  Normal   BackOff    9m27s (x7 over 11m)  kubelet            Back-off pulling image "temporalio/web:1.10.1"
  Warning  Failed     82s (x42 over 11m)   kubelet            Error: ImagePullBackOff

Additional context
Until temporalio/web:1.10.1 is available, users can set web.image.tag to 1.10.0 at install/upgrade time.

Deprecation warnings for ClusterRole / ClusterRoleBinding / Ingress

Hi, I'm running kubernetes 1.19 with RKE and when i'm installing the chart, These deprecations appears:

W1208 21:07:40.172608  495469 warnings.go:67] networking.k8s.io/v1beta1 Ingress is deprecated in v1.19+, unavailable in v1.22+; use networking.k8s.io/v1 Ingress
W1208 21:07:40.589121  495469 warnings.go:67] rbac.authorization.k8s.io/v1beta1 ClusterRoleBinding is deprecated in v1.17+, unavailable in v1.22+; use rbac.authorization.k8s.io/v1 ClusterRoleBinding
W1208 21:07:40.644555  495469 warnings.go:67] rbac.authorization.k8s.io/v1beta1 ClusterRole is deprecated in v1.17+, unavailable in v1.22+; use rbac.authorization.k8s.io/v1 ClusterRole

1.22 is scheduled for mid 2021

[Bug] service init containers not finishing due to waiting for default keyspace to become ready

Describe the bug
Installing the helm chart as follows:

  • elasticsearch disabled
  • mysql disabled
  • prometheus disabled
  • grafana enabled
  • cassandra enabled
  • cassandra persistence disabled

Got an error in many of the services' (frontend, history, matching, etc) init containers when installing:

waiting for default keyspace to become ready

Apparently, the keyspace becomes ready with the Job: https://github.com/temporalio/helm-charts/blob/master/templates/server-job.yaml

The job has the following annotations:

  annotations:
    {{- if .Values.cassandra.enabled }}
    "helm.sh/hook": post-install
    {{- else }}
    "helm.sh/hook": pre-install
    {{- end }}

..it seems that the post-install hook doesn't execute until after the pods are ready, which they don't because they're waiting for this job to run.

The job already has init containers that are waiting for cassandra to come up, so I'm not sure the install hooks are necessary.

To Reproduce
Steps to reproduce the behavior:

  1. create a temporal namespace
  2. use the following values.yaml
server:
  enabled: true
  replicaCount: 1
  config:
    persistence:
      default:
        driver: "cassandra"
        cassandra:
          hosts: ["temporal-cassandra.temporal.svc.cluster.local"]
          # port: 9042
          keyspace: "temporal"
          user: "user"
          password: "password"
          existingSecret: ""
          replicationFactor: 1
          consistency:
            default:
              consistency: "local_quorum"
              serialConsistency: "local_serial"
      visibility:
        driver: "cassandra"

        cassandra:
          hosts: ["temporal-cassandra.temporal.svc.cluster.local"]
          keyspace: "temporal_visibility"
          user: "user"
          password: "password"
          existingSecret: ""
          replicationFactor: 1
          consistency:
            default:
              consistency: "local_quorum"
              serialConsistency: "local_serial"
  frontend:
    replicaCount: 1

  history:
    replicaCount: 1

  matching:
    replicaCount: 1

  worker:
    replicaCount: 1

admintools:
  enabled: true
web:
  enabled: true
  replicaCount: 1
schema:
  setup:
    enabled: true
    backoffLimit: 100
  update:
    enabled: true
    backoffLimit: 100
elasticsearch:
  enabled: false
prometheus:
  enabled: false
grafana:
  enabled: false
cassandra:
  enabled: true
  persistence:
    enabled: false
  config:
    cluster_size: 3
    ports:
      cql: 9042
    num_tokens: 4
    max_heap_size: 512M
    heap_new_size: 128M
    seed_size: 0
  env:
    CASSANDRA_PASSWORD: password

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots/Terminal ouput
log the check-cassandra-temporal-schema init container of the frontend service during the deployment yields:

waiting for default keyspace to become ready

Versions (please complete the following information where relevant):

  • OS:
Client Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.3", GitCommit:"1e11e4a2108024935ecfcb2912226cedeafd99df", GitTreeState:"clean", BuildDate:"2020-10-14T12:50:19Z", GoVersion:"go1.15.2", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"17+", GitVersion:"v1.17.17-eks-c5067d", GitCommit:"c5067dd1eb324e934de1f5bd4c593b3cddc19d88", GitTreeState:"clean", BuildDate:"2021-03-05T23:39:01Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}
  • Temporal Version
    • Chart version 0.9.3
  • are you using Docker or Kubernetes or building Temporal from source?
    • not building from source

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.