Giter Site home page Giter Site logo

tezos-on-gke's Introduction

midl-website

Repository for midl.dev website source code

tezos-on-gke's People

Contributors

denver-s avatar hodl-dot-farm avatar nicolasochem avatar oksanaprotsukha avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

tezos-on-gke's Issues

Upgrade from v1.0

Hello, is there a guide on how to safely upgrade from v1.0?

I am mostly concerned about the v1.0 signer which is probably not compatible with the v1.1 cloud baker.

Do I need to upgrade one of the two signers, then setup a v1.1 cloud baker connected to the upgraded signer and then stop the v1.0 baker (in order to avoid double baking) once I see that everything is working fine?

Endorser is stuck in a loop

The endorser got stuck after my pvc were out of space. I had to clean up public/private nodes pvc and restart all the pods. But after restarting, the endorser was stuck in a loop while accessing client-pvc. Logs are attached.
endorser_stuck_in_loop.log

Public nodes crash when restarted

When public nodes pods are restarted (i.e. scale to 0) then the following error happens:
Aug 30 00:40:52 - node.main: (chain = TEZOS_ALPHANET_CARTHAGE_2019-11-28T13:02:13Z)
Aug 30 00:40:52 - node.main: disabled local peer discovery
Aug 30 00:40:52 - node.main: read identity file (peer_id = idrWXra7g5mHLrZk9Jn1JFfHnkc4ff)
Aug 30 00:40:52 - main: shell-node initialization: bootstrapping
Aug 30 00:40:52 - main: shell-node initialization: p2p_maintain_started
Aug 30 00:40:52 - block_validator_process_external: Initialized
Aug 30 00:40:52 - block_validator_process_external: Block validator started on pid 29
tezos-node: Error:
Cannot switch from history mode 'full' to 'rolling'. In order to change your history mode please refer to the Tezos node documentation.

Add Telegram alerts

I see that it's possible to setup a Slack bot for alerts, which is great.

I was thinking about doing the same but using a Telegram bot. They are very easy to create and use.

Private node warning "p2p.maintenance: Too few connections (1)" [SOLVED]

For some reason, private-node failed to connect to public-node-0.

To fix this problem I connected to the gcloud shell:

gcloud container clusters get-credentials blockchain --region us-central1 --project <PROJECT_ID> && kubectl exec xtz-tezos-private-baking-node-mynode-<POD_ID> -c tezos-private-node --namespace tezos -it -- /bin/sh

Checked the p2p stat for the node:

tezos-admin-client -A xtz-tezos-private-baking-node-mynode p2p stat (-A is not used since version v8.0)
tezos-admin-client -E http://xtz-tezos-private-baking-node-mynode:8732 p2p stat

GLOBAL STATS
  ↗ 43.58 MiB (24 B/s) ↘ 75.58 MiB (114 B/s)
CONNECTIONS
  ↗ idsxxxxxxxxxxxx 10.104.3.14:9732 (TEZOS_MAINNET.0 (p2p: 1)) 
KNOWN PEERS
  ⚏  1 idqyyyyyyyyyyy ↗ 0 B (0 B/s) ↘ 0 B (0 B/s)  
  ⚌  1 idsxxxxxxxxxxxx ↗ 358.83 kiB (24 B/s) ↘ 868.79 kiB (114 B/s)  
KNOWN POINTS
  ⚌  10.104.3.14:9732 idsxxxxxxxxxxxx 
  ⚏  x.y.z.k:9732
  ⚏  z.y.z.k:9732
  ⚏  j.y.z.k:9732
  ⚏  t.y.z.k:9732

So, the private node is connected to public-node-1 (10.104.3.14:9732) and the unconnected node is public-node-0 (10.104.2.14:9732).

tezos-admin-client -A xtz-tezos-private-baking-node-mynode connect address 10.104.2.14:9732 (-A is not used since version v8.0)

tezos-admin-client -E http://xtz-tezos-private-baking-node-mynode:8732 trust address 10.104.2.14:9732
tezos-admin-client -E http://xtz-tezos-private-baking-node-mynode:8732 connect address 10.104.2.14:9732

Now the p2p stat shows:

GLOBAL STATS
  ↗ 43.60 MiB (22 B/s) ↘ 75.62 MiB (28 B/s)
CONNECTIONS
  ↗ idqyyyyyyyyyyy 10.104.2.14:9732 (TEZOS_MAINNET.0 (p2p: 1)) 
  ↗ idsxxxxxxxxxxxx 10.104.3.14:9732 (TEZOS_MAINNET.0 (p2p: 1)) 
KNOWN PEERS
  ⚌  1 idqyyyyyyyyyyy ↗ 9.77 kiB (12 B/s) ↘ 10.90 kiB (17 B/s)  
  ⚌  1 idsxxxxxxxxxxxx ↗ 374.26 kiB (7 B/s) ↘ 906.24 kiB (9 B/s)

Hope it can help others!

Error when deploying with terraform

Hi, I'm having the following error when deploying docker image:

local_file.k8s_kustomization: Creation complete after 0s [id=0728e4ecd9dbc70fc144a0e57fdc92a56293af0d]
null_resource.push_containers: Provisioning with 'local-exec'...
null_resource.push_containers (local-exec): Executing: ["/bin/sh" "-c" "\nset -x\n\nbuild_container () {\n set -x\n cd $1\n container=$(basename $1)\n cp Dockerfile.template Dockerfile\n sed -i "s/((tezos_sentry_version))/latest-release/" Dockerfile\n sed -i "s/((tezos_private_version))/latest-release/" Dockerfile\n cat << EOY > cloudbuild.yaml\nsteps:\n- name: 'gcr.io/cloud-builders/docker'\n args: ['build', '-t', "gcr.io/my-tezos-project/$container:latest", '.']\nimages: ["gcr.io/my-tezos-project/$container:latest"]\nEOY\n gcloud builds submit --project my-tezos-project --config cloudbuild.yaml .\n rm -v Dockerfile\n rm cloudbuild.yaml\n}\nexport -f build_container\nfind ./../docker -mindepth 1 -type d -exec bash -c 'build_container "$0"' {} \; -printf '%f\n'\n"]
module.terraform-gke-blockchain.google_container_node_pool.blockchain_cluster_node_pool: Creating...
null_resource.push_containers (local-exec): + export -f build_container
null_resource.push_containers (local-exec): /bin/sh: 21: export: Illegal option -f

module.terraform-gke-blockchain.google_container_node_pool.blockchain_cluster_node_pool: Still creating... [10s elapsed]
module.terraform-gke-blockchain.google_container_node_pool.blockchain_cluster_node_pool: Still creating... [20s elapsed]
module.terraform-gke-blockchain.google_container_node_pool.blockchain_cluster_node_pool: Still creating... [30s elapsed]
module.terraform-gke-blockchain.google_container_node_pool.blockchain_cluster_node_pool: Still creating... [40s elapsed]
module.terraform-gke-blockchain.google_container_node_pool.blockchain_cluster_node_pool: Still creating... [50s elapsed]
module.terraform-gke-blockchain.google_container_node_pool.blockchain_cluster_node_pool: Still creating... [1m0s elapsed]
module.terraform-gke-blockchain.google_container_node_pool.blockchain_cluster_node_pool: Still creating... [1m10s elapsed]
module.terraform-gke-blockchain.google_container_node_pool.blockchain_cluster_node_pool: Still creating... [1m20s elapsed]
module.terraform-gke-blockchain.google_container_node_pool.blockchain_cluster_node_pool: Still creating... [1m30s elapsed]
module.terraform-gke-blockchain.google_container_node_pool.blockchain_cluster_node_pool: Still creating... [1m40s elapsed]
module.terraform-gke-blockchain.google_container_node_pool.blockchain_cluster_node_pool: Still creating... [1m50s elapsed]
module.terraform-gke-blockchain.google_container_node_pool.blockchain_cluster_node_pool: Still creating... [2m0s elapsed]
module.terraform-gke-blockchain.google_container_node_pool.blockchain_cluster_node_pool: Creation complete after 2m5s [id=projects/my-tezos-project/locations/us-central1/clusters/blockchain/nodePools/tzbaker-pool]

Error: Error running command '
set -x

build_container () {
set -x
cd $1
container=$(basename $1)
cp Dockerfile.template Dockerfile
sed -i "s/((tezos_sentry_version))/latest-release/" Dockerfile
sed -i "s/((tezos_private_version))/latest-release/" Dockerfile
cat << EOY > cloudbuild.yaml
steps:

  • name: 'gcr.io/cloud-builders/docker'
    args: ['build', '-t', "gcr.io/my-tezos-project/$container:latest", '.']
    images: ["gcr.io/my-tezos-project/$container:latest"]
    EOY
    gcloud builds submit --project my-tezos-project --config cloudbuild.yaml .
    rm -v Dockerfile
    rm cloudbuild.yaml
    }
    export -f build_container
    find ./../docker -mindepth 1 -type d -exec bash -c 'build_container "$0"' {} ; -printf '%f\n'
    ': exit status 2. Output: + export -f build_container
    /bin/sh: 21: export: Illegal option -f

pushd/popd: not found

null_resource.push_containers: Creating...
null_resource.push_containers: Provisioning with 'local-exec'...
null_resource.push_containers (local-exec): Executing: ["/bin/sh" "-c" "#!/bin/bash\ngcloud auth configure-docker --project \"mainnet0d64d979842ed401\"\n\nfind ./../docker -mindepth 1 -type d  -printf '%f\\n'| while read container; do\n  pushd ./../docker/$container\n  sed -e \"s/((tezos_network))/mainnet/\" Dockerfile.template > Dockerfile\n  tag=\"gcr.io/mainnet0d64d979842ed401/$container:latest\"\n  docker build -t $tag .\n  docker push $tag\n  rm -v Dockerfile\n  popd\ndone\n"]
kubernetes_secret.hot_wallet_private_key: Creating...
kubernetes_secret.website_builder_key: Creating...
kubernetes_secret.hot_wallet_private_key: Creation complete after 0s [id=default/hot-wallet]
kubernetes_secret.website_builder_key: Creation complete after 0s [id=default/website-builder-credentials]
null_resource.push_containers (local-exec): gcloud credential helpers already registered correctly.
null_resource.push_containers (local-exec): /bin/sh: 5: pushd: not found
null_resource.push_containers (local-exec): sed: impossibile leggere Dockerfile.template: File o directory non esistente
null_resource.push_containers (local-exec): time="2019-12-02T20:49:24+01:00" level=error msg="failed to dial gRPC: cannot connect to the Docker daemon. Is 'docker daemon' running on this host?: dial unix /var/run/docker.sock: connect: permission denied"
null_resource.push_containers (local-exec): error during connect: Post http://%2Fvar%2Frun%2Fdocker.sock/v1.40/build?buildargs=%7B%7D&cachefrom=%5B%5D&cgroupparent=&cpuperiod=0&cpuquota=0&cpusetcpus=&cpusetmems=&cpushares=0&dockerfile=Dockerfile&labels=%7B%7D&memory=0&memswap=0&networkmode=default&rm=1&session=izl2j9z7k52lur3lca081ijlk&shmsize=0&t=gcr.io%2Fmainnet0d64d979842ed401%2Ftezos-private-node-connectivity-checker%3Alatest&target=&ulimits=null&version=1: context canceled
null_resource.push_containers (local-exec): Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Post http://%2Fvar%2Frun%2Fdocker.sock/v1.40/images/gcr.io/mainnet0d64d979842ed401/tezos-private-node-connectivity-checker/push?tag=latest: dial unix /var/run/docker.sock: connect: permission denied
null_resource.push_containers (local-exec): 'Dockerfile' removed
null_resource.push_containers (local-exec): /bin/sh: 11: popd: not found
null_resource.push_containers (local-exec): /bin/sh: 5: pushd: not found

Pod restarting but "/dev/termination-log" is empty

Hey! Sorry to bother but I've got an issue.

Once a day the pods restart, for no reason apparently. They become inactive for 30 seconds which is terminationGracePeriodSeconds but I don't get why.

I checked the YAML and read terminationMessagePath: /dev/termination-log and terminationMessagePolicy: File but that file is empty.

V3 documentation

I find it quite difficult to understand how to deploy this infrastructure.

Deploy a baker with remote signer https://tezos-docs.midl.dev/deploy-remote-signer.html

Quickstart link points to https://github.com/midl-dev/tezos-on-gke (why not to a page in the documentation website?)

Production hardening https://tezos-docs.midl.dev/production-readiness.html

Which one is to be followed?

If I want to follow the "production best practices", I am supposed to use https://github.com/midl-dev/terraform-gke-blockchain as described? What commands should I execute?

Problems with tezos-sidecar

Could not make it work with and without experimental_active_standby_mode.

With experimental_active_standby_mode = false I get this:

rpc error: code = Unknown desc = Error response from daemon: manifest for gcr.io/project-gcp/tezos-sidecar:tezos-latest not found: manifest unknown: Failed to fetch "tezos-latest" from request "/v2/project-gcp/tezos-sidecar/manifests/tezos-latest".: ErrImagePull

Errore creating zone with Cloudflare because it already exists

I manually created a zone for the website (I thought it wasn't necessary!) on my Cloudflare and so it gives me this error.

Error: Error creating zone "[DOMAIN]": error from makeRequest: HTTP status 400: content "{\"success\":false,\"errors\":[{\"code\":1061,\"message\":\"[DOMAIN] already exists\"}],\"messages\":[],\"result\":null}"

  on cloudflare.tf line 12, in resource "cloudflare_zone" "tezos_baker_zone":
  12: resource "cloudflare_zone" "tezos_baker_zone" {

Terraform apply, "Error running command 'set -x"

It seems when running terraform apply from the /terraform dir that the resource "null_resource" "push_containers" in k8s.tf is causing the following error:

null_resource.push_containers: Creating...
null_resource.push_containers: Provisioning with 'local-exec'...
null_resource.push_containers (local-exec): Executing: ["/bin/bash" "-c" "set -x\n\nbuild_container () {\n  set -x\n  cd $1\n  container=$(basename $1)\n  cp Dockerfile.template Dockerfile\n  sed -i \"s/((tezos_sentry_version))/latest-release/\" Dockerfile\n  sed -i \"s/((tezos_private_version))/latest-release/\" Dockerfile\n  cat << EOY > cloudbuild.yaml\nsteps:\n- name: 'gcr.io/cloud-builders/docker'\n  args: ['build', '-t', \"gcr.io/solo-tezos-baker/$container:tezos-latest\", '.']\nimages: [\"gcr.io/solo-tezos-baker/$container:tezos-latest\"]\nEOY\n  gcloud builds submit --project solo-tezos-baker --config cloudbuild.yaml .\n  rm -v Dockerfile\n  rm cloudbuild.yaml\n}\nexport -f build_container\nfind ./../docker -mindepth 1 -maxdepth 1 -type d -exec bash -c 'build_container \"$0\"' {} \\; -printf '%f\\n'\n"]
null_resource.push_containers (local-exec): + export -f build_container
null_resource.push_containers (local-exec): + find ./../docker -mindepth 1 -maxdepth 1 -type d -exec bash -c 'build_container "$0"' '{}' ';' -printf '%f\n'
null_resource.push_containers (local-exec): find: -printf: unknown primary or operator


Error: Error running command 'set -x

build_container () {
  set -x
  cd $1
  container=$(basename $1)
  cp Dockerfile.template Dockerfile
  sed -i "s/((tezos_sentry_version))/latest-release/" Dockerfile
  sed -i "s/((tezos_private_version))/latest-release/" Dockerfile
  cat << EOY > cloudbuild.yaml
steps:
- name: 'gcr.io/cloud-builders/docker'
  args: ['build', '-t', "gcr.io/solo-tezos-baker/$container:tezos-latest", '.']
images: ["gcr.io/solo-tezos-baker/$container:tezos-latest"]
EOY
  gcloud builds submit --project solo-tezos-baker --config cloudbuild.yaml .
  rm -v Dockerfile
  rm cloudbuild.yaml
}
export -f build_container
find ./../docker -mindepth 1 -maxdepth 1 -type d -exec bash -c 'build_container "$0"' {} \; -printf '%f\n'
': exit status 1. Output: + export -f build_container
+ find ./../docker -mindepth 1 -maxdepth 1 -type d -exec bash -c 'build_container "$0"' '{}' ';' -printf '%f\n'
find: -printf: unknown primary or operator

I simply running apply with a few variables set:

project="the-baker"
cluster_name="the-baker-cluster"
kubernetes_namespace="tezos"
baking_nodes = {
  mynode = {
    mybaker = {
      public_baking_key="edpku3Xg8pYtZP8era5amwhLoip8YGnwpNXT1GJ1xpCDtkK3Pf93xx"
      public_baking_key_hash="tz1NfwaFPNrQi3icf37Ybz7bbqDmVmarFWxx"
      ledger_authorized_path="ledger://four-animal-words-here/ed25519/0h/1h"
      authorized_signers= {
          ssh_pubkey="ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQClk976UOhyjMCfcvp5UxyhGM1Unh4lxGDlRCaZFrzBt7DkQRUBX+0EGSkyB7TI5bmGn5UW8OXSFpKJjFB6evZ8PJtUWy5T/TB2aSfMGpNrhDIJmLZ2hGOh7lxScvdjRhXd+plPBaTLoR/cdezSuYJs1OywHFYbqJ90msxHZilSXntTArTu9oCatChi9oNqcYbZM0BDJHMZWcTICtraIAt0b9uSzdPXT8UK/a32DkAMeU92x1xsGEPOOyaI9J4lFC7dpRaLmkYPEFjqRXyXHs2gS9P9VmsqIIRIM1+YUDCPZjWOq1hwxjfbHcF9jRMxG3U+D8S6i1567ZsY/W70cUxx baker@raspberrypi"
          signer_port="22"
          tunnel_endpoint_port="22"
      }
    }
  }
}

Curious if anyone else has run into this or what I may be doing wrong? Thanks!

Terraform warning messages

Hello, I wanted to test this (great) project but I found some warnings when executing the command terraform plan -out plan.out.

An example is:

Warning: Quoted references are deprecated

on website.tf line 47, in resource "google_compute_managed_ssl_certificate" "default":
47: provider = "google-beta"

In this context, references are expected literally rather than in quotes.
Terraform 0.11 and earlier required quotes, but quoted references are now
deprecated and will be removed in a future version of Terraform. Remove the
quotes surrounding this reference to silence this warning.

Also, can you explain what and how to "collect the necessary information and put it in terraform.tfvars"?

Error: googleapi: Error 400: Service account "[email protected]" does not exist., badRequest

terraform plan show Plan: 28 to add, 0 to change, 0 to destroy.
after apply i have this error
odule.terraform-gke-blockchain.google_project_service.service[3]: Creation complete after 4s [id=tezos-308212/iam.googleapis.com]
module.terraform-gke-blockchain.google_compute_firewall.gke-master-to-kubelet: Still creating... [10s elapsed]
module.terraform-gke-blockchain.google_compute_firewall.gke-master-to-kubelet: Creation complete after 12s [id=projects/tezos-308212/global/firewalls/k8s-master-to-kubelets]
module.terraform-gke-blockchain.google_project_iam_member.service-account[0]: Still creating... [10s elapsed]
module.terraform-gke-blockchain.google_project_iam_member.service-account[2]: Still creating... [10s elapsed]
module.terraform-gke-blockchain.google_project_iam_member.service-account[3]: Still creating... [10s elapsed]
module.terraform-gke-blockchain.google_project_iam_member.service-account[1]: Still creating... [10s elapsed]
module.terraform-gke-blockchain.google_project_iam_member.service-account[3]: Creation complete after 11s [id=tezos-308212/roles/storage.objectViewer/serviceaccount:[email protected]]
module.terraform-gke-blockchain.google_project_iam_member.service-account[2]: Creation complete after 12s [id=tezos-308212/roles/monitoring.viewer/serviceaccount:[email protected]]
module.terraform-gke-blockchain.google_project_iam_member.service-account[1]: Creation complete after 13s [id=tezos-308212/roles/monitoring.metricWriter/serviceaccount:[email protected]]
module.terraform-gke-blockchain.google_project_iam_member.service-account[0]: Creation complete after 14s [id=tezos-308212/roles/logging.logWriter/serviceaccount:[email protected]]
module.terraform-gke-blockchain.google_container_cluster.blockchain_cluster: Creating...

Error: googleapi: Error 400: Service account "[email protected]" does not exist., badRequest

on .terraform/modules/terraform-gke-blockchain/gcp.tf line 176, in resource "google_container_cluster" "blockchain_cluster":
176: resource "google_container_cluster" "blockchain_cluster" {

after terraform plan show Plan: 9 to add, 0 to change, 0 to destroy.
part resource was created
i dont have this service account [email protected]

Error 403: Compute Engine API has not been used in project [...] before or it is disabled.

Error: Error when reading or editing Subnetwork Not Found : default: googleapi: Error 403: Compute Engine API has not been used in project [xxxx] before or it is disabled. Enable it by visiting https://console.developers.google.com/apis/api/compute.googleapis.com/overview?project=[xxxx] then retry. If you enabled this API recently, wait a few minutes for the action to propagate to our systems and retry., accessNotConfigured

googleapi: Error creating GlobalAddress, ManagedSslCertificate and BackendBucket

Error: Error creating GlobalAddress: googleapi: Error 403: Project 1008....82 is not found and cannot be used for API calls. If it is recently created, enable Compute Engine API by visiting https://console.developers.google.com/apis/api/compute.googleapis.com/overview?project=1008....82 then retry. If you enabled this API recently, wait a few minutes for the action to propagate to our systems and retry., accessNotConfigured

  on website.tf line 1, in resource "google_compute_global_address" "default":
   1: resource "google_compute_global_address" "default" {
Error: Error creating ManagedSslCertificate: googleapi: Error 403: Project 1008....82 is not found and cannot be used for API calls. If it is recently created, enable Compute Engine API by visiting https://console.developers.google.com/apis/api/compute.googleapis.com/overview?project=1008....82 then retry. If you enabled this API recently, wait a few minutes for the action to propagate to our systems and retry., accessNotConfigured

  on website.tf line 46, in resource "google_compute_managed_ssl_certificate" "default":
  46: resource "google_compute_managed_ssl_certificate" "default" {
Error: Error creating BackendBucket: googleapi: Error 403: Project 1008....82 is not found and cannot be used for API calls. If it is recently created, enable Compute Engine API by visiting https://console.developers.google.com/apis/api/compute.googleapis.com/overview?project=1008....82 then retry. If you enabled this API recently, wait a few minutes for the action to propagate to our systems and retry., accessNotConfigured

sshd: no hostkeys available -- exiting

In variables.tf I did not set any signer_target_host_key as I did not understand what it is and how to generate, furthermore it seemed optional. My pod tezos-remote-signer-forwarder-mybaker-0 gives me this error:

image

Improving installation guide & how-to use

Some questions that I personally find difficult to answer myself.

Whats is signer_target_host_key? Is it optional? How to generate?

Provide a complete example of baking_nodes in variables.tf, such as:

variable "baking_nodes" {
  type = map
  description = "Structured data related to baking, including public key and signer configuration"
  default = {
    mynode = {
      mybaker = {
        public_baking_key="edpku...."
        public_baking_key_hash="tz1....."
        ledger_authorized_path="ledger://bob-alice-bob-alice/ed25519/0h/0h"
        authorized_signers = [
          {
            ssh_pubkey = "ssh-rsa AAAAB3...."
            signer_port = 8443
            tunnel_endpoint_port = 58255
          }
        ]
      }
    }
  }
}

To be continued.

Unsupported argument username, password and identity_namespace

I am trying to create from scratch a simple public node based on this guide https://tezos-docs.midl.dev/deploy-public, but this error happens:

❯ terraform plan
╷
│ Error: Unsupported argument
│ 
│   on .terraform/modules/terraform-gke-blockchain/gcp.tf line 217, in resource "google_container_cluster" "blockchain_cluster":
│  217:     username = ""
│ 
│ An argument named "username" is not expected here.
╵
╷
│ Error: Unsupported argument
│ 
│   on .terraform/modules/terraform-gke-blockchain/gcp.tf line 218, in resource "google_container_cluster" "blockchain_cluster":
│  218:     password = ""
│ 
│ An argument named "password" is not expected here.
╵
╷
│ Error: Unsupported argument
│ 
│   on .terraform/modules/terraform-gke-blockchain/gcp.tf line 282, in resource "google_container_cluster" "blockchain_cluster":
│  282:     identity_namespace = "${data.google_project.blockchain_cluster.project_id}.svc.id.goog"
│ 
│ An argument named "identity_namespace" is not expected here.

Talking about with @nicolasochem, he thinks that is a bug from a recent version of the terraform Google provider.

Error when deploying cluster first time without authorized_signer

When deploying for fist time cluster without authorized_signer got following error:

Error: Error in function call: Call to function "templatefile" failed: ./../k8s/tezos-remote-signer-loadbalancer-tmpl/kustomization.yaml.tmpl:17,76-79: Invalid index; The given key does not identify an element in this collection value..

Note that instructions mention "Note that authorized_signers is empty for now." like below:

baking_nodes = {
mynode = {
mybaker = {
public_baking_key="tz1YmsrYxQFJo5nGj4MEaXMPdLrcRf2a5mAU"
ledger_authorized_path="ledger://my-four-key-words/ed25519/0h/1h",
authorized_signers : []

}

}
}

"no global Slack API URL set"

My pod alertmanager-monitoring-prom-kube-prome-alertmanager-0 is giving me this error:

Loading configuration file failed" file=/etc/alertmanager/config/alertmanager.yaml err="no global Slack API URL set"

File variables.tf is:

terraform {
  required_version = ">= 0.13"
}

variable "org_id" {
  type        = string
  description = "Organization ID."
  default = "123456789"
}

variable "billing_account" {
  type        = string
  description = "Billing account ID."
  default = "0XXXXX0"
}

variable "project" {
  type        = string
  default     = "mainnet-xxx-01"
  description = "Project ID where Terraform is authenticated to run to create additional projects. If provided, Terraform will great the GKE and Tezos cluster inside this project. If not given, Terraform will generate a new project."
}

variable "region" {
  type        = string
  default     = "us-central1"
  description = "Region in which to create the cluster, or region where the cluster exists."
}

variable "node_locations" {
  type        = list
  default     = [ "us-central1-b", "us-central1-f" ]
  description = "Zones in which to create the nodes"
}


variable "kubernetes_namespace" {
  type = string
  description = "kubernetes namespace to deploy the resource into"
  default = "tezos"
}

variable "kubernetes_name_prefix" {
  type = string
  description = "kubernetes name prefix to prepend to all resources (should be short, like xtz)"
  default = "xtz"
}

variable "kubernetes_endpoint" {
  type = string
  description = "name of the kubernetes endpoint"
  default = ""
}

variable "cluster_ca_certificate" {
  type = string
  description = "kubernetes cluster certificate"
  default = ""
}

variable "cluster_name" {
  type = string
  description = "name of the kubernetes cluster"
  default = ""
}

variable "kubernetes_access_token" {
  type = string
  description = "name of the kubernetes endpoint"
  default = ""
}

variable "terraform_service_account_credentials" {
  type = string
  description = "path to terraform service account file, created following the instructions in https://cloud.google.com/community/tutorials/managing-gcp-projects-with-terraform"
  default = "~/.config/gcloud/application_default_credentials.json"
}

variable "kubernetes_pool_name" {
  type = string
  description = "when kubernetes cluster has several node pools, specify which ones to deploy the baking setup into. only effective when deploying on an external cluster with terraform_no_cluster_create"
  default = "blockchain-pool"
}

#
# Tezos node and baker options
# ------------------------------

variable "baking_nodes" {
  type = map
  description = "Structured data related to baking, including public key and signer configuration"
  default = {
    mynode = {
      mybaker = {
        public_baking_key="edpkxxxxxxxx"
        public_baking_key_hash="tz1xxxxxxxxxxxxxxxx"
        ledger_authorized_path="ledger://xxxx-xxxx-xxxx-xxxx/ed25519/0h/0h"
        monitoring_slack_url="https://hooks.slack.com/services/xxxx/xxxx/xxxxxxxxxx"
        monitoring_slack_channel="#tezos-blockchain"
        authorized_signers = [
          {
            ssh_pubkey = "ssh-rsa AAAAB3NzaCxxxxxx"
            signer_port = 8443
            tunnel_endpoint_port = 58255
          }
        ]
      }
    }
  }
}

variable "tezos_network" {
  type =string
  description = "The tezos network i.e. mainnet, carthagenet..."
  default = "mainnet"
}

variable "tezos_sentry_version" {
  type =string
  default = "latest-release"
}

variable "tezos_private_version" {
  type =string
   default = "latest-release"
}

variable "signer_target_host_key" {
  type = string
  description = "ssh host key for the ssh endpoint the remote signer connects to. if you leave it empty, sshd will generate it but it may change, cutting your access to the remote signers."
  default = ""
}

variable "protocols" {
  type = list
  description = "The list of Tezos protocols currently in use, following the naming convention used in the baker/endorser binary names, for example 006-PsCARTHA. Baking and endorsing daemons will be spun up for every protocol provided i>  default = [ "006-PsCARTHA", "007-PsDELPH1" ]
  validation {
    condition     = length(sort(var.protocols)) == length(distinct(sort(var.protocols)))
    error_message = "You must pass different protocols, passing the same protocol twice is not allowed as it introduces double-baking risk."
  }
}

variable "snapshot_url" {
  type = string
  description = "url of the snapshot of type rolling to download"
  default = "https://mainnet.xtz-shots.io/rolling"
}

variable "history_mode" {
  type = string
  description = "history mode of the tezos nodes"
  default = "rolling"
}

variable "node_storage_size" {
  type = string
  description = "storage size for the nodes, in gi"
  default = "15"
}

variable "rpc_public_hostname" {
  type = string
  description = "if set, expose the rpc of the public node through a load balancer and create a certificate for the given hostname"
  default = ""
}

variable "rpc_subnet_whitelist" {
  type = list
  description = "ip address whitelisting for the public rpc. defaults to open to everyone"
  default = [ "0.0.0.0/0" ]
}

variable "monitoring_slack_url" {
  type = string
  default = "https://hooks.slack.com/services/xxxx/xxxxx/xxxxx"
  description = "slack api url to send prometheus alerts to"
}

Replace deprecated --addr and --port with --endpoint (v8-release only)

Flags --addr and --port are deprecated. They should be replaced by --endpoint, such as:

--endpoint "http://$NODE_HOST:$NODE_RPC_PORT"

Here:

and here:

Edit: this flag will be available on v8-release.

remove sentry node

I get many missed bakes and endorsements since florence activation or v9 release, even with v9.2 which was supposed to have a fix for it.

I am trying now to run with "naked nodes" not protected by sentries, to see if it fixes it. I will post my findings.

We are an exception, no one else (kiln or big bakers) are using sentries for their baking operations.

I will monitor for a few days and sees if it fixes it. If it does, I will push the code for sentry removal here.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.