Giter Site home page Giter Site logo

discovery.etcd.io's Issues

discovery.etcd.io not resolving

$ dig discovery.etcd.io

; <<>> DiG 9.10.3-P4-Debian <<>> discovery.etcd.io
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 39690
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;discovery.etcd.io.             IN      A

;; AUTHORITY SECTION:
etcd.io.                1748    IN      SOA     dns1.p06.nsone.net. hostmaster.nsone.net. 1559080105 43200 7200 1209600 3600

;; Query time: 41 msec
;; SERVER: 100.115.92.193#53(100.115.92.193)
;; WHEN: Tue May 28 15:45:42 PDT 2019
;; MSG SIZE  rcvd: 111

discovery.etcd.io expired cert 2024-05-10

Thanks for maintaining this service.

If this is not already known, https://discovery.etcd.io/ has an expire cert at the moment

$ openssl s_client -servername discovery.etcd.io -connect 35.225.64.149:443 2>/dev/null | openssl x509 -noout -dates
notBefore=Feb 10 12:43:27 2024 GMT
notAfter=May 10 12:43:26 2024 GMT

SSL certificate expired 2022-02-20

discovery.etcd.io service is not accessible for creating new etcd clusters as the domain certificate has expired.

expire date: Feb 20 09:35:52 2022 GMT

curl, verbose output showing (and ignoring) cert errors.

$ curl -vk  https://discovery.etcd.io/new
*   Trying 35.225.64.149...
* TCP_NODELAY set
* Connected to discovery.etcd.io (35.225.64.149) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
  CApath: /etc/ssl/certs
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS Unknown, Certificate Status (22):
* TLSv1.3 (IN), TLS handshake, Unknown (8):
* TLSv1.3 (IN), TLS Unknown, Certificate Status (22):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS Unknown, Certificate Status (22):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS Unknown, Certificate Status (22):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Client hello (1):
* TLSv1.3 (OUT), TLS Unknown, Certificate Status (22):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384
* ALPN, server accepted to use h2
* Server certificate:
*  subject: CN=discovery.etcd.io
*  start date: Nov 22 09:35:53 2021 GMT
*  expire date: Feb 20 09:35:52 2022 GMT
*  issuer: C=US; O=Let's Encrypt; CN=R3
*  SSL certificate verify result: certificate has expired (10), continuing anyway.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* TLSv1.3 (OUT), TLS Unknown, Unknown (23):
* TLSv1.3 (OUT), TLS Unknown, Unknown (23):
* TLSv1.3 (OUT), TLS Unknown, Unknown (23):
* Using Stream ID: 1 (easy handle 0x55ef5cc77620)
* TLSv1.3 (OUT), TLS Unknown, Unknown (23):
> GET /new HTTP/2
> Host: discovery.etcd.io
> User-Agent: curl/7.58.0
> Accept: */*
> 
* TLSv1.3 (IN), TLS Unknown, Certificate Status (22):
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* TLSv1.3 (IN), TLS Unknown, Certificate Status (22):
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* TLSv1.3 (IN), TLS Unknown, Unknown (23):
* Connection state changed (MAX_CONCURRENT_STREAMS updated)!
* TLSv1.3 (OUT), TLS Unknown, Unknown (23):
* TLSv1.3 (IN), TLS Unknown, Unknown (23):
* TLSv1.3 (IN), TLS Unknown, Unknown (23):
< HTTP/2 200 
< date: Sun, 20 Feb 2022 20:55:57 GMT
< content-type: text/plain; charset=utf-8
< content-length: 58
< strict-transport-security: max-age=15724800; includeSubDomains
< 
* TLSv1.3 (IN), TLS Unknown, Unknown (23):
* Connection #0 to host discovery.etcd.io left intact
https://discovery.etcd.io/ad9d3e1202fcef9afe39a62610d08509

OpenSSL output.

$ openssl s_client -connect discovery.etcd.io:443 -showcerts
CONNECTED(00000005)
depth=2 C = US, O = Internet Security Research Group, CN = ISRG Root X1
verify return:1
depth=1 C = US, O = Let's Encrypt, CN = R3
verify return:1
depth=0 CN = discovery.etcd.io
verify error:num=10:certificate has expired
notAfter=Feb 20 09:35:52 2022 GMT
verify return:1
depth=0 CN = discovery.etcd.io
notAfter=Feb 20 09:35:52 2022 GMT
verify return:1
---
<removed>
---
SSL handshake has read 4603 bytes and written 399 bytes
Verification error: certificate has expired
---
<removed>

Default TTL of token?

@philips @idvoretskyi Could you please let me know the expiry time of a token? Since I'm trying to document it in OpenStack Magnum so that we can rely on this service with much confidence. Thank you.

By invalidating all old etcd discovery tokens you have broken my workflow.

etcd 3.3.9

In response to coreos/discovery.etcd.io#64 (comment)

Previous workflow.

Create discovery url, create 5 node etcd cluster in asg.

When I want to roll over one of those nodes I do the following:

  1. Delete old node (shut down etcd first)

  2. on current cluster member remove old member

  3. one current cluster member add new member

  4. on replacment node create a dropin systemd unit without discovery and with current members and self, daemon-reload, restart etcd

  5. confirm success, remove restore droping, daemon-reload

That doesn't work anymore on clusters with 'old' discovery tokens.

instead the new node, with the old discovery token starts up and then crashes:

May 16 16:08:20 ip-172-27-187-218 etcd-wrapper[1170]: 2019-05-16 16:08:20.689784 E | etcdmain: failed to join discovery cluster (discovery: bad discovery endpoint)
May 16 16:08:20 ip-172-27-187-218 etcd-wrapper[1170]: 2019-05-16 16:08:20.689816 I | etcdmain: discovery token https://discovery.etcd.io/9a63c50f66e2803fb5ad005643cb7e60 was used, but failed to bootstrap the cluster.
May 16 16:08:20 ip-172-27-187-218 etcd-wrapper[1170]: 2019-05-16 16:08:20.689822 I | etcdmain: please generate a new discovery token and try to bootstrap again.

Access for sig-k8s-infra-leads to `etcd-io-dev` and `etcd-io` gcp projects

Following the creation of sig-etcd and in advance of upcoming pricing changes for google workspaces the etcd project are aiming to move existing gcp projects we own under the kubernetes shared GCP org to reduce costs and improve the oversight and management of these projects.

Refer:

The initial stage of this requires us to add [email protected] to the gcp projects with the Owner role so they can assess resource usage and confirm we can proceed absorbing these under the kubernetes org.

This has been completed for etcd-development however as per the terraform modules in this repository we also have projects etcd-io-dev and etcd-io which we don't seem to have access to in order to be able to grant access.

@victortrac can you please confirm if these projects are still in use? If so can you please grant access as described above?

Any questions please feel free to ping me on kubernetes slack ๐Ÿ™

CNCF Handoff

  • Add CNCF on-call to stackdriver discovery.etcd.io/health alert
  • Ensure CNCF has access to the GKE cluster and stackdriver
  • Answer any questions about the architecture
  • Create an SLO on upgrades

Backport VPC and GKE to terraform

Currently the GCP environment that runs discovery.etcd.io is manually built. It'd be nice to turn it into infrastructure-as-code so that it becomes reproducible and have a change audit-log.

Certificate issue w/ discovery.etcd.io

Hi,

Since September 30th, we've got some certificate issue while starting our kubernetes infrastructure [1]. curl command return a valid certificate [2] but openssl s_client a unvalid one [3].

[1]

2021-10-05 10:47:47.258 1756 ERROR magnum.drivers.heat.template_def [req-d7d8ccc7-4fb6-46bb-b19a-c5d0850456c5 - - - - -] [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:618): SSLError:
 [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:618)
2021-10-05 10:47:47.304 1756 ERROR oslo_messaging.rpc.server [req-d7d8ccc7-4fb6-46bb-b19a-c5d0850456c5 - - - - -] Exception during message handling: GetDiscoveryUrlFailed: Failed to get discovery url fro
m 'https://discovery.etcd.io/new?size=1'.

[2]

$ curl -v 'https://discovery.etcd.io/new?size=1'
[...]
* Server certificate:
*  subject: CN=discovery.etcd.io
*  start date: Sep 23 10:35:28 2021 GMT
*  expire date: Dec 22 10:35:27 2021 GMT
*  subjectAltName: host "discovery.etcd.io" matched cert's "discovery.etcd.io"
*  issuer: C=US; O=Let's Encrypt; CN=R3
[...]

[3]

$ openssl s_client -showcerts -connect discovery.etcd.io:443 -servername discovery.etcd.io
CONNECTED(00000005)
depth=1 O = Digital Signature Trust Co., CN = DST Root CA X3
verify error:num=10:certificate has expired
notAfter=Sep 30 14:01:15 2021 GMT
verify return:0
depth=1 O = Digital Signature Trust Co., CN = DST Root CA X3
verify error:num=10:certificate has expired
notAfter=Sep 30 14:01:15 2021 GMT
verify return:0
depth=3 O = Digital Signature Trust Co., CN = DST Root CA X3
verify error:num=10:certificate has expired
notAfter=Sep 30 14:01:15 2021 GMT
verify return:0
---

production burn down

This is a list of things that need to happen to ensure long term production stability:

  • Rollout branch with token garbage collection
  • Hook-up https://discovery.etcd.io/health to pingdom, etc
  • Configure and deploy S3 backups (xref #1)
  • Automatic building of container

Implement k8s cluster backups

It'd be nice to install and configure velero on the k8s cluster to automatically backup workload data and cluster configuration into Google Cloud Storage.

Create a dev/staging environment

There's only a single environment for discovery.etcd.io, which means that there's not an environment to test upgrades to GKE, etcd, or the discovery service itself. Once #8 is done, we should use that template to generate a pre-prod environment to enable testing of changes to the service.

https://discovery.etcd.io/new?size=3 gives back http url

At some point recently, it appears discovery.etcd.io has started giving back http urls instead of https urls when requesting a new token url for discovery. The http url gives a 308 back to https when used so I'm guessing this was not intentional.

Upgrade GKE to latest stable version

It'd be nice to upgrade the GKE cluster to the latest stable version of k8s to take advantage of new GKE features like VPC aliasing, workload identity, regional masters, private API endpoint, and to get the latest security updates.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.