Giter Site home page Giter Site logo

Comments (9)

kflynn avatar kflynn commented on June 10, 2024 1

@Wenliang-CHEN Keeping fingers crossed for you -- enjoy the holiday! 🙂

from linkerd2.

kflynn avatar kflynn commented on June 10, 2024 1

@Wenliang-CHEN Happy new year!! Just wanted to make sure this was still on your radar. 🙂

from linkerd2.

adleong avatar adleong commented on June 10, 2024

Hi @Wenliang-CHEN

This sounds a bit similar to an issue we had where the destination controller could become locked and stop processing service discovery updates. However, this bug was fixed in stable-2.14.2 and should not affect you in stable-2.14.3. In order to rule out that possibility, you could take a look at the endpoints_updates counter metric exposed by the destination controller:

linkerd diagnostics controller-metrics | grep endpoints_updates

You should see this counter incremented when the endpoints of a service change. If, instead, this counter remains at the same value, it means that the destination controller is not processing updates for some reason.

In stable-2.14.4 we added *_informer_lag_secs histogram metrics to the destination controller for even more visibility. If you upgrade to stable-2.14.4 or later you can use these histograms to see if there is a substantial lag between when endpoints are updated in Kubernetes vs when the destination controller processes those updates.

from linkerd2.

Wenliang-CHEN avatar Wenliang-CHEN commented on June 10, 2024

Hey @adleong , thanks for the reply.

And yes, I do see the endpoints_updates counter incremented after the deployment of the target service: service A. With that I guess the destination controller was processing.

A couple of things worth mentioning:

  • the issue happens about 20mins after the deployment of the target service.
  • If we restart the deployment that owns the outbound pod, the issue is solved

Does it change anything?

And as action item, I think we will try to update to stable-2.14.4 and take a look at *_informer_lag_secs as well.

Meanwhile, if we found anything new, we will report in the thread again.

Thanks!

from linkerd2.

kflynn avatar kflynn commented on June 10, 2024

@Wenliang-CHEN Any joy trying with stable-2.14.4? 🙂

from linkerd2.

Wenliang-CHEN avatar Wenliang-CHEN commented on June 10, 2024

Hey @kflynn not yet...around Christmas holiday. I will let you know 😄

But there has not been another instance since I reported the issue. But to be safe, we are still observing...

from linkerd2.

Wenliang-CHEN avatar Wenliang-CHEN commented on June 10, 2024

Hey @kflynn happy new year!

And yes, we have not forgotten this. We just upgraded to v2.14.9. And so far we did not get any report about the same issue.

Hopefully the upgrade somehow fixes it. We will monitor it through out Feb. If there is no further report, I think we can close it for now. Thanks!

from linkerd2.

Wenliang-CHEN avatar Wenliang-CHEN commented on June 10, 2024

Okay, the issue happens again.

We are able to get the linkerd.endpoints_updates, linkerd.endpointslices_informer_lag_seconds.bucket and linkerd.endpoints_informer_lag_seconds.bucket

It seems they go in patterns: the linkerd.endpointslices_informer_lag_seconds.bucket goes with linkerd.endpoints_updates:

Screenshot 2024-02-08 at 14 54 34 Screenshot 2024-02-08 at 14 54 43

And the linkerd.endpoints_informer_lag_seconds.bucket is aways 0

Screenshot 2024-02-08 at 14 54 50

We are not sure how to understand this. Do they mean anything particular? Or are they totally normal?

from linkerd2.

stale avatar stale commented on June 10, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

from linkerd2.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.