Giter Site home page Giter Site logo

Comments (5)

ZoabKapoor avatar ZoabKapoor commented on August 16, 2024

Oh, one important thing to mention is that we're providing the BrokerList as a URL which our internal DNS then resolves into multiple hosts

<property name="consumer.BrokerList" value="kafka.<OURDOMAIN>.com:<KAFKA_PORT>">
dig kafka.<OURDOMAIN>.com

;; ANSWER SECTION:
kafka.<OURDOMAIN>.com. 3 IN A 10.199.80.101
kafka.<OURDOMAIN>.com. 3 IN A 10.199.81.75
kafka.<OURDOMAIN>.com. 3 IN A 10.199.53.167
... more IPs

from sarama.

thorseraq avatar thorseraq commented on August 16, 2024

It may be DNS cache, and each time it resolves to the same IP. Ideally, if you provide multiple address, sarama would init brokers with those address, and pick one from "LeastLoadedBroker", which is first previously connected broker and has least amount response promise to handle, if none, from unused seedBrokers(If conn failed, broker will be moved to deadSeeds)

from sarama.

ZoabKapoor avatar ZoabKapoor commented on August 16, 2024

Interesting, DNS is definitely one of the things I'm considering, thanks @thorseraq ! Do you have any insight how I'd verify if this is the case/how to determine what values Sarama is getting out of the DNS cache? What's weird is that when I dig AAAA or dig the hostname, it correctly both (1) resolves and (2) shuffles the order each time, so I'm not sure why Sarama might be getting something different repeatedly

One thing we did do is hook up the Sarama logging, and we're hoping to see if this happens again if the Sarama-level logs can tell us a little more about what's going on.

from sarama.

dnwe avatar dnwe commented on August 16, 2024

Hi @ZoabKapoor

The key thing to remember with Kafka is that the broker connection addresses that you give to Sarama are simply what it can use to "bootstrap" a connection to the cluster. So by providing Sarama with kafka.<OURDOMAIN>.com:<KAFKA_PORT> you're telling it that there is one broker/hostname that can be used to find out where things live on the cluster.

The individual brokers themselves have a config for ADVERTISED_LISTENER which is the hostname:port that should be used to connect to the given individual broker. The log snippet above is very small, but do you know what configuration you are using for advertised listener on each broker?

from sarama.

ZoabKapoor avatar ZoabKapoor commented on August 16, 2024

Hey @dnwe , that makes sense... I don't have direct access to the brokers to see what they're configuring as their ADVERTISED_LISTENER, unless that's something I can see in the sarama runtime using a debugger.

One thing that's interesting is that I've been trying to reproduce this issue locally and I'm unable to -- originally I thought that maybe what was happening is that the dial was returning the same value when resolving the DNS record kafka.<OURDOMAIN>.com:<KAFKA_PORT>, but testing it multiple times I could see that the IP returned varied, and to your point that only matters at bootstrap time -- once the client is directly connected to the leaders for the partitions it's consuming, it doesn't care about the initial bootstrap connection.

My guess as to what might have been happening (hard to verify because we can't reproduce) is the 7 seconds we saw many calls to connect to 2a04:f547:43:e046::11b, the broker at 2a04:f547:43:e046::11b1 may have been down but the partition leader was not failed over yet... we should be able to determine if this happens again as we've enabled more detailed logging on the Sarama side.

For now, I will close this issue and reopen once we have more info (if it recurs) from the Sarama-level logs.

Thanks to both of you for your help so far!

from sarama.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.