Comments (5)
Oh, one important thing to mention is that we're providing the BrokerList
as a URL which our internal DNS then resolves into multiple hosts
<property name="consumer.BrokerList" value="kafka.<OURDOMAIN>.com:<KAFKA_PORT>">
dig kafka.<OURDOMAIN>.com
;; ANSWER SECTION:
kafka.<OURDOMAIN>.com. 3 IN A 10.199.80.101
kafka.<OURDOMAIN>.com. 3 IN A 10.199.81.75
kafka.<OURDOMAIN>.com. 3 IN A 10.199.53.167
... more IPs
from sarama.
It may be DNS cache, and each time it resolves to the same IP. Ideally, if you provide multiple address, sarama would init brokers with those address, and pick one from "LeastLoadedBroker", which is first previously connected broker and has least amount response promise to handle, if none, from unused seedBrokers(If conn failed, broker will be moved to deadSeeds)
from sarama.
Interesting, DNS is definitely one of the things I'm considering, thanks @thorseraq ! Do you have any insight how I'd verify if this is the case/how to determine what values Sarama is getting out of the DNS cache? What's weird is that when I dig AAAA
or dig
the hostname, it correctly both (1) resolves and (2) shuffles the order each time, so I'm not sure why Sarama might be getting something different repeatedly
One thing we did do is hook up the Sarama logging, and we're hoping to see if this happens again if the Sarama-level logs can tell us a little more about what's going on.
from sarama.
Hi @ZoabKapoor
The key thing to remember with Kafka is that the broker connection addresses that you give to Sarama are simply what it can use to "bootstrap" a connection to the cluster. So by providing Sarama with kafka.<OURDOMAIN>.com:<KAFKA_PORT>
you're telling it that there is one broker/hostname that can be used to find out where things live on the cluster.
The individual brokers themselves have a config for ADVERTISED_LISTENER which is the hostname:port that should be used to connect to the given individual broker. The log snippet above is very small, but do you know what configuration you are using for advertised listener on each broker?
from sarama.
Hey @dnwe , that makes sense... I don't have direct access to the brokers to see what they're configuring as their ADVERTISED_LISTENER
, unless that's something I can see in the sarama runtime using a debugger.
One thing that's interesting is that I've been trying to reproduce this issue locally and I'm unable to -- originally I thought that maybe what was happening is that the dial was returning the same value when resolving the DNS record kafka.<OURDOMAIN>.com:<KAFKA_PORT>
, but testing it multiple times I could see that the IP returned varied, and to your point that only matters at bootstrap time -- once the client is directly connected to the leaders for the partitions it's consuming, it doesn't care about the initial bootstrap connection.
My guess as to what might have been happening (hard to verify because we can't reproduce) is the 7 seconds we saw many calls to connect to 2a04:f547:43:e046::11b
, the broker at 2a04:f547:43:e046::11b1
may have been down but the partition leader was not failed over yet... we should be able to determine if this happens again as we've enabled more detailed logging on the Sarama side.
For now, I will close this issue and reopen once we have more info (if it recurs) from the Sarama-level logs.
Thanks to both of you for your help so far!
from sarama.
Related Issues (20)
- kafka: Failed to produce message to topic myRsp: write tcp [ip1]:[port1]->[ip2]:[port2]: write: broken pipe HOT 7
- Enabling idempotency still allows producing same message
- No throughput difference between sync and async producers HOT 1
- High CPU usage in `newBrokerProducer.func2` HOT 3
- Incorrect rate5 metrics values
- Error and data race using transaction example from the library HOT 2
- kafka server: The producer attempted to update a transaction while another concurrent operation on the same transaction was ongoing HOT 3
- Will handling response this way result in subsequent requests all being mishandled? HOT 1
- unsubscribe w/o passing whole list HOT 2
- Sincerely ask, is this redundant? HOT 1
- How can consumer group consume new partition without losing data using OffsetNewest config?
- Setting a too high value for Producer.Retry.Max can lead to excessive memory usage.
- Docs with examples using tls/ssl by cert files creating a producer and consumer HOT 1
- Broken pipe and EOF
- Re-ordering of offset commit requests can cause committed offset to move "backwards". HOT 1
- Configure proxy for the Kerberos client
- Need help to understand supported Kafka versions HOT 1
- High GC pressure due to numerous temporary objects in high-concurrency consumption scenario
- zstd compression wastes too much time on memory allocation due to lack of destination buffer HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from sarama.