Hi, While running kafka-connect-elasticsearch in distributed mode, w

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Distributed mode and versionning about kafka-connect-elasticsearch HOT 4 CLOSED

confluentinc commented on July 1, 2024

Distributed mode and versionning

from kafka-connect-elasticsearch.

Comments (4)

shikhar commented on July 1, 2024

That's a good point that when using key.ignore=true having the offset as ES document version is not useful since the topic/partition/offset is already encoded in the key.

Can you elaborate on 'lots of warnings'. I would expect some warnings when you are starting the connector if the shutdown was not clean, but they should not be persistent. Basically the warnings should only happen if the connector has to perform some recovery and re-send documents that were already indexed.

If it is more frequent than that, it would be great to see a complete Connect worker log if possible.

from kafka-connect-elasticsearch.

shikhar commented on July 1, 2024

INFO or DEBUG for this log line may be more appropriate in retrospect.

UPDATE: done in b929b1d

from kafka-connect-elasticsearch.

simplesteph commented on July 1, 2024

Seeing the same issue in a similar setup.
The triggering line error is:

[2017-01-06 04:59:58,458] ERROR Commit of WorkerSinkTask{id=elasticsearch-topic-0} offsets threw an unexpected exception:  (org.apache.kafka.connect.runtime.WorkerSinkTask)
org.apache.kafka.connect.errors.ConnectException: Flush timeout expired with unflushed records: 560
	at io.confluent.connect.elasticsearch.bulk.BulkProcessor.flush(BulkProcessor.java:302)
	at io.confluent.connect.elasticsearch.ElasticsearchWriter.flush(ElasticsearchWriter.java:217)
	at io.confluent.connect.elasticsearch.ElasticsearchSinkTask.flush(ElasticsearchSinkTask.java:125)
	at org.apache.kafka.connect.runtime.WorkerSinkTask.commitOffsets(WorkerSinkTask.java:287)
	at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:157)
	at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:143)
	at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:140)
	at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:175)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)

Which is very odd...
Then I'm getting all these warnings.

PS: How can I benefit from the elasticsearch connector updates if I'm using the cp-docker-images ?

from kafka-connect-elasticsearch.

commented on July 1, 2024

@shikhar Sorry for the very long delay, I didn't notice the notification...

By 'lots of warning' I meant thousands or warnings. not just a few when starting the connector when the shutdown was not clean.
Right now, we don't use the kafka elasticsearch connector anymore, so I am not sure I can help moving on with this issue.

My current hypothesis is that the connector was reading to many items from kafka and was not able to index them into elasticsearch before the session.timeout.ms triggered. The commit fails (because it is too late) and the items are given to another thread. However, part of them are already indexed.
(Not 100% sure, but it is the problem we hit with logstash and the kafka input plugin and elasticsearch output plugin).

Because i will not be able to help test any solution right now, from my point of view, you can close this issue.

from kafka-connect-elasticsearch.

Distributed mode and versionning about kafka-connect-elasticsearch HOT 4 CLOSED

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent