Comments (4)
That's a good point that when using key.ignore=true
having the offset as ES document version is not useful since the topic/partition/offset is already encoded in the key.
Can you elaborate on 'lots of warnings'. I would expect some warnings when you are starting the connector if the shutdown was not clean, but they should not be persistent. Basically the warnings should only happen if the connector has to perform some recovery and re-send documents that were already indexed.
If it is more frequent than that, it would be great to see a complete Connect worker log if possible.
from kafka-connect-elasticsearch.
INFO
or DEBUG
for this log line may be more appropriate in retrospect.
UPDATE: done in b929b1d
from kafka-connect-elasticsearch.
Seeing the same issue in a similar setup.
The triggering line error is:
[2017-01-06 04:59:58,458] ERROR Commit of WorkerSinkTask{id=elasticsearch-topic-0} offsets threw an unexpected exception: (org.apache.kafka.connect.runtime.WorkerSinkTask)
org.apache.kafka.connect.errors.ConnectException: Flush timeout expired with unflushed records: 560
at io.confluent.connect.elasticsearch.bulk.BulkProcessor.flush(BulkProcessor.java:302)
at io.confluent.connect.elasticsearch.ElasticsearchWriter.flush(ElasticsearchWriter.java:217)
at io.confluent.connect.elasticsearch.ElasticsearchSinkTask.flush(ElasticsearchSinkTask.java:125)
at org.apache.kafka.connect.runtime.WorkerSinkTask.commitOffsets(WorkerSinkTask.java:287)
at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:157)
at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:143)
at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:140)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:175)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Which is very odd...
Then I'm getting all these warnings.
PS: How can I benefit from the elasticsearch connector updates if I'm using the cp-docker-images ?
from kafka-connect-elasticsearch.
@shikhar Sorry for the very long delay, I didn't notice the notification...
By 'lots of warning' I meant thousands or warnings. not just a few when starting the connector when the shutdown was not clean.
Right now, we don't use the kafka elasticsearch connector anymore, so I am not sure I can help moving on with this issue.
My current hypothesis is that the connector was reading to many items from kafka and was not able to index them into elasticsearch before the session.timeout.ms triggered. The commit fails (because it is too late) and the items are given to another thread. However, part of them are already indexed.
(Not 100% sure, but it is the problem we hit with logstash and the kafka input plugin and elasticsearch output plugin).
Because i will not be able to help test any solution right now, from my point of view, you can close this issue.
from kafka-connect-elasticsearch.
Related Issues (20)
- ERROR Failed to create client to verify connection (Invalid or missing build flavor [oss]) HOT 1
- Mapping of topic to specific index in elastic-sink connector HOT 6
- [BUG] connector crash without a legit reason [type:mapper_parsing_exception][reason: array_index_out_of_bounds_exception Index -1 out of bounds for length 0] HOT 1
- kafka-connect-elasticsearch error message Failed to execute the bulk request HOT 3
- Restriction of Data Stream Type
- How to Convert JSON String field to ES Object?
- Capture Kafka key without using it as ID HOT 2
- Suggestion for INSERT operation "Ignoring EXTERNAL version conflict for operation INDEX on document"
- Used Elastic Java REST client is deprecated in 7.15.0 HOT 1
- Error with `"behavior.on.null.values": "delete"`
- Consumer paused indefinitely when using `AsyncOffsetTracker` with lot of null values
- Cannot use data stream with time_series mode HOT 2
- Error: Cannot infer mapping without schema HOT 1
- Connector fails with payloads >20 MB HOT 1
- Can't create a connector even if its loaded in Strimzi
- Support requests per second configuration options
- Log when there are too many requests errors
- [BUG] `TOO_MANY_REQUESTS` error craches the tasks with a unrecoverable exceptions without retries
- Ignore 'document_parsing_exception' HOT 1
- Inconsistent Logging for Tombstone Messages in Elastic Sink Connector
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from kafka-connect-elasticsearch.