Elasticsearch has support for <a href="https://www.elastic.co/guide/en/elasticsearch/r

ES Ingest Pipeline support about kafka-connect-elasticsearch HOT 13 OPEN

confluentinc commented on July 1, 2024 37

ES Ingest Pipeline support

from kafka-connect-elasticsearch.

Comments (13)

Nayruden commented on July 1, 2024 11

Any updates on this issue?

from kafka-connect-elasticsearch.

pratikshya commented on July 1, 2024 7

Is there any update on this enhancement?

from kafka-connect-elasticsearch.

briward commented on July 1, 2024 4

I'm also looking for this support as it's going to dramatically improve the ElasticSearch queries that we are currently running.

from kafka-connect-elasticsearch.

agi0rgi commented on July 1, 2024 2

Why would passing the pipeline as a parameter be the desired approach rather than setting index.default_pipeline on the ES side of things?

You might need to apply different pipeline based on the logs type you're trying to index.
If i have, say, a topic "es-logs", which then i use to index documents in the index "es-prod-logs" through the connector and it contains both system and apache logs i would like to be able to get Apache logs parsed by the default filebeat apache pipeline, meanwhile System logs are getting parsed by the default filebeat syslog system pipeline.

That would be useful as Filebeat has a metadata field which specifies the ingest pipeline to use (https://www.elastic.co/guide/en/logstash/current/use-ingest-pipelines.html)

As a workaround you might want to have filebeat conditionally sending logs to a different topic.
If logs are from Apache folder logs then send them to the "elk_apache_logs" topic, which will then send them to the "elk_apache_logs" index which has the "index.default_pipeline" setting set to the Filebeat Apache Pipeline.
Same would apply for System logs and so on.

from kafka-connect-elasticsearch.

windowsrefund commented on July 1, 2024 1

Why would passing the pipeline as a parameter be the desired approach rather than setting index.default_pipeline on the ES side of things?

from kafka-connect-elasticsearch.

windowsrefund commented on July 1, 2024 1

Given that starting point, I can see the use case.

from kafka-connect-elasticsearch.

evilezh commented on July 1, 2024

I think the issue is mostly because of Jest for now, as it does not support that.

from kafka-connect-elasticsearch.

yuancheng-p commented on July 1, 2024

Hello, we also need this feature. Since this elasticsearch connector is based on the Bulk API (https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html), maybe it's possible to add an option in the configuration file to pass user defined url parameters ?

For example, while putting the data to elasticsearch, it will call http://<elasticsearh_url>/_bulk?pipeline=some_pipeline, where the parameters come from an option defined in the properties file: bulk.url.parameters=?pipeline=some_pipeline. By default, this option should be empty.

This is the basic idea, then maybe the parameters should be better defined with a map key-value pairs.

from kafka-connect-elasticsearch.

amonhuang commented on July 1, 2024

Is any update of this feature? We need it in our production environment too, thanks.

from kafka-connect-elasticsearch.

zboinek commented on July 1, 2024

I see few commits on this issue and also code that provide pipeline config parameter. Does it work? I'm trying to specify pipeline in my sink conf file but nothing happens.

from kafka-connect-elasticsearch.

taythebot commented on July 1, 2024

We also require this option in production. The only other viable workaround is to ditch this connector and use Logstash to consume messages from Apache Kafka, adding another potential point of failure and making our Elasticsearch deployment more complicated on scale.

Pipelines are an integral part of processing data using Elasticsearch, please consider merging the branches above.

from kafka-connect-elasticsearch.

pavelpe commented on July 1, 2024

I see few commits on this issue and also code that provide pipeline config parameter. Does it work? I'm trying to specify pipeline in my sink conf file but nothing happens.

There was a merge request (almost a year ago) but seems like master doesn't contain any related changes in config code:
ElasticsearchSinkConnectorConfig.java

And it's strange because this feature is requested by many users for a long time.

from kafka-connect-elasticsearch.

brsolomon-deloitte commented on July 1, 2024

The lack of this feature is a deciding factor in use of Logstash over kafka-connect-elasticsearch.

As mentioned by agi0rgi, being able to apply a pipeline at runtime based on the topic, key, or message attributes is a common need for ingest processes.

from kafka-connect-elasticsearch.

ES Ingest Pipeline support about kafka-connect-elasticsearch HOT 13 OPEN

Comments (13)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent