Giter Site home page Giter Site logo

Comments (7)

binarylogic avatar binarylogic commented on August 25, 2024 2

Hi @sam701, this source is a little more tricky since:

  1. Vector must implement continuous shard discovery.
  2. Vector must handle various shard states (ex: a closed stream).
  3. Vector must coordinate with other Vector instances to ensure only one Vector instance is reading a shard. (distributed locking).
  4. Vector must maintain checkpoint state.

Kafka, for example, handles a lot of this bookkeeping, making the integration much easier. That's why this source is not done yet. It's questionable if all of the above fits within the scope of Vector. Especially stream exclusivity, which would require distributed locking. That would obviously need to be delegated to a system designed for that.

The best solution, imo, is the wrap Vector in a system that handles the above. Which is easier said than done. But, for example, if Vector was integrated into an AWS Lambda function you could leverage AWS' kinesis -> lambda integration, which handles all of this for you.

from vector.

nikolay-te avatar nikolay-te commented on August 25, 2024 1

There is a client for Rust, already being used by the kinesis sink, but the issue is in the complexity of synchronization and shard discovery. The Logstash input uses the AWS KCL that needs a DynamoDB table (e.g. similar to how Kafka uses zookeeper for state).
@binarylogic's comment explains well the complexity behind this.

I think however another option could be to run the AWS Java KCL MultiLangDaemon and pipe the output to a Vector stdin source.

from vector.

sam701 avatar sam701 commented on August 25, 2024

@binarylogic How would you recommend to handle this scenario in spring 2020? Having a lambda + vector with HTTP source? I saw, the HTTP source is not production ready yet.

from vector.

joseluisjimenez1 avatar joseluisjimenez1 commented on August 25, 2024

Any news on this?

from vector.

RichardoC avatar RichardoC commented on August 25, 2024

I'd also be appreciative of this feature being added. In terms of the complexity, logstash already supports this so perhaps its solutions to the locking problem/etc can be reused?
https://www.elastic.co/guide/en/logstash/current/plugins-inputs-kinesis.html

from vector.

joseluisjimenez1 avatar joseluisjimenez1 commented on August 25, 2024

I'd also be appreciative of this feature being added. In terms of the complexity, logstash already supports this so perhaps its solutions to the locking problem/etc can be reused? https://www.elastic.co/guide/en/logstash/current/plugins-inputs-kinesis.html

Looks like logstash is able to support cause is using the aws client, but there is no rust one at this moment: https://github.com/awslabs?q=kinesis-client&type=all&language=&sort=

I guess that we can call some java or python code from rust but I seems bad to me.

from vector.

RichardoC avatar RichardoC commented on August 25, 2024

There is https://crates.io/crates/aws-sdk-kinesis linked to from https://awslabs.github.io/aws-sdk-rust/ so perhaps that have most of what's needed?

from vector.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.