samcday / logstash-output-kinesis Goto Github PK

View Code? Open in Web Editor NEW

59.0 7.0 46.0 27.65 MB

A Kinesis output plugin for Logstash that uses KPL.

License: Other

Ruby 100.00%

logstash-output-kinesis's Introduction

Kinesis Output Plugin

This is a plugin for Logstash. It will send log records to a Kinesis stream, using the Kinesis Producer Library (KPL).

This version is intended for use with Logstash 5.x. For plugin versions compatible with older versions of Logstash:

Configuration

Minimum required configuration to get this plugin chugging along:

output {
  kinesis {
    stream_name => "logs-stream"
    region => "ap-southeast-2"
  }
}

This plugin accepts a wide range of configuration options, most of which come from the underlying KPL library itself. View the full list of KPL configuration options here.

Please note that configuration options are snake_cased instead of camelCased. So, where KinesisProducerConfiguration offers a setMetricsLevel option, this plugin accepts a metrics_level option.

Dynamic stream name

You can dictate the name of the stream to send a record to, based on data in the record itself.

output {
  kinesis {
    stream_name => "%{myfield}-%{myotherfield}"
  }
}

Metrics

The underlying KPL library defaults to sending CloudWatch metrics to give insight into what it's actually doing at runtime. It's highly recommended you ensure these metrics are flowing through, and use them to monitor the health of your log shipping.

If for some reason you want to switch them off, you can easily do so:

output {
  kinesis {
    # ...

    metrics_level => "none"
  }
}

If you choose to keep metrics enabled, ensure the AWS credentials you provide to this plugin are able to write to Kinesis and write to CloudWatch.

Authentication

By default, this plugin will use the AWS SDK DefaultAWSCredentialsProviderChain to obtain credentials for communication with the Kinesis stream (and CloudWatch, if metrics are enabled). The following places will be checked for credentials:

AWS_ACCESS_KEY_ID / AWS_SECRET_KEY environment variables available to the Logstash prociess
~/.aws/credentials credentials file
Instance profile (if Logstash is running in an EC2 instance)

If you want to provide credentials directly in the config file, you can do so:

output {
  kinesis {
    # ...

    access_key => "AKIAIDFAKECREDENTIAL"
    secret_key => "KX0ofakeLcredentialsGrightJherepOlolPkQk"

    # You can provide specific credentials for CloudWatch metrics:
    metrics_access_key => "AKIAIDFAKECREDENTIAL"
    metrics_secret_key => "KX0ofakeLcredentialsGrightJherepOlolPkQk"
  }
}

If access_key and secret_key are provided, they will be used for communicating with Kinesis and CloudWatch. If metrics_access_key and metrics_secret_key are provided, they will be used for communication with CloudWatch. If only the metrics credentials were provided, Kinesis would use the default credentials provider (explained above) and CloudWatch would use the specific credentials. Confused? Good!

Using STS

You can also configure this plugin to use AWS STS to "assume" a role that has access to Kinesis and CloudWatch. If you use this in combination with EC2 instance profiles (which the defaults credentials provider explained above uses) then you can actually configure your Logstash to write to Kinesis and CloudWatch without any hardcoded credentials.

output {
  kinesis {
    # ...

    role_arn => "arn:aws:iam::123456789:role/my-kinesis-producer-role"

    # You can also provide a specific role to assume for CloudWatch metrics:
    metrics_role_arn => "arn:aws:iam::123456789:role/my-metrics-role"
  }
}

You can combine role_arn / metrics_role_arn with the explicit AWS credentials config explained earlier, too.

All this stuff can be mixed too - if you wanted to use hardcoded credentials for Kinesis, but then assume a role via STS for accessing CloudWatch, you can do that. Vice versa would work too - assume a role for accessing Kinesis and then providing hardcoded credentials for CloudWatch. Make things as arbitrarily complicated for yourself as you like ;)

Building a partition key

Kinesis demands a partition key be provided for each record. By default, this plugin will provide a very boring partition key of -. However, you can configure it to compute a partition key from fields in your log events.

output {
  kinesis {
    # ...
    event_partition_keys => ["[field1]", "[field2]"]
  }
}

Randomised partition keys

If you don't care about the ordering of your logs in the Kinesis stream, you might want to use a random partition key. This way, your log stream will be more or less uniformly spread across all available shards in the Kinesis stream.

output {
  kinesis {
    randomized_partition_key => true
  }
}

Record Aggregation

The Amazon KPL library can aggregate your records when writing to the Kinesis stream. This behaviour is configured to be enabled by default.

If you are using an older version of the Amazon KCL library to consume your records, or not using KCL at all, your consumer application(s) will probably not behave correctly. See the matrix on this page for more info, and read more about de-aggregating records here.

If you wish to simply disable record aggregation, that's easy:

output {
  kinesis {
    aggregation_enabled => false
  }
}

Backpressure

This plugin will enforce backpressure if the records passing through Logstash's pipeline are not making it up to Kinesis fast enough. When this happens, Logstash will stop accepting records for input or filtering, and a warning will be emitted in the Logstash logs.

By default, the threshold for blocking is 1000 pending records. If you want to throw more memory / CPU cycles at buffering lots of stuff before it makes it to Kinesis, you can control the high-watermark:

output {
  kinesis {
    max_pending_records => 10000
  }
}

Logging configuration

The underlying KPL uses SLF4J for logging and binding for Log4j (used by Logstash) is included in the plugin package. Thus the logging levels can be controlled with the log4j2.properties file provided by Logstash.

As the KPL might be too noisy with INFO level, you might want to dial it down by following configuration in log4j2.properties:

...
logger.kinesis.name = com.amazonaws.services.kinesis
logger.kinesis.level = WARN
logger.kinesis.additivity = false
logger.kinesis.appenderRef.console.ref = console
...

Known Issues

Alpine Linux is not supported

This Logstash plugin uses the KPL daemon under the covers. This daemon is linked against, and specifically requires, glibc. See awslabs/amazon-kinesis-producer#86.

Noisy shutdown

During shutdown of Logstash, you might get noisy warnings like this:

[pool-1-thread-6] WARN com.amazonaws.services.kinesis.producer.Daemon - Exception during updateCredentials
java.lang.InterruptedException: sleep interrupted
at java.lang.Thread.sleep(Native Method)
at com.amazonaws.services.kinesis.producer.Daemon$5.run(Daemon.java:316)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)

This is caused by awslabs/amazon-kinesis-producer#10.

Noisy warnings about `Error during socket read`

While your Logstash instance is running, you may occasionally get a warning on stderr that looks like this:

[2015-10-20 06:31:08.441640] [0x00007f36c9402700] [error] [io_service_socket.h:229] Error during socket read: End of file; 0 bytes read so far (kinesis.us-west-1.amazonaws.com:443)

This is being tracked in awslabs/amazon-kinesis-producer#17. This log message seems to just be noise - your logs should still be delivering to Kinesis fine (but of course, you should independently verify this!).

Contributions

Are more than welcome. See CONTRIBUTING.md

License

Apache License 2.0

logstash-output-kinesis's People

Contributors

Stargazers

Watchers

Forkers

aedoran simplyguru-dot ihorkhavkin mnazbro acchen97 jpotts18 xose modolabs astropuffin dplunk vkurkine nanobit trello-archive chrandal danapplegate gregsterin lgarvey jjensn littleliang donnyding jackyjjc semyont vhartikainen lchen223 phireedi mrohland jordanforks ipf-pl-godlewsm rodgjame auto1-oss tmccoy lnxeng benganellison erica-terry kagonzalez84 asimion-atlassian bliujc thejumpcloud av-talsi shugars323

logstash-output-kinesis's Issues

Error writing event to Kinesis

Hi, I want logstash output kinesis.
but error message view

message=>"Error writing event to Kinesis", :exception=>com.amazonaws.services.kinesis.producer.DaemonException: The child process has been shutdown and can no longer accept messages., :event=>#<LogStash::Event:0x5739e2f7 @metadata={"path"=>"/var/log/nginx/api.oregon.access.log", "partition_key"=>"-"}

logstash config file

output
{
kinesis {
stream_name => "cwlTest1-KinesisSubscriptionStream-EW4WU6QL3HDQ"
region => "ap-northeast-1"
aggregation_enabled => false
event_partition_keys => ["[field1]", "[field2]"]
}

}

What problem this error message?
How modified issue?

please help me.

Kinesis Firehose target results in no error but no output

When I run the attached kinesis_waves.conf and pass the short.csv nothing shows up in Kinesis if the target is a kinesis firehose. CloudWatch Metric's UserRecordsPut is zero, as are the other measures. Everything works if the target is a ordinary kinesis stream.

I'm running logstash 5.6.0 with my logstash.yml file is entirely commented out on OSX 10.13 with java 1.8.0_131-b11.

Is this intended behavior?

short.csv.txt
kinesis_waves.conf.txt

logstash-plain.log

Plugin offline install

Hi @samcday

I try to install the plugin with offline method because my VM cannot connect to https://rubygems.org.
My environment is: logstash 5.0.0, JDK 1.8.0

I try the following steps but failed, could you please help me?

download plugin zip file from github, extract it and upload the folder logstash-output-kinesis-master to VM
add one line to logstash Gemfile: gem "logstash-output-kinesis", :path => "/tmp/logstash-output-kinesis-master
run #bin/logstash-plugin install --no-verify

Error info:
[root@ip-172-31-5-94 logstash-5.0.0]# bin/logstash-plugin install --no-verify
Installing...
Error Bundler::GemspecError, retrying 1/10
There was a Errno::ENOENT while loading logstash-output-kinesis.gemspec:
No such file or directory - git from
/tmp/logstash-output-kinesis-master/logstash-output-kinesis.gemspec:17:in `eval_gemspec'

How can I finish offline install? Thank you!
Xun

Write Throughput Exceeded

Hi,

I am using your plugin to backhaul a whole bunch of data from kafka to kinesis.
And right now, I'm exceeding the write throughput.

I was just wondering, since I don't see anything explicit in the logs, what the behavior of logstash was in this case.
I would assume you just retry failed logs until they succeed but I wanted to double check with you :-)

Thanks!

Support to produce UTF-8 JSON

Currently using this output plugin like this:

output {
    kinesis {
        stream_name => "test-stream"
        region => "us-east-1"
    }
}

When connecting my stream with Kinesis Analytics (https://aws.amazon.com/kinesis/analytics/) i get the following error:

There was an issue updating your application. Error message: 1 validation error detected: Value 'windows-1253' at 'input.inputSchema.recordEncoding' failed to satisfy constraint: Member must satisfy regular expression pattern: UTF-8

DatatypeConverter Missing in version 7.3.2

When using a dockerfile to add this plugin the base image of logstash 7.3.2 it is not able to start:

dockerfile:

#logstash-base image, plus d2l configrations
FROM docker.elastic.co/logstash/logstash-oss:7.3.2

#remove default configs
RUN rm -rf /usr/share/logstash/pipeline
RUN rm -f /usr/share/logstash/config/logstash-sample.conf
RUN /usr/share/logstash/bin/logstash-plugin install logstash-output-kinesis

#add custom settings files
ADD config/ /usr/share/logstash/config/
ADD pipeline/ /usr/share/logstash/pipeline/

docker build:

docker build . --tag testing-logstash:latest -m 2g

docker run:

docker run testing-logstash:latest

error:

[2019-09-27T17:26:36,241][ERROR][logstash.agent           ] Failed to execute action {:id=>:main, :action_type=>LogStash::ConvergeResult::FailedAction, :message=>"Could not execute action: PipelineAction::Create<main>, action_result: false", :backtrace=>nil}
[2019-09-27T17:26:36,456][ERROR][org.logstash.Logstash    ] java.lang.NoClassDefFoundError: javax/xml/bind/DatatypeConverter

does it work on java11 ?

I see below message on the Logstash logs on java11:

[2020-10-07T09:20:56,880][ERROR][org.logstash.Logstash ] java.lang.NoClassDefFoundError: javax/xml/bind/DatatypeConverter
[2020-10-07T09:21:09,466][INFO ][logstash.runner ] Starting Logstash {"logstash.version"=>"6.8.12"}
[2020-10-07T09:21:11,479][INFO ][logstash.pipeline ] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>8, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50}
[2020-10-07T09:21:11,567][INFO ][com.amazonaws.services.kinesis.producer.KinesisProducer] Extracting binaries to /tmp/amazon-kinesis-producer-native-binaries
[2020-10-07T09:21:12,140][ERROR][logstash.agent ] Failed to execute action {:id=>:main, :action_type=>LogStash::ConvergeResult::FailedAction, :message=>"Could not execute action: PipelineAction::Create

, action_result: false", :backtrace=>nil}

jar dependencies conflict with elasticsearch output plugin

On various servers running Java 8, I had the latest logstash-output-elasticsearch plugin installed concurrently with this plugin, and Logstash was consistently crashing shortly after startup with a NoMethodError exception on set_validate_after_inactivity.

It turns out the elasticsearch plugin depends on manticore which calls setValidateAfterInactivity which was added in v4.4 of httpclient. manticore requires version 4.5 of httpclient in its gemspec, while this plugin ships with 4.3. I believe Java 8 is somehow loading these such that the 4.3 versions takes precedence.

I am able to fix the issue locally (and haven't run into any other problems) by swapping the the httpclient and httpcore files for newer versions in this plugin's runtime-jars folder.

The KPL must be upgraded to version 0.12.6 or later

According to AWS,
On February 9, 2018, at 9:00 AM PST, Amazon Kinesis Data Streams will install ATS certificates. To continue to be able to write records to Kinesis Data Streams using the Kinesis Producer Library (KPL), you must upgrade your installation of the KPL to version 0.12.6 or later by that time.

Reference: https://docs.aws.amazon.com/streams/latest/dev/kinesis-kpl-upgrades.html

The current KPL version used in this plugin is older than 0.12.6, so it will no longer be working on/after February 9, 2018.

error with Logstash 5.4.2

I get this error when trying to run this plugin with logstash 5.4.2:

Pipeline aborted due to error {:exception=>java.lang.RuntimeException: Could not copy native binaries to temp directory /tmp/amazon-kinesis-producer-native-binaries, :backtrace=>["com.amazonaws.services.kinesis.producer.KinesisProducer.extractBinaries(com/amazonaws/services/kinesis/producer/KinesisProducer.java:856)", "com.amazonaws.services.kinesis.producer.KinesisProducer.<init>(com/amazonaws/services/kinesis/producer/KinesisProducer.java:245)", "java.lang.reflect.Constructor.newInstance(java/lang/reflect/Constructor.java:423)", "RUBY.register(/usr/share/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-kinesis-5.1.0-java/lib/logstash/outputs/kinesis.rb:82)", "org.jruby.RubyArray.each(org/jruby/RubyArray.java:1613)", "RUBY.register(/usr/share/logstash/logstash-core/lib/logstash/output_delegator_strategies/legacy.rb:17)", "RUBY.register(/usr/share/logstash/logstash-core/lib/logstash/output_delegator.rb:41)", "RUBY.register_plugin(/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:268)", "RUBY.register_plugins(/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:279)", "org.jruby.RubyArray.each(org/jruby/RubyArray.java:1613)", "RUBY.register_plugins(/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:279)", "RUBY.start_workers(/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:288)", "RUBY.run(/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:214)", "RUBY.start_pipeline(/usr/share/logstash/logstash-core/lib/logstash/agent.rb:398)", "java.lang.Thread.run(java/lang/Thread.java:745)"]}

Any ideas on how to fix this or what causes it?

Is it possible send data via a proxy server to kinesis.

We have egress rules configured in the server to only send data via the proxy server, which has whitelisting. Is it possible to configure that?

Installing kinesis output plugin on a amazon linux box...

Hello,

I'm trying to install my kinesis plugin onto my logstash instance and have it push logs into my AWS kinesis stream.

I'm running logstash 2.0 on an amazon linux box and have put the kinesis plugin into the /etc/logstash/conf.d/ config directory. When I check my logstash.log, I receive the following error:

The error reported is:
Couldn't find any output plugin named 'kinesis'. Are you sure this is correct? Trying to load the kinesis output plugin resulted in this error: no such file to load -- logstash/outputs/kinesis

I know that the path /logstash/output/kinesis would be for a windows box.

Have I put the kinesis output plugin in the right place on my logstash linux box?

Many thanks

Example IAM policy? AccessDenied

I'm testing with the following policy:

      "Effect": "Allow",
      "Action": [
        "firehose:*",
        "kinesis:*",
        "cloudwatch:*",
        "logs:*",
        "events:*"
      ],
      "Resource": [ "*" ]
    },

attached to the following role:

  "Version": "2012-10-17",
  "Statement": [
    {
      "Action": "sts:AssumeRole",
      "Principal": {
        "Service": "ec2.amazonaws.com"
      },
      "Effect": "Allow",
      "Sid": ""
    },
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "firehose.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

and getting:
Caused by: com.amazonaws.services.securitytoken.model.AWSSecurityTokenServiceException: Access denied (Service: AWSSecurityTokenService; Status Code: 403; Error Code: AccessDenied; Request ID: $redacted)

Could you provide an example policy?

Complex types used in event partition keys cause serialization errors

If you try to use field foo in the event partition key, but foo is an array or object, then the plugin fails.

logstash-output-kinesis does not work inside alpine linux

I'm going to open and then close this issue, because I'd like folks to find it and avoid the debugging hassle I just went through.

Short story: the logstash-output-kinesis plugin won't work inside an Alpine Linux container, because the KPL library is linked against glibc. See awslabs/amazon-kinesis-producer#86 for more info.

Before mucking around, all I could see in the logstash output was this error, immediately after startup:

22:47:49.812 [[main]>worker1] WARN  logstash.outputs.kinesis - Error writing event to Kinesis {:exception=>com.amazonaws.services.kinesis.producer.DaemonException: The child process has been shutdown and can no longer accept messages.}

This led me to #16, but in my case the credentials were fine. The root cause was an error at startup time which was not logged by default, making this even harder to track down. To enable logging internal to the KPL library, I downloaded the latest release of slf4j, and copied their "simple" logger jar into the java class path:

$ wget https://www.slf4j.org/dist/slf4j-1.7.25.tar.gz
$ tar xzvf slf4j-1.7.25.tar.gz
$ cp slf4j-1.7.25/slf4j-simple-1.7.25.jar  /usr/share/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-kinesis-5.0.0-java/vendor/jar-dependencies/runtime-jars/slf4j-simple.jar

then, restarting logstash, I finally saw on stderr:

[kpl-daemon-0000] ERROR com.amazonaws.services.kinesis.producer.KinesisProducer - Error in child process
com.amazonaws.services.kinesis.producer.IrrecoverableError: Error starting child process
	at com.amazonaws.services.kinesis.producer.Daemon.fatalError(Daemon.java:525)
	at com.amazonaws.services.kinesis.producer.Daemon.startChildProcess(Daemon.java:456)
	at com.amazonaws.services.kinesis.producer.Daemon.access$100(Daemon.java:63)
	at com.amazonaws.services.kinesis.producer.Daemon$1.run(Daemon.java:133)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: Cannot run program "/tmp/amazon-kinesis-producer-native-binaries/kinesis_producer_d93825f806782576ef9f09eef67a2baeadfec35c": error=2, No such file or directory
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
	at com.amazonaws.services.kinesis.producer.Daemon.startChildProcess(Daemon.java:454)
	... 7 more
Caused by: java.io.IOException: error=2, No such file or directory
	at java.lang.UNIXProcess.forkAndExec(Native Method)
	at java.lang.UNIXProcess.<init>(UNIXProcess.java:247)
	at java.lang.ProcessImpl.start(ProcessImpl.java:134)
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
	... 8 more

That led me to awslabs/amazon-kinesis-producer#86 👍

Amazon Kinesis Data Streams will install ATS certificates

On February 9, 2018 at 9AM PST Amazon Kinesis Data Streams will install ATS certificates[1]. Producer clients older than version 0.12.6 will fail to write records to Kinesis at this point because these clients do not support ATS certificates.

throughput issue

Hi Team,

I am trying to stream logs with a autoscaled logstash gke cluster, with this plugin installed.
with max_pending_records => 5000

my kinesis shards is set to 100 shards, enough capacity

still i am getting warning messages in my logstash logs

[2019-02-02T08:48:12,822][INFO ][com.amazonaws.services.kinesis.producer.LogInputStreamReader] [2019-02-02 08:48:12.821877] [0x00000032][0x00007f2e2d5eb700] [info] [processing_statistics_logger.cc:112] Stage 2 Triggers: { stream: 'streampoc1', manual: 9, count: 0, size: 0, matches: 0, timed: 387, KinesisRecords: 477, PutRecords: 396 }
[2019-02-02T08:48:26,861][WARN ][logstash.outputs.kinesis ] Kinesis is too busy - blocking until things have cleared up

happy to do a zoom and show you the issue.

in my use case, i am planning to stream 10gb of logs in < 30 mins, all the way to kinesis.

Any help is really appreciated. Please contact me at [email protected] if you need more information

Log message duplication issues

Around mid May we received a notification from the AWS team stating that we needed up upgrade KPL versions due to deprecation as we were running an older version of this logstash plugin at the time.
We upgraded to Logstash 5.x and installed the latest version of this plugin and things seemed to be fine.
AWS then emailed us again to notify that the change on their back end has been applied.

This occurred around the 22nd June 2018 and since then we have found that we are seeing log message duplication from between 2x to 180x and after investigation we believe that this is occurring when being pushed to Kinesis via this plugin.

Has anyone else experienced this?

Does this support the Firehose output

Kinesis is too busy?

Hi,
I'm getting this log consistently

{:timestamp=>"2016-05-05T19:07:33.472000+0000", :message=>"Pipeline main started"}
{:timestamp=>"2016-05-05T19:07:46.186000+0000", :message=>"Kinesis is too busy - blocking until things have cleared up", :level=>:warn}
{:timestamp=>"2016-05-05T19:08:01.541000+0000", :message=>"Kinesis is too busy - blocking until things have cleared up", :level=>:warn}
{:timestamp=>"2016-05-05T19:08:24.668000+0000", :message=>"Kinesis is too busy - blocking until things have cleared up", :level=>:warn}
{:timestamp=>"2016-05-05T19:08:40.208000+0000", :message=>"Kinesis is too busy - blocking until things have cleared up", :level=>:warn}

I have a very simple configuration

  kinesis {
    stream_name => "KinesisStreamTest"
    region => "us-east-1"
    access_key => "ACCESSKEY"
    secret_key => "SECRETKEY"
  }

Is there anything I'm doing wrong?

Is it compatible with Logstash 5.x?

Socket Read Error

Hi,

Just configured this plugin as a test:

output {
  kinesis {
    stream_name => "TestStream"
    region => "eu-west-1"
  }
}

All works as expected on messages being pushed from logstash to kinesis. I am seeing this in the logstash error file:

[2015-10-19 15:11:27.918276] [0x00007f626f6cd700] [error] [io_service_socket.h:229] Error during socket read: End of file; 0 bytes read so far (kinesis.eu-west-1.amazonaws.com:443)
[2015-10-19 15:11:44.970761] [0x00007f626f6cd700] [error] [io_service_socket.h:229] Error during socket read: End of file; 0 bytes read so far (kinesis.eu-west-1.amazonaws.com:443)
[2015-10-19 15:12:30.086751] [0x00007f626f6cd700] [error] [io_service_socket.h:229] Error during socket read: End of file; 0 bytes read so far (kinesis.eu-west-1.amazonaws.com:443)
[2015-10-19 15:13:10.198026] [0x00007f626f6cd700] [error] [io_service_socket.h:229] Error during socket read: End of file; 0 bytes read so far (kinesis.eu-west-1.amazonaws.com:443)
[2015-10-19 15:13:41.241707] [0x00007f626f6cd700] [error] [io_service_socket.h:229] Error during socket read: End of file; 0 bytes read so far (kinesis.eu-west-1.amazonaws.com:443)
[2015-10-19 15:14:16.297197] [0x00007f626f6cd700] [error] [io_service_socket.h:229] Error during socket read: End of file; 0 bytes read so far (kinesis.eu-west-1.amazonaws.com:443)
[2015-10-19 15:14:24.315864] [0x00007f626f6cd700] [error] [io_service_socket.h:229] Error during socket read: End of file; 0 bytes read so far (kinesis.eu-west-1.amazonaws.com:443)

The IAM for this is as follows:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "Stmt1389182023000",
            "Effect": "Allow",
            "Action": [
                "kinesis:*",
                "cloudwatch:*"
            ],
            "Resource": "*"
        }
    ]
}

Any ideas?

Paul

Added to logstash-plugins?

@samcday @purbon Can this be considered a candidate for the /logsash-plugin/ group?

Error Writing to Kinesis on EMR spinup

Hi -

I'm seeing this issue when we startup logstash as a Step in EMR spinup to forward logs to Kinesis. The strange thing is that once the EMR cluster is 'Ready', restarting logstash connects to Kinesis just fine. Might be an AWS issue which we're exploring, but I'm wondering if this issue has been logged before.

The error:
{:timestamp=>"2016-12-07T19:36:17.136000+0000", :message=>"Error writing event to Kinesis", :exception=>com.amazonaws.services.kinesis.producer.DaemonException: The child process has been shutdown and can no longer accept messages., :level=>:warn}

Kinesis Output Config
kinesis {
stream_name => "${KINESIS_STREAM:XXX}"
region => "us-east-1"
randomized_partition_key => true
aggregation_enabled => false
max_pending_records => 10000
}

I can provide additional information if needed.

Does it emit records in batches/

While configuring my logstash instance, I noticed that the plugin is outputting to kinesis stream using kinesis:PutRecord instead of kinesis:PutRecords

is there a way to configure the plugin to use PutRecords in batches? @samcday

Invalid Signature Exception

Here's my logstash.conf:

...
output {
stdout {
codec => "json"
}
kinesis {
access_key => "xxx"
secret_key => "xxx"
stream_name => "apiplatform-audit-dev"
region => "us-west-2"
randomized_partition_key => true
}
}

When logstash attempts to push an event onto a Kinesis, the following error is thrown:

[shard_map.cc:172] Shard map update for stream "apiplatform-audit-dev" failed: {"__type":"InvalidSignatureException","message":"The request signature we calculated does not match the signature you provided. Check your AWS Secret Access Key and signing method. Consult the service documentation for detail. ...}

My access and secret keys are fine (I have double checked).

This repo is archived

I don't anticipate any future scenario where I have the time/energy to maintain this repo properly, and so I'm archiving it.

This repo still continues to get a bit of activity here and there, which suggests there's enough appetite for it that maybe it's worth someone's time to get it upstreamed into @logstash-plugins.

Logstash & Kinesis losing logs

Hey man,
This still needs more debugging but I've relized that logstash might be losing logs when using kinesis. Not 100% sure if this has anything to do with https://github.com/samcday/logstash-output-kinesis#noisy-warnings-about-error-during-socket-read but the number of logs that get lost if pretty big. +50% in some cases. I will continue digging into it though.

This Plugin is unstable, as Certificate files are removed after 3 days

The implementation of this plugin with the KPL on the server results in the certificate files being removed after 3 days.

As documented here: awslabs/amazon-kinesis-producer#81
And apparently fixed in KPL version 12.8: awslabs/amazon-kinesis-producer#165

Since the current version of KPL this is using is 12.5 based, we still have the issue.
We're going to attempt a workaround on our AMI, but I wanted to document this for anyone else looking to use this output.

Is it compatible with Logstash 6.x?

Planning to use this for logstash 6.
Did not find any breaking changes between 5 & 6 for plugins:
https://www.elastic.co/guide/en/logstash/current/breaking-changes.html

Plugin is not producing logs of underlying KPL

Seems like there is no SLF4J Binding at all in the package. Thus the following log entries are printed and no output is produced in the logs from KPL library.
´´´
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
´´´

I have created a PR (#35) as a fix candidate for this.
Feel free to comment and suggest improvements.

endpoint setting not supported

In order to test locally with something like Localstack, an endpoint setting is required. Trying to add an endpoint => "${KINESIS_ENDPOINT}" setting results in [logstash.outputs.kinesis ] Unknown setting 'endpoint' for kinesis

currently testing with logstash 7.5.1

Please create a latest release

Hi @samcday , looks like JDK11 dependency is already included, can you please create a latest release to use this plugin with JDK11?

Someday, one of kinesis Stream's process shutdown.

Dear samcday,
I really appreciate you to develop this perfect application.
I am recently using your application for my project.
But someday, I found some different point especially kinesis stream output process.

When stream data sent time processed by logstash output plugin to our system,
I figure out the kinesis output plunin process count is important which is running daemon as process,
when if the process count is 3, the output data is correct.
But, the count is below than 3, then our output data is incorrect.

So, I wanna if you this issue already known, please tell me how to do handle over.

Is there appropreate daemon process count? fixed 3 count?
If then, each process have each role? or only for using parallel processing
When comes to the time shutdown one of the daemon process?
for instance when logstash is busy.. or kinesis stream is not busy so itself shut down case..
any case is useful for me.

bundle error

we get this error when building the plugin:
~/logstash-output-kinesis$ bundler install
Fetching gem metadata from https://rubygems.org/
Fetching version metadata from https://rubygems.org/
Fetching dependency metadata from https://rubygems.org/
Could not find gem 'logstash-output-kinesis' in source at ..
Source contains 'logstash-output-kinesis' at: 2.1.0 java

Can you please help?

Thanks,
Oliver

Not all records make it into Kinesis unless I pass the --debug flag to logstash

I'm running Logstash 2.0.0 on a MacBook Pro (10.10.5), using the logstash-output-kinesis plugin to send the output to a Kinesis stream in us-east-1. I've configured Logstash to read from stdin and am piping a large file of Apache log events into Logstash.

In looking at the CloudWatch stats (as well as the stats generated by my Spark job, which is reading the events from the Kinesis stream), I observed that I never got as many records going into Kinesis as I expected. The test file I'm using contains 152,925 total events. Of those, 7 are malformed and so get rejected by Logstash, meaning that I would expect to see 152,918 events going into Kinesis. However, what I was observing was that about 50,000-55,000 events would make it into Kinesis each time (it varied a bit per test run).

To try to get some insight into what was going on, I included the --debug flag on the command line when starting Logstash. Suddenly, I was seeing all 152,918 events making it into Kinesis. If I removed the --debug flag, it went back to about 50,000-55,000 events. I tried it both ways a few times, and the results were consistent each time. I also tried using the --verbose flag instead of the --debug flag, but that still resulted in only 50,000-55,000 events getting into Kinesis. Only the --debug flag gave me 100% of the expected events.

Here's the command I use to ingest the Apache log file that results in 50,000-55,000 events getting into Kinesis:

bin/logstash -f logstash-stdin.conf < myApacheLogEvents.txt

And here's the command that results in 100% of events getting into Kinesis:

bin/logstash --debug -f logstash-stdin.conf < myApacheLogEvents.txt

And here's the config I'm using for the logstash-output-kinesis plugin in my Logstash config file:

output {
  #stdout {
  #  codec => json
  #}

  kinesis {
    stream_name => "dreams-webonenike-logs"
    region => "us-east-1"
    #metrics_level => "none"
    access_key => "xxx"
    secret_key => "xxx"
    randomized_partition_key => true
    aggregation_enabled => false
  }
}

As a test, I tried commenting out the kinesis output and uncommenting the stdout output. In that case, I saw that I got 100% of my events out of Logstash, even without including the --debug flag. This suggest to me that whatever is happening involves the Kinesis plugin, and not just Logstash itself.

Dropping lots of messages

I am testing a simple setup of

logstash-forwarder -> logstash -> kinesis -> lambda -> ES

I can get individual messages to flow through without issue.

However when logstash-forwarder goes to batch lines (say submit 10 log lines at once), I only see some or none of the lines in ES. I found that disabling aggregation helped some (when True no batched messages were appearing in ES at all)

I'm not seeing any errors in the logstash output, or in my lambda function. Anyone have insights?