Giter Site home page Giter Site logo

Comments (14)

driskell avatar driskell commented on July 23, 2024

Can you share your courier and logstash configs? Best to close #43 and out the entry here it is likely related.

from log-courier.

tedder avatar tedder commented on July 23, 2024

Here is the top of my courier config; the rest is picking up more files, I don't think it's relevant, but lemme know. The logcourier_transport is set to TCP.

Logcourier client

{
  "general": {
    "persist directory": "/opt/logcourier",
    "log stdout": false,
    "log syslog": true,
    "log level": "warning"
  },
  "network": {
    "servers": [ "{{logcourier_forwarder_host}}:{{logcourier_forwarder_port}}" ],
    "transport": "{{logcourier_transport}}"
  },
  "files": [
    {
      "paths": [
        "/var/log/syslog"
      ],

Logcourier server

Again, transport is TCP.

input {
  courier {
    # The port to listen on
    port => {{logcourier_input_port}}
    transport => {{logcourier_transport}}
  }
}

from log-courier.

driskell avatar driskell commented on July 23, 2024

Do you get any other messages in the logstash logs apart from failed to flush?

When courier gets EOF it normally means logstash plugin closed the connection - but it normally states the reason in logstash logs. Can you check?

from log-courier.

driskell avatar driskell commented on July 23, 2024

Hi @tedder did you manage to get any further into this?

from log-courier.

tedder avatar tedder commented on July 23, 2024

hey- you sent this as I was leaving on vacation. I'm back, I'll get it to you within the next few days.

from log-courier.

aaliang avatar aaliang commented on July 23, 2024

hey @driskell, this may be an unrelated issue to @tedder's error but it appears that max_packet_size is no longer being defaulted on logstash's side. Seems to work correctly if you set a max_packet_size in the logstash conf

I have commit 7cd0c63

log-courier gives me transport error EOF's
without specifying a max_packet_size I consistently get an error on transport:

'[LogCourierServer] ["org/jruby/RubyFixnum.java:931:in >'", "/home/ec2-user/logstash-1.4.2/vendor/bundle/jruby/1.9/gems/log-courier-0.13/lib/log-courier/server_tcp.rb:176:inrun'", "org/jruby/RubyKernel.java:1521:in loop'", "/home/ec2-user/logstash-1.4.2/vendor/bundle/jruby/1.9/gems/log-courier-0.13/lib/log-courier/server_tcp.rb:167:inrun'", "/home/ec2-user/logstash-1.4.2/vendor/bundle/jruby/1.9/gems/log-courier-0.13/lib/log-courier/server_tcp.rb:136:in `run'"]: comparison of Fixnum with nil failed (ArgumentError) {:level=>:warn}'

from log-courier.

driskell avatar driskell commented on July 23, 2024

Hi @aaliang - I've addressed that in the latest commit. Sorry about that.

from log-courier.

tedder avatar tedder commented on July 23, 2024

@driskell, I dug through my servers ("clients" and elasticsearch/logstash receivers) and don't have errors that indicate anything with these EOFs. The last major errors were the ones seen in #43 with the flush, and none of those are recent.

I'm going to increase verbosity of logstash (literally --verbose) and see what happens.

from log-courier.

tedder avatar tedder commented on July 23, 2024

Here's the EOF:

Sep 16 01:49:24 ip-xxxx log-courier[10937]: Transport error, will try again: EOF

Here's the same event form the server side

{:timestamp=>"2014-09-16T01:49:24.343000+0000", :message=>"[LogCourierServer] Connection from 0.0.0.15:55310 closed", :level=>:info}

Here's the context of that specific message. I've included one extra line on each side as context.

{:timestamp=>"2014-09-16T01:48:11.590000+0000", :message=>"[LogCourierServer] New connection from 0.0.0.15:55308", :level=>:info}
{:timestamp=>"2014-09-16T01:48:12.569000+0000", :message=>"[LogCourierServer] Connection from 0.0.0.126:38846 closed", :level=>:info}
{:timestamp=>"2014-09-16T01:48:24.361000+0000", :message=>"[LogCourierServer] New connection from 0.0.0.15:55310", :level=>:info}
{:timestamp=>"2014-09-16T01:49:11.343000+0000", :message=>"[LogCourierServer] Connection from 0.0.0.15:55308 closed", :level=>:info}
{:timestamp=>"2014-09-16T01:49:18.369000+0000", :message=>"[LogCourierServer] New connection from 0.0.0.126:38890", :level=>:info}
{:timestamp=>"2014-09-16T01:49:24.343000+0000", :message=>"[LogCourierServer] Connection from 0.0.0.15:55310 closed", :level=>:info}
{:timestamp=>"2014-09-16T01:49:25.361000+0000", :message=>"[LogCourierServer] New connection from 0.0.0.15:55333", :level=>:info}

One thing I notice is that the connection is open EXACTLY for 60 seconds. That smells like a timeout. (in fact, it's the Amazon TCP ELB idle timeout). If you compare that to an event that was apparently successful, port 55308, its time is not exactly a minute.

Hmm. Does courier leave a connection open and expect to be able to talk to it without doing a keepalive?

(I've scrubbed the IP address)

from log-courier.

driskell avatar driskell commented on July 23, 2024

Hi @tedder

Ah, you didn't mention ELB.

Yes it will open a connection and expect it to remain open. Unencumbered, a TCP connection would remain open indefinitely.

Firewalls and NAT generally limit it to as low as an hour. Thus courier has a keep alive internally that triggered every 30 minutes.

A one minute timeout is very low. You should increase if you can. IIRC you can increase it now for ELB.

Jason

from log-courier.

tedder avatar tedder commented on July 23, 2024

Sorry for not mentioning it. I was trying to figure out the key issues, that should have been more obvious. As of this summer it's settable, so I'll increase it to an hour.

I see the timeouts are here.. too bad I can't configure them to ping every 30 seconds.

And for posterity, here's how to set it in AWS ELB:

from log-courier.

driskell avatar driskell commented on July 23, 2024

It's actually here for Log Courier:

keepalive_timeout time.Duration = 900 * time.Second

The first one you found is the keep alive for the courier output Logstash plugin:

@keepalive_timeout = 1800

The other one you found is for the admin connection:

// TODO: Make idle timeout configurable

I'll be making the Log Courier one configurable at some point in the future, and probably the admin connection one too. With regards to the Logstash output plugin, not any time soon - that's behind in development in many areas already as I don't know of anyone using it (it's more of a forwarder legacy plugin I kept just in case). Once I see people using the output plugin I'll start catching it up to Log Courier standards. I'll also be documenting it as such shortly as those documentation pages are my next piece.

from log-courier.

tedder avatar tedder commented on July 23, 2024

Thanks for fixing my file searches. I just looked for "1800". No worries about making it configurable, I can normalize my stack to have a long timeout.

Feel free to close this (or I will).

from log-courier.

tedder avatar tedder commented on July 23, 2024

I reconfigured the timeout (including plumbing it through Ansible), zero transport errors since then. yay!

For posterity/SEO, here's the aws-cli command to reconfigure it:

AWS_ACCESS_KEY_ID="xx" AWS_SECRET_ACCESS_KEY="yy" aws elb modify-load-balancer-attributes --load-balancer-name LBNAME --load-balancer-attributes "{\"ConnectionSettings\":{\"IdleTimeout\":3600}}"

from log-courier.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.