Giter Site home page Giter Site logo

apsops / filebeat-kubernetes Goto Github PK

View Code? Open in Web Editor NEW
150.0 11.0 41.0 24 KB

Filebeat container, alternative to fluentd used to ship kubernetes cluster and pod logs

License: MIT License

Dockerfile 100.00%
kubernetes filebeat logstash logging docker pod container

filebeat-kubernetes's Introduction

filebeat-kubernetes

Docker Pulls

Filebeat container, alternative to fluentd used to ship kubernetes cluster and pod logs

Getting Started

This container is designed to be run in a pod in Kubernetes to ship logs to logstash for further processing. You can provide following environment variables to customize it.

LOGSTASH_HOSTS=example.com:4083,example.com:4084
LOG_LEVEL=info  # log level for filebeat. Defaults to "error".
FILEBEAT_HOST=ip-a-b-c-d # custom "host" field. Refer following manifest to set it to k8s nodeName
CLUSTER_NAME=my_cluster # Kubernetes cluster name to identity if you have multiple clusters. Default value is "default".

The endpoints listed by LOGSTASH_HOSTS should be listening with the Beats input plugin.

This should be run as a Kubernetes Daemonset (a pod on every node).

The updateStrategy will determine how to apply imperative changes, See K8s docs.

Example manifest:

apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  name: filebeat
  namespace: kube-system
  labels:
    app: filebeat
spec:
  updateStrategy:
    type: RollingUpdate
  template:
    metadata:
      labels:
        app: filebeat
      name: filebeat
    spec:
      containers:
      - name: filebeat
        image: apsops/filebeat-kubernetes:v0.4
        resources:
          limits:
            cpu: 50m
            memory: 50Mi
        env:
          - name: LOGSTASH_HOSTS
            value: myhost.com:5000
          - name: LOG_LEVEL
            value: info
          - name: CLUSTER_NAME
            value: my_cluster
          - name: FILEBEAT_HOST
            valueFrom:
                fieldRef:
                  fieldPath: spec.nodeName
        volumeMounts:
        - name: varlog
          mountPath: /var/log/containers
        - name: varlogpods
          mountPath: /var/log/pods
          readOnly: true
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
      terminationGracePeriodSeconds: 30
      tolerations:
      - key: node-role.kubernetes.io/master
        effect: NoSchedule
      volumes:
      - name: varlog
        hostPath:
          path: /var/log/containers
      - name: varlogpods
        hostPath:
          path: /var/log/pods
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers

Filebeat parses docker json logs and applies multiline filter on the node before pushing logs to logstash.

Make sure you add a filter in your logstash configuration if you want to process the actual log lines.

filter {
  if [type] == "kube-logs" {

    mutate {
      rename => ["log", "message"]
    }

    date {
      match => ["time", "ISO8601"]
      remove_field => ["time"]
    }

    grok {
        match => { "source" => "/var/log/containers/%{DATA:pod_name}_%{DATA:namespace}_%{GREEDYDATA:container_name}-%{DATA:container_id}.log" }
        remove_field => ["source"]
    }
  }
}

This grok pattern would add the fields - pod_name, namespace, container_name and container id to log entry in Elasticsearch.

Contributing

I plan to make this more modular and reliable.

Feel free to open issues and pull requests for bug fixes or features.

Licence

This project is licensed under the MIT License. Refer LICENSE for details.

filebeat-kubernetes's People

Contributors

apsops avatar c0psrul3 avatar timbuchwaldt avatar toddiuszho avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

filebeat-kubernetes's Issues

Host field should show the physical hostname

Today it shows the pod name as host.

The FluentD daemonset (https://github.com/fluent/fluentd-kubernetes-daemonset) follows a better practice, it shows the physical name of the node the log is coming from.

I'm thinking that the physical name would be more helpful in a troubleshooting situation. Patterns pointing to node failure would be easier to identify. Plus showing the pod name is kinda redundant. It's showing up in other fields as well.

Symlink File Mounting Issue

My filebeat DaemonSet was not able to read any container log files using the provided yaml. I kept getting error: stat (logfile-path.log) no such file or directory.

To fix the issue I had to also mount the /var/log/pods directory into the container.

I wasn't able to track down the exact change log, but I believe this is due to changes with how the log files are mounted in kubernetes 1.6.

My working daemonset:

apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  name: filebeat
  labels:
    app: filebeat
spec:
  template:
    metadata:
      labels:
        app: filebeat
      name: filebeat
    spec:
      containers:
      - name: filebeat
        image: apsops/filebeat-kubernetes:v0.3
        resources:
          limits:
            cpu: 50m
            memory: 50Mi
        env:
          - name: LOGSTASH_HOSTS
            value: my-host:my-port
          - name: LOG_LEVEL
            value: debug
        volumeMounts:
        - name: varlog
          mountPath: /var/log/containers
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
        - name: varlogpods
          mountPath: /var/log/pods
          readOnly: true
      terminationGracePeriodSeconds: 30
      volumes:
      - name: varlog
        hostPath:
          path: /var/log/containers
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers
      - name: varlogpods
        hostPath:
          path: /var/log/pods

PS. Thanks for putthing this project together, it saved my quite a bit of time.

Why mounting /var/lib/docker/containers ?

Hi,

I'm also trying to use filebeat + ELK instead of a default fluentd solution to collect k8s logs, and this repository is really helpful. Thank you!

Could I just ask you why you're mounting the original log stored location /var/lib/docker/containers, not only /var/log/containers with readonly permission? Is it necessary to run this container?

variable replacement

In my container, the variables in /etc/filebeat/filebeat.yml wouldn't replaced, how does the process works?

Kubernetes system logs

Seems the Filebeat config only ships the pod logs from /var/log/containers/*.log

I'm running Kubernetes in Rancher where Kubernetes itself is containerized and logs to the standard /var/lib/docker/containers folder. Now these files are not as nicely named as the symlinked ones in /var/log/containers, but they hold very important information, like the kubelet log.

Is this shipper aiming to solve this problem?

I'm happy to contribute, but don't know what strategy to follow..

Error in resolving LOGSTASH_HOSTS

Hi

I get the following error when running the container:

tcp.go:26: WARN DNS lookup failure "'logstash.foo": lookup 'logstash.foo: invalid domain name

My yaml file looks like:

   - name: LOGSTASH_HOSTS
     value: "'logstash.foo:5959'" 

The container works correctly if I pass logstash.foo:5959 in filebeat.yml e.g.

hosts: ["logstash.foo:5959"]

Any ideas?

cluster_name field not being shipped

Hello, I started using the container recently, its working great so far. However, its an env variable to add the cluster_name is your are running multiple clusters. However, I'm unable to see that field after shipping our logs through our logging pipeline. We are using shipping to logstash 5.2.2. This is not a big deal for us at the moment, but wanted to bring this to your attention.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.