Giter Site home page Giter Site logo

inovex / trovilo Goto Github PK

View Code? Open in Web Editor NEW
16.0 16.0 4.0 54 KB

trovilo collects and prepares files from Kubernetes ConfigMaps for Prometheus & friends

License: Apache License 2.0

Go 89.27% Makefile 9.42% Dockerfile 1.31%
prometheus alertmanager alerts kubernetes monitoring grafana dashboards configmap

trovilo's Introduction

trovilo

Build Status Go Report Card Docker Pulls

trovilo collects and prepares files from Kubernetes ConfigMaps for Prometheus & friends.

Philosophy

This simple helper tool aims to collect ConfigMaps (files) via the Kubernetes API and writes them down into the filesystem, that may be internally processed by apps like Prometheus or Grafana. It focuses to serve this purpose only on a very generic way. This means it's not meant to work with a specific app only and won't contain such specific code. Instead we try to provide an extensive UI and try keep the code maintainable.

Contributions are highly appreciated. :)

Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project on a live system.

Setup your GO environment if not already done:

$ export GOPATH=${HOME}/GOPATH GOBIN=${HOME}/GOPATH/bin
$ go get -u github.com/inovex/trovilo/cmd/trovilo
$ $GOBIN/trovilo --help
usage: trovilo --config=CONFIG [<flags>]

Trovilo collects and prepares files from Kubernetes ConfigMaps for Prometheus & friends

Flags:
  -h, --help                   Show context-sensitive help (also try --help-long and --help-man).
      --config=CONFIG          YAML configuration file.
      --kubeconfig=KUBECONFIG  Optional kubectl configuration file. If undefined we expect trovilo is running in a pod.
      --log-level="info"       Specify log level (debug, info, warn, error).
      --log-json               Enable JSON-formatted logging on STDOUT.
  -v, --version                Show application version.

Deployment

Deploy the binary to your target systems or use the official Docker image. Notice: The tools-tagged Docker image additionally contains useful tools for verify or post-deploy commands.

Simple trovilo example configuration file:

$ cat trovilo-config.yaml
# Which namespace to check (empty string means all namespaces)
#namespace: ""
jobs:
  # Arbitrary name for identification (and troubleshooting in logs)
  - name: alert-rules
    # Kubernetes-styled label selector to define how to find ConfigMaps
    selector:
      type: prometheus-alerts
    verify:
      # Example verification step to check whether the contents of the ConfigMap are valid Prometheus alert files. %s will be replaced by the ConfigMap's file path(s).
      - name: verify alert rule validity
        cmd: ["promtool", "check", "rules", "%s"]
    target-dir: /etc/prometheus-alerts/
    # Enable directory flattening so all ConfigMap files will be placed into a single directory
    flatten: true
    # After successfully verifying the ConfigMap and deploying it into the target-dir, run the following commands to trigger (e.g. Prometheus) manual config reloads
    post-deploy:
      - name: reload prometheus
        cmd: ["curl", "-s", "-X", "POST", "http://localhost:9090/-/reload"]
  # Another job example, but for Grafana dashboards (JSON model)
  - name: grafana-dashboards
    selector:
      type: grafana-dashboards
    target-dir: tmp/target-grafana-dashboards/

Full example Kubernetes deployment with Prometheus:

$ kubectl apply \
  -f https://raw.githubusercontent.com/inovex/trovilo/master/examples/k8s/alert-rules-team1.yaml \
  -f https://raw.githubusercontent.com/inovex/trovilo/master/examples/k8s/prometheus-config.yaml \
  -f https://raw.githubusercontent.com/inovex/trovilo/master/examples/k8s/trovilo-config.yaml \
  -f https://raw.githubusercontent.com/inovex/trovilo/master/examples/k8s/deployment.yaml

Alternatives

Some projects with very similar use case(s):

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details

trovilo's People

Contributors

arnisoph avatar frittentheke avatar hikhvar avatar johscheuer avatar tilladamdmde avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

trovilo's Issues

add docs

Add README and useful examples

Possible race condition?

Currently there is a possible race condition in the way trovilo is implemented (AFAIK):

Imagine the following flow:

1.) trovilo get's started
2.) add a ConfigMap with the expected labels for trovilo
3.) trovilo add ConfigMap (or correctly the content of the ConfigMap) on "disk"
4.) tovilo crashes
5.) Delete ConfigMap from above
6.) trovilo recovers

--> If a ConfigMap is deleted during a crash of trovilo the ConfigMap will never be clean up, correct? Since this line will never be called https://github.com/inovex/trovilo/blob/master/cmd/trovilo/main.go#L104 or to be precisly trovilo never checks the initial state of the targetDir.

Fix travis CI

Currently the CI run fails in the publish state. This should be fixed.

Refactor trovilo job design

Currently trovilo supports multiple jobs (e.g. to gather information for Prometheus and for Grafana) since trovilo is designed to run inside Kubernetes environments I don't actaully see a benefit in supporting multiple jobs (and adding the complexity). From my perspective trovilo should always be used as a sidecar to the according service like in the Prometheus example: https://github.com/inovex/trovilo/blob/master/examples/k8s/deployment.yaml#L60

Are there any reasons why we should keep to support multiple jobs?

Allow for a decorator phase / command to i.e. force-tag alerts

Thanks for this very helpful tool!

I'd love to be able to define a decorator that applies changes to the collected data from configmaps before feeding it to i.e. Alertmanager or Grafana.

One very distinct use-case are Prometheus' alert definitions which are collected from multiple Kubernetes namespaces. If one wants to route the alerts based on the source namespace the configmap was picked-up from, this metadata needs to be immutably available. In case of alerts this requires the source to either ensure the PromQL query leaves this as label on the data or have it set an additional label or an annotations containing the namespace for each and every alert. Ensuring this over hundreds of alerts and many different teams and people without maintaining the source namespace info for each alert behind the scenes is prone to fail.

My suggestion is to simply allow a command to run for each configmap collected by trovilo which then receives the Kubernetes metadata of the individual configmap as environment variables, i.e. K8S_METADATA_NAMESPACE, K8S_METADATA_NAME. The command could simply be a call of sed or maybe even a jsonpatch which decorates the source data with additional info.

Running this arbitrary command and simply providing some variables to it does not make trovilo any more domain specific.
But especially in multi-tenancy the namespace might just be the most important piece of information one wants to add / keep on the data that is then given to Alertmanager or Grafana.

Trovilo crashes without helpful error message / cause

After running for hours without any issue, Trovilo sometimes crashes with a very short error message like:

"{"error":"EOF","level":"fatal","msg":"Kubernetes ConfigMap watcher encountered an error. Exit..","time":"2018-08-30T16:04:31Z"}"

Unfortunately there is no indication of what could have causes this and after the container is restarted it works again for a long time until it crashes in the same matter again. While I cannot rule out an influence from consumed configmaps, I am very certain this is not causes but a change in the configmaps. It also happens when Kubernetes resources are not changed at all over a longer period of time.

Maybe some timeout when talking to the API which is not handled well?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.