Giter Site home page Giter Site logo

shamil / graphout Goto Github PK

View Code? Open in Web Editor NEW
10.0 3.0 1.0 215 KB

Graphout lets you query Graphite or Prometheus, then forward the results to different external services

Home Page: http://shamil.github.io/graphout

License: MIT License

JavaScript 98.32% Dockerfile 1.68%
graphite prometheus statuspage cloudwatch zabbix metrics

graphout's Introduction

NPM

What is Graphout

Graphout lets you query graphite or prometheus, then forward the results to different external services. Like Zabbix or StatusPage.io custom metrics.

The project considered BETA, however everything should work. Submit issues and/or suggestions. Pull requests are always welcome.

Why?

Graphite collects metrics, this is very cool, but how can I make use of these metrics? And not just for visualizing them. What if I have a central monitoring system like Zabbix, that responsible to send alerts, and I want to alert based on Graphite data? Or what if I want to do AWS Auto-scaling based on graphite data? How can I get this data into CloudWatch? I'm sure you have your own reasons to get this data out of Graphite to some external tool or service.

So, I decided that I need something that can answer the above questions. This is how Graphout was started. ๐Ÿ’ช

Features

  • Can take any query from Graphite render API
  • HTTPS and HTTP basic authentication
  • Average, maximum and minimum calculation (per output)
  • Filter queries (per output)
  • Log, Zabbix, CloudWatch and StatusPage.io outputs
  • New output modules very easy to write
  • New, support for Prometheus as query source
  • Docker image available in the DockerHub

Future work

  • Allow set interval per query
  • Write unit tests (help needed)
  • Create Upstart and Systemd service scripts
  • Nice to have: prepare a puppet module

Quick start guide

Install

# npm install --global graphout

Usage

# graphout --help
usage: graphout --config <config-path> --pid <pid-path> [-v]

Run

  1. download example configuration, and save it to /etc/graphout/graphout.json
  2. change the configuration to meet your graphite settings
  3. make sure the example query will work on your environment, if not change it
  4. Now, you can run Graphout
graphout --pid /tmp/graphout.pid --config /etc/graphout/graphout.json

Result

If all good, you should see data goes to a log file (/tmp/logoutput.log) written by the logoutput module. If not, try to set log_level to debug in configuration or post your issue(s) and I'll try to help you getting started.

Configuration

Configuration is a typical JSON file(s), with one addition that you can have comments in it. Also you can include configuration files from master config. See include option. The configuration file(s) validated using JSON schema, invalid configuration properties will cause Graphout to exit immediately. Read the schema for the accepted configuration format.

Query engines

Starting from Graphout version 0.4.0, there is support for query engines. Which allows to use query source other than graphite. Currently prometheus query engine supported in addition to graphite.

Graphout allows to use single query engine per configuration. Which means you can't use graphite and prometheus together. Thus you have to specify which query engine you want to use.

Minimal configuration

Note: by default Graphout uses graphite query engine.

  • graphite_url is mandatory
  • at least one query must be configured
  • at least one output must be configured

Example

{
    "graphite_url": "http://graphite.example.com:8080",

    "queries":
    {
        "go-carbon.updateOperations":
        {
            "query": "sumSeries(carbon.agents.*.persister.updateOperations)",
            "from": "-1min",
            "until": "now"
        }
    },

    "outputs":
    {
        "logfile":
        {
            "output": "./logoutput",
            "params": {
                "path": "/tmp/logoutput.log"
            }
        }
    }
}

Available configuration options

query_engine

Which query engine to use when executing queries, one of graphite or prometheus, default is graphite.

graphite_url/prometheus_url

URL to the graphite-web or prometheus-api. The option must conform to the URI format.

graphite_auth/prometheus_auth

HTTP basic authentication option in <username>:<password> format, optional.

interval

Query interval in seconds, default is 60 seconds

log_file

Full path to the log file, default is /var/log/graphout/graphout.log. Set this to /dev/stdout to print to console.

log_level

Minimal log level that will be printed, default is info. Available levels are: error, warn, info and debug.

splay

Delay each query by consistent random of seconds. If enabled, delay between 1 second and the query interval. Default is false

include

The include option is a list of configuration files to load. The files are loaded and merged in the specified order. Each include element can have glob based wildcards.

Example:

"include": ["/etc/graphout/conf.d/*.json", "/etc/graphout/example.json"]

queries

Query objects, for graphite or prometheus.

For graphite, the format is:

// Alphanumeric unique query name, with dots and hyphens allowed.
"go-carbon.updateOperations":
{
    // the graphite target
    "query": "sumSeries(carbon.agents.*.persister.updateOperations)",

    // relative or absolute time period
    "from": "-1min",

    // relative or absolute time period
    "until": "now",
}

For more information about the query (target), from and until options, read the Graphite Render URL API manual.

Note that, Graphout uses the maxDataPoints API option, to return 60 consolidated data points at most. The maxDataPoints option available since Graphite 0.9.13. So it's best that you have the latest version of graphite-web.

For prometheus, the format is:

// Alphanumeric unique query name, with dots and hyphens allowed.
"prometheus_cpu.5m.avg":
{
    // the prometheus instant-query
    "query": "sum(irate(node_cpu{role='prometheus', mode!='idle'}[5m])) * 100",

    // time=<rfc3339 | unix_timestamp>: evaluation timestamp, optional.
    "time": ""
}

For more information about the query (instant-query) and the time options, read the Prometheus HTTP API manual. Currently Graphout supports only vector result types. Open feature request, if you need the matrix type as well.

outputs

Output objects. The format is:

// Alphanumeric unique output name
"logfile":
{
    // ouput module name, Graphaut will use "require" to load the module
    "output": "./logoutput",

    // filter can be used to process only matched queries (using regular expression)
    // default: all queries are processed by the outputs.
    "filter": ".*",

    // the calculation method of the values received from query_engine
    // available methods: "avg", "min", "max"
    // default: "avg"
    "calculation": "avg"

    // "params" properties are specific to the "output" module
    "params": {
        "path": "/tmp/logoutput.log"
    }
}

Outputs configuration

Each output is a Node.js module. The only exception is a built-in logoutput output, which is part of this project. The other currently available outputs are CloudWatch, Statuspage.io and Zabbix they are separate packages. Those outputs are dependencies of this project, so they're installed automatically when you install Graphout.

logoutput

The only param for this output is path, to the log file where all queries results will be written to.

Documentaion of supported outputs:

Custom outputs

Custom outputs are very easy to write. You just write a function that accepts 3 arguments. Inside your function you listen to upcoming events and process them as you desire. Just take a look at the logoutput output as an example.

Function arguments

  • events (EventEmitter), where all the events will be sent to.
  • logger, the logger where you can send your logs to.
  • params, the output parameters, all the parameters that were passed to the output module (read above about output params)

Available events

raw

A very first event which includes exactly same data as it was retrieved from the query_engine, as JavaScript Object. Two arguments passed to the event, first is the raw data, second is the query options object.

values

The values array of the query, which still not passed any calculation. (nulls are omitted) Two arguments passed to the event, first is the values array, second is the query options object.

result

The calculated result, after calculation of avg, min or max. Depends what was requested in the query options. Two arguments passed to the event, first is the result value, second is the query options object.

Internal architecture

diagram

License

Licensed under the MIT License. See the LICENSE file for details.

graphout's People

Contributors

dependabot-preview[bot] avatar dependabot[bot] avatar shamil avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

boomerchi

graphout's Issues

error with prometheus api

I get an error while trying graphout with prometheus v2.

2018-10-15 18:21:44 severity="info" interval="60000"
2018-10-15 18:21:44 severity="info" log_level="debug"
2018-10-15 18:21:44 severity="info" query_engine="prometheus"
2018-10-15 18:21:44 severity="info" splay="false"
2018-10-15 18:21:44 message="loading query engine" severity="info" engine="prometheus"
2018-10-15 18:21:44 message="loading output" severity="info" output="logfile" module="graphout-output-statuspage-io"
2018-10-15 18:21:44 message="executing query" severity="debug" query="prometheus_cpu.5m.avg" request="{'_pd':null,'protocol':'https:','hostname':XXX','port':null,'path':'/api/v1/query?_=123456&query=up&time=rfc3339','method':'GET','headers':{'Accept':'application/json, text/javascript'}}"
2018-10-15 18:21:44 message="query failed" severity="error" query="prometheus_cpu.5m.avg" error="bad HTTP status (400)"

A simple curl on the prometheus retrieve the metrics, without the _=124356 parameter. What is the purpose of this specific parameter ?

CloudWatch

Hi

Thanks for this useful tool, that seems to work well. I have tested reading from Prometheus and writing to CloudWatch, and the basic functionality work.

I have two small issues though, that I would like to report.

  1. Decimals are rounded to the nearest value. I use this query to get the memory pressure of my Kubernetes cluster, and would like to get the result into CloudWatch with one or two decimals. An configurable rounding options would be nice
    // queries section
    "queries":
    {
		"k8sprod2.WorkerMemoryUtilization":
		{
			// the prometheus instant-query
			"query": "sum(container_memory_working_set_bytes{id='/',kube_aws_coreos_com_role='worker'}) / sum(machine_memory_bytes{kube_aws_coreos_com_role='worker'}) * 100",

			// time=<rfc3339 | unix_timestamp>: evaluation timestamp, optional.
			"time": ""
		}
    },
  1. There is no dimension assigned to the metric

graphite metric values passed by graphout are different than provided by graphite

it seems that graphout logs a value that is different than as returned by graphite

From Graphite:
[{"target": "netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy", "datapoints": [[25.33446831696655, 1461842400], [25.7734740653523, 1461842520], [24.3570468278944, 1461842640], [23.711759390607803, 1461842760], [20.9118678938947, 1461842880], [19.24948126962095, 1461843000], [17.3139677711872, 1461843120], [16.9410851940636, 1461843240], [18.42542053910585, 1461843360], [17.364125610583102, 1461843480], [17.9149330001911, 1461843600], [20.77903744127045, 1461843720], [24.8942426859944, 1461843840], [24.571219714625748, 1461843960], [24.243233581547052, 1461844080], [25.46816505205415, 1461844200], [39.16704786828345, 1461844320], [39.40435667720285, 1461844440], [36.336761017686854, 1461844560], [22.27124974191695, 1461844680], [18.248599238215952, 1461844800], [16.91937757664115, 1461844920], [17.973630250759697, 1461845040], [18.342401167242002, 1461845160], [18.70834590831095, 1461845280], [18.0758846361824, 1461845400], [18.6187341950368, 1461845520], [31.877802993933052, 1461845640], [28.141148342675102, 1461845760], [32.63457091455205, 1461845880], [24.5021316754224, 1461846000], [17.8243685593147, 1461846120], [16.870662293467298, 1461846240], [16.56807217429585, 1461846360], [18.128912716322553, 1461846480], [17.182529652129297, 1461846600], [19.72172020432835, 1461846720], [18.298302881740298, 1461846840], [25.06567155973265, 1461846960], [37.972886343796546, 1461847080], [36.64555787314655, 1461847200], [23.7312807510069, 1461847320], [22.4088348458965, 1461847440], [23.09373883044265, 1461847560], [16.236113348037648, 1461847680], [16.26398928346125, 1461847800], [17.06447199255425, 1461847920], [16.961864672827247, 1461848040], [19.62282370327315, 1461848160], [20.3817087678584, 1461848280]]}]

From /tmp/logoutput.log:

2016-04-28 20:17:47 severity="info" result="25.5372" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-1min
" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:18:47 severity="info" result="25.5465" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-1min
" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:19:47 severity="info" result="25.5524" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-1min
" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:20:47 severity="info" result="25.5496" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-1min
" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:21:47 severity="info" result="25.5442" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-1min
" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:22:47 severity="info" result="25.5406" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-1min
" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:23:47 severity="info" result="25.5363" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-1min
" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:24:48 severity="info" result="25.5319" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-1min
" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:25:48 severity="info" result="25.5261" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-1min
" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:26:48 severity="info" result="25.5212" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-1min
" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:27:48 severity="info" result="25.519" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-1min"
 until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:28:48 severity="info" result="25.5176" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-1min
" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:29:48 severity="info" result="25.5154" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-1min
" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:30:48 severity="info" result="25.5133" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-1min
" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:31:48 severity="info" result="25.5123" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-1min
" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:32:48 severity="info" result="25.5123" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-1min
" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:33:48 severity="info" result="25.5112" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-1min
" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:34:48 severity="info" result="25.5098" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-1min
" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:35:48 severity="info" result="25.5085" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-1min
" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:36:48 severity="info" result="25.5146" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-1min
" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:37:48 severity="info" result="25.5271" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-1min
" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:38:43 severity="info" result="25.5378" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-1min
" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:43:47 severity="info" result="25.5603" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-1min
" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:45:53 severity="info" result="25.5607" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-1min
" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:46:18 severity="info" result="25.5659" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-10mi
n" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:47:18 severity="info" result="25.5654" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-10mi
n" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:48:18 severity="info" result="25.5656" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-10mi
n" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:49:18 severity="info" result="25.5641" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-10mi
n" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:50:06 severity="info" result="25.5592" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-10mi
n" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:51:06 severity="info" result="25.5538" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-10mi
n" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:52:06 severity="info" result="25.5468" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-10mi
n" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:53:06 severity="info" result="25.5413" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-10mi
n" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:54:32 severity="info" result="25.5352" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-10mi
n" until="now" name="sdt-cdot1-01.avg_processor_busy"
2016-04-28 20:55:42 severity="info" result="25.5266" query="netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy" from="-10mi
n" until="now" name="sdt-cdot1-01.avg_processor_busy"

my "/etc/graphout/graphout.json" file:

/**
 * example Graphout configuration file
 */
{
    // graphite-web options
    "graphite_url": "http://sdt-graphite.nltestlab.hq.netapp.com:81",

    // log file options
    "log_file": "/dev/stdout",
    "log_level": "debug",

    // query interval (in seconds)
    "interval": 60,

    // delay each query by consistent random of seconds
    // if enabled, delay between 1 second and the query interval
    "splay": false,

    // queries section
    "queries":
    {
        "sdt-cdot1-01.avg_processor_busy":
        {
            "query": "netapp.perf.nl.sdt-cdot1.node.sdt-cdot1-01.processor.avg_processor_busy",
            "from": "-1min",
            "until": "now"
        }
    },

    // outputs section
    "outputs":
    {
        "logfile":
        {
            "output": "./logoutput",
            "params": {
                "path": "/tmp/logoutput.log"
            }
        },
        "zabbix":
        {
                "output": "graphout-output-zabbix",
                "params":
                {
                        "host": "localhost",
                        "port": 10051,
                        "target": "monitor",
                        "namespace": "graphout"
                }
        }
    }
}

Am I doing something wrong here or is there a bug?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.