Giter Site home page Giter Site logo

intelsdi-x / snap-plugin-publisher-influxdb Goto Github PK

View Code? Open in Web Editor NEW
8.0 59.0 40.0 428 KB

Publishes Snap metrics to InfluxDB

Home Page: http://snap-telemetry.io/

License: Apache License 2.0

Shell 40.03% Go 41.32% Makefile 1.75% Ruby 14.58% Dockerfile 2.32%

snap-plugin-publisher-influxdb's Introduction

DISCONTINUATION OF PROJECT.

This project will no longer be maintained by Intel.

This project has been identified as having known security escapes.

Intel has ceased development and contributions including, but not limited to, maintenance, bug fixes, new releases, or updates, to this project.

Intel no longer accepts patches to this project.

DISCONTINUATION OF PROJECT

This project will no longer be maintained by Intel. Intel will not provide or guarantee development of or support for this project, including but not limited to, maintenance, bug fixes, new releases or updates. Patches to this project are no longer accepted by Intel. If you have an ongoing need to use this project, are interested in independently developing it, or would like to maintain patches for the community, please create your own fork of the project.

Build Status

Snap publisher plugin - InfluxDB

This plugin supports pushing metrics into an InfluxDB instance.

It's used in the Snap framework.

  1. Getting Started
  1. Documentation
  1. Community Support
  2. Contributing
  3. License
  4. Acknowledgements

Getting Started

System Requirements

Support Matrix

Influxdb Influxdb Publisher Snap
1.0 16 1.0.0
1.1 16 1.0.0
1.1.1 16 1.0.0

Known Limitation

  • InfluxDB (tested with InfluxDB 1.0) does not support uint64 as type of data. Metrics with uint64 type are converted to int64 by Snap publisher plugin. uint64 values higher than maximum int64 value are converted to negative value and saved in InfluxDB. Overflow cases are logged.

Installation

Download InfluxDB plugin binary:

You can get the pre-built binaries for your OS and architecture at plugin's GitHub Releases page.

To build the plugin binary:

Fork https://github.com/intelsdi-x/snap-plugin-publisher-influxdb

Clone repo into $GOPATH/src/github.com/intelsdi-x/:

$ git clone https://github.com/<yourGithubID>/snap-plugin-publisher-influxdb.git

Build the plugin by running make within the cloned repo:

$ make

This builds the plugin in ./build

Configuration and Usage

Documentation

The plugin expects you to provide the following parameters:

  • host
  • database
  • user
  • password

You can also set the following options if needed:

  • skip-verify defaults to false (boolean). Set to true to complain if the certificate used is not issued by a trusted CA.
  • precision defaults to s (string). The value can be changed to any of the following: n,u,ms,s,m,h. This will determine the precision of timestamps.
  • isMultiFields defaults to false (boolean). When it's true, plugin groups common namespaces, those that differ at the leaf and have same tags including values, into one data point with multiple influx fields.
  • port defaults to 8086 which works with http and https. The port is 4444 for udp in the example.
  • scheme defaults to http.
    • http
    • https
    • udp
  • retention defaults to autogen, it indicates retention policy for database with specified duration which determines how long InfluxDB keeps the data, for more information read Retention Policy Management.

Examples

See examples/tasks folder for examples.

Here are samples to illustrate the differences for isMultiFields flag. When isMultiFields is false which is the default setting, you have to query each measurement. While isMultiFields is true, plugin groups the common namespaces, those that differ at the leaf and have same tags including values, into one data point with multiple influx fields; you query the common namespace.

Sample isMultiField=false

select * from "/intel/psutil/load/load1"
time source unit value
1483997727411599704 egu-mac01.lan Load/1M 1.76
1483997728412178616 egu-mac01.lan Load/1M 1.76

Sample isMultiField=true

select * from "/intel/psutil/load"
time load1 load15 load5 source unit
1483996289995839909 2.05 egu-mac01.lan Load/1M
1483996289995839909 6.21 egu-mac01.lan Load/1M
1483996289995839909 5.26 egu-mac01.lan Load/1M

Roadmap

There isn't a current roadmap for this plugin, but it is in active development. As we launch this plugin, we do not have any outstanding requirements for the next release.

If you have a feature request, please add it as an issue and/or submit a pull request.

Community Support

This repository is one of many plugins in Snap, a powerful telemetry framework. See the full project at http://github.com/intelsdi-x/snap To reach out to other users, head to the main framework

Contributing

We love contributions!

There's more than one way to give back, from examples to blogs to code updates. See our recommended process in CONTRIBUTING.md.

License

Snap, along with this plugin, is an Open Source software released under the Apache 2.0 License.

Acknowledgements

And thank you! Your contribution, through code and participation, is incredibly important to us.

snap-plugin-publisher-influxdb's People

Contributors

alrim42 avatar andrzej-k avatar candysmurf avatar croseborough avatar geauxvirtual avatar ircody avatar izabellaraulin avatar jcooklin avatar katarzyna-z avatar kdembler avatar kindermoumoute avatar kjlyon avatar luizoz avatar lynxbat avatar marcin-krolik avatar marcin-ol avatar marcintao avatar mbbroberg avatar nanliu avatar nichelle-hall avatar patrykmatyjasek avatar rdower avatar taotod avatar thomastaylor312 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

snap-plugin-publisher-influxdb's Issues

Error with publisher job, connection reset by peer

Snap daemon version : 2.0.0

Environment:

  • Cloud provider or hardware configuration: Azure
  • OS (e.g. from /etc/os-release): CentOS Linux release 7.0.1406 (Core)
  • Kernel (e.g. uname -a): Linux CAZURENDRC012 3.10.0-123.20.1.el7.x86_64 #1 SMP Thu Jan 29 18:05:33 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
  • Relevant tools (e.g. plugins used with Snap):
file 		 2 		 publisher 	 false 		 loaded 	 Fri, 13 Oct 2017 10:27:30 CEST
psutil 		 14 		 collector 	 false 		 loaded 	 Mon, 16 Oct 2017 09:41:05 CEST
cpu 		 7 		 collector 	 false 		 loaded 	 Fri, 13 Oct 2017 10:48:47 CEST
influxdb 	 25 		 publisher 	 false 		 loaded 	 Fri, 13 Oct 2017 11:53:45 CEST
disk 		 6 		 collector 	 false 		 loaded 	 Fri, 13 Oct 2017 15:47:35 CEST
meminfo 	 4 		 collector 	 false 		 loaded 	 Fri, 13 Oct 2017 17:00:41 CEST
ethtool 	 5 		 collector 	 false 		 loaded 	 Fri, 13 Oct 2017 17:01:53 CEST

What happened:
Publishing metrics to a remote influxdb instance, after some successful hits, the connection gets lost.
The error msg from tail -f /var/log/snap/snapteld.log is:

time="2017-10-16T09:53:57+02:00" level=error msg="error with publisher job" _module=scheduler-job block=run error="Post http://www.mydomain.com:myport/write?consistency=&db=&precision=ns&rp=autogen: read tcp host_private_ip_address:46513->public_ip_of_mydomain:myport: read: connection reset by peer" job-type=publisher plugin-config=map[password:{} skip-verify:{true} user:{} precision:{ns} database:{} isMultiFields:{true} port:{myport} scheme:{http} retention:{autogen} host:{www.mydomain.com}] plugin-name=influxdb plugin-version=-1

invalid plugin name

The suggested in the repository is "influxdb", when in fact the plugin itself (inside snap) is called "influx".

This is a minor issue, but when using this plugin either the name should change or the repository.

Some field values saved as string

Snap daemon version (use snapteld -v):
snapteld version 1.2.0

Environment:

  • Cloud provider or hardware configuration:
  • OS (e.g. from /etc/os-release):
    Centos 5
  • Kernel (e.g. uname -a):
    2.6.18-238.9.1.el5
  • Relevant tools (e.g. plugins used with Snap):
    psutil
  • Others (e.g. deploying with Ansible):
    isMultiFields: true
    scheme: http

What happened:
Values for intel/psutil/vm metrics are saved as strings. Others are net saved as strings (intel/psutil/load, intel/psutil/cpu/cpu-total)

What you expected to happen:
Save field values as numeric so I can use aggregations.

Steps to reproduce it (as minimally and precisely as possible):

---
  version: 1
  name: "psutil"
  schedule:
    type: "simple"
    interval: "60s"
  workflow:
    collect:
      metrics:
        /intel/psutil/load/load1: {}
        /intel/psutil/load/load15: {}
        /intel/psutil/load/load5: {}
        /intel/psutil/vm/available: {}
        /intel/psutil/vm/free: {}
        /intel/psutil/vm/used: {}
        /intel/psutil/vm/used_percent: {}
        /intel/psutil/vm/active: {}
        /intel/psutil/vm/buffers: {}
        /intel/psutil/vm/cached: {}
        /intel/psutil/vm/total: {}
        /intel/psutil/cpu/cpu-total/idle: {}
        /intel/psutil/cpu/cpu-total/iowait: {}
        /intel/psutil/cpu/cpu-total/irq: {}
        /intel/psutil/cpu/cpu-total/softirq: {}
        /intel/psutil/cpu/cpu-total/system: {}
        /intel/psutil/cpu/cpu-total/user: {}
      publish:
        - plugin_name: "influxdb"
          config:
            isMultiFields: true
            scheme: http

Incorrect timestamps sent to database

With this plugin built from master, metrics published to influx (v0.9.5) have incorrect timestamps:

Data from task watch (snap built from master 7cb26be): /intel/test/test 2234 2016-02-04 13:02:41.0238152 -0800 PST taylorth-mac03.pdx.intel.com

Data from publisher (I added a print to stdout to get the data):

2016/02/04 13:02:41 Timestamp: 2016-02-04 13:02:41.0238152 -0800 PST
2016/02/04 13:02:41 Struct: {intel/test/test map[source:taylorth-mac03.pdx.intel.com] 2016-02-04 13:02:41.0238152 -0800 PST map[value:2234] s }

Data that appears in the database: 1970-01-01T00:00:01.454618963Z "taylorth-mac03.pdx.intel.com" 2234

It appears that each timestamp in the DB increments by a little bit, but all still show as being in the 1970s. Could this be related to a change in the InfluxDB library?

Too many open files

Hi,

I've hit a "too many open files" error while publishing to the InfluxDB. I've noticed that publisher creates a new socket to the database for each publish action, but sockets are not being closed. I'm using Snap v0.13.0-beta.

root@node-8:~# lsof -p 56250 |grep 8086 |wc -l
527

snapd.conf:


---
log_level: 1
log_path: /var/log/snap
control:
  auto_discover_path: /etc/snap/plugin
  plugin_trust_level: 0

task file:


---
version: 1
schedule:
  type: 'simple'
  interval: '1s'
workflow:
  collect:
    metrics:
      /intel/linux/iostat/avg-cpu/%idle: {}
      /intel/linux/iostat/avg-cpu/%iowait: {}
      /intel/linux/iostat/avg-cpu/%nice: {}
      /intel/linux/iostat/avg-cpu/%steal: {}
      /intel/linux/iostat/avg-cpu/%system: {}
      /intel/linux/iostat/avg-cpu/%user: {}
      /intel/linux/iostat/device/ALL/%util: {}
      /intel/linux/iostat/device/ALL/avgqu-sz: {}
      /intel/linux/iostat/device/ALL/avgrq-sz: {}
      /intel/linux/iostat/device/ALL/await: {}
      /intel/linux/iostat/device/ALL/r_await: {}
      /intel/linux/iostat/device/ALL/r_per_sec: {}
      /intel/linux/iostat/device/ALL/rkB_per_sec: {}
      /intel/linux/iostat/device/ALL/rrqm_per_sec: {}
      /intel/linux/iostat/device/ALL/svctm: {}
      /intel/linux/iostat/device/ALL/w_await: {}
      /intel/linux/iostat/device/ALL/w_per_sec: {}
      /intel/linux/iostat/device/ALL/wkB_per_sec: {}
      /intel/linux/iostat/device/ALL/wrqm_per_sec: {}
      /intel/procfs/meminfo/MemUsed: {}
      /intel/procfs/meminfo/MemUsed_perc: {}
      /intel/procfs/meminfo/SwapFree: {}
      /intel/procfs/meminfo/SwapFree_perc: {}
      /intel/procfs/meminfo/MemFree: {}
      /intel/procfs/meminfo/MemFree_perc: {}
      /intel/procfs/load/min1: {}
      /intel/procfs/load/min5: {}
      /intel/procfs/load/min15: {}
      /intel/procfs/iface/br-mgmt/bytes_recv: {}
      /intel/procfs/iface/br-mgmt/bytes_sent: {}
      /intel/procfs/iface/br-storage/bytes_recv: {}
      /intel/procfs/iface/br-storage/bytes_sent: {}
      /intel/procfs/swap/all/cached_bytes: {}
      /intel/procfs/swap/all/cached_percent: {}
      /intel/procfs/swap/all/free_bytes: {}
      /intel/procfs/swap/all/free_percent: {}
      /intel/procfs/swap/all/used_bytes: {}
      /intel/procfs/swap/all/used_percent: {}
    process:
      -
        plugin_name: 'passthru'
        process: null
        publish:
          -
            plugin_name: 'influx'
            config:
              host: 
              port: 8086
              database: snap
              user: 
              password: 

Related log entries:

time="2016-04-28T12:22:45Z" level=debug msg="Batch submission of process and publish nodes" _block=work-jobs _module=scheduler-workflow count-process-nodes=0 count-publish-nodes=1 parent-node-type=processor task-id=f36c8347-4b7d-42e9-85da-3da2099b9062 task-name=Task-f36c8347-4b7d-42e9-85da-3da2099b9062 
time="2016-04-28T12:22:45Z" level=debug msg="Submitting publish job" _block=submit-publish-job _module=scheduler-workflow parent-node-type=processor publish-name=influx publish-version=-1 task-id=f36c8347-4b7d-42e9-85da-3da2099b9062 task-name=Task-f36c8347-4b7d-42e9-85da-3da2099b9062 
time="2016-04-28T12:22:45Z" level=debug msg="starting publisher job" _module=scheduler-job block=run content-type=snap.gob job-type=publisher plugin-config=map[password:{} port:{8086} user:{} database:{snap} host:{}] plugin-name=influx plugin-version=-1 
time="2016-04-28T12:22:45Z" level=debug msg="plugin selected" _module=control-routing block=select hitcount=1014 index="publisher:influx:v12:id1" pool size=1 strategy=least-recently-used 
time="2016-04-28T12:22:45Z" level=error msg="error with publisher job" _module=scheduler-job block=run content-type=snap.gob error="Publish call error: Post http://:8086/write?consistency=&db=snap&precision=s&rp=default: dial tcp :8086: socket: too many open files" job-type=publisher plugin-config=map[password:{} port:{8086} user:{} database:{snap} host:{}] plugin-name=influx plugin-version=-1 
time="2016-04-28T12:22:45Z" level=warning msg="Publish job failed" _block=submit-publish-job _module=scheduler-workflow parent-node-type=processor publish-name=influx publish-version=-1 task-id=f36c8347-4b7d-42e9-85da-3da2099b9062 task-name=Task-f36c8347-4b7d-42e9-85da-3da2099b9062 

Better exception handling based on user experience on long write

Notes from intelsdi-x/snap#731 -

  • @thomastaylor312 found that err="NaN is an unsupported value for field value" plugin-name=influx plugin-type=publisher plugin-version=10 point=<nil> caused a nil pointer reference and caused the plugin to panic. So a mixture of bad data and a hard to read log (it was all in byte arrays)
  • @geauxvirtual notes that this brings up a valid point that unexpected EOF could be improved upon. While that may be an error message received, it just becomes noise and has little value in knowing exactly why the job failed.

No dynamic metrics with 'isMultiFields' option

What happened:
When using InfluxDB with isMutliFields option set to true, metrics are not dynamic.
i.e. docker collector metrics look like this:
/intel/docker/22e33jfhk/cgroups/cpu_stats/cpu_shares
instead of:
/intel/docker/cgroups/cpu_stats/cpu_shares with docker_id in tag.

Steps to reproduce it (as minimally and precisely as possible):

  1. Run Snap with Docker collector and InfluxDB publisher
  2. Create task with option isMultiFields set to true.
  3. See results in task watch or Grafana.

Anything else do we need to know (e.g. issue happens only occasionally):
Namespace with dynamic elements should be also used in case when isMultiFields is set to true.
https://github.com/intelsdi-x/snap-plugin-publisher-influxdb/blob/master/influxdb/influxdb.go#L252

Medium test should cleanup containers after execution

What happened:
If the medium test fails, the docker-compose containers remain running, and the test will fail again during next execution. This is originally reported in: intelsdi-x/snap-pluginsync#66

What you expected to happen:
Medium test should cleanup docker-compose containers. A trap or something similar would ensure the test is cleaned up.

Steps to reproduce it (as minimally and precisely as possible):

  1. block network connection to cause glide to fail
  2. after test failure docker containers will remain running
  3. re-running medium tests will execute on the existing containers and keep failing.

HTTPS issue

Hello,

This PR #121 is not implemented with the skip-verify option, so the create or and the if exists, fails:

DEBU[2017-05-23T22:34:10Z] starting publisher job _module=scheduler-job block=run job-type=publisher plugin-config=map[user:{Value:aa} retention:{Value:autogen} precision:{Value:ns} database:{Value:snap} isMultiFields:{Value:true} scheme:{Value:https} skip-verify:{Value:true} host:{Value:192.168.0.2} password:{Value:aa} port:{Value:8086}] plugin-name=influxdb plugin-version=-1

DEBU[2017-05-23T22:34:10Z] time="2017-05-23T22:34:10Z" level=error msg="Post https://192.168.0.2:8086/query?q=CREATE+DATABASE+snap: x509: certificate signed by unknown authority" plugin-name=influxdb plugin-type=publisher plugin-version=22 _module=plugin-exec io=stderr plugin=influxdb

influxdb 1.0 GA

Now that influxdb has a stable 1.0 release would it make sense to standardize this plugin to work with that release?

Trimming namespace with metric labels is broken

In case of more then one Label, trimming namespace won't work correctly.

example namespace and labels

ns := []string{"intel", "foo", "to_remove", "also_to_remove", "bar"}
labels := []core.Label{{Name: "to_remove", Index: 2}, {Name: "also_to_remove", Index: 3}

Below loop will not remove unwanted namespace parts

for _, label := range m.Labels_ {
    tags[label.Name] = m.Namespace()[label.Index]
    ns = append(m.Namespace()[:label.Index], m.Namespace()[label.Index+1:]...)
}

Proposed solution:

for _, label := range m.Labels_ {
    tags[label.Name] = m.Namespace()[label.Index]
    ns = str.Filter(
        ns,
        func (n string) bool {
            return n != label.Name
        },
    )
}

1.0 support and updated support matrix

The support matrix in the README doesn't mention the current version influxdb 1.0. I think we should update this to include the current release version (16) and the compatibility (or incompatibility) with influxdb 1.0.

config:password is not used

Snap version (use snapctl -v): 0.17.0

Environment:

  • Cloud provider or hardware configuration: KVM VMs
  • OS (e.g. from /etc/os-release): Centos 7.2
  • Kernel (e.g. uname -a): 3.10.0-327.36.1.el7.x86_64
  • Relevant tools (e.g. plugins used with Snap): snap-plugin-publisher-influxdb
  • Others (e.g. deploying with Ansible):

What happened:

  • Configured the InfluxDB publisher plugin to write with user and password (where they're different)
  • Doesn't work, authentication failure

What you expected to happen:

  • Connection to InfluxDB to work.

Steps to reproduce it (as minimally and precisely as possible):

  1. Create an InfluxDB user where user != password
  2. Configure the publish with correct user and password
  3. Run the task

Anything else do we need to know (e.g. issue happens only occasionally):

I have pinpointed it to this line of code, probably a typo https://github.com/intelsdi-x/snap-plugin-publisher-influxdb/blob/master/influxdb/influxdb.go#L103

Plugin returns an error for nil value of metric

Plugin always returns an error for nil value of metric. If the publisher plugin fails 10 times then task is disabled.

ID                   NAME                        STATE       HIT     MISS    FAIL    CREATED         LAST FAILURE
0633c781-0200-4b9e-891a-a8095ebb6a93     Task-0633c781-0200-4b9e-891a-a8095ebb6a93   Disabled    10      0   10      11:07AM 7-13-2016   Publish call error: {"error":"unable to parse 'intel/mysql/mysql_commands/select,source=kzab-Z97X-UD7-TH value= 1468400924': missing field value"}

To reproduce:

  1. create a collector which either always returns 'nil' value of metric or 10 times returns 'nil' value of metric,
  2. create task for publishing metrics from collector in Influxdb.

InfluxDB 1.0 incompatibility

In InfluxDB 1.0 they renamed the default retention policy that gets generated on every database from "default" to "autogen".

The plugin still references "default" and since (as far as I can see in the code) this is not configurable it is simply unable to push any data to influx.

I'd suggest:

  • make the retention policy name that is being used configurable
  • change it to "autogen" from "default" as the default value

Publishing metrics with user specified Ids

Snap daemon version (use snapteld -v):

Environment:

  • Cloud provider or hardware configuration:
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Relevant tools (e.g. plugins used with Snap):
  • Others (e.g. deploying with Ansible):

What happened:

What you expected to happen:

Steps to reproduce it (as minimally and precisely as possible):

Anything else do we need to know (e.g. issue happens only occasionally):

Influxdb configuration for Snap

Hi Team,
Snap daemon version (use snapteld -v): snapteld version 1.0.0
Environment: VM hosted in Esxi

Cloud provider or hardware configuration:NA
OS (e.g. from /etc/os-release): ubuntu 16.04 x86
Kernel (e.g. uname -a): Linux ubuntu 4.4.0-59-generic #80-Ubuntu SMP Fri Jan 6 17:47:47 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
Relevant tools (e.g. plugins used with Snap): Grafana/influxdb
Others (e.g. deploying with Ansible):NA

What happened:
I have installed snap telemetry and would like to use influxdb to get metrics , from snap and to display it on Grafana.
1.
is there any plugin to push logs to influxdb. I referred https://github.com/katarzyna-z/snap-plugin-publisher-influxdb . But after ' make ' I am not clear with the steps. How to get snap metrics in influxdb ? how to configure it
2. Snap DS does not support Alert creation in Grafana. any plan to support in future release?

What you expected to happen:
I should be able to configure graphs on grafana and grafana should be able to pull metrics from influxdb

Thanks
Malleshi CN

Snap daemon version (use snapteld -v):

Environment:

  • Cloud provider or hardware configuration:
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Relevant tools (e.g. plugins used with Snap):
  • Others (e.g. deploying with Ansible):

What happened:

What you expected to happen:

Steps to reproduce it (as minimally and precisely as possible):

Anything else do we need to know (e.g. issue happens only occasionally):

Configurable source of timestamps

Source of timestamps could be configurable to provide option to use timestamps which are assigned for metrics or use timestamps which are created by publisher plugin.

Plugin publishes wrong timestamp to the InfluxDB 0.10

I tried to use InfluxDB 0.10 and all entries are stored with wrong date "1 January of 1970".
1970-01-01T00:00:01.456154571Z . Attaching screenshot, with InfluxDB Query

[iolie@irild016 ~]$ ~/snap/opt/snap/bin/snapctl plugin list
NAME VERSION TYPE SIGNED STATUS LOADED TIME
file 3 publisher false loaded Mon, 22 Feb 2016 15:22:41 GMT
influx 9 publisher false loaded Mon, 22 Feb 2016 15:22:41 GMT
passthru 1 processor false loaded Mon, 22 Feb 2016 15:22:41 GMT
csl 1 collector false loaded Mon, 22 Feb 2016 15:22:38 GMT
osv 2 collector false loaded Mon, 22 Feb 2016 15:22:40 GMT

influxdb

Error in large tests

Following errors occurs in large tests:

2016-11-24 14:22:52 UTC [     info] creating and starting a task
Using task manifest to create task
Task created
ID: bc7d8324-ea06-41ab-a85e-7b64846251bc
Name: Task-bc7d8324-ea06-41ab-a85e-7b64846251bc
State: Running
[task is running] not ok
[task is hitting] ok
[task has no errors] not ok
error: error executing remote command: command terminated with non-zero exit code: Error executing in Docker Container: 255
+ clean_up
+ kubectl delete ns testrunner-9cfa699f53f013b37190c687305b7fb5

float64 and int conflict

When using the disk collector plugin with influx as publisher, I get the following error:

Publish call error: {"error":"field type conflict: input field \"value\" on measurement \"intel/procfs/disk/merged_read\" is type float64, already exists as type integer"}

Talking about on slack, @IRCody suggested that this could be a publisher issue.

I'm using on influx 0.13, with snap master, influx master and go 1.6.3.

If @intelsdi-x/plugin-maintainers could please take a look, that'd great.

Support mapping (many) snap.Metrics to a (single) data point with (many) fields

Today we publish every snap.Metric as a separate data point with the only influx field being Data. I propose that we add support for mapping common namespaces, those that differ at the leaf, into one data point with multiple influx fields.

For example consider the following namespaces:
/intel/psutil/cpu/idle
/intel/psutil/cpu/user
/intel/psutil/cpu/system

Today the above metrics would result in publishing a datapoint to 3 separate series however we could enable mapping of these three namespaces to one series and publish one datapoint that included fields for idle, user, system.

Publisher not closing tcp connections

I'm trying to test publising some stats to InfluxDB v1.0 on Centos 7.
The setup is from the Getting Started Tutorial
I start snapd

sudo nohup /usr/local/bin/snapd --config myconfig.json &
{
    "log_level": 1,
    "log_path": "/var/log/snap",
    "gomaxprocs": 2,
    "control": {
        "auto_discover_path": "/home/centos/snap/plugins_use",
        "plugin_trust_level": 0,
        "plugins": {
            "all": {
                "user": "none",
                "log-level": "debug",
                "password": "none"
            },
            "collector": {
                "all": {
                },
                "psutil": {
                    "all": {
                        "path": "/usr/local/bin/psutil"
                    }
                }
            },
            "publisher": {
            }
        }
    }
}

with snap-plugin-collector-psutil and snap-plugin-publisher-influxdb loaded, and then I create a task with a following config:

snapctl task create -t  tasks/psutil-influx.json
{
    "version": 1,
    "schedule": {
        "type": "simple",
        "interval": "1s"
    },
    "workflow": {
        "collect": {
            "metrics": {
                "/intel/psutil/load/load1": {},
                "/intel/psutil/load/load5": {},
                "/intel/psutil/load/load15": {},
                "/intel/psutil/vm/available": {},
                "/intel/psutil/vm/free": {},
                "/intel/psutil/vm/used": {}
            },
            "process": null,
            "publish": [
                {
                    "plugin_name": "influx",
                    "config": {
                        "log-level" : "debug",
                        "version": 13,
                        "host": "127.0.0.1",
                        "port": 8086,
                        "database": "db_test",
                        "user": "admin",
                        "password": "password"
                    }
                }
            ]
        }
    }
}

The task starts successfully an begins to push data to InfluxDB, but the number of TCP connections continue to grow fast, until I stop the task, after that they return to initial number more or less.
Initially I got an error "too many open files" like in this issue #51, so I increased the ulimit, but the problem is connection are not getting closed and I can see tens of thousands of them after 30 minutes.

[centos@vm snap]$ sudo lsof |grep influx | wc -l
383
[centos@vm snap]$ snapctl task create -t  tasks/psutil-influx.json
Using task manifest to create task
Task created
ID: d7d26338-44e4-4888-adaa-be7e50d79f45
Name: Task-d7d26338-44e4-4888-adaa-be7e50d79f45
State: Running
[centos@vm snap]$ snapctl task list
ID                   NAME                        STATE       HIT     MISS    FAIL    CREATED         LAST FAILURE
d7d26338-44e4-4888-adaa-be7e50d79f45     Task-d7d26338-44e4-4888-adaa-be7e50d79f45   Running     3   0   0   9:46PM 6-29-2016    
[centos@vm snap]$ snapctl task list
ID                   NAME                        STATE       HIT     MISS    FAIL    CREATED         LAST FAILURE
d7d26338-44e4-4888-adaa-be7e50d79f45     Task-d7d26338-44e4-4888-adaa-be7e50d79f45   Running     6   0   0   9:46PM 6-29-2016    
[centos@vm snap]$ sudo lsof |grep influx | wc -l
535
[centos@vm snap]$ sudo lsof |grep influx | wc -l
1765
[centos@vm snap]$ sudo lsof |grep influx | wc -l
17655
[...]
[centos@vm snap]$ snapctl task stop d7d26338-44e4-4888-adaa-be7e50d79f45
Task stopped:
ID: d7d26338-44e4-4888-adaa-be7e50d79f45
[centos@vm snap]$  sudo lsof |grep influx | wc -l
395
[centos@vm snap]$ sudo lsof |grep influx
[...]
influxd   21061 21066 influxdb 1011u     IPv6             221017       0t0        TCP localhost:d-s-n->localhost:37671 (ESTABLISHED)
influxd   21061 21066 influxdb 1012u     IPv6             221019       0t0        TCP localhost:d-s-n->localhost:37672 (ESTABLISHED)
influxd   21061 21066 influxdb 1013u     IPv6             207517       0t0        TCP localhost:d-s-n->localhost:37673 (ESTABLISHED)
influxd   21061 21066 influxdb 1014u     IPv6             221021       0t0        TCP localhost:d-s-n->localhost:37674 (ESTABLISHED)
influxd   21061 21066 influxdb 1015u     IPv6             207520       0t0        TCP localhost:d-s-n->localhost:37675 (ESTABLISHED)
influxd   21061 21066 influxdb 1016u     IPv6             207533       0t0        TCP localhost:d-s-n->localhost:37679 (ESTABLISHED)

Also I can see some logging in the connection pool code (this commit):

logger.Debug("Opening new InfluxDB connection[", user, "@", db, " ", u.String(), "]")
[...]
logger.Debug("Using open InfluxDB connection[", user, "@", db, " ", u.String(), "]")

but I cannot find where the log is being written, nothing about opening or closing connections in /var/log/snap/snapd.log

Integers written to InfluxDB as strings

When using the influxdb publisher with the psutil collector, all integers are written to the database as strings. This causes issues with the aggregator functions when querying the data.

I don't see anything I can do differently to effect the datatypes submitted to the database. If I have mis-configured something, please let me know how to correct the issue.

Environment

snaptel version

# snapteld -v
snapteld version 1.2.0

# snaptel plugin list
NAME 		 VERSION 	 TYPE 		 SIGNED 	 STATUS 	 LOADED TIME
psutil 		 10 		 collector 	 false 		 loaded 	 Tue, 27 Jun 2017 07:41:58 PDT
file 		 2 		 publisher 	 false 		 loaded 	 Tue, 27 Jun 2017 07:41:59 PDT
influxdb 	 22 		 publisher 	 false 		 loaded 	 Tue, 27 Jun 2017 07:41:59 PDT 

Task Definition

# cat /opt/snap/tasks/psutil.yml
---
version: 1
schedule:
    type: simple
    interval: 5s
workflow:
    collect:
        metrics:
            /intel/psutil/load/load1: {}
            /intel/psutil/load/load5: {}
            /intel/psutil/load/load15: {}

            /intel/psutil/vm/available: {}
            /intel/psutil/vm/buffers: {}
            /intel/psutil/vm/cached: {}
            /intel/psutil/vm/free: {}
            /intel/psutil/vm/inactive: {}
            /intel/psutil/vm/total: {}
            /intel/psutil/vm/used: {}
            /intel/psutil/vm/used_percent: {}
            /intel/psutil/vm/wired: {}

        publish:
          - plugin_name: "file"
            config:
                file: "/tmp/psutil.log"
          - plugin_name: influxdb
            config:
                host: xx.xxx.xx.xxx
                port: 8086
                database: snap
                scheme: http
                isMultiFields: True
                user: admin
                password: admin

Influx Database Version

# influx -version
InfluxDB shell version: 1.2.4

# cat /var/log/syslog | grep -e influxd.*version
Jun 27 08:55:08 hostname influxd[8631]: [I] 2017-06-27T15:55:08Z InfluxDB starting, version 1.2.4, branch master, commit 77909d7c7826afe597b12d957996d6e16cd1afaa
Jun 27 08:55:08 hostname influxd[8631]: [I] 2017-06-27T15:55:08Z Go version go1.7.4, GOMAXPROCS set to 24

Example

Data types of fields

Note that the load fields are all float, the vm used_percent field is also float, but all other field values are set to string.

# influx -database snap -execute 'show field keys'
name: intel/psutil/load
fieldKey fieldType
-------- ---------
load1    float
load15   float
load5    float

name: intel/psutil/vm
fieldKey     fieldType
--------     ---------
available    string
buffers      string
cached       string
free         string
inactive     string
total        string
used         string
used_percent float
wired        string

Queries

If you query for the raw records, they are returned, and everything appears as expected

# influx -database snap -execute 'select "available" from "intel/psutil/vm" where time > now() - 30s'
name: intel/psutil/vm
time                available
----                ---------
1498581320505602065 61972746240
1498581325505964922 61973139456
1498581330505170802 61965983744
1498581335507270905 61965783040
1498581340506459073 61964587008
1498581345506721712 61964021760

However, if you query using an aggregator the same query fails. Obviously you cant find the mean value of a set of strings.

# influx -database snap -execute 'select mean("available") from "intel/psutil/vm" where time > now() - 30s group by time(10s)'
ERR: unsupported mean iterator type: *influxql.stringInterruptIterator
unsupported mean iterator type: *influxql.stringInterruptIterator

If you run the same queries on a field with a numeric datatype, they function as expected

# influx -database snap -execute 'select "used_percent" from "intel/psutil/vm" where time > now() - 30s'
name: intel/psutil/vm
time                used_percent
----                ------------
1498581445510829361 8.237404328803912
1498581450510185747 8.236955438811805
1498581455509704698 8.237489253937554
1498581460509345807 8.236087989232464
1498581465511396128 8.236542945305544
1498581470510772477 8.236797720706472

# influx -database snap -execute 'select mean("used_percent") from "intel/psutil/vm" where time > now() - 30s group by time(10s)'
name: intel/psutil/vm
time                mean
----                ----
1498581470000000000 8.23509315195266
1498581480000000000 8.232788041182381
1498581490000000000 8.233097411312077
1498581500000000000 8.2323330851093

The json from the file publisher shows the values as integers (no quotes).

# tail -n1 /tmp/psutil.log | python -m json.tool
[
    {
        "data": 0.13,
        "last_advertised_time": "2017-06-27T09:58:14.166305127-07:00",
        "namespace": "/intel/psutil/load/load1",
        "tags": {
            "plugin_running_on": "hostname"
        },
        "timestamp": "2017-06-27T09:58:14.161033571-07:00",
        "unit": "Load/1M",
        "version": 0
    },
    {
        "data": 0.16,
        "last_advertised_time": "2017-06-27T09:58:14.166305783-07:00",
        "namespace": "/intel/psutil/load/load15",
        "tags": {
            "plugin_running_on": "hostname"
        },
        "timestamp": "2017-06-27T09:58:14.161034316-07:00",
        "unit": "Load/15M",
        "version": 0
    },
    {
        "data": 0.21,
        "last_advertised_time": "2017-06-27T09:58:14.166306012-07:00",
        "namespace": "/intel/psutil/load/load5",
        "tags": {
            "plugin_running_on": "hostname"
        },
        "timestamp": "2017-06-27T09:58:14.161034665-07:00",
        "unit": "Load/5M",
        "version": 0
    },
    {
        "data": 62013546496,
        "last_advertised_time": "2017-06-27T09:58:14.166306244-07:00",
        "namespace": "/intel/psutil/vm/available",
        "tags": {
            "plugin_running_on": "hostname"
        },
        "timestamp": "2017-06-27T09:58:14.163725418-07:00",
        "unit": "B",
        "version": 0
    },
    {
        "data": 3328962560,
        "last_advertised_time": "2017-06-27T09:58:14.166306479-07:00",
        "namespace": "/intel/psutil/vm/cached",
        "tags": {
            "plugin_running_on": "hostname"
        },
        "timestamp": "2017-06-27T09:58:14.163725953-07:00",
        "unit": "B",
        "version": 0
    },
    {
        "data": 59098464256,
        "last_advertised_time": "2017-06-27T09:58:14.166306709-07:00",
        "namespace": "/intel/psutil/vm/free",
        "tags": {
            "plugin_running_on": "hostname"
        },
        "timestamp": "2017-06-27T09:58:14.163726333-07:00",
        "unit": "B",
        "version": 0
    },
    {
        "data": 1058287616,
        "last_advertised_time": "2017-06-27T09:58:14.166306933-07:00",
        "namespace": "/intel/psutil/vm/inactive",
        "tags": {
            "plugin_running_on": "hostname"
        },
        "timestamp": "2017-06-27T09:58:14.163726588-07:00",
        "unit": "B",
        "version": 0
    },
    {
        "data": 8.159376329229993,
        "last_advertised_time": "2017-06-27T09:58:14.166307195-07:00",
        "namespace": "/intel/psutil/vm/used_percent",
        "tags": {
            "plugin_running_on": "hostname"
        },
        "timestamp": "2017-06-27T09:58:14.163726898-07:00",
        "unit": "B",
        "version": 0
    },
    {
        "data": 0,
        "last_advertised_time": "2017-06-27T09:58:14.166307426-07:00",
        "namespace": "/intel/psutil/vm/wired",
        "tags": {
            "plugin_running_on": "hostname"
        },
        "timestamp": "2017-06-27T09:58:14.163727224-07:00",
        "unit": "B",
        "version": 0
    },
    {
        "data": 202137600,
        "last_advertised_time": "2017-06-27T09:58:14.166307673-07:00",
        "namespace": "/intel/psutil/vm/buffers",
        "tags": {
            "plugin_running_on": "hostname"
        },
        "timestamp": "2017-06-27T09:58:14.163727517-07:00",
        "unit": "B",
        "version": 0
    },
    {
        "data": 67523002368,
        "last_advertised_time": "2017-06-27T09:58:14.166307913-07:00",
        "namespace": "/intel/psutil/vm/total",
        "tags": {
            "plugin_running_on": "hostname"
        },
        "timestamp": "2017-06-27T09:58:14.163727676-07:00",
        "unit": "B",
        "version": 0
    },
    {
        "data": 5509455872,
        "last_advertised_time": "2017-06-27T09:58:14.166308142-07:00",
        "namespace": "/intel/psutil/vm/used",
        "tags": {
            "plugin_running_on": "hostname"
        },
        "timestamp": "2017-06-27T09:58:14.163727879-07:00",
        "unit": "B",
        "version": 0
    }
]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.