performancecopilot / speed Goto Github PK

View Code? Open in Web Editor NEW

37.0 7.0 6.0 2.23 MB

A Go implementation of the PCP instrumentation API

License: MIT License

Makefile 0.48% Go 99.52%

pcp metrics vector go go-kit performance monitoring observability

speed's Introduction

Golang implementation of the Performance Co-Pilot (PCP) instrumentation API

Install
Walkthrough
- SingletonMetric
- InstanceMetric
- Counter
- CounterVector
- Gauge
- GaugeVector
- Timer
- Histogram
Go Kit

Install

Prerequisites

PCP

Install Performance Co-Pilot on your local machine, either using prebuilt archives or by getting and building the source code. For detailed instructions, read the page from pcp.readthedocs.io.

Go

Set up a go environment on your computer. For more information about these steps, please read how to write go code.

download and install go 1.6 or above from https://golang.org/dl
set up $GOPATH to the root folder where you want to keep your go code
add $GOPATH/bin to your $PATH by adding export PATH=$GOPATH/bin:$PATH to your shell configuration file, such as to your .bashrc, if using a Bourne shell variant.

The grafana-pcp plugin provides PCP metrics in the popular Grafana visualization tool. It includes PCP Vector, a live datasource for metrics exposed using Performance Co-Pilot. Metrics you create with Speed are immediately visible in Grafana using this datasource.

Getting the library

go get github.com/performancecopilot/speed

Getting the examples

All examples are executable go programs. Simply doing

go get github.com/performancecopilot/speed/examples/<example name>

will get the example and add an executable to $GOPATH/bin. If it is on your path, simply doing

<example name>

will run the binary, running the example

Walkthrough

There are 3 main components defined in the library, a Client, a Registry and a Metric. A client is created using an application name, and the same name is used to create a memory mapped file in PCP_TMP_DIR. Each client contains a registry of metrics that it holds, and will publish on being activated. It also has a SetFlag method allowing you to set a mmv flag while a mapping is not active, to one of three values, NoPrefixFlag, ProcessFlag and SentinelFlag. The ProcessFlag is the default and reports metrics prefixed with the application name (i.e. like mmv.app_name.metric.name). Setting it to NoPrefixFlag will report metrics without being prefixed with the application name (i.e. like mmv.metric.name) which can lead to namespace collisions, so be sure of what you're doing.

A client can register metrics to report through 2 interfaces, the first is the Register method, that takes a raw metric object. The other is using RegisterString, that can take a string with metrics and instances to register similar to the interface in parfait, along with type, semantics and unit, in that order. A client can be activated by calling the Start method, deactivated by the Stop method. While a client is active, no new metrics can be registered but it is possible to stop existing client for metric registration.

Each client contains an instance of the Registry interface, which can give different information like the number of registered metrics and instance domains. It also exports methods to register metrics and instance domains.

Finally, metrics are defined as implementations of different metric interfaces, but they all implement the Metric interface, the different metric types defined are

SingletonMetric

This type defines a metric with no instance domain and only one value. It requires type, semantics and unit for construction, and optionally takes a couple of description strings. A simple construction

metric, err := speed.NewPCPSingletonMetric(
	42,                                                             // initial value
	"simple.counter",                                               // name
	speed.Int32Type,                                                // type
	speed.CounterSemantics,                                         // semantics
	speed.OneUnit,                                                  // unit
	"A Simple Metric",                                              // short description
	"This is a simple counter metric to demonstrate the speed API", // long description
)

A SingletonMetric supports a Val method that returns the metric value and a Set(interface{}) method that sets the metric value.

InstanceMetric

An InstanceMetric is a single metric object containing multiple values of the same type for multiple instances. It also requires an instance domain along with type, semantics and unit for construction, and optionally takes a couple of description strings. A simple construction

indom, err := speed.NewPCPInstanceDomain(
	"Acme Products",                                          // name
	[]string{"Anvils", "Rockets", "Giant_Rubber_Bands"},      // instances
	"Acme products",                                          // short description
	"Most popular products produced by the Acme Corporation", // long description
)

...

countmetric, err := speed.NewPCPInstanceMetric(
	speed.Instances{
		"Anvils":             0,
		"Rockets":            0,
		"Giant_Rubber_Bands": 0,
	},
	"products.count",
	indom,
	speed.Uint64Type,
	speed.CounterSemantics,
	speed.OneUnit,
	"Acme factory product throughput",
	`Monotonic increasing counter of products produced in the Acme Corporation
	factory since starting the Acme production application.  Quality guaranteed.`,
)

An instance metric supports a ValInstance(string) method that returns the value as well as a SetInstance(interface{}, string) that sets the value of a particular instance.

Counter

A counter is simply a PCPSingletonMetric with Int64Type, CounterSemantics and OneUnit. It can optionally take a short and a long description.

A simple example

c, err := speed.NewPCPCounter(0, "a.simple.counter")

a counter supports Set(int64) to set a value, Inc(int64) to increment by a custom delta and Up() to increment by 1.

CounterVector

A CounterVector is a PCPInstanceMetric , with Int64Type, CounterSemantics and OneUnit and an instance domain created and registered on initialization, with the name metric_name.indom.

A simple example

c, err := speed.NewPCPCounterVector(
	map[string]uint64{
		"instance1": 0,
		"instance2": 1,
	}, "another.simple.counter"
)

It supports Val(string), Set(uint64, string), Inc(uint64, string) and Up(string) amongst other things.

Gauge

A Gauge is a simple SingletonMetric storing float64 values, i.e. a PCP Singleton Metric with DoubleType, InstantSemantics and OneUnit.

A simple example

g, err := speed.NewPCPGauge(0, "a.sample.gauge")

supports Val(), Set(float64), Inc(float64) and Dec(float64)

GaugeVector

A Gauge Vector is a PCP instance metric with DoubleType, InstantSemantics and OneUnit and an autogenerated instance domain. A simple example

g, err := NewPCPGaugeVector(map[string]float64{
	"instance1": 1.2,
	"instance2": 2.4,
}, "met")

supports Val(string), Set(float64, string), Inc(float64, string) and Dec(float64, string)

Timer

A timer stores the time elapsed for different operations. It is not compatible with PCP's elapsed type metrics. It takes a name and a TimeUnit for construction.

timer, err := speed.NewPCPTimer("test", speed.NanosecondUnit)

calling timer.Start() signals the start of an operation

calling timer.Stop() signals end of an operation and will return the total elapsed time calculated by the metric so far.

Histogram

A histogram implements a PCP Instance Metric that reports the mean, variance and standard_deviation while using a histogram backed by codahale's hdrhistogram implementation in golang. Other than these, it also returns a custom percentile and buckets for plotting graphs. It requires a low and a high value and the number of significant figures used at the time of construction.

m, err := speed.NewPCPHistogram("hist", 0, 1000, 5)

Go Kit

Go kit provides a wrapper package over speed that can be used for building microservices that expose metrics using PCP.

For modified versions of the examples in go-kit that use pcp to report metrics, see suyash/kit-pcp-examples

speed's People

Contributors

Stargazers

Watchers

Forkers

owenbutler saurvs lzap linus5 natoscott adrianbiro

speed's Issues

add support for measuring quantiles and apdex scores and subsequently implement histograms and summaries

https://prometheus.io/docs/practices/histograms/

add load tests to make verify concurrent updates to different metric types

add precision based tests for timer

github.com/performancecopilot/speed/bytewriter package does not compile on Windows.

C:\>go version
go version go1.8 windows/amd64

C:\>go get -u -v github.com/performancecopilot/speed/bytewriter
github.com/performancecopilot/speed (download)
github.com/performancecopilot/speed/bytewriter
# github.com/performancecopilot/speed/bytewriter
..\..\performancecopilot\speed\bytewriter\memorymappedwriter.go:46: undefined: syscall.Mmap
..\..\performancecopilot\speed\bytewriter\memorymappedwriter.go:46: undefined: syscall.PROT_READ
..\..\performancecopilot\speed\bytewriter\memorymappedwriter.go:46: undefined: syscall.PROT_WRITE
..\..\performancecopilot\speed\bytewriter\memorymappedwriter.go:46: undefined: syscall.MAP_SHARED
..\..\performancecopilot\speed\bytewriter\memorymappedwriter.go:60: undefined: syscall.Munmap

implement custom metric types

Counters, Gauges with sensible types, semantics and units that require much less info for construction than a raw PCPMetric

rename Writer to Client

The fundamental reason for doing this is that the name writer in the golang community implies to most that the type implements io.Writer, and initially it did, but later on all the writing capability was abstracted away into bytebuffer, and the Buffer type does implement io.Writer, but speed.Writer doesn't, and I don't think the name is apt anymore. Looking at the current definition of the interface, I think Client is a more appropriate name, and will probably create less confusion.

Test failures on big-endian system (s390x)

I've recently packaged this library for Debian, and when its tests are run on a big-endian system (s390x), several of the tests fail. My initial guess is that since the mmvdump test files were created on a little-endian system, they are being read improperly on the big-endian system.

	cd _build && go test -vet=off -v -p 10 github.com/performancecopilot/speed github.com/performancecopilot/speed/bytewriter github.com/performancecopilot/speed/mmvdump
error initializing config. maybe PCP isn't installed properly
=== RUN   TestMmvFileLocation
--- PASS: TestMmvFileLocation (0.00s)
=== RUN   TestTocCountAndLength
--- PASS: TestTocCountAndLength (0.00s)
=== RUN   TestMapping
--- PASS: TestMapping (0.00s)
=== RUN   TestWritingSingletonMetric
    client_test.go:373: Incomplete/Partially Written TOC
--- FAIL: TestWritingSingletonMetric (0.03s)
=== RUN   TestUpdatingSingletonMetric
    client_test.go:427: Cannot extract dump from the writer buffer
--- FAIL: TestUpdatingSingletonMetric (0.02s)
=== RUN   TestWritingInstanceMetric
    client_test.go:539: Incomplete/Partially Written TOC
--- FAIL: TestWritingInstanceMetric (0.06s)
=== RUN   TestUpdatingInstanceMetric
    client_test.go:582: cannot get dump, error: Incomplete/Partially Written TOC
    client_test.go:342: expected 1 metrics, got 0
    client_test.go:346: expected 2 values, got 0
    client_test.go:301: expected a metric of name met.1
    client_test.go:486: expected 2 instances, got 0
    client_test.go:493: expected an instance domain of name met
    client_test.go:500: expected an instance domain at 216
    client_test.go:500: expected an instance domain at 136
    client_test.go:612: cannot get dump, error: Incomplete/Partially Written TOC
    client_test.go:342: expected 1 metrics, got 0
    client_test.go:346: expected 2 values, got 0
    client_test.go:301: expected a metric of name met.1
    client_test.go:486: expected 2 instances, got 0
    client_test.go:493: expected an instance domain of name met
    client_test.go:500: expected an instance domain at 136
    client_test.go:500: expected an instance domain at 216
--- FAIL: TestUpdatingInstanceMetric (0.24s)
=== RUN   TestStringValueWriting
    client_test.go:638: Incomplete/Partially Written TOC
--- FAIL: TestStringValueWriting (0.09s)
=== RUN   TestWritingDifferentSemantics
    client_test.go:705: cannot create dump: Incomplete/Partially Written TOC
    client_test.go:342: expected 8 metrics, got 0
    client_test.go:346: expected 12 values, got 0
    client_test.go:284: expected a metric of name m.2
    client_test.go:284: expected a metric of name m.3
    client_test.go:284: expected a metric of name m.4
    client_test.go:301: expected a metric of name m.5
    client_test.go:301: expected a metric of name m.6
    client_test.go:301: expected a metric of name m.7
    client_test.go:301: expected a metric of name m.8
    client_test.go:284: expected a metric of name m.1
    client_test.go:486: expected 2 instances, got 0
    client_test.go:493: expected an instance domain of name m
    client_test.go:500: expected an instance domain at 136
    client_test.go:500: expected an instance domain at 216
--- FAIL: TestWritingDifferentSemantics (0.13s)
=== RUN   TestWritingDifferentUnits
    client_test.go:758: cannot get dump: Incomplete/Partially Written TOC
--- FAIL: TestWritingDifferentUnits (0.13s)
=== RUN   TestWritingDifferentTypes
    client_test.go:794: cannot get dump: Incomplete/Partially Written TOC
--- FAIL: TestWritingDifferentTypes (0.24s)
=== RUN   TestMMV2MetricWriting
    client_test.go:817: cannot create dump, error: Incomplete/Partially Written TOC
--- FAIL: TestMMV2MetricWriting (0.63s)
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x155e8c]

goroutine 215 [running]:
testing.tRunner.func1.2({0x177be0, 0x2a6190})
	/usr/lib/go-1.19/src/testing/testing.go:1396 +0x2e8
testing.tRunner.func1()
	/usr/lib/go-1.19/src/testing/testing.go:1399 +0x3fc
panic({0x177be0, 0x2a6190})
	/usr/lib/go-1.19/src/runtime/panic.go:884 +0x240
github.com/performancecopilot/speed.TestMMV2MetricWriting(0xc018521a00)
	/tmp/autopkgtest-lxc.jhd0e6dy/downtmp/autopkgtest_tmp/_build/src/github.com/performancecopilot/speed/client_test.go:820 +0x2bc
testing.tRunner(0xc018521a00, 0x1a8d30)
	/usr/lib/go-1.19/src/testing/testing.go:1446 +0x128
created by testing.(*T).Run
	/usr/lib/go-1.19/src/testing/testing.go:1493 +0x448
FAIL	github.com/performancecopilot/speed	1.614s
=== RUN   TestWriteInt32
--- PASS: TestWriteInt32 (0.00s)
=== RUN   TestWriteInt64
--- PASS: TestWriteInt64 (0.00s)
=== RUN   TestWriteString
--- PASS: TestWriteString (0.00s)
=== RUN   TestOffset
--- PASS: TestOffset (0.00s)
=== RUN   TestMemoryMappedWriter
--- PASS: TestMemoryMappedWriter (0.01s)
PASS
ok  	github.com/performancecopilot/speed/bytewriter	0.023s
=== RUN   TestMmvDump1
    mmvdump_test.go:17: Incomplete/Partially Written TOC
--- FAIL: TestMmvDump1 (0.05s)
=== RUN   TestInputs
    mmvdump_test.go:67: Incomplete/Partially Written TOC
--- FAIL: TestInputs (0.03s)
FAIL
FAIL	github.com/performancecopilot/speed/mmvdump	0.098s
FAIL

add support for expvar as a backend

https://golang.org/pkg/expvar/

one of the original goals of the project was to create generic interfaces that could be implemented for any metrics reporting backend, and expvar should be the simplest to implement

install pcp and test visibility on travis

need to completely figure this out but we can install pcp and check actual visibility of metrics on travis

Histogram percentile support

Hey,

are there plans to give PCPHistorgram a percentile support in a way that when update function is called, it provides the instances? One idea would be to have an array of percentiles user is interested in and those would be added as instances named "perc_99" or "perc_95". For simplicity, just integer percentiles would be fine (50, 90, 95, 99). Would you accept such a patch?

If this is not planned or wanted, what is the best way of "plugging-in" the update function so percentiles gets passed into PCP? Maybe a callback function or similar pattern would do it so I could write my own handler.

Thanks!

Expose Go runtime metrics

The Go runtime package exposes metrics related to the host CPU, memory usage and garbage collector.

We can either add a new example demonstrating how to expose those metrics using the existing API, or we can implement a new PCPInstanceMetric called GoRuntime, which exposes those metrics and implements a SetTimeResolution method for periodically updating the metrics.

implement a go port of mmvdump

should help in better testing the writer

implement prometheus style `Must` methods that panic automatically on an error instead of returning them

can be done for writer start and stop to allow a safe defer writer.Stop() and metric registration, as well as writing values in the bytebuffer to avoid all those errcheck linting errors from gometalinter

Is this package still maintained?

As the title says: Is this package still maintained? Do you accept PRs?

I'd like to update the dependencies of this package, particularly github.com/codahale/hdrhistogram as it's been moved to a new place.

Would you accept a PR like that? Or am I better off forking the library?

Thanks!

shouldn't have to 'go get' vendor packages in travis

https://github.com/performancecopilot/speed/blob/master/.travis.yml#L16-L18

go vendoring works by default for go1.6+, and does work locally, but for some reason, not on travis

figure out ways to make instances mutable in an InstanceMetric

should be able to atleast add new instances after creation

explore making instance domains a completely internal concept

we already have shorthands (RegisterString, CounterVector, GaugeVector) that initialize an instance domain alongside an instance metric, so instance domains can probably be made an internal concept and completely removed from the public api

use pkg/errors to create errors with contexts

https://github.com/pkg/errors

helps get better stack frames on errors than this

PMID hash collisions

Hey,

I am hitting a hard wall of 2^10 maximum metrics and I am getting collisions which are causing pmval: pmGetInDom(70.1560651): Unknown or illegal instance domain identifier when trying to read the values via CLI tool. I see it with just few dozens of metrics:

3 fm_rails_http_request_db_duration.hosts_controller.index
16 fm_rails_activerecord_instances.Location
16 fm_rails_ruby_gc_allocated_objects.environments_controller.index
23 fm_rails_http_request_view_duration.discovered_hosts_controller.index
23 fm_rails_ruby_gc_minor_count.subnets_controller.index
111 fm_rails_ruby_gc_major_count.environments_controller.index
156 fm_rails_activerecord_instances.Host__Managed
156 fm_rails_ruby_gc_count.domains_controller.index
171 fm_rails_http_request_total_duration.hosts_controller.get_power_state
171 fm_rails_ruby_gc_count.hosts_controller.show
340 fm_rails_http_requests.domains_controller.index
340 fm_rails_ruby_gc_freed_objects.compute_resources_controller.index
380 fm_rails_http_request_view_duration.api_v2_bookmarks_controller.index
380 fm_rails_ruby_gc_major_count.compute_resources_controller.index
999 fm_rails_http_requests.notification_recipients_controller.index
999 fm_rails_http_request_total_duration.hosts_controller.runtime

Possible solutions include explicit metric ID assignment instead of hash, that would perhaps require storing the ID in some "cache" file. Alternatively, there is plenty of bits in PMID in "cluster" but I am unsure what this is supposed to be for. In Speed, cluster seems to be bound to the client.

I need to implement support for instances to bring the number of metrics down to about a dozen and hope for no collisions. But I assume many users can be unlucky and symptoms are hard to track.

Edit: For the record here is the utility I generated PMIDs with (pipe through sort -n for best results):

 package main
  
    import (
      "fmt"
      "hash/fnv"
      "bufio"
      "os"
    )
    
    func hash(s string, b uint32) uint32 {
      h := fnv.New32a()
    
      _, err := h.Write([]byte(s))
      if err != nil {
        panic(err)
      }
    
      val := h.Sum32()
      if b == 0 {
        return val
      }
        
      return val & ((1 << b) - 1)
    } 
  
    func main() {
      scanner := bufio.NewScanner(os.Stdin)
      for scanner.Scan() {
          text := scanner.Text()
          fmt.Printf("%d %s\n", hash(text, 10), text)
      }
    }

Allow external configuration of logger

I would like to have speed's internal logger under my control, ideally if I am able to provide own logger configuration (or log instance) that would be great. Currently it depends on some external custom formatter which I don't like, I would like to send everything into journald/syslog by default.

add support for string data types for metrics

mmvdump: add PCP qa tests

match pcp mmvdump qa outputs for the implemented mmvdump package

Reuse instances for histograms

Hey,

we use Speed to report about a thousand of histograms, this creates about a thousand of min,max,mean,std_dev instances. I am under impression that it should be possible to create those instances just once and then reuse their IDs.

If I am wrong just close the RFE ticket, sorry for the noise :-)

implement an agent in go to export metrics from the API directly

similar to parfait-agent

Add mmv v2 support

Client name with invalid characters does not work

It does not appear in PCP, I was tracking it down for an hour until I found that "client-123" won't work. Just a reminder for googlers.

explore using golang supplemental sys packages for MemoryMappedWriter

https://github.com/golang/sys

built for much more archs than core golang syscall

Add better tests for mmvdump using qas from PCP

Currently there is a basic test using the 'simple output'. Better tests mean better integrity for the package.

Explain metric registration in documentation

Hello,

I am building an adapter or bridge that will read statsd protocol data and write to PCP using your library, but I don't understand how metrics survive restart of PCP daemon. Protocol statsd is a pretty dynamic environment where clients simply send metrics and in PCP all metrics must be registered at the initialization.

I tried to register metrics dynamically stopping the client first but it did not work well (I was running into issues trying to stop already stopped client - maybe just a race condition). Can you confirm it should be possible to post-register a new metric for already started client (stopping it first of course)? The documentation only mentions the client must be stopped, this could work. Will this approach work with archiving and long-term monitoring?

Thanks

performancecopilot / speed Goto Github PK

speed's Introduction

Install

Prerequisites

Getting the library

Getting the examples

Walkthrough

speed's People

Contributors

Stargazers

Watchers

Forkers

speed's Issues

Recommend Projects

Recommend Topics

Recommend Org