oetiker / rrdtool-2.x Goto Github PK

86.0 34.0 8.0 7 KB

RRDtool 2.x - The Time Series Database

rrdtool-2.x's Introduction

NAME

RRDtool 2.x - Re-Engineering RRDtool for the next 15 Years

CONTEXT

Since its release in 1999, RRDtool has become an integral part of many monitoring applications, way beyond my initial vision. RRDtool is used everywhere: in tiny embedded systems as well as in huge enterprise monitoring systems with hundreds of thousands of data sources.

My main concerns when designing applications is to come up with simple and logical interfaces, covering all the necessary functionality and providing a path for further enhancement. This has worked very well with RRDtool 1.x, but there are some central design elements that are not so easily changed and are therefore blocking incremental development.

VISION

Part of the success of RRDtool is based on the fact that a single package takes care of all your time series data storage, retrieval and presentation needs. Any future version of RRDtool will do the same, only more so.

A prime objective of the 2.x rewrite is to create clear internal APIs for the interaction of the individual components of RRDtool. This will make it possible to replace or drastically modify one component without changes to the rest of the system. To some extent, this pattern was already present in RRDtool 1.x, but especially in the data storage layer the structure does not lend itself to extensions all that well.

COMPONENTS

Time Series Database

The database component stores time series data.

Data Retrieval and Postprocessing

The Data Retrieval and Postprocessing component takes care of all data retrieval and postprocessing needs. DEF, CDEF and VDEF functionality from RRDtool 1.x is located here, and is available to all data consumers.

Graphing

The graphing component draws the charts. Its structure allow for multiple chart types to be implemented.

Web API

RRDtool comes with a REST API. Unfortunately this raises a bunch of dependency issues, for the actual webserver, authentication, encryption. Ideally, RRDtool itself would provide some minimal implementation of these, to be able to use it standalone.

DEVELOPMENT PLAN

Rewriting RRDtool is a major software engineering effort. Here is the plan to achieve it.

Collect requirements, using the GitHub Issue Tracker.
Create Engineering Documents in the GitHub Wiki.
Create and document a coherent, modular design, down to the internal API level.
Plan and budget the implementation.
Find financing. Large corp sponsors or crowd funding.
Implement.
Release 2.0

REQUIREMENTS

Test Suite: All 2.x functionality is exercised by a test suite.
Backward Compatibility: The 2.x design addresses all the complex issued not easily changed by altering RRDtool 1.x. Most of the 1.x functionality is present in RRDtool 2.x. An 1.x compatibility API emulates the 1.x behavior on top of the 2.x API.

FREE SOFTWARE LICENSE

A suitable Free Software License for RRDtool 2.x is to be determined based on feedback from the people financing the development. It could be GNU GPL V2, V3, GNU LGPL

NOTE

This document will evolve as the project takes shape.

AUTHOR

Tobi Oetiker <[email protected]>

rrdtool-2.x's People

Contributors

Stargazers

Watchers

Forkers

skynet msperl lidia4 codeslash linearregression ryandesign zimmermatt

rrdtool-2.x's Issues

What is the timeline to expect for the first/initial release of rrdtool 2.0

Hi,

I was just wondering, what is the timeline to expect for the availability of the functional rrdtool2 system?

Nikhil

Sane time interval defaults for "fetch"

rrdtool fetch including at least its Perl interfaces RRDs and RRDTool::OO by default only fetch data from the past 24 hours regardless of whether there are data sets from the past 24 hours in the RRD or not.

From my point of view, sane defaults would be

Output all contained data by default (with regards to time interval as well as consolidation functions)
Output all data from the most densest RRA

If fetch should be backward-compatible, i.e. changing these defaults is out of scope, I strongly recommend adding a new rrdtool subcommand which does the same, but with saner default values for time intervals.

object oriented code base

for easier maintainability and potential multithreading for parallel data access.

More flexibility in VDEF/CDEF etc

It should be possible to use the value of a VDEF in a CDEF, plus more of the RPN functions in general in a VDEF. We should also remove the requirement to have at least one DS in a CDEF
This would allow, for example, things such as -

Define an HRULE on the 95th percentile without having to do 2 passes
Define a vertical line at the time of maximum for the value
Have a CDEF as a function of time and not need to use a throwaway DS
Calculate an average temperature in Farenheit using a VDEF when the DS is in Celsius (already possible, but you need to make an extra CDEF to do it)
Have a CDEF that calculates the variation from the average (IE, x-AGV(x)) which requires using avdef value

Of course, you'd need to be careful to identify circular references between CDEFs and VDEFs...

support for heatmap/treemap

Would be sweet to have support for plotting a treemap and heatmap using rrd2.

Very valuable types of plots when looking 20-25 servers at once. Currently this is possible using R or some javascript libraries. This should be default on rrd2 along with the other default plotting capabilities.

Respect variable month length in data aggregation

Currently rrdtool uses fixed size of aggregation period, which makes it impossible to calculate monthly statistics. It would be great if there would be an option to specify monthly aggregation (from 1st to last day of the month)

RRD data as a network service

This is potentially related to issue #13.

It would be neat if RRD data could be "updated" and "fetched" as a network service.

The network service could offer a level of header caching and deferred writes (see issue #13). The RRD files (or the administrator of the service?) would need to specify how much write caching is acceptable. I personally would probably risk batching up writes to 30 updates per physical write; gambling a 30 minute loss in exchange for a significant improvement on how much a single box can do.

The other side of this would be that RRD graph would allow DEF's to point to network service based data storage. This would allow a single RRD graph to use multiple back end nodes; possibly clustered together or possibly geographically diverse back ends. This would allow us to finally unify a view on a distributed RRD footprint.

This is probably getting too much into the details: but I would imagine that at least one "fetch" output format should be optimal to RRD graph purposes; not necessarily XML that is expensive over the wire and expensive to parse.

Separate data-storage / -retrieval and graphing

Hi,

this has been requested many times and also it's already mentioned in a few other issues.

The idea is to have (at least) two different libraries. One taking care of all the data retrieval, data update, etc. functions (let's call that librrd-data for now) and one taking care of the graphing (which would be build on top of the other library) (let's call that librrd-graph for now). librrd-data would itself be based on the low-level I/O layer which, for the most flexibility, could also be a separate library.

See my other ticket about moving RPN stuff from "graph" to "fetch" for some more ideas about the implementation.

Cheers,
Sebastian

Support 64bit time regardles of os time_t

on systems where time_t is 32 bit, rrdtool should still
use 64bit to avoid 2038 time bug the https://github.com/schwern/y2038
library comes to mind.

full compatability with rrdtool 1.x features and dataformat

otherwise migration plans become as bad as an endian conversion nightmare ( which I've done going from PowerPC -> Intel on a live cluster )

External Graphic Dependencies

rrdtool needs external graphic libraries (like cairo or pango).

Sometimes this could be a rather lengthy thirst track...

If the building process of these external libraries in RRDTool could be (better) automated that will be helpful for compiling new rrdtool builds.

Cheers,
Roman

Ability to insert data at irregular intervals

It would be nice, if RRD allows to insert data with different intervals. It will be useful to have real-time graph, but store only 5 min measures in a long run.

ability to store raw data

for high precision use-cases and CF RRA use

Focus also on Embedded Devices

RRDTool v1 is used in embedded devices, so the new RRDTool v2 should be also have a focus on such usage and devices:

rather small memory usage
tested on embedded CPUs (like ARM, Intel Atom, ...)
maybe separate 'graphic engine' from 'store / calculate engine'

Cheers,
Roman

atomic data type

Hi,

I would like to see atomic data type to be able to store atomic data.
An example of atomic data is UPS status which value can be either online or onbattery.

Sure, standard aggregations can not be applied on such data.

Atomic data can be used, for instance, in IF constructions to modify color of the different LINE or AREA.

Using atoms instead of numeric integer state representation is more convenient, you don't need to explicitly store the mapping both on backend and frontend sides.

Vala for implementation

The Vala programming language might provide an interesting system for sticking with c for portability and speed while getting oo-ease in handling our data and and manageing extensibility. Bindings for dynamic languages come for free and since we are already depending on glib this would be quite a nice fit.

https://wiki.gnome.org/Projects/Vala

CUSUM

add support for CUSUM anomaly detecction http://en.wikipedia.org/wiki/CUSUM

Histogram mode

It would be great to have support for histograms based on the time series data. This might even be a new form of RRA or aggregation function. This is probably non-trivial to get right.

This is something I need quite often, because it takes the data out of the time domain to find new underlying properties not immediatly obvious there.

RRA selection when data fetched

Currently, RRDTool returnes all data at the same resolution, using the best match from available RRAs.

THis has the unfortunate sideeffect of meaning that, if even one requested endpoint falls outside the RRA you requested, youll get another lower-resolution one.

It would be preferable if RRDTool Always returned the resolution requested, using the closest-match RRA available, averaging or copying as required. IE, if the first half of your requested window can be provided at the requested resolution, then use that; for the other half, use a lower-resolution RRA and duplicate values.

Thus, for any subset of your requested window, you'd always get the best data available at the resolution you requested.

This might cause some awkwardness if the RRAs are not evenly sized though.. eg, if your RRAs are size 1cdp=3dp and 1cdp=5dp and you request a cdp=2dp fetch, then RRDTool would need to approximate a 2dp spread from a 3dp average...

multi DS DEF

ability for a single DEF to specify multiple sequencial datasources: DEF:def0=/filepath1:ds-name:AVERAGE:start=1395292500:end=1395465540,/filepath2:ds-name:AVERAGE

data compression

lz4 realtime compression improves performance as there is an abundance of cpu.

extent based database format

ability to grow aka perpetual storage and everything inbetween

Wiki: TIRD link is broken?

On the wiki page Portable Dataformat in section workaround, the TIRD link seems to be broken. I tried some searching on the internet to figure out what the updated link would be, but couldn't find anything.

Portable Dataformat

RRD files are binary compatible on all architectures without dump / restore.

Alerting

I’m not sure if it actually is acceptable for RRDTool or not, so I just thought there’s no harm in bringing it up.

Firing alerts based on definable situations can make a powerful monitoring tool. Usually, similar systems can fire alerts when a sudden change in data happens (a 200% increase or decrease, for example), when values go outside of a safe range, (below 1% or above 80%, for example), or when no data comes.

Alerts can be fired with calling web hooks, running scripts, or anything similar.

Make current graph resolution available for cdev/vdef calculations

Currently you don't know what time resolution was selected by rrdtool when drawing a graph (or doing an export). Of course you can calculate it externally knowing what RRAs are there, but this value is quite frequently used in calculations and so it would be much easier if it were provided by rrdtool as a variable in RPN expressions.

integrated caching functionality

caching happens before data is written to disk, after all the processing

Legend control

One things I've wanted to be able to do for some time is to add a single coloured box to the legend on its own. This is in order to have more control over the order the legends are printed, independently of the order the lines on the graph are rendered.

new Multi-Queue Block Layer in kernel 3.13

i just want to drop this information. please see presentation.
http://bjorling.me/blkmq-slides.pdf

looks promising

network transport layer

provide network access to rrdtool functionality. the existing pipe and a new rest interface come to mind.

Support for writing rrd files or hierarchy into distributed filesystem

Hi,

I would like to have rrdtool support for writing rrds into (some fast) distributed filesystem, so lack of disk storage on one node or a single disk spindle/throughput issue should not be a concern.

Nikhil

Put (CDEF, VDEF, etc.) calculation into "fetch" (rather than "graph")

Hi,

in RRDtool 1.x, all RPN calculation lives inside rrdgraph. That way, it's rather hard to do any "complex" queries on the data using rrdfetch. The only way around that is to abuse rrdgraph by writing the image to something like /dev/null and using PRINT to access the data.

It would be nice to have RPN calculation support in rrdfetch and building rrdgraph on top of that (or rather on top of the fetch API).

Cheers,
Sebastian

SQL like query tool

I'd like rrdtool-2 to come with some sort of query tool which returns the actual values and timestamps where a criteria was met, e.g. where the value of a DS was above a certain threshold, or when it was NaN. I know that you can achieve this with rrdtool-1 and a handcrafted graph with e.g. vertical bars, or filtering all data using a Perl script. But it would be handy as a native function of rrdtool-2 for hunting errors, spikes, missing data points, and so on.

RRDtool 2 Datafile

Extent based RRA
- fixed size extents defined upfront; reasonable default 16KB?
  - fixed size will permit the re-use of extents if one stops being used.
  - it's easier
- RRA (round-robin) notion modified to 'time' not rows
  - RRA1:
    90 days of 5 minute CF <- user provides
    25,920 rows <- rrd requires
    13 16KB extents <- what rrd consumes
  - same thing, but less complicated -- users don't need complication
  - supports RRA0 idea much simpler
- Extent offset tracking would likely need a larger header and/or extent based header
  - like an inode tracks to other inodes
- use posix_fallocate() to avoid excessive IO
RRA0 stores 'literal storage': no consolidation, no normalization
- RRA1+ would be CF (not support RAW ... anyone using RAW must use extent-growth on RRA0 to retain data)
- RRAs would age by time, not by rows, given example:
  RRA0 - store data for 3 days (settable) no matter how many datapoints inserted, sub-second supported (micro or nano second insanity even.)
  RRA1 - a 60-step average store for 1 month
  RRA2 - a 300-step average store for 1 year
  RRAx - whatever
- special RRA CF types like COMPUTE, Holt-Winters types, etc would only be able to operate on RRA1 not RRA0
- RRA0 holding all data for RRA1+ would be able to coalesce writes
  boundary step timing of updates into RRA0 would trigger a normalize/coalesce for a DS in other RRAs
  then normalization is the same thing we have now with RRA1 an update passing it's boundary would normalize to RRA2, etc.
- everyone cares about precision for 'what is happening now' RRA0 gives them exactly that
- no one really cares about that precision a long time ago; RRA1+ gives them that.
  - btw: I'd use RRA1+ as 1-min/5-min and start collecting data more often in RRA0: 30,15,10,1 second intervals.
RAW DST - new datasource type
- store exactly what was given at time it was given; no normalization
  - it's a minor issue to know up front all numeric types
  - people get confused about this, getting it wrong stores wrong information
  - getting it wrong is easy, example: riak/memcache/mysql all have stats output which give key/value and NO numeric type ... one must basically upfront guess at the type.
- if rrd had a data type RAW then a system which was capable could pass in the DST dynamically and rrdtool would do the operation on export.
- DST against RAW would require heartbeat:min:max to be passed in as well.
- RAW could then be used for any numeric which was unknown at the cost of performance due to ignorance.
- RAW could have an rrdtool modify feature which would convert RAW into another datatype when it was learned (for those so inclined.)
- RAW would only be supported in RRA0
file-size currently is efficient 8bytes per datapoint; storing time in RRA0 will impact this
- LZO compression built-in would be highly beneficial if definable per RRA:
  - super long-term datafiles
  - cold-storage rrds that won't be updated again.
  - could be set on RRA1+ (being CFs of RRA0 are not modified as often)
delayed RRA1+ CF ( consolidations )
- tunable delayed compactions: as RRA1+ aren't required near-time they can be delayed
- delay wouldn't work for algorithms like holt-winters and COMPUTE which would have to operate on RRA1 (not [0] due to above RRA0 implementation)
- would reduce IO
- delay randomized(offset) to avoid many consolidations in unison to benefit systems with multi-millions of rrd datafiles
DEF to support datafile list in RRD xport/graph syntax so offline cold-storage (compressed) rrds can be supported

additions from ryan 2013-09-25

permitting individual updates to distinct datasources:
rrd_update file1 ds-nameA,time1:v1,time2:v2 ds-nameB,time1:v1,time2:v2
rrd_update file1 ds-nameC,time1:v1,time2:v2 ds-nameD,time1:v1,time2:v2
- note: two calls to rrd_update both inserting into the same time but for different ds-names each
- note: think about a database with millions of datasources and the rrd-datafiles all open writable once-only (mmap()'ed) and the db constantly only updating with msync() no close()
- note: OS overhead on open() is high enough to engineer around it ... only way to do that is store more in one file and mmap() that file, but we can't make a user 'coelesce' the data for writing like rrdtool requires now with the 'write once into time-frame model'
- note: msync() will flush blocks to user/shared space and permit any external app to read it ( without having to go through the write database for reads )
fixed extent size during datafile creation
ensure first extent offsets into extent like rrd does now:
this actually still helps quite a lot even in a new format because IO creation won't all happen at the same 'moment' but be spread out over time.
only required on first extent, subsequent extents need not do this ( beneft is gained from the original offset ) and once max-extents is reached the first extent will be round-robin'ed to.

design data point - btrdb

You may or may not have seen, but a less known database was developed around 2017 called btrdb that is similar to how I imagined rrdtool2 could operate. The open source version is here https://github.com/BTrDB/btrdb with a paper describing it here https://www.usenix.org/system/files/conference/fast16/fast16-papers-andersen.pdf .

Actually I recently started contracting for a company called PingThings - https://www.pingthings.io/ uses a modified version of btrdb for storing power grid time series. (We are also looking for new engineers/contractors)

data base alteration

capability of appending a DS to an existing datafile
adding/removing RRAs
altering RRA properties
re-populating with existing data

separation of presentation and database functionallity

database and graphing are two separate things. the two components should work well together but also be able to operate independently

Allow to change format of RRD files

Add/remove DSs
Add/remove/change RRAs

API abstraction of the storage backand

Would be nice if the new API has an abstract interface allowing different storage backends.

The abstraction interface should mirror the basic operations of the file descriptor API like open, close, read, write and sync. All algorithms in RRDv2 should use this API to read and write the data.

This would allow different backends like, files, mmap, memory buffers, network protocols, caching daemons, ...

Minimize storage I/O

The single biggest cost with deploying RRD in my experience has been the I/O cost of a single .rrd file.
Seek,Read header, seek, write header, and one or more (seek RRA + offset, write).

Smashing more DS's into a single RRD helps, to an extent; but there is a tradeoff on flexibility by doing that. Particularly when your logical breakdown of what goes into a single .rrd needs to add or remove DS's, to accommodate changes in the instrumentation of applications.

Likewise, reducing the number of RRA's helps with some of the cost. A single RRA of 1 minute data for 1 year, reduces the write costs; but it does have a penalty on graphing long periods of time later.

Throwing hardware at it solves it only so far. RAID10 with fast drives helps. But even with that, some of our deployments still require 20 or more RRD servers. The followup problem from distributing the data to so many servers then becomes, bringing the data back to one place for a single combined graph.

While these problems exist no matter what; some emphasis on the I/O cost of a single update may be able to help alleviate this problem for a number of customers; it may at least lower the capex costs for the rest.