Giter Site home page Giter Site logo

cmacrae / kove Goto Github PK

View Code? Open in Web Editor NEW
46.0 7.0 6.0 6.15 MB

Watch your in cluster Kubernetes manifests for OPA policy violations and export them as Prometheus metrics

License: MIT License

Dockerfile 2.64% Go 89.95% Open Policy Agent 7.42%
prometheus-exporter kubernetes opa rego metrics monitoring go golang open-policy-agent prometheus

kove's Introduction

kove logo

License Badge Version Badge GitHub Project Badge Go Report Card

GHCR Badge DockerHub Badge OCI Badge Helm Badge Snyk Badge

Watch your in cluster Kubernetes manifests for OPA policy violations and export them as Prometheus metrics.
Craft alerts & dashboards based on the structured data of the objects live in your environment(s).

About

Open Policy Agent provide the fearsome-but-trustworthy gatekeeper, which allows for admission control of Kubernetes manifests being submitted to the API. This is really nice and allows administrators to control the manifests coming in as fine-grained as they please.

However, administrators may not always want to take direct action (such as denial) on manifests arriving at the API. This is where kove comes in.
It allows administrators of Kubernetes clusters to define Rego policies that they want to flag violations for by exposing a Prometheus metric.

Some example use cases include monitoring the use of deprecated APIs, unwanted docker images, or container vars containing strings like API_KEY, etc.
Administrators can craft dashboards or alerts when such conditions are observed to better expose this information to users.

kove is built on an informer model, rather than admission control - so, it works on any existing objects in your cluster, instead of evaluating them when they arrive at the API (upon create/update). This means it'll expose policy violators that may otherwise go unnoticed if they're not updated often.

Example Implementations

kove-deprecations

A good example built on top of kove is the kove-deprecations Helm Chart.
It provides metrics for objects using APIs, annotations, and other such properties which are considered (or, soon to be) deprecated.

Simple alerts or Grafana dashboards can then be built for an overview: Grafana Deprecations Example

This grants administrators automated visibility over the objects in their cluster that meet such criteria, which in turn allows for easier preparation of cluster upgrades and alignment with best practices.

Take a look at the policies/ directory (heavily based on the policies from swade1987/deprek8ion).

Metrics

Metric Description
opa_policy_violation Represents a Kubernetes object that violates the provided Rego expression. Includes the labels name, namespace, kind, api_version, ruleset and data
opa_policy_violations_total Total number of policy violations observed
opa_policy_violations_resolved_total Total number of policy violation resolutions observed
opa_object_evaluations_total Total number object evaluations conducted

Usage

ConfigMap objects containing the Rego policy/policies and the application configuration can be mounted to configure what you want to evaluate and how you want to evaluate it.

Options

Option Default Description
config "" Path to the config file. If not set, this will look for the file config.yaml in the current directory

config

A YAML manifest can be provided in the following format to describe how and what you want to watch for evaluation:

namespace: default
ignoreChildren: true
regoQuery: data.pkgname.blah
policies:
  - example/policies
objects:
  - group: apps
    version: v1
    resource: deployments
  - group: apps
    version: v1
    resource: daemonsets
  - group: apps
    version: v1
    resource: replicasets
Option Default Description
namespace "" Kubernetes namespace to watch objects in. If empty or omitted, all namespaces will be observed
ignoreChildren false Boolean that decides if objects spawned as part of a user managed object (such as a ReplicaSet from a user managed Deployment) should be evaluated
regoQuery data[_].main The Rego query to read evaluation results from. This should match the expression in your policy that surfaces violation data
policies none A list of files/directories containing Rego policies to evaluate objects against
objects none A list of GroupVersionResource expressions to observe and evaluate. If empty all object kinds will be evaluated (apart from those defined in ignoreKinds)
ignoreKinds [
apiservice
endpoint
endpoints
endpointslice
event
flowschema
lease
limitrange
namespace
prioritylevelconfiguration
replicationcontroller
runtimeclass
]
A list of object kinds to ignore for evaluation
ignoreDifferingPaths [
metadata/resourceVersion
metadata/managedFields/0/time
status/observedGeneration
]
A list of JSON paths to ignore for reevaluation when a change in the monitored object is observed

The above example configuration would instruct kove to monitor apps/v1/Deployment, apps/v1/DaemonSet, and apps/v1/ReplicaSet objects in the default namespace, but ignore child objects, yielding its results from the data.pkgname.blah expression in the provided policy.

policies

There are some important semantics to understand when crafting your Rego policies for use with kove.
The expression that you evaluate from your query must return structured data with the following fields:

  • Name: The name of the object being evaluated
  • Namespace: The namespace in which the object you're evaluating resides
  • Kind: The kind of object being evaluated
  • ApiVersion: The version of the Kubernetes API the object is using
  • RuleSet: A short description that describes why this is a violation
  • Data: Additional arbitrary data you wish to expose about the object

The above data are provided by kove when it evaluates an object, with the exception of RuleSet & Data which should be defined in the Rego expression. For instance, if we were to evaluate the query data.example.bad, our policy may look something like this:

package example

# Label matchers we want to look for.
labels["secure"] = ["nope"]

# Kinds of objects we care about evaluating.
# This isn't strictly necessary if you're satisfied with the 'objects' configuration
# option for kove; it'll only watch what it's told.
kinds = ["Deployment", "StatefulSet", "DaemonSet"]

bad[stuff] {
	# Assign our object manifest (input) to the variable 'r'
	r := input

	# Does our object's 'kind' field match any in our 'kinds' array?
	r.kind == kinds[_]

	# Does our object's 'secure' label match any in our 'labels.secure' array?
	r.metadata.labels.secure == labels.secure[_]

	# If the above conditions are true, express a set containing various pieces of
	# information about our object. As you can see, we're assigning this to the variable
	# 'stuff', which you may notice in the expression signature is what we're returning.
	# This information is then used to expose a Prometheus metric with labels using this
	# information.
	stuff := {
		"Name": r.metadata.name,
		"Namespace": r.metadata.namespace,
		"Kind": r.kind,
		"ApiVersion": r.apiVersion,
		"RuleSet": "Insecure object", # Explain why this is a violation
		"Data": r.metadata.annotations["something"] # Additional arbitrary information
	}
}

If you have a test cluster (perhaps built on kind), you can try out the evaluation of this policy against a violating Deployment.
Check out more examples/.

Deployment Artifact HUB

A Helm Chart is available on Artifact HUB with accompanying implementation charts built on top of it, like kove-deprecations

kove's People

Contributors

cmacrae avatar rcjames avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

kove's Issues

Mitigate throttling

Even in an "empty" cluster, when watching all object kinds, informers can get a bit too enthusiastic and start getting throttled.

Implement a work queue.

Old violations aren't removed

Currently, violation metrics are never deleted after they are set. This means once a violation is resolved, the metric for it continues to be exported until the exporter is restarted.

Ideally, once a violation ceases to exist, it should then be removed from the Prometheus vector.

Structured logging

It'd be nice to have some logging (structured, if implemented).
Perhaps using klog

Expose rego query as a parameter

Initial implementation statically defines the following query:

	r := rego.New(
		rego.Query("data[_].main"),
	)

We should expose this to allow users to asses whatever results they want

Provide config option to ignore child objects

If child objects are created in the API by a user controlled manifest (such as a ReplicaSet that's created by a user defined Deployment) it's possible that a violation will be observed, in addition to the parent (depending on the policy, of course).

This may not be desirable in some cases, so it'd be nice to have something like a ignore_children: true option.

This can be checked for in the metadata.ownerReferences field of the manifest. For example:

	"ownerReferences": [
	    {
		"apiVersion": "apps/v1",
		"blockOwnerDeletion": true,
		"controller": true,
		"kind": "Deployment",
		"name": "violation-deployment",
		"uid": "32715671-2d99-4a0d-8fde-800207ef4532"
	    }
	]

Multiple reevaluations of fresh objects

When creating an object of a kind that's being monitored, since 1.20.0 the manifest changes enough in its initial lifecycle that we reevaluate several times, almost immediately.

Allow users to express what constructs they want to monitor

At the moment, the following expression is used to collect a static set of construct types:

	gvrs := []schema.GroupVersionResource{
		{Group: "apps", Version: "v1", Resource: "daemonsets"},
		{Group: "apps", Version: "v1", Resource: "deployments"},
		{Group: "apps", Version: "v1", Resource: "replicasets"},
		{Group: "apps", Version: "v1", Resource: "statefulsets"},
		{Group: "networking.k8s.io", Version: "v1", Resource: "networkpolicies"},
		{Group: "policy", Version: "v1beta1", Resource: "podsecuritypolicies"},
		{Group: "extensions", Version: "v1beta1", Resource: "ingresses"},
	}

A configuration schema (perhaps using viper) should be implemented to allow users to express which constructs from which APIs they want to evaluate.

Load policies from file/ConfigMap

The PoC implementation loads an included policy borrowed from kube-no-trouble.
Need to allow users to define their own policies and doing this with a flag to a file (thus a ConfigMap mount) is nice and simple.

Allow configuration of list options

We're chucking an empty metav1.ListOptions when we look up resources.
People may want to filter based on a LabelSelector/FieldSelector, etc.

Add functional metrics

Rather than just metrics for the policy violations themselves, there should probably also be additional metrics about the exporter functionality:

  • Total objects evaluated
  • Total violations flagged

Better evaluation trigger

Initial implementation of the evaluation trigger is about as crude as it gets: every minute, we pull all the manifests in, then evaluate them... we even have to wait for the first minute before our timer's initial tick.

There has to be a nicer way to do this. Perhaps based on watches

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.