Giter Site home page Giter Site logo

bergerx / kubectl-status Goto Github PK

View Code? Open in Web Editor NEW
103.0 3.0 5.0 6.29 MB

A kubectl plugin to print a human-friendly output that focuses on the status fields of the resources in kubernetes.

License: Apache License 2.0

Makefile 0.90% Go 99.10%
kubectl-plugin kubectl-plugins kubernetes

kubectl-status's Introduction

kubectl status

A kubectl plugin to print a human-friendly output that focuses on the status fields of the resources in kubernetes.

Just a different representation of the kubernetes resources (next to get and describe).

This plugin uses templates for well-known API conventions and has support for hardcoded resources. Not all resources are fully supported.

Installation

You can install kubectl status using the Krew, the package manager for kubectl plugins.

After you install Krew, just run:

kubectl krew install status
kubectl status --help

Upgrade

Assuming you installed using Krew:

kubectl krew upgrade status

Demo

Example Pod: pod

Example StatefulSet: statefulset

Example Deployment and ReplicaSet deployment-replicaset

Example Service: service

Features

  • aims for ease of understanding the status of a given resource,
  • aligned with other kubectl cli subcommand usages (just like kubectl get or kubectl describe),
  • uses colors extensively for a better look and feel experience, while a white-ish output means everything is ok, red-ish output strongly indicates something wrong,
  • erroneous/impacting states are explicit and obvious,
  • explicit messages for not-so-easy-to-understand status (e.g., ongoing rollout),
  • goes the extra mile for better expressing the status (e.g., show spec diff for ongoing rollouts),
  • compact, non-extensive output to keep it sharp,
  • no external dependencies, doesn't shell out, and so doesn't depend on client/workstation configuration

Usage

In most cases, replacing a kubectl get ... with a kubectl status ... would be sufficient.

Examples:

kubectl status pods                     # Show status of all pods in the current namespace
kubectl status pods --all-namespaces    # Show status of all pods in all namespaces
kubectl status deploy,sts               # Show status of all Deployments and StatefulSets in the current namespace
kubectl status nodes                    # Show status of all nodes
kubectl status pod my-pod1 my-pod2      # Show status of some pods
kubectl status pod/my-pod1 pod/my-pod2  # Same with previous
kubectl status svc/my-svc1 pod/my-pod2  # Show status of various resources
kubectl status deployment my-dep        # Show status of a particular deployment
kubectl status deployments.v1.apps      # Show deployments in the "v1" version of the "apps" API group.
kubectl status node -l node-role.kubernetes.io/master  # Show status of nodes marked as master

Development

Please see CONTRIBUTING.md file for development related documents.

License

Apache 2.0. See LICENSE.

kubectl-status's People

Contributors

bergerx avatar dependabot-preview[bot] avatar dependabot[bot] avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

kubectl-status's Issues

adopt cli-utils kstatus library

Seems like there are some commonalities with this and kstatus from the kubernetes-sigs kstatus project which aims to provide a standard library for handling status of objects and even define a set of standard conditions that the library will understand and that we encourage developers to adopt in their CRDs.
https://github.com/kubernetes-sigs/cli-utils/tree/5e6805052d6c39a77fc1355ad560b2f270677e63/pkg/kstatus

It could be better adopt both:

  • the library, and
  • the terminology the library follows as part of the reported status.

Include each pod's resource usage details when querying a node

Currently describe prints the below details when querying a node. But the real-time utilization is missing in that picture.

Non-terminated Pods:          (12 in total)
  Namespace                   Name                                CPU Requests  CPU Limits  Memory Requests  Memory Limits  AGE
  ---------                   ----                                ------------  ----------  ---------------  -------------  ---
  kube-system                 coredns-6955765f44-gcntl            100m (5%)     0 (0%)      70Mi (3%)        170Mi (9%)     3h56m
  kube-system                 coredns-6955765f44-knpzb            100m (5%)     0 (0%)      70Mi (3%)        170Mi (9%)     18d
  kube-system                 etcd-minikube                       0 (0%)        0 (0%)      0 (0%)           0 (0%)         18d
  kube-system                 kube-apiserver-minikube             250m (12%)    0 (0%)      0 (0%)           0 (0%)         18d
  kube-system                 kube-controller-manager-minikube    200m (10%)    0 (0%)      0 (0%)           0 (0%)         18d
  kube-system                 kube-proxy-8n656                    0 (0%)        0 (0%)      0 (0%)           0 (0%)         18d
  kube-system                 kube-scheduler-minikube             100m (5%)     0 (0%)      0 (0%)           0 (0%)         18d
  kube-system                 metrics-server-6754dbc9df-frtfb     0 (0%)        0 (0%)      0 (0%)           0 (0%)         13d
  kube-system                 storage-provisioner                 0 (0%)        0 (0%)      0 (0%)           0 (0%)         18d
  test1                       web-0                               0 (0%)        0 (0%)      0 (0%)           0 (0%)         12d
  test1                       web-1                               0 (0%)        0 (0%)      0 (0%)           0 (0%)         12d
  test1                       web-2                               0 (0%)        0 (0%)      0 (0%)           0 (0%)         12d

But operators would like to find the answer for "Which pod is 'actually' using most resources?"

(This may be a bit stretch goal for this plugin)

Create a developer/contributor document

We should cover:

  • how to contribute - should aim for a good "first 10 minutes" experience
  • requirements for acceptable contributions (e.g., go fmt, go vet)
  • code review requirements, including how code review is conducted, what must be checked, and what is required to be acceptable.
  • communication channels

Add watch capability

It would be really helpful if we implement a watch functionality to see the updates in real-time.

But watching an object and doing queries to other resources and rendering them may cause printing inconsistent/non-correct/invalid states. Some updated on the objects are applied quite fast, it's also better not to overwhelm kubernetes apiserver.

It may be better to implement after having a shallow flag implemented in #186 and when the watch is set, the shallow mode can be implicitly enabled

Dependabot can't resolve your Go dependency files

Dependabot can't resolve your Go dependency files.

As a result, Dependabot couldn't update your dependencies.

The error Dependabot encountered was:

vbom.ml/[email protected]: unrecognized import path "vbom.ml/util" (https fetch: Get https://vbom.ml/util?go-get=1: EOF)

If you think the above is an error on Dependabot's side please don't hesitate to get in touch - we'll do whatever we can to fix it.

View the update logs.

Reports incorrect relative times

This is from kubectl describe hpa which is accurate:

  Normal  SuccessfulRescale  29m (x2 over 41m)  horizontal-pod-autoscaler  New size: 100; reason: external metric AWS-SQS-Queue-ApproximateNumberOfMessages-helloworld(&LabelSelector{MatchLabels:map[string]string{deploymentName: sqs-consumer,},MatchExpressions:[]LabelSelectorRequirement{},}) above target

This is from kubectl status hpa:

    SuccessfulRescale 4h ago (x2 over 4h) from horizontal-pod-autoscaler: New size: 100; reason: external metric AWS-SQS-Queue-ApproximateNumberOfMessages-helloworld(&LabelSelector{MatchLabels:map[string]string{deploymentName: sqs-consumer,},MatchExpressions:[]LabelSelectorRequirement{},}) above target

My local TZ offset is -0500, not sure if that's related or not.

Seeing the same with other object types (e.g. pod, hpa is just an example).

extended ingress domain and TLS secret checks

Push #199 further with these stretch goals for ingress checks:

  • the cert in the secret is not self-signed and must match the host
  • another ingress using the same host+port (may be hard to catch since regex can be used, and there can be many ingresses in the clusters, checking ingresses just in the same namespace may get most operators catch most instances)
  • verify the domain and cert by checking the host+path (this may be misleading for cases where operators are using VPN or have ingresses has IP restriction)

ResourceQuota template is missing

If you are actively using ResourceQuotas please think about contributing because we don't yet use ResourceQuotas actively.
Our knowledge on its details is quite limited.

Show diff for ongoing deployment rollout as STS does

We can show a diff if there is an ongoing STS operation with the changes.

We can do the similar with Deployments, but it's a bit more tricky to handle.

At a given time STS would have only one currentRevision and updateRevision each referring a Kind: ControllerRevision which allows us to get a diff.

But a Deployment may have many pods each from a different ReplicaSet. E.g. At a given time, a Deployment may match 3 pods each from a different Replicaset. Calculating diff, in that case, would be troublesome. Maybe we can try to identify the last two and warn if there are more.

())

DoRaw(context.TODO())
if err != nil {
// ignore any errors querying the endpoint
return nil
}
nodeStatsSummary := make(map[string]interface{})


This issue was generated by todo based on a TODO comment in 5e4b280. It's been assigned to @bergerx because they committed the code.

add render related flags/options

We can let users choose what t be displayed. Here are some flags that may be implemented:

  • include owners: render owner objects (this is the current default behavior)
  • include dependents: reverse of owners
  • include events
  • include matching services
  • include matching ingresses
  • include app information (e.g helm)
  • cascade render options: apply the same render options when rendering a related object
  • shallow: render only the object itself doesn't query kubernetes API for other resources than the immediate resource, would likely disable most includes

.cpu.usageCoreNanoSeconds: 71853537319612

TODO: .cpu.usageCoreNanoSeconds: 71853537319612
{{- */ -}}
{{- define "node_stats_summary_resources" }}
{{- with .cpu.usageNanoCores }}cpu {{ . | float64 | divFloat64 1000000000 | printf "%.2g" }}core/sec, {{ end }}
{{- with .memory -}}
mem {{ .usageBytes | float64 | humanizeSI "B" -}}


This issue was generated by todo based on a TODO comment in 5e4b280. It's been assigned to @bergerx because they committed the code.

enable debug/verbose mode/flag

We should be able to print out the details about the queries made to the API server.

Something like this:

$ kubectl options
...
  -v, --v=0: number for the log level verbosity
...

add ability to show changes in last/recent Deployment rollout

When checking a recent rollout, it makes sense to see what has had changed.

E.g. here is an example rollout by AKS on their managed resources, it would be nice to see what exactly they updated:
image

We already have the relevant information kept inside the ReplicaSets. What's needed is to figure out some recent rollouts.

This feature is required to be placed behind a flag (#186) or this will bloat the Deployment output. And some cases it could be nice to tell it how many of the recent rollouts to show, or maybe even tell a time or timeframe like, we need to figure this out:

  --show-recent-rollouts=N
  --show-last-rollout
  --rollouts-since=2020-12-31
  --rollouts-in=1d

This feature is different than #167. That one is about showing changes for ongoing rollouts.

If a pod/rs has any problems inlcude its owners' and matcher services' statuses

In some cases when checking a pod, the root cause could be a recent broken rollout, which just from the pod's perspective won't be visible.

As an example, there is a recent STS/Deployment rollout with a broken image reference. It would be useful to see the status of the parent object to see details about the rollout which caused the issue.

0.135core/sec(0.01% of available)

TODO: 0.135core/sec(0.01% of available)
TODO: .cpu.usageCoreNanoSeconds: 71853537319612
{{- */ -}}
{{- define "node_stats_summary_resources" }}
{{- with .cpu.usageNanoCores }}cpu {{ . | float64 | divFloat64 1000000000 | printf "%.2g" }}core/sec, {{ end }}
{{- with .memory -}}


This issue was generated by todo based on a TODO comment in 5e4b280. It's been assigned to @bergerx because they committed the code.

support ingress v1

It would be better if we implement the whole implementation into a template function as described in #183

if unsupportedApiVersion := checkUnsupportedIngressApiVersion(obj); unsupportedApiVersion != "" {
out["unsupportedApiVersion"] = unsupportedApiVersion
return nil
}

func checkUnsupportedIngressApiVersion(obj runtime.Object) string {
objectGroupVersion := obj.GetObjectKind().GroupVersionKind().GroupVersion().String()
supportedGroupVersion := v1beta1.SchemeGroupVersion.String()
if objectGroupVersion != supportedGroupVersion {
return objectGroupVersion
}
return ""
}

Dependabot can't resolve your Go dependency files

Dependabot can't resolve your Go dependency files.

As a result, Dependabot couldn't update your dependencies.

The error Dependabot encountered was:

vbom.ml/[email protected]: unrecognized import path "vbom.ml/util" (https fetch: Get http://vbom.ml/util/?go-get=1: redirected from secure URL https://vbom.ml/util?go-get=1 to insecure URL http://vbom.ml/util/?go-get=1)

If you think the above is an error on Dependabot's side please don't hesitate to get in touch - we'll do whatever we can to fix it.

View the update logs.

when a status.conditions has unknown it should be explicit

Currently, we only color-code it based on the status field value. But the status field conventionally has one of these 3 values: "True", "False", "Unknown".

The current behavior is to accept Unknown as an explicit faulty status rather than missing information. We should be explicit about the "Unknown" value in such cases.

See https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#typical-status-properties for more details.

If pod is not a guaranteed/high priority pod, warn about the likelihood of OOM issues based on node's memory utilisation

If the underlying node's memory usage is already high (or if already under memory pressure), its more likely that the burstable/besteffort pods will be killed on that soon.

But the fact that the OOMKiller act based on the oomscore (see https://kubernetes.io/docs/concepts/scheduling-eviction/node-pressure-eviction/#node-out-of-memory-behavior)

I'm not really sure how does QOS level vs the pod priority takes role on oom score adj value. E.g. can a Guaranteed pod with very low priority be killed before a BestEffort pod with very high priority?

verify that ingress.spec.tls.secretName exists

In some cases, the certificate for ingress may be missing and it fallbacks to the default certificate.

It would be good if we can catch these cases:

  • ingress should have a .spec.tls, or its TLS will be up to default ingress config
  • the .spec.tls[*].hosts fields exist without a secretName next to it, this will use the default ingress cert
  • the secret exists
  • the secret is type: kubernetes.io/tls
  • the secret has the right keys (tls.crt and tls.key): https://kubernetes.io/docs/concepts/services-networking/ingress/#tls

I pushed some stretch goals into #200

rollouts are not sorted

When displaying recent rollouts for sts/statefulset and deployments the rollouts are randomly sorted:
image

tests for go templates

Currently, template development depends on developers. Kubernetes API has lots of fields optional and shows up only when there is a value is set. In many cases, they depend on some other fields' existence. We need to figure out a way to validate templates and also provide a wide range of test artifacts. With that, we can work on the template improvements just by getting a get -o yaml output from users.

We need to be able to test templates free from a cluster. And just by providing some basic information.

This will allow template development to be faster and easier without having to depend on developers' environments.

replace the `kindInjectFuncMap` with template functions

Convert the methods in this map to template methods:

kindInjectFuncMap := map[string][]func(obj runtime.Object, restConfig *rest.Config, out map[string]interface{}) error{
"Node": {includePodDetailsOnNode, includeNodeStatsSummary},
"Pod": {includePodMetrics}, // kubectl get --raw /api/v1/nodes/minikube/proxy/stats/summary --> .pods[] | select podRef | containers[] | select name
"StatefulSet": {includeStatefulSetDiff},
"Ingress": {includeIngressServices},
}

There should be a notification if metrics-server is missing

Users may be confused with inconsistent outputs of the same command on different environments.

kubectl-status tries to query and incorporate information from related resources (e.g. Kind: NodeMetrics are included to Kind: Nodes result) if they are available.

kubectl-status should display a notification when such resources are not available to include some extra bit of information.

An example that shows how kubectl top reports missing metrics-server.

$ kubectl top node
error: Metrics API not available

include helm chart/release (app) info if available

It could be helpful for operators to see which helm chart/release a resource is created as a result of.

Here is some metadata added by Helm, these are mostly conventions but would still be helpful for the majority of the users if we can have these.

metadata:
  annotations:
    deployment.kubernetes.io/revision: "2"
    meta.helm.sh/release-name: stable-prometheus-operator
    meta.helm.sh/release-namespace: monitoring
  labels:
    app.kubernetes.io/instance: stable-prometheus-operator
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: grafana
    app.kubernetes.io/version: 7.4.3
    helm.sh/chart: grafana-6.6.3

This could be something like this:

Deployment/stable-prometheus-operator-grafana -n monitoring, created 5mo ago, gen:2
  Helm release:stable-prometheus-operator chart:grafana-6.6.3 

Even for non-Helm users, the app.kubernetes.io/* labels would still be helpful to identify the application.

improve pv/pvc templates

Currently, kubectl-status is not really helpful to identify PV/PVC issues.

There are cases like multiple PVs are created for a PVC or the PVC is deleted but the PV is still kept.

(), metav1.ListOptions{FieldSelector: fieldSelector.String()})

List(context.TODO(), metav1.ListOptions{FieldSelector: fieldSelector.String()})
if err != nil {
return errors.WithMessage(err, "Failed getting non-terminated Pods for Node")
}
var podsList []interface{}
for _, pod := range nodeNonTerminatedPodsList.Items {


This issue was generated by todo based on a TODO comment in 5e4b280. It's been assigned to @bergerx because they committed the code.

Place an explicit explanation for stuck sts rollout

We have been suffering from kubernetes/kubernetes#78709 for a while.

Once an existing statefulset is deployed with some faulty config, there seems to be no way to get it to recover without manually deleting the stuck pod. From the PR:

There are a few cases where a StatefulSet can become bricked. Usually from something like setting an invalid image tag in the container. When this occurs, manual intervention is required in order to clear out the bad StatefulSet pods and allow k8s to spawn new ones. In this PR, I took a shot at detecting when a pod is stuck and we have reasonable confidence that replacing that pod will result in a better case than performing a no-op.

Here is a quote from the PR:

// isStatefullyStuck returns true if a pod in a stateful set is stuck
// due to a previously bad roll-out. We can detect this by checking all of:
// 1) The pod is in a pending state
// 2) The pod is at a different revision than the update revision
// 3) The update strategy is rolling
// 4) The pod should be updated
func isStatefullyStuck(set *apps.StatefulSet, pod *v1.Pod, currentRevision, updateRevision *apps.ControllerRevision) bool {
	return isPending(pod) &&
		getPodRevision(pod) == currentRevision.Name &&
		currentRevision.Name != updateRevision.Name &&
		set.Spec.UpdateStrategy.Type == apps.RollingUpdateStatefulSetStrategyType &&
		set.Spec.UpdateStrategy.RollingUpdate.Partition != nil &&
		getOrdinal(pod) >= int(*set.Spec.UpdateStrategy.RollingUpdate.Partition)
}

Incomplete Data Reporting

Hello, I checked the operation in 2 Clusters type:

Kubernetes: v1.18.0
OS: Oracle Linux v7.8
Kernel: 4.1.12-124.26.5.el7uek.x86_64
Conf: IaC (based on identical template in both cases)

In one of the clusters it reports:

Node / worker1, created 13d ago
linux Oracle Linux Server 7.8 (amd64), kernel 4.1.12-124.35.4.el7uek.x86_64, kubelet v1.18.2, kube-proxy v1.18.2
cpu: 0.107 / 1 (11%) <-
mem: 2.9GB / 7.1GB (42%) <-
ephemeral-storage: 2.2GB
images 50

Values ​​of a node describe:

Addresses:
Hostname: worker1
Capacity:
cpu: 2
ephemeral-storage: 3030800Ki
hugepages-2Mi: 0
memory: 8155976Ki
pods: 110
Allocatable:
cpu: 1
ephemeral-storage: 2256314364
hugepages-2Mi: 0
memory: 7005000Ki
pods: 110

Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits


cpu 550m (55%) 200m (20%)
memory 562Mi (8%) 562Mi (8%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)

And in the other Cluster reports:

Node / worker2, created 14d ago
linux Oracle Linux Server 7.8 (amd64), kernel 4.1.12-124.26.5.el7uek.x86_64, kubelet v1.18.2, kube-proxy v1.18.2
cpu: 5, mem: 7.1GB, ephemeral-storage: 2.2GB <--
images 50

Values ​​of a node describe:

Addresses:
Hostname: worker2
Capacity:
cpu: 6
ephemeral-storage: 3030800Ki
hugepages-2Mi: 0
memory: 8156080Ki
pods: 110
Allocatable:
cpu: 5
ephemeral-storage: 2256314364
hugepages-2Mi: 0
memory: 7005104Ki
pods: 110

Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits


cpu 1710m (34%) 2200m (44%)
memory 2042Mi (29%) 450Mi (6%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)

In both cases, they are recent installations sharing versions and parameters with each other. I cannot understand the reason why in one it presents the CPU and Ram occupancy ratios correctly and in another case No.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.