Giter Site home page Giter Site logo

giantswarm / prometheus Goto Github PK

View Code? Open in Web Editor NEW
858.0 52.0 427.0 3.05 MB

Kubernetes Setup for Prometheus and Grafana

License: Apache License 2.0

Shell 36.95% Mustache 63.05%
prometheus grafana dashboard kubernetes metrics monitoring helm-chart

prometheus's Introduction

CircleCI

Kubernetes Setup for Prometheus and Grafana

Quick start

To quickly start all the things just do this:

kubectl apply \
  --filename https://raw.githubusercontent.com/giantswarm/prometheus/master/manifests-all.yaml

This will create the namespace monitoring and bring up all components in there.

To shut down all components again you can just delete that namespace:

kubectl delete namespace monitoring

Default Dashboards

If you want to re-import the default dashboards from this setup run this job:

kubectl apply --filename ./manifests/grafana/import-dashboards/job.yaml

In case the job already exists from an earlier run, delete it before:

kubectl --namespace monitoring delete job grafana-import-dashboards

To access grafana you can use port forward functionality

kubectl port-forward --namespace monitoring service/grafana 3000:3000

And you should be able to access grafana on http://localhost:3000/login

More Dashboards

See grafana.net for some example dashboards and plugins.

  • Configure Prometheus data source for Grafana.
    Grafana UI / Data Sources / Add data source

    • Name: prometheus
    • Type: Prometheus
    • Url: http://prometheus:9090
    • Add
  • Import Prometheus Stats:
    Grafana UI / Dashboards / Import

    • Grafana.net Dashboard: https://grafana.net/dashboards/2
    • Load
    • Prometheus: prometheus
    • Save & Open
  • Import Kubernetes cluster monitoring:
    Grafana UI / Dashboards / Import

    • Grafana.net Dashboard: https://grafana.net/dashboards/162
    • Load
    • Prometheus: prometheus
    • Save & Open

Credit

Alertmanager configs and integration in this repository was heavily inspired by the implementation in kayrus/prometheus-kubernetes.

prometheus's People

Contributors

choffmeister avatar fuminori-ido avatar indyfree avatar josephsalisbury avatar joshes avatar jperville avatar kabakaev avatar koep avatar lentil1016 avatar marcelmue avatar marians avatar oliver006 avatar pipo02mix avatar puja108 avatar rahulmahale avatar slahser avatar stevenaldinger avatar stone-z avatar thomaspeitz avatar v1k0d3n avatar webwurst avatar wombat avatar zatricky avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

prometheus's Issues

grafana-net-2-dashboard.json and grafana-net-737-dashboard.json is obsolete ,it cause no graph in grafana

i installed kubernetes-prometheus use

kubectl create -f manifests-all.yaml

and it seems all fine ,but no graph in grafana ,why?

[root@k8s-master prometheus]# kubectl version
Client Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.0", GitCommit:"d3ada0119e776222f11ec7945e6d860061339aad", GitTreeState:"clean", BuildDate:"2017-06-29T23:15:59Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.0", GitCommit:"d3ada0119e776222f11ec7945e6d860061339aad", GitTreeState:"clean", BuildDate:"2017-06-29T22:55:19Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
[root@k8s-master prometheus]# kubectl get pods --namespace monitoring -o wide
NAME                                 READY     STATUS    RESTARTS   AGE       IP              NODE
alertmanager-1345290433-51n0m        1/1       Running   0          1h        172.16.1.221    k8s-node1
grafana-core-277836579-bpndq         1/1       Running   0          1h        172.16.1.222    k8s-node1
kube-state-metrics-458253108-7mnkl   1/1       Running   0          1h        172.16.1.224    k8s-node1
kube-state-metrics-458253108-p3xsm   1/1       Running   0          1h        172.16.0.107    k8s-master
node-directory-size-metrics-3w1pl    2/2       Running   0          1h        172.16.0.105    k8s-master
node-directory-size-metrics-78jkk    2/2       Running   0          1h        172.16.2.68     k8s-node2
node-directory-size-metrics-j3zj4    2/2       Running   0          1h        172.16.1.223    k8s-node1
prometheus-core-3823466589-j6g4b     1/1       Running   0          1h        172.16.0.108    k8s-master
prometheus-node-exporter-3jvcm       1/1       Running   0          1h        10.161.233.80   k8s-node2
prometheus-node-exporter-bvc4l       1/1       Running   0          1h        10.165.97.219   k8s-node1
prometheus-node-exporter-dwh0k       1/1       Running   0          1h        10.132.41.234   k8s-master

pic1

pic2

Error from server (NotFound): error when stopping "manifests-all.yaml"

when i remove cabernets-prometheus , i issue this command "kubectl delete -f manifests-all.yaml" ,it take long time to finish ,and throw error info like following

[root@k8s-master prometheus]# kubectl version
Client Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.0", GitCommit:"d3ada0119e776222f11ec7945e6d860061339aad", GitTreeState:"clean", BuildDate:"2017-06-29T23:15:59Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.0", GitCommit:"d3ada0119e776222f11ec7945e6d860061339aad", GitTreeState:"clean", BuildDate:"2017-06-29T22:55:19Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
[root@k8s-master prometheus]# kubectl delete -f manifests-all.yaml
namespace "monitoring" deleted
configmap "alertmanager-templates" deleted
configmap "alertmanager" deleted
deployment "alertmanager" deleted
service "alertmanager" deleted
configmap "grafana-import-dashboards" deleted
job "grafana-import-dashboards" deleted
service "grafana" deleted
configmap "prometheus-core" deleted
service "kube-state-metrics" deleted



clusterrolebinding "prometheus" deleted
clusterrole "prometheus" deleted
Error from server (NotFound): error when stopping "manifests-all.yaml": deployments.extensions "grafana-core" not found
Error from server (NotFound): error when stopping "manifests-all.yaml": deployments.extensions "prometheus-core" not found
Error from server (NotFound): error when stopping "manifests-all.yaml": deployments.extensions "kube-state-metrics" not found
Error from server (NotFound): error when deleting "manifests-all.yaml": serviceaccounts "kube-state-metrics" not found
error when stopping "manifests-all.yaml": timed out waiting for the condition
Error from server (NotFound): error when stopping "manifests-all.yaml": daemonsets.extensions "prometheus-node-exporter" not found
Error from server (NotFound): error when stopping "manifests-all.yaml": services "prometheus-node-exporter" not found
Error from server (NotFound): error when deleting "manifests-all.yaml": namespaces "monitoring" not found
Error from server (NotFound): error when deleting "manifests-all.yaml": namespaces "monitoring" not found
Error from server (NotFound): error when stopping "manifests-all.yaml": services "prometheus" not found

is grafana dashboard UI on localhost:3000?

is grafana dashboard UI on localhost:3000?

No UI on localhost:3000, what I am missing here?

Rams-MacBook-Pro:k8smonitoring ram.dhakne$ sudo kubectl get pods --namespace=monitoring
NAME READY STATUS RESTARTS AGE
alertmanager-64fd9d59f9-vsv9g 1/1 Running 1 40m
grafana-core-5cf6b555cc-x27p9 1/1 Running 1 40m
kube-state-metrics-568457dff4-49blf 1/1 Running 1 40m
kube-state-metrics-568457dff4-dq65c 1/1 Running 1 40m
node-directory-size-metrics-s7zv6 2/2 Running 2 40m
prometheus-core-79648bf5cc-kp4t9 1/1 Running 1 40m
prometheus-node-exporter-9xwm2 1/1 Running 1 40m

...
...
Normal SuccessfulMountVolume 5m kubelet, minikube MountVolume.SetUp succeeded for volume "grafana-persistent-storage"
Normal SuccessfulMountVolume 5m kubelet, minikube MountVolume.SetUp succeeded for volume "default-token-2w6fv"
Normal SandboxChanged 5m kubelet, minikube Pod sandbox changed, it will be killed and re-created.
Normal Pulled 5m kubelet, minikube Container image "grafana/grafana:4.2.0" already present on machine
Normal Created 4m kubelet, minikube Created container
Normal Started 4m kubelet, minikube Started container

No UI on localhost:3000, what I am missing here?

node low data disk doesn't show alert

I define node low data disk for test like this๏ผŒnode low root disk show alert๏ผŒbut data disk doesn't
ALERT NodeLowDataDisk
IF ((node_filesystem_size{mountpoint="/var/lib/docker/"} - node_filesystem_free{mountpoint="/var/lib/docker/"}) / node_filesystem_size{mountpoint="/var/lib/docker/"} * 100) > 1
FOR 2m
LABELS {severity="page"}
ANNOTATIONS {DESCRIPTION="{{$labels.instance}}: Data disk usage is above 1% (current value is: {{ $value }})", SUMMARY="{{$labels.instance}}: Low data disk space"}

I find disk mount /var/lib/docker like this๏ผš
[root@master2 prometheus]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/cl-root 37G 4.0G 33G 11% /
devtmpfs 3.9G 4.0K 3.9G 1% /dev
shm 64M 0 64M 0% /dev/shm
tmpfs 3.9G 247M 3.6G 7% /run
tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup
/dev/sdc 40G 26G 15G 64% /var/lib/docke

Putting a PVC in Grafana's Deployment doesn't save the data.

The grafana image has a VOLUME spec at /var/lib/grafana ..

The current Deployment uses this:

        volumeMounts:
        - name: grafana-persistent-storage
          mountPath: /var

so the storage isn't persisted even if you change it to a PVC or something, because the VOLUME tag in the image will create an emptyDir at /var/lib/grafana and store it there. We should correct the deployment.yaml to read:

        volumeMounts:
        - name: grafana-persistent-storage
          mountPath: /var/lib/grafana

Default Grafana Login rejected

After I set up the services with kubectl apply --filename https://raw.githubusercontent.com/giantswarm/kubernetes-prometheus/master/manifests-all.yaml and tried to login with admin:admin I can't access the dashboard and get this error message (browser console):

Possibly unhandled rejection:
{
   "paths":[
      "/api",
      "/api/v1",
      "/apis",
      "/apis/apps",
      "/apis/apps/v1beta1",
      "/apis/authentication.k8s.io",
      "/apis/authentication.k8s.io/v1beta1",
      "/apis/authorization.k8s.io",
      "/apis/authorization.k8s.io/v1beta1",
      "/apis/autoscaling",
      "/apis/autoscaling/v1",
      "/apis/batch",
      "/apis/batch/v1",
      "/apis/batch/v2alpha1",
      "/apis/certificates.k8s.io",
      "/apis/certificates.k8s.io/v1alpha1",
      "/apis/extensions",
      "/apis/extensions/v1beta1",
      "/apis/policy",
      "/apis/policy/v1beta1",
      "/apis/rbac.authorization.k8s.io",
      "/apis/rbac.authorization.k8s.io/v1alpha1",
      "/apis/storage.k8s.io",
      "/apis/storage.k8s.io/v1beta1",
      "/healthz",
      "/healthz/poststarthook/bootstrap-controller",
      "/healthz/poststarthook/extensions/third-party-resources",
      "/healthz/poststarthook/rbac/bootstrap-roles",
      "/logs",
      "/metrics",
      "/swaggerapi/",
      "/ui/",
      "/version"
   ],
   "severity":"warning"
}

System:

Client Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.4", GitCommit:"793658f2d7ca7f064d2bdf606519f9fe1229c381", GitTreeState:"clean", BuildDate:"2017-08-17T08:48:23Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"5+", GitVersion:"v1.5.6-4+abe34653415733", GitCommit:"abe346534157336e6bd5a70702756cff19d43a49", GitTreeState:"clean", BuildDate:"2017-05-18T16:52:50Z", GoVersion:"go1.7.4", Compiler:"gc", Platform:"linux/amd64"}

add a quick TLDR way of deploying

For those in a hurry or just wanting to set it up again (like me for the nth time now) it would be cool to have a TLDR way (a script or compressed YAMLs or something) to get to a running state.

What do you think @webwurst ?

wait-for-endpoints init-containers fails to load with k8s 1.6.0

Hi,

I just updated to k8s 1.6.0 (via kubeadm) and found that the grafana-import-dashboards job is failing to pick up the kubernetes api.

I am assuming this is because of the new RBAC roles that were added to 1.6 but I am unsure of how to fix this issue or hack around it.

I believe this issue is around this block of code.

      annotations:
        pod.beta.kubernetes.io/init-containers: '[
          {
            "name": "wait-for-endpoints",
            "image": "giantswarm/tiny-tools",
            "imagePullPolicy": "IfNotPresent",
            "command": ["fish", "-c", "echo \"waiting for endpoints...\"; while true; set endpoints (curl -s --cacert /var/run/secrets/kubernetes.io/serviceaccount/ca.crt --header \"Authorization: Bearer \"(cat /var/run/secrets/kubernetes.io/serviceaccount/token) https://kubernetes.default.svc/api/v1/namespaces/monitoring/endpoints/grafana); echo $endpoints | jq \".\"; if test (echo $endpoints | jq -r \".subsets[].addresses | length\") -gt 0; exit 0; end; echo \"waiting...\";sleep 1; end"],
            "args": ["monitoring", "grafana"]
          }
        ]'

Here is some debugging information.

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.0", GitCommit:"fff5156092b56e6bd60fff75aad4dc9de6b6ef37", GitTreeState:"clean", BuildDate:"2017-03-28T19:15:41Z", GoVersion:"go1.8", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.0", GitCommit:"fff5156092b56e6bd60fff75aad4dc9de6b6ef37", GitTreeState:"clean", BuildDate:"2017-03-28T16:24:30Z", GoVersion:"go1.7.5", Compiler:"gc", Platform:"linux/amd64"}

Pods

$ kubectl -n monitoring get pods
NAME                                  READY     STATUS     RESTARTS   AGE
grafana-core-2777125642-hzj36         1/1       Running    0          6m
grafana-import-dashboards-r0kh8       0/1       Init:0/1   0          6m
kube-state-metrics-3573491037-sr51m   1/1       Running    0          6m
node-directory-size-metrics-3gnkn     2/2       Running    0          6m
node-directory-size-metrics-qh9zk     2/2       Running    0          6m
prometheus-core-4230560888-jqh5r      1/1       Running    0          6m
prometheus-node-exporter-3d4sm        1/1       Running    0          6m
prometheus-node-exporter-hqzdm        1/1       Running    0          6m

logs for the initContainer

kubectl -n monitoring logs grafana-import-dashboards-r0kh8 -c wait-for-endpoints

waiting...
test: Missing argument at index 2
parse error: Invalid numeric literal at line 1, column 5
parse error: Invalid numeric literal at line 1, column 5
waiting...

I am able to hit the endpoint api via dashboard

// 20170407140649
// http://localhost:8001/api/v1/namespaces/monitoring/endpoints/grafana

{
  "kind": "Endpoints",
  "apiVersion": "v1",
  "metadata": {
    "name": "grafana",
    "namespace": "monitoring",
    "selfLink": "/api/v1/namespaces/monitoring/endpoints/grafana",
    "uid": "xxx",
    "resourceVersion": "5366",
    "creationTimestamp": "2017-04-07T17:57:00Z",
    "labels": {
      "app": "grafana",
      "component": "core"
    }
  },
  "subsets": [
    {
      "addresses": [
        {
          "ip": "xxx",
          "nodeName": "xxx-kube-node-0",
          "targetRef": {
            "kind": "Pod",
            "namespace": "monitoring",
            "name": "grafana-core-2777125642-hzj36",
            "uid": "xxx",
            "resourceVersion": "5363"
          }
        }
      ],
      "ports": [
        {
          "port": 3000,
          "protocol": "TCP"
        }
      ]
    }
  ]
}

Grafana ingress template fails to create ingress

After doing kubectl apply -f manifests-all.yaml
It fails to launch the ingress with this error

namespace "monitoring" created
clusterrolebinding "kube-state-metrics" created
clusterrole "kube-state-metrics" created
serviceaccount "kube-state-metrics" created
clusterrolebinding "prometheus" created
clusterrole "prometheus" created
serviceaccount "prometheus-k8s" created
deployment "grafana-core" created
configmap "grafana-import-dashboards" created
job "grafana-import-dashboards" created
error: error validating "manifests-all.yaml": error validating data: field spec for v1beta1.IngressSpec: expected object of type map[string]interface{}, but the actual type is []interface {}; if you choose to ignore these errors, turn validation off with --validate=false

Fix for this is to add rules: at Line number 2053 here

Update Prometheus

1.10 got out [recently|(https://github.com/prometheus/prometheus/releases/tag/v1.1.0) and seems to have some updates for K8s SD, we might want to try that out.

add new informative cluster dashboard

Add new or change the standard k8s grafana dashboard to fit a typical GS user better.

Some things to consider:

  • resource usage per node (+ worker)
  • anything else?

scale components

Should be easy for Prometheus, I guess.
For Grafana we need to setup a shared Mysql instance.

unable to add prometheus data source

Hi,

Great work on this project! My problem is adding a prometheus data source.

Http settings:
URL == http://localhost:9090
Access == proxy

k8bctl logs -f [name-of--pod]

these are my errors. Can anyone give any pointers? Anyone else get this? Are there more steps to set up the prometheus data source?

t=2018-06-11T10:19:48+0000 lvl=info msg="Plugin dir created" logger=plugins dir=/var/lib/grafana/plugins
t=2018-06-11T10:19:48+0000 lvl=info msg="Initializing Stream Manager"
t=2018-06-11T10:19:48+0000 lvl=info msg="Initializing Alerting" logger=alerting.engine
t=2018-06-11T10:19:48+0000 lvl=info msg="Initializing CleanUpService" logger=cleanup
t=2018-06-11T10:19:48+0000 lvl=info msg="Initializing HTTP Server" logger=http.server address=0.0.0.0:3000 protocol=http subUrl=
t=2018-06-11T11:03:38+0000 lvl=info msg="Request Completed" logger=context userId=0 orgId=0 uname= method=GET path=/ status=302 remote_addr=100.X.Y.Z time_ms=0s size=29
t=2018-06-11T11:10:48+0000 lvl=info msg="Request Completed" logger=context userId=1 orgId=1 uname=admin method=GET path=/api/v1/label/__name__/values status=502 remote_addr=100.X.Y.Z time_ms=8ns size=0
2018/06/11 11:10:48 http: proxy error: dial tcp [::1]:9090: getsockopt: connection refused
t=2018-06-11T11:12:29+0000 lvl=eror msg="Unable to call AWS API" logger=context userId=1 orgId=1 uname=admin error="MissingRegion: could not find region configuration"
t=2018-06-11T11:12:29+0000 lvl=eror msg="Request Completed" logger=context userId=1 orgId=1 uname=admin method=POST path=/api/datasources/proxy/3 status=500 remote_addr=100.X.Y.Z time_ms=252ns size=36
2018/06/11 11:20:39 http: proxy error: dial tcp [::1]:8080: getsockopt: connection refused
t=2018-06-11T11:20:39+0000 lvl=info msg="Request Completed" logger=context userId=1 orgId=1 uname=admin method=GET path=/metrics/find status=502 remote_addr=100.X.Y.Z time_ms=2ns size=0
2018/06/11 11:20:52 http: proxy error: dial tcp [::1]:9090: getsockopt: connection refused
t=2018-06-11T11:20:52+0000 lvl=info msg="Request Completed" logger=context userId=1 orgId=1 uname=admin method=GET path=/api/v1/label/__name__/values status=502 remote_addr=100.X.Y.Z time_ms=2ns size=0
2018/06/11 11:20:53 http: proxy error: dial tcp [::1]:9090: getsockopt: connection refused
t=2018-06-11T11:20:53+0000 lvl=info msg="Request Completed" logger=context userId=1 orgId=1 uname=admin method=GET path=/api/v1/label/__name__/values status=502 remote_addr=100.X.Y.Z time_ms=2ns size=0
2018/06/11 11:20:54 http: proxy error: dial tcp [::1]:9090: getsockopt: connection refused
t=2018-06-11T11:20:54+0000 lvl=info msg="Request Completed" logger=context userId=1 orgId=1 uname=admin method=GET path=/api/v1/label/__name__/values status=502 remote_addr=100.X.Y.Z time_ms=2ns size=0
2018/06/11 11:22:06 http: proxy error: dial tcp [::1]:9090: getsockopt: connection refused
t=2018-06-11T11:22:06+0000 lvl=info msg="Request Completed" logger=context userId=1 orgId=1 uname=admin method=GET path=/api/v1/label/__name__/values status=502 remote_addr=100.X.Y.Z time_ms=2ns size=0

Add RBAC roles and serviceaccount binding

Prometheus talks to the API server for SD, we should check upstream if there's already an RBAC role for it and integrate in our setup. This is not very urgent, but would be nice to have for running RBAC and not giving out admin roles to everything.

Prometheus failed to list ressources

Seems like the Prometheus service can't access k8s ressources:

2017-09-05T05:55:25.325516476Z time="2017-09-05T05:55:25Z" level=error msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:227: Failed to list *v1.Service: Forbidden: "/api/v1/services?resourceVersion=0" (get services)" component="kube_client_runtime" source="kubernetes.go:75" 
2017-09-05T05:55:25.544671315Z time="2017-09-05T05:55:25Z" level=error msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:179: Failed to list *v1.Endpoints: Forbidden: "/api/v1/endpoints?resourceVersion=0" (get endpoints)" component="kube_client_runtime" source="kubernetes.go:75" 
2017-09-05T05:55:25.645500647Z time="2017-09-05T05:55:25Z" level=error msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:181: Failed to list *v1.Pod: Forbidden: "/api/v1/pods?resourceVersion=0" (get pods)" component="kube_client_runtime" source="kubernetes.go:75" 
2017-09-05T05:55:25.703932970Z time="2017-09-05T05:55:25Z" level=error msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:207: Failed to list *v1.Pod: Forbidden: "/api/v1/pods?resourceVersion=0" (get pods)" component="kube_client_runtime" source="kubernetes.go:75" 
2017-09-05T05:55:25.794057778Z time="2017-09-05T05:55:25Z" level=error msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:245: Failed to list *v1.Node: Forbidden: "/api/v1/nodes?resourceVersion=0" (get nodes)" component="kube_client_runtime" source="kubernetes.go:75" 
2017-09-05T05:55:25.794077568Z time="2017-09-05T05:55:25Z" level=error msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:180: Failed to list *v1.Service: Forbidden: "/api/v1/services?resourceVersion=0" (get services)" component="kube_client_runtime" source="kubernetes.go:75" 
2017-09-05T05:55:26.332010158Z time="2017-09-05T05:55:26Z" level=error msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:227: Failed to list *v1.Service: Forbidden: "/api/v1/services?resourceVersion=0" (get services)" component="kube_client_runtime" source="kubernetes.go:75" 
2017-09-05T05:55:26.547030910Z time="2017-09-05T05:55:26Z" level=error msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:179: Failed to list *v1.Endpoints: Forbidden: "/api/v1/endpoints?resourceVersion=0" (get endpoints)" component="kube_client_runtime" source="kubernetes.go:75" 
2017-09-05T05:55:26.648655345Z time="2017-09-05T05:55:26Z" level=error msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:181: Failed to list *v1.Pod: Forbidden: "/api/v1/pods?resourceVersion=0" (get pods)" component="kube_client_runtime" source="kubernetes.go:75" 
2017-09-05T05:55:26.706663533Z time="2017-09-05T05:55:26Z" level=error msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:207: Failed to list *v1.Pod: Forbidden: "/api/v1/pods?resourceVersion=0" (get pods)" component="kube_client_runtime" source="kubernetes.go:75" 
2017-09-05T05:55:26.800311617Z time="2017-09-05T05:55:26Z" level=error msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:180: Failed to list *v1.Service: Forbidden: "/api/v1/services?resourceVersion=0" (get services)" component="kube_client_runtime" source="kubernetes.go:75" 
2017-09-05T05:55:26.800331822Z time="2017-09-05T05:55:26Z" level=error msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:245: Failed to list *v1.Node: Forbidden: "/api/v1/nodes?resourceVersion=0" (get nodes)" component="kube_client_runtime" source="kubernetes.go:75" 
2017-09-05T05:55:27.335507760Z time="2017-09-05T05:55:27Z" level=error msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:227: Failed to list *v1.Service: Forbidden: "/api/v1/services?resourceVersion=0" (get services)" component="kube_client_runtime" source="kubernetes.go:75" 
2017-09-05T05:55:27.549759362Z time="2017-09-05T05:55:27Z" level=error msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:179: Failed to list *v1.Endpoints: Forbidden: "/api/v1/endpoints?resourceVersion=0" (get endpoints)" component="kube_client_runtime" source="kubernetes.go:75" 
2017-09-05T05:55:27.651208222Z time="2017-09-05T05:55:27Z" level=error msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:181: Failed to list *v1.Pod: Forbidden: "/api/v1/pods?resourceVersion=0" (get pods)" component="kube_client_runtime" source="kubernetes.go:75" 
2017-09-05T05:55:27.709451754Z time="2017-09-05T05:55:27Z" level=error msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:207: Failed to list *v1.Pod: Forbidden: "/api/v1/pods?resourceVersion=0" (get pods)" component="kube_client_runtime" source="kubernetes.go:75" 
2017-09-05T05:55:27.802812402Z time="2017-09-05T05:55:27Z" level=error msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:180: Failed to list *v1.Service: Forbidden: "/api/v1/services?resourceVersion=0" (get services)" component="kube_client_runtime" source="kubernetes.go:75" 
2017-09-05T05:55:27.802836038Z time="2017-09-05T05:55:27Z" level=error msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:245: Failed to list *v1.Node: Forbidden: "/api/v1/nodes?resourceVersion=0" (get nodes)" component="kube_client_runtime" source="kubernetes.go:75" 
2017-09-05T05:55:28.338185252Z time="2017-09-05T05:55:28Z" level=error msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:227: Failed to list *v1.Service: Forbidden: "/api/v1/services?resourceVersion=0" (get services)" component="kube_client_runtime" source="kubernetes.go:75" 
2017-09-05T05:55:28.553583658Z time="2017-09-05T05:55:28Z" level=error msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:179: Failed to list *v1.Endpoints: Forbidden: "/api/v1/endpoints?resourceVersion=0" (get endpoints)" component="kube_client_runtime" source="kubernetes.go:75" 
2017-09-05T05:55:28.655636436Z time="2017-09-05T05:55:28Z" level=error msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:181: Failed to list *v1.Pod: Forbidden: "/api/v1/pods?resourceVersion=0" (get pods)" component="kube_client_runtime" source="kubernetes.go:75" 
2017-09-05T05:55:28.711802173Z time="2017-09-05T05:55:28Z" level=error msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:207: Failed to list *v1.Pod: Forbidden: "/api/v1/pods?resourceVersion=0" (get pods)" component="kube_client_runtime" source="kubernetes.go:75" 
2017-09-05T05:55:28.805598875Z time="2017-09-05T05:55:28Z" level=error msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:180: Failed to list *v1.Service: Forbidden: "/api/v1/services?resourceVersion=0" (get services)" component="kube_client_runtime" source="kubernetes.go:75" 
2017-09-05T05:55:28.805619270Z time="2017-09-05T05:55:28Z" level=error msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:245: Failed to list *v1.Node: Forbidden: "/api/v1/nodes?resourceVersion=0" (get nodes)" component="kube_client_runtime" source="kubernetes.go:75" 
2017-09-05T05:55:29.342557645Z time="2017-09-05T05:55:29Z" level=error msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:227: Failed to list *v1.Service: Forbidden: "/api/v1/services?resourceVersion=0" (get services)" component="kube_client_runtime" source="kubernetes.go:75" 
2017-09-05T05:55:29.555876317Z time="2017-09-05T05:55:29Z" level=error msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:179: Failed to list *v1.Endpoints: Forbidden: "/api/v1/endpoints?resourceVersion=0" (get endpoints)" component="kube_client_runtime" source="kubernetes.go:75" 
2017-09-05T05:55:29.658413209Z time="2017-09-05T05:55:29Z" level=error msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:181: Failed to list *v1.Pod: Forbidden: "/api/v1/pods?resourceVersion=0" (get pods)" component="kube_client_runtime" source="kubernetes.go:75" 
2017-09-05T05:55:29.716340932Z time="2017-09-05T05:55:29Z" level=error msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:207: Failed to list *v1.Pod: Forbidden: "/api/v1/pods?resourceVersion=0" (get pods)" component="kube_client_runtime" source="kubernetes.go:75" 
2017-09-05T05:55:29.808308737Z time="2017-09-05T05:55:29Z" level=error msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:245: Failed to list *v1.Node: Forbidden: "/api/v1/nodes?resourceVersion=0" (get nodes)" component="kube_client_runtime" source="kubernetes.go:75" 
2017-09-05T05:55:29.808330533Z time="2017-09-05T05:55:29Z" level=error msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:180: Failed to list *v1.Service: Forbidden: "/api/v1/services?resourceVersion=0" (get services)" component="kube_client_runtime" source="kubernetes.go:75" 
2017-09-05T05:55:30.345444610Z time="2017-09-05T05:55:30Z" level=error msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:227: Failed to list *v1.Service: Forbidden: "/api/v1/services?resourceVersion=0" (get services)" component="kube_client_runtime" source="kubernetes.go:75" 
2017-09-05T05:55:30.558481803Z time="2017-09-05T05:55:30Z" level=error msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:179: Failed to list *v1.Endpoints: Forbidden: "/api/v1/endpoints?resourceVersion=0" (get endpoints)" component="kube_client_runtime" source="kubernetes.go:75" 
2017-09-05T05:55:30.661439331Z time="2017-09-05T05:55:30Z" level=error msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:181: Failed to list *v1.Pod: Forbidden: "/api/v1/pods?resourceVersion=0" (get pods)" component="kube_client_runtime" source="kubernetes.go:75" 
2017-09-05T05:55:30.719157518Z time="2017-09-05T05:55:30Z" level=error msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:207: Failed to list *v1.Pod: Forbidden: "/api/v1/pods?resourceVersion=0" (get pods)" component="kube_client_runtime" source="kubernetes.go:75" 
2017-09-05T05:55:30.810896406Z time="2017-09-05T05:55:30Z" level=error msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:245: Failed to list *v1.Node: Forbidden: "/api/v1/nodes?resourceVersion=0" (get nodes)" component="kube_client_runtime" source="kubernetes.go:75" 
2017-09-05T05:55:30.810918023Z time="2017-09-05T05:55:30Z" level=error msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:180: Failed to list *v1.Service: Forbidden: "/api/v1/services?resourceVersion=0" (get services)" component="kube_client_runtime" source="kubernetes.go:75" 
2017-09-05T05:55:31.348096843Z time="2017-09-05T05:55:31Z" level=error msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:227: Failed to list *v1.Service: Forbidden: "/api/v1/services?resourceVersion=0" (get services)" component="kube_client_runtime" source="kubernetes.go:75" 
2017-09-05T05:55:31.561770991Z time="2017-09-05T05:55:31Z" level=error msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:179: Failed to list *v1.Endpoints: Forbidden: "/api/v1/endpoints?resourceVersion=0" (get endpoints)" component="kube_client_runtime" source="kubernetes.go:75" 
2017-09-05T05:55:31.665413249Z time="2017-09-05T05:55:31Z" level=error msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:181: Failed to list *v1.Pod: Forbidden: "/api/v1/pods?resourceVersion=0" (get pods)" component="kube_client_runtime" source="kubernetes.go:75" 
2017-09-05T05:55:31.721623025Z time="2017-09-05T05:55:31Z" level=error msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:207: Failed to list *v1.Pod: Forbidden: "/api/v1/pods?resourceVersion=0" (get pods)" component="kube_client_runtime" source="kubernetes.go:75" 
2017-09-05T05:55:31.813526530Z time="2017-09-05T05:55:31Z" level=error msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:245: Failed to list *v1.Node: Forbidden: "/api/v1/nodes?resourceVersion=0" (get nodes)" component="kube_client_runtime" source="kubernetes.go:75" 
2017-09-05T05:55:31.813547360Z time="2017-09-05T05:55:31Z" level=error msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:180: Failed to list *v1.Service: Forbidden: "/api/v1/services?resourceVersion=0" (get services)" component="kube_client_runtime" source="kubernetes.go:75" 
2017-09-05T05:55:32.350782045Z time="2017-09-05T05:55:32Z" level=error msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:227: Failed to list *v1.Service: Forbidden: "/api/v1/services?resourceVersion=0" (get services)" component="kube_client_runtime" source="kubernetes.go:75" 
2017-09-05T05:55:32.565790623Z time="2017-09-05T05:55:32Z" level=error msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:179: Failed to list *v1.Endpoints: Forbidden: "/api/v1/endpoints?resourceVersion=0" (get endpoints)" component="kube_client_runtime" source="kubernetes.go:75" 
2017-09-05T05:55:32.668112522Z time="2017-09-05T05:55:32Z" level=error msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:181: Failed to list *v1.Pod: Forbidden: "/api/v1/pods?resourceVersion=0" (get pods)" component="kube_client_runtime" source="kubernetes.go:75" 
2017-09-05T05:55:32.724119946Z time="2017-09-05T05:55:32Z" level=error msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:207: Failed to list *v1.Pod: Forbidden: "/api/v1/pods?resourceVersion=0" (get pods)" component="kube_client_runtime" source="kubernetes.go:75" 
2017-09-05T05:55:32.816956438Z time="2017-09-05T05:55:32Z" level=error msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:180: Failed to list *v1.Service: Forbidden: "/api/v1/services?resourceVersion=0" (get services)" component="kube_client_runtime" source="kubernetes.go:75"

Comment out ingresses

Let's provide ingress configuration for all services where it makes sense, but comment them out. Since they are quiet specific for each setup. So they just serve as an example here.

Issue getting application status up in prometheus dashboard

Hi,

I am trying to deploy Nginx application which is my first application that I am trying to deploy.
Below is my code for Deployment, Service & Service account

image

I am not sure what the issue is but the application is down on my dashboard. Not sure if there is any mistake on the above code. Dashboard screen snip below.

image

Thanks,

Cannot get Grafana to work

I followed the quickstart doc and ran
kubectl apply --filename https://raw.githubusercontent.com/giantswarm/kubernetes-prometheus/master/manifests-all.yaml

Now when I login to Grafana, the default Kubernetes Pod Resources dashboard shows N/A for all metrics. All components are in running state and no errors in any logs. Am I missing some other config?

Alertmanager Templates

I want to modify subject and body of email-alert sent by Alertmanager . I have tried to change the default config-map of alertmanager (manifest/alertmanager/alertmanager-templates.yaml) .But it seems there is no effect on email template by changing config-map .
Please suggest me what to do . thanks

End-to-end test

Could we come up with some easy to use and simple end-to-end test? To automatically check if this setup works for different Kubernetes/Minikube versions. Maybe also against kubernetes-anywhere or others. I didn't look at the Kubernetes e2e setups so far, just an idea.

/ping @puja108

Additional Nodes

First of all this script is awesome and I got my prometheus environment started in seconds.
Now I have a question that needs to be clarified.

  1. I don't have all the pods that I have it in my master node in the prometheus dashboard.
    image

In master node
image

From the above image I want Nginx pod to be appearing in prometheus dashboard.

Need assistance please as I am doing a POC. My main intention is to deploy a Tomcat application in a container and monitor its reqource.

Thanks...

Prometheus-core Insufficient cpu (3) bug

Prometheus-core pods cant be created cause of error

No nodes are available that match all of the following predicates:: Insufficient cpu (3).

Kubernetes is 1.7.5

There is no logs, because pod cant even start

Prometheus deprecated flag memory-chunks

time="2017-05-01T11:01:31Z" level=warning msg="Flag -storage.local.memory-chunks is deprecated. Its value 500000 is used to override -storage.local.target-heap-size to 1536000000." source="config.go:317"

How to scrape CAdvsior in Kubernetes 1.8

How to scrape CAdvsior in Kubernetes 1.8 . The default configuraion is unable to scarep it.

I see this in the official doc.How to modify this for this project.

  - job_name: 'kubernetes-nodes-cadvisor'
    tls_config:
      ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
    kubernetes_sd_configs:
     - role: node
    relabel_configs:
    - action: labelmap
    regex: __meta_kubernetes_node_label_(.+)
    - target_label: __address__
      replacement: kubernetes.default.svc:443
    - source_labels: [__meta_kubernetes_node_name]
      regex: (.+)
      target_label: __metrics_path__
      replacement: /api/v1/nodes/${1}:4194/proxy/metrics/cadvisor

node-directory-size-metrics

I have run this manifets-all.yaml on my kubernetes master .Every thing is up and running but I can't see container matrics in my prometheus or grafana dashboard .
Below is the logs of node-directory-size-metrics

du: cannot access '/mnt/var/lib/docker/aufs/mnt/925a67dd5dc3f1fb97da6fdf8c77fcc152fc2c3868714781d092a28fe6acac79/proc/8/task/8/fd/4': No such file or directory
du: cannot access '/mnt/var/lib/docker/aufs/mnt/925a67dd5dc3f1fb97da6fdf8c77fcc152fc2c3868714781d092a28fe6acac79/proc/8/task/8/fdinfo/4': No such file or directory
du: cannot access '/mnt/var/lib/docker/aufs/mnt/925a67dd5dc3f1fb97da6fdf8c77fcc152fc2c3868714781d092a28fe6acac79/proc/8/fd/3': No such file or directory
du: cannot access '/mnt/var/lib/docker/aufs/mnt/925a67dd5dc3f1fb97da6fdf8c77fcc152fc2c3868714781d092a28fe6acac79/proc/8/fdinfo/3': No such file or directory
du: cannot access '/mnt/var/lib/docker/aufs/mnt/925a67dd5dc3f1fb97da6fdf8c77fcc152fc2c3868714781d092a28fe6acac79/proc/16/task/16/fd/4': No such file or directory
du: cannot access '/mnt/var/lib/docker/aufs/mnt/925a67dd5dc3f1fb97da6fdf8c77fcc152fc2c3868714781d092a28fe6acac79/proc/16/task/16/fdinfo/4': No such file or directory
du: cannot access '/mnt/var/lib/docker/aufs/mnt/925a67dd5dc3f1fb97da6fdf8c77fcc152fc2c3868714781d092a28fe6acac79/proc/16/fd/3': No such file or directory
du: cannot access '/mnt/var/lib/docker/aufs/mnt/925a67dd5dc3f1fb97da6fdf8c77fcc152fc2c3868714781d092a28fe6acac79/proc/16/fdinfo/3': No such file or directory
du: cannot access '/mnt/var/lib/docker/aufs/mnt/925a67dd5dc3f1fb97da6fdf8c77fcc152fc2c3868714781d092a28fe6acac79/proc/23/task/23/fd/4': No such file or directory
du: cannot access '/mnt/var/lib/docker/aufs/mnt/925a67dd5dc3f1fb97da6fdf8c77fcc152fc2c3868714781d092a28fe6acac79/proc/23/task/23/fdinfo/4': No such file or directory
du: cannot access '/mnt/var/lib/docker/aufs/mnt/925a67dd5dc3f1fb97da6fdf8c77fcc152fc2c3868714781d092a28fe6acac79/proc/23/fd/3': No such file or directory
du: cannot access '/mnt/var/lib/docker/aufs/mnt/925a67dd5dc3f1fb97da6fdf8c77fcc152fc2c3868714781d092a28fe6acac79/proc/23/fdinfo/3': No such file or directory
du: cannot access '/mnt/var/lib/docker/aufs/mnt/925a67dd5dc3f1fb97da6fdf8c77fcc152fc2c3868714781d092a28fe6acac79/proc/30/task/30/fd/4': No such file or directory
du: cannot access '/mnt/var/lib/docker/aufs/mnt/925a67dd5dc3f1fb97da6fdf8c77fcc152fc2c3868714781d092a28fe6acac79/proc/30/task/30/fdinfo/4': No such file or directory
du: cannot access '/mnt/var/lib/docker/aufs/mnt/925a67dd5dc3f1fb97da6fdf8c77fcc152fc2c3868714781d092a28fe6acac79/proc/30/fd/3': No such file or directory
du: cannot access '/mnt/var/lib/docker/aufs/mnt/925a67dd5dc3f1fb97da6fdf8c77fcc152fc2c3868714781d092a28fe6acac79/proc/30/fdinfo/3': No such file or directory
du: cannot access '/mnt/var/lib/docker/aufs/mnt/925a67dd5dc3f1fb97da6fdf8c77fcc152fc2c3868714781d092a28fe6acac79/proc/37/task/37/fd/4': No such file or directory
du: cannot access '/mnt/var/lib/docker/aufs/mnt/925a67dd5dc3f1fb97da6fdf8c77fcc152fc2c3868714781d092a28fe6acac79/proc/37/task/37/fdinfo/4': No such file or directory
du: cannot access '/mnt/var/lib/docker/aufs/mnt/925a67dd5dc3f1fb97da6fdf8c77fcc152fc2c3868714781d092a28fe6acac79/proc/37/fd/3': No such file or directory
du: cannot access '/mnt/var/lib/docker/aufs/mnt/925a67dd5dc3f1fb97da6fdf8c77fcc152fc2c3868714781d092a28fe6acac79/proc/37/fdinfo/3': No such file or directory
du: cannot access '/mnt/var/lib/docker/aufs/mnt/925a67dd5dc3f1fb97da6fdf8c77fcc152fc2c3868714781d092a28fe6acac79/proc/44/task/44/fd/4': No such file or directory
du: cannot access '/mnt/var/lib/docker/aufs/mnt/925a67dd5dc3f1fb97da6fdf8c77fcc152fc2c3868714781d092a28fe6acac79/proc/44/task/44/fdinfo/4': No such file or directory
du: cannot access '/mnt/var/lib/docker/aufs/mnt/925a67dd5dc3f1fb97da6fdf8c77fcc152fc2c3868714781d092a28fe6acac79/proc/44/fd/3': No such file or directory
du: cannot access '/mnt/var/lib/docker/aufs/mnt/925a67dd5dc3f1fb97da6fdf8c77fcc152fc2c3868714781d092a28fe6acac79/proc/44/fdinfo/3': No such file or directory
du: cannot access '/mnt/var/lib/docker/aufs/mnt/925a67dd5dc3f1fb97da6fdf8c77fcc152fc2c3868714781d092a28fe6acac79/proc/51/task/51/fd/4': No such file or directory
du: cannot access '/mnt/var/lib/docker/aufs/mnt/925a67dd5dc3f1fb97da6fdf8c77fcc152fc2c3868714781d092a28fe6acac79/proc/51/task/51/fdinfo/4': No such file or directory
du: cannot access '/mnt/var/lib/docker/aufs/mnt/925a67dd5dc3f1fb97da6fdf8c77fcc152fc2c3868714781d092a28fe6acac79/proc/51/fd/3': No such file or directory
du: cannot access '/mnt/var/lib/docker/aufs/mnt/925a67dd5dc3f1fb97da6fdf8c77fcc152fc2c3868714781d092a28fe6acac79/proc/51/fdinfo/3': No such file or directory

k8s 1.8.x support

Does gaintswarm/kubernetes-prometheus support k8s v1.8.x?

Here's my setup details

K8s version : 1.8.2
OS: Ubuntu 16.04

I have updated the kube-state-metrics version to 1.1.0 in manifests-all.yml after checking compatibility

image: gcr.io/google_containers/kube-state-metrics:v1.1.0

kube-state-metrics pod is failing with below error

 docker logs -f 3dd00e9c68a9
I1107 11:40:06.519973       1 main.go:164] Using default collectors
I1107 11:40:06.520335       1 main.go:171] Using all namespace
I1107 11:40:06.520529       1 main.go:180] apiserver set to: https://10.242.138.118:6443
I1107 11:40:06.521481       1 main.go:218] service account token present: true
I1107 11:40:06.521900       1 main.go:219] service host: https://10.242.138.118:6443
I1107 11:40:06.527870       1 main.go:245] Testing communication with server
F1107 11:40:36.528884       1 main.go:187] Failed to create client: ERROR communicating with apiserver: Get https://10.242.138.118:6443/version: dial tcp 10.242.138.118:6443: i/o timeout

All other pods are up and running

NAME                                  READY     STATUS             RESTARTS   AGE
alertmanager-5d66dcb4b9-whzrv         1/1       Running            0          5m
grafana-core-cdb657b7b-dkkwq          0/1       Running            0          5m
kube-state-metrics-84bff58f4b-46j42   0/1       CrashLoopBackOff   4          5m
kube-state-metrics-84bff58f4b-fgz8d   0/1       CrashLoopBackOff   4          5m
node-directory-size-metrics-6xz7b     2/2       Running            0          5m
prometheus-core-64dbb76c65-msgsg      1/1       Running            0          5m
prometheus-node-exporter-mltv7        1/1       Running            0          5m


Any insights on this issue please?

How to monitor all namespaces

I used this project. I have just started with kubernetes monitoring. I could see in the endpoints that deployment,pods,service of namespace'monitoring' alone is probed. How to get data of all namespaces

No datapoints in CPU Usage graph for pods

Hello,

Seems CPU usage stats disappear shortly after deploying. Tried adjusting cpu & memory which helped the graphs to load quicker and temporarily fixed the cpu usage graph however shortly afterwards issue resurfaced.
As a side note: query in another dashboard for cpu stats seems to work sum (rate (container_cpu_usage_seconds_total{image!="",name=~"^k8s_.*",kubernetes_io_hostname=~"^$Node$"}[1m])) by (pod_name)

Hope this helps!

Not much data displayed

Most of the data is not displayed. That is on the "fix" branch. Too bad this seemed like a pretty comprehensive install. Looks like all the pieces are there, just not actually working :).

Whoops I take that back. I think I just wasn't waiting long enough!

Is this still considered the standard install, or is the new coreos operator option the "new way"?

On kubernetes 1.7.3 the data is null

When I use this on kubrenetes 1.7.3 (installd by kubeadm),The data is null. What heappend?
image
and the cluster status is null
image

my envirtmonent:
os:centos 7.3 x_641611
kubernets:

Client Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.3", GitCommit:"2c2fe6e8278a5db2d15a013987b53968c743f2a1", GitTreeState:"clean", BuildDate:"2017-08-03T07:00:21Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.3", GitCommit:"2c2fe6e8278a5db2d15a013987b53968c743f2a1", GitTreeState:"clean", BuildDate:"2017-08-03T06:43:48Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}

kubectl get pod --all-namespaces

NAMESPACE NAME READY STATUS RESTARTS AGE
default default-http-backend-3515556356-sn52n 1/1 Running 1 3d
devo devo-ui-c8bsx 1/1 Running 1 2d
devo facp-controller-vbv46 1/1 Running 0 1d
kube-system etcd-node147 1/1 Running 1 3d
kube-system heapster-3904197848-8h054 1/1 Running 0 2d
kube-system kube-apiserver-node147 1/1 Running 1 3d
kube-system kube-controller-manager-node147 1/1 Running 1 3d
kube-system kube-dns-2425271678-050lp 3/3 Running 3 3d
kube-system kube-flannel-ds-0lsvg 2/2 Running 3 3d
kube-system kube-flannel-ds-czmtc 2/2 Running 3 3d
kube-system kube-proxy-88q3w 1/1 Running 1 3d
kube-system kube-proxy-jd8gr 1/1 Running 2 3d
kube-system kube-scheduler-node147 1/1 Running 1 3d
kube-system kubernetes-dashboard-3313488171-lk1m7 1/1 Running 0 2d
kube-system monitoring-grafana-2027494249-1nlh2 1/1 Running 0 2d
kube-system monitoring-influxdb-3487384708-30jfj 1/1 Running 0 2d
monitoring alertmanager-4158139002-92636 1/1 Running 0 2h
monitoring grafana-core-1069951769-v3895 1/1 Running 0 2h
monitoring kube-state-metrics-654070635-lnt29 1/1 Running 0 1h
monitoring kube-state-metrics-654070635-nzh0l 1/1 Running 0 1h
monitoring node-directory-size-metrics-s08vh 2/2 Running 0 1h
monitoring node-directory-size-metrics-vbw4v 2/2 Running 0 1h
monitoring prometheus-core-669051596-68rgp 1/1 Running 0 50m
monitoring prometheus-node-exporter-gd33b 1/1 Running 0 1h
monitoring prometheus-node-exporter-mxgn5 1/1 Running 0 1h
nginx-ingress nginx-ingress-controller-2029042266-gfrzt 1/1 Running 1 3d
nginx-ingress nginx-ingress-controller-2029042266-vzlt6 1/1 Running 2 2d

imagePullPolicy: Required Value

Seems like this deployment is also impacted by the recent Kubernetes issue:

The Job "grafana-import-dashboards" is invalid: spec.template.spec.initContainers[0].imagePullPolicy: Required value

Just wanted to let you know in case you decided to use the recommended work around.

short time frame data in grafana

Hi,

First of all, very nice work, and very easy installation.

Currently I did a setup but the data stored is a very short time frame. It seems like the data will be truncated each few hours.
Is there something I might be missing?

Many thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.