Giter Site home page Giter Site logo

Comments (13)

Pionerd avatar Pionerd commented on July 17, 2024 1

Additionally, are you using long-term storage with prometheus to feed VPA?

Yes we use thanos

from charts.

Pionerd avatar Pionerd commented on July 17, 2024 1

Additional remark: we have multiple clients using our setup and only but all EKS clients are suffering from this, the AKS customers are not after the same upgrade.

from charts.

Pionerd avatar Pionerd commented on July 17, 2024 1

Relevant parameters:

I0816 16:14:07.067381       1 flags.go:57] FLAG: --add-dir-header="false"
I0816 16:14:07.067486       1 flags.go:57] FLAG: --address=":8942"
I0816 16:14:07.067492       1 flags.go:57] FLAG: --alsologtostderr="false"
I0816 16:14:07.067495       1 flags.go:57] FLAG: --checkpoints-gc-interval="10m0s"
I0816 16:14:07.067499       1 flags.go:57] FLAG: --checkpoints-timeout="1m0s"
I0816 16:14:07.067504       1 flags.go:57] FLAG: --container-name-label="container"
I0816 16:14:07.067509       1 flags.go:57] FLAG: --container-namespace-label="namespace"
I0816 16:14:07.067514       1 flags.go:57] FLAG: --container-pod-name-label="pod"
I0816 16:14:07.067517       1 flags.go:57] FLAG: --cpu-histogram-decay-half-life="24h0m0s"
I0816 16:14:07.067522       1 flags.go:57] FLAG: --cpu-integer-post-processor-enabled="false"
I0816 16:14:07.067526       1 flags.go:57] FLAG: --history-length="8d"
I0816 16:14:07.067531       1 flags.go:57] FLAG: --history-resolution="1h"
I0816 16:14:07.067535       1 flags.go:57] FLAG: --kube-api-burst="10"
I0816 16:14:07.067541       1 flags.go:57] FLAG: --kube-api-qps="5"
I0816 16:14:07.067547       1 flags.go:57] FLAG: --kubeconfig=""
I0816 16:14:07.067552       1 flags.go:57] FLAG: --log-backtrace-at=":0"
I0816 16:14:07.067566       1 flags.go:57] FLAG: --log-dir=""
I0816 16:14:07.067571       1 flags.go:57] FLAG: --log-file=""
I0816 16:14:07.067575       1 flags.go:57] FLAG: --log-file-max-size="1800"
I0816 16:14:07.067579       1 flags.go:57] FLAG: --logtostderr="true"
I0816 16:14:07.067584       1 flags.go:57] FLAG: --memory-aggregation-interval="24h0m0s"
I0816 16:14:07.067589       1 flags.go:57] FLAG: --memory-aggregation-interval-count="8"
I0816 16:14:07.067593       1 flags.go:57] FLAG: --memory-histogram-decay-half-life="24h0m0s"
I0816 16:14:07.067597       1 flags.go:57] FLAG: --memory-saver="false"
I0816 16:14:07.067601       1 flags.go:57] FLAG: --metric-for-pod-labels="kube_pod_labels{job=\"kube-state-metrics\"}[8d]"
I0816 16:14:07.067605       1 flags.go:57] FLAG: --min-checkpoints="10"
I0816 16:14:07.067609       1 flags.go:57] FLAG: --one-output="false"
I0816 16:14:07.067613       1 flags.go:57] FLAG: --oom-bump-up-ratio="1.2"
I0816 16:14:07.067618       1 flags.go:57] FLAG: --oom-min-bump-up-bytes="1.048576e+08"
I0816 16:14:07.067623       1 flags.go:57] FLAG: --pod-label-prefix=""
I0816 16:14:07.067627       1 flags.go:57] FLAG: --pod-name-label="pod"
I0816 16:14:07.067631       1 flags.go:57] FLAG: --pod-namespace-label="namespace"
I0816 16:14:07.067635       1 flags.go:57] FLAG: --pod-recommendation-min-cpu-millicores="5"
I0816 16:14:07.067640       1 flags.go:57] FLAG: --pod-recommendation-min-memory-mb="25"
I0816 16:14:07.067645       1 flags.go:57] FLAG: --prometheus-address="http://thanos-query-frontend.prometheus-stack:9090"
I0816 16:14:07.067649       1 flags.go:57] FLAG: --prometheus-cadvisor-job-name="kubelet"
I0816 16:14:07.067653       1 flags.go:57] FLAG: --prometheus-query-timeout="5m"
I0816 16:14:07.067657       1 flags.go:57] FLAG: --recommendation-margin-fraction="0.15"
I0816 16:14:07.067662       1 flags.go:57] FLAG: --recommender-interval="1m0s"
I0816 16:14:07.067667       1 flags.go:57] FLAG: --recommender-name="default"
I0816 16:14:07.067671       1 flags.go:57] FLAG: --skip-headers="false"
I0816 16:14:07.067675       1 flags.go:57] FLAG: --skip-log-headers="false"
I0816 16:14:07.067679       1 flags.go:57] FLAG: --stderrthreshold="2"
I0816 16:14:07.067683       1 flags.go:57] FLAG: --storage="prometheus"
I0816 16:14:07.067686       1 flags.go:57] FLAG: --target-cpu-percentile="0.9"
I0816 16:14:07.067690       1 flags.go:57] FLAG: --v="10"
I0816 16:14:07.067693       1 flags.go:57] FLAG: --vmodule=""
I0816 16:14:07.067697       1 flags.go:57] FLAG: --vpa-object-namespace=""
I0816 16:14:07.067702       1 main.go:82] Vertical Pod Autoscaler 0.13.0 Recommender: 0xc00004d820

Full logs in your mail :) not to leak any sensitive info here.

from charts.

Pionerd avatar Pionerd commented on July 17, 2024 1

Helm values are not much different:

vpa:
  recommender:
    extraArgs:
      storage: "prometheus"
      # The prometheus_server_endpoint should have the form http://<service-name>.<namespace-name>.svc:portnumber
      prometheus-address: "http://thanos-query-frontend.prometheus-stack:9090"
      prometheus-cadvisor-job-name: kubelet
      pod-label-prefix: ""
      pod-namespace-label: namespace
      pod-name-label: pod
      container-pod-name-label: pod
      container-name-label: container
      metric-for-pod-labels: kube_pod_labels{job="kube-state-metrics"}[8d]
      pod-recommendation-min-cpu-millicores: 5
      pod-recommendation-min-memory-mb: 25
      v: 10
  updater:
    enabled: false
  admissionController:
    enabled: false

from charts.

sudermanjr avatar sudermanjr commented on July 17, 2024

How are you pulling these metrics into Grafana? Is it possible there's actually just an issue with the metrics reporting rather than the actual VPA recommendation itself? The changes from 1.7.5 to 2.x are almost entirely unrelated to the recommender deployment itself.

from charts.

sudermanjr avatar sudermanjr commented on July 17, 2024

Additionally, are you using long-term storage with prometheus to feed VPA?

from charts.

Pionerd avatar Pionerd commented on July 17, 2024

We use kube-state-metrics to scrape the VPA recommendations. The values in the Grafana dashboard are the same as when checking using kubectl get vpa.

I also cannot understand why this change would lead to this behaviour. You have not seen anything like this before?

from charts.

sudermanjr avatar sudermanjr commented on July 17, 2024

The only time I've seen erratic recommendations is when I'm not using Prometheus data to feed the recommendations and I don't wait long enough for VPA to generate a good recommendation. Here's a cluster with 53 VPAs, using prometheus data, and the latest chart. (also using kube-state-metrics to poll the VPA data)

Screenshot 2023-08-16 at 10 06 27 AM

from charts.

sudermanjr avatar sudermanjr commented on July 17, 2024

Maybe try turning the log level on the recommender up to 10?

from charts.

sudermanjr avatar sudermanjr commented on July 17, 2024

I just realized the cluster that I'm showing in that graph above uses the vpa 0.14.0 image. Perhaps there's a bugfix in that version. Worth trying.

It would help if you could share your exact values for me to try to reproduce the issue

from charts.

sudermanjr avatar sudermanjr commented on July 17, 2024

Aha. You're using uncappedTarget which does not respect limits set on the VPA or in the defaults

kubernetes/autoscaler#2747 (comment)

Uncapped Target gives the recommendation before applying constraints specified in the VPA spec, such as min or max.

I would imagine that switching that metrics to target would provide more consistent data (that's what my graph above uses)

from charts.

Pionerd avatar Pionerd commented on July 17, 2024

That was just the first graph being shown by Grafana :) similar images for Target:
image

from charts.

sudermanjr avatar sudermanjr commented on July 17, 2024

Well now I'm at a loss. Perhaps the VPA folks can help explain why the recommendation status would oscillate so much. I personally haven't seen it do this in my various tests.

I'm guessing that the actual chart change actually has nothing to do with it, but it's something that is triggered by the re-deploy of the VPA pods. But that's just a hunch.

from charts.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.