Comments (10)
@mkocher I find this change as a must and as a great enhancement. It doesn't make sense to show the app's CPU usage as percentage from the CPU available to the whole VM. It would be much better to show how much of the entitled/available CPU for the app is being used at the moment.
Here is an example from an app which we have running in one of our foundations in which the differences can be clearly seen:
cf app
cf app cf-app-monitoring
Showing health and status for app cf-app-monitoring in org <reducted> / space <reducted> as <reducted>...
name: cf-app-monitoring
requested state: started
routes: cf-app-monitoring.<reducted>
last uploaded: Thu 23 Mar 13:47:59 UTC 2023
stack: cflinuxfs4
buildpacks:
name version detect output buildpack name
staticfile_buildpack 1.6.0 staticfile staticfile
type: web
sidecars:
instances: 2/2
memory usage: 64M
state since cpu memory disk logging details
#0 running 2024-01-12T17:01:09Z 1.1% 15.6M of 64M 5.2M of 1G 0/s of unlimited
#1 running 2024-01-12T17:43:15Z 1.1% 15.7M of 64M 5.2M of 1G 0/s of unlimited
cf cpu-entitlement
cf cpu-entitlement cf-app-monitoring
Note: This plugin is experimental.
Showing CPU usage against entitlement for app cf-app-monitoring in org <reducted> / space <reducted> as <reducted>...
avg usage curr usage
#0 55.98% 54.78%
#1 58.97% 57.16%
WARNING: Instance #0 was over entitlement from 2024-01-12 17:01:11 to 2024-01-12 17:01:26
WARNING: Instance #1 was over entitlement from 2024-01-12 17:43:23 to 2024-01-12 17:44:23
We should be careful about this change when rolling it out as this would be a breaking change if we stop emitting the current metric by default. We should be loud when announcing this and provide ops files in cf-deployment for activating and switching configuration.
from diego-release.
👍 glad to hear you're in favor
Agreed we need to make this backwards compatible, though I'd prefer to turn off the old metrics by default sooner than later. I don't think many people look at them, and container metrics generate a ton of individual time series which can put a burden on some metric stores.
from diego-release.
We checked App Autoscaler Release and searched for absolute_entitlement and absolute_usage and got no results. So we think this is safe from that perspective.
from diego-release.
Dear @mkocher, @chombium, as far as i remember 'cf app' metric shows the percentage the container is currently using from a single CPU core, but not from the entire host VM's CPU.
I.e. if we take the @chombium 's example above the app is currently consuming 1.1% from a single host CPU.
On our CF deployments we allow CPU burst, in this case if the application is using more CPU we have seen this metric to spike up to several hundreds %. Like for example 300%, in this case the container is consuming 3 CPU out of all available on the host. In general the max value this metric can produce is: (100*N)% where N is the number of CPU cores the host VM has. This metric is an easy way to see if the application is currently bursting when debugging.
The CPUEntitlement metric is really a good one, but it has different semantic it shows where the container is positioned with its average/current CPU consumption according to what it is entitled to. Also have in mind that the first metric comes for free while for the 'cf cpu' you need to install a cf cli plugin.
from diego-release.
Yep, the current cpu metric is out of 100*NumberOfCores, not 100. I'm not sure why we do that as an industry, but it is the convention. Apps however aren't allocated cores, they're allocated shares. So using more than 100% doesn't indicate one way or the other if the app is bursting.
from diego-release.
@mkocher CI is failing with rep-spec windows diversion. Should this be applied to rep-windows too?
from diego-release.
Oops. It has been 0️⃣ days since we forgot about windows.
As far as we can tell this should be applied verbatim to Windows as well. We'll take a look.
from diego-release.
#901 fixes the windows issue. It also makes Diego releasable again as it make the change non-breaking.
from diego-release.
Released in https://github.com/cloudfoundry/diego-release/releases/tag/v2.93.0
from diego-release.
Will the official container metrics documentation still be updated?
from diego-release.
Related Issues (20)
- [PR REVIEW]: allow sending network traffic usage for app metrics
- [EXECUTOR PR REVIEW]: send container network traffic metrics via logging client HOT 1
- [REP PR REVIEW]: enhance test to expect new network traffic usage fields
- Make BBS more resilient to API port being unavailable HOT 3
- RetireActualLRP is not emitting events HOT 3
- [EXECUTOR/REP PR REVIEW]: disable log rate limit metrics for tasks HOT 2
- Add support for docker images with attestation information HOT 3
- [BBS PR REVIEW]: BBS - Make BBS DesiredLRPHandler send the Stop/Update LRP requests to rep in parallel
- Rep unable to removed cached items after management API restart HOT 2
- Calculate the CPUWeight directly in the Executor HOT 10
- Make max-containers setting configurable HOT 7
- [REP/BBS PR REVIEW]: Add tags to task logs HOT 1
- [BBS] Use scheduling info instead of the whole desiredLRP HOT 5
- [BBS] Application CPU assignment HOT 3
- [Executor] Send error to app logs if starting a container fails HOT 14
- Use SHA algorithm for content digest in URLUploader HOT 2
- [BBS] Add request metrics for BBS endpoints HOT 5
- [Envoy] Envoy proxy healthchecks
- Convert components to a go module HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from diego-release.