Comments (14)
For Hypothetical scenario 1 where no log lines are allowed - I can imagine customers being confused that they're missing log lines entirely. If 1 log line is larger in bytes than the max bytes per second, can we truncate the line so they know the line exceeds the limit?
from diego-release.
We could make a change in CAPI to not allow log rate limits that are below the max log line size.
This makes sense to me. Having some kind of API level failures where the minimum has to be met.
Too many log rate limit exceeded error messages
I think this comment makes sense as the goal of this feature. I think that instead of emitting every second, if we can figure out when it starts and stops, it makes sense.
Update the generic error message to include more meaningful information.
Can you elaborate what type of meaningful information this could be?
from diego-release.
PSA: planning to discuss this proposal at the working group meeting on Wednesday, May 5th.
from diego-release.
Just for the record: here's a rephrasing of the state we're in
As part of implementing app level rate limiting with quotas, some changes to the algorithm of log rate limiting were made.
One of those dramatically increased the amount of "log rate exceeded" messages a user could see.
What we're looking to do is possibly:
- reduce the amount of log-rate-exceeded messages, reducing the load on the platform
- increase the amount of contiguous blocks of logs the user sees, possibly making it easier to interpret the logs that are emitted
- keep the amount of clarity around which logs are being elluded due to log rate limits.
There's some future thoughts as well on:
- trying to make sure applications can utilize more of their log-rate-limit when we can allow it, but that's not implemented in the pr we made this time. An exmaple would be truncating oversized logs rather then triggering the timeout immediately without logging.
from diego-release.
cloudfoundry/executor#73 was merged, which implements (2).
from diego-release.
cloudfoundry/executor#73 has now been released in v2.77.0
from diego-release.
Started a discussion in CF Slack.
from diego-release.
Added a new step to the proposal based off Jochen's suggestion in Slack: Update the generic error message to include more meaningful information.
from diego-release.
We've I think been explicitly avoiding truncating logs to meet the rate limit. Maybe it's something we should have considered more, fair.
TY for bringing it up!!
from diego-release.
I'm not sure I agree with a minimum. People should feel free to set their limit low, or entirely off if need be. I'm not sure we have a good argument as to why that shouldn't be allowed. Perhaps a "hey, you sure about that" on the cli? I don't feel like a "never".
Can you elaborate what type of meaningful information this could be?
An example we've talked about is, if we want to go to a timeout type limit, including that information("You are being timed out for 1 second") in the outage warning log.
from diego-release.
I think that's a good rephrase, thanks Ben! Having a clear idea of (what we the think are) the benefits of the proposal definitely helps with evaluating it.
from diego-release.
I think that instead of emitting every second, if we can figure out when it starts and stops, it makes sense.
That's why we're talking about doing a timeout box. It both allows us to reduce the number of messages saying when the logs are being emitted, and allows us to demarcate clearer when logs are being dropped.
from diego-release.
The outcome of the proposal discussion at the working group was general agreement on the proposal (by the stakeholders present at the meeting).
from diego-release.
I am going to close this issue since all related PRs have been merged. Please re-open if that's not the case.
from diego-release.
Related Issues (20)
- [PR REVIEW]: allow sending network traffic usage for app metrics
- [EXECUTOR PR REVIEW]: send container network traffic metrics via logging client HOT 1
- [REP PR REVIEW]: enhance test to expect new network traffic usage fields
- Make BBS more resilient to API port being unavailable HOT 3
- RetireActualLRP is not emitting events HOT 3
- [EXECUTOR/REP PR REVIEW]: disable log rate limit metrics for tasks HOT 2
- Add support for docker images with attestation information HOT 3
- [BBS PR REVIEW]: BBS - Make BBS DesiredLRPHandler send the Stop/Update LRP requests to rep in parallel
- Rep unable to removed cached items after management API restart HOT 2
- Calculate the CPUWeight directly in the Executor HOT 10
- Make max-containers setting configurable HOT 7
- [REP/BBS PR REVIEW]: Add tags to task logs HOT 1
- [BBS] Use scheduling info instead of the whole desiredLRP HOT 5
- [BBS] Application CPU assignment HOT 3
- [Executor] Send error to app logs if starting a container fails HOT 14
- Use SHA algorithm for content digest in URLUploader HOT 2
- Add CPU Entitlement gauge metric & Deprecate CPU Entitlement counter metric HOT 10
- [BBS] Add request metrics for BBS endpoints HOT 5
- [Envoy] Envoy proxy healthchecks
- Convert components to a go module HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from diego-release.