Comments (9)
I just realized there was a major Træfik version release during my development cycle ! That’ll teach me not pinning containers…
Anyway, just for the record I tested with container tag v2.11.3 and the result is much more stable (although I understand there were other problems with the former traefik_entrypoint_open_connections
metric versus the new traefik_open_connections
one):
Now I’ll test a modified v3.0 with the +1/-1 in the mutex critical section.
from traefik.
Digging a bit more… I don’t really speak Go but I’ll try… Clearly this function RemoveConnection is being called too much, or AddConnection not enough.
from traefik.
Errr, so something looks off to me, but probably is not. Reproducing the code of RemoveConnection here:
func (c *connectionTracker) RemoveConnection(conn net.Conn) {
c.connsMu.Lock()
delete(c.conns, conn)
c.connsMu.Unlock()
if c.openConnectionsGauge != nil {
c.openConnectionsGauge.Add(-1)
}
}
Is it normal that the Gauge decrementation is outside of the mutex section ? I guess the instrumentation library is threadsafe already, is it ?
from traefik.
@rtribotte I see that you authored this highly relevant commit 7c2af10 from PR #9656, might you be able to chip in ?
from traefik.
I have an issue too
my metrics:
HELP traefik_open_connections How many open connections exist, by entryPoint and protocol
TYPE traefik_open_connections gauge
traefik_open_connections{entrypoint="metrics",protocol="TCP"} 2
traefik_open_connections{entrypoint="traefik",protocol="TCP"} -26
traefik_open_connections{entrypoint="web",protocol="TCP"} -26
traefik_open_connections{entrypoint="websecure",protocol="TCP"} 8
from traefik.
Hello @navaati, thanks for opening this!
Is it normal that the Gauge decrementation is outside of the mutex section ? I guess the instrumentation library is threadsafe already, is it ?
It should be threadsafe indeed (https://github.com/prometheus/client_golang/blob/release-1.17/prometheus/gauge.go#L122C1-L130C2), Have you had a chance to try the alternative (protect with the mutex)? Did it fix the bug?
from traefik.
Hi.
No, I haven’t tested anything yet: I first need to figure out how to build the binary then the container, for which I haven’t found instructions yet (at least nothing in the contributing documentation). EDIT: found the doc https://doc.traefik.io/traefik/contributing/building-testing/ !
I’ll find and try to test, although with that link you dug I don’t have much hope. Is it actually the official prom client which is being used here though ? I see the type of the gauge is gokitmetrics.Gauge
.
from traefik.
Yeah no, it’s just an abstraction layer over the official prom client: https://github.com/go-kit/kit/blob/dfe43fa6a8d72c23e2205d0b80e762346e203f78/metrics/prometheus/prometheus.go#L84.
from traefik.
I don't have much useful information other than to say that I am running 3.0.2 and I have had lots of negative indicated open connections. My prometheus data confirms that this has not been an issue before.
I have added two extra queries to demonstrate:
You can see min-2.0 never goes below 0, while min-3.0 maxed out at almost 500.
The two fields used are:
traefik_open_connections
(3.0 >)
traefik_entrypoint_open_connections
(< 2.x)
from traefik.
Related Issues (20)
- A Router rule that exactly match all same-name header values. HOT 3
- Extend `headerLabels` Support to All Prometheus Metrics HOT 1
- TCP weighted service not respecting weights HOT 1
- Sectigo Certresolver does not populate cert field in JSON file HOT 1
- Unable to obtain ACME certificate for domains HOT 1
- Order cannot contain more than 100 DNS names
- container image on ghcr HOT 1
- Traefik provides default TLS certificate instead of one from a secret HOT 8
- Add TCP Health Check using SYN, SYN-ACK, and RST packets HOT 2
- Traefik 3.x can not download customized plugin HOT 2
- Support for HTTP Calls in Existing WASM Plugins System
- Traefik Configuration Checks HOT 2
- Can't upload docker images larger than 400MB or 2GB via traefik 3.0 3.0.1 3.0.2 proxy HOT 5
- Traefik sends 400 Bad Request if any header has some special char and also request not even get logged in access log HOT 2
- DownstreamStatus is 0 in v3 when server-sent event response is aborted from client HOT 7
- Support BackendTLSPolicy from Gateway API
- Support AWS IRSA with EKS Fargate?
- bug: `TLSStore` with Wildcard Certificate and `sniStrict: true` does not work
- Traefik Routing: Protocol and Port Mismatch Not Captured
- Errors Middleware + IngressRoute : Could not get Capture / value not found in context HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from traefik.