sfproductlabs / tracker Goto Github PK
View Code? Open in Web Editor NEWGrowth Tracker. GDPR friendly Telemetry. Subsystem of SFPL Experimentation Framework.
License: Apache License 2.0
Growth Tracker. GDPR friendly Telemetry. Subsystem of SFPL Experimentation Framework.
License: Apache License 2.0
Implement these overload-protection features from NGINX:
Then:
Leave some reserved connection slots for /ping, e.g.:
On NGINX it'S like: Maximum accepted 2048, maximum of 2000 for /api/, leaves 48 slots for /ping/ (and for returning the 503 overloaded mentioned above).
Necessary for:
Making sure "/ping" still works on an overloaded system. Otherwise AWS ALB HealthCheck tends to accidentially take down heavy-loaded-containers making the load-situation even worse (next one fails, then being taken down etc.). So the heath check (/ping/) needs some extra slots/prioritization etc.)
Some sources:
Connection-Limits in Go:
https://stackoverflow.com/questions/22625367/how-to-limit-the-connection-count-of-an-http-server-implemented-in-go
HTTP "Retry-After":
https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Retry-After
to session
This is related to multi-site/multi-customer tracking (not sure how you called it exactly).
Consider a tracker hosted by some 'companyA' for an internal 'project1' and different customers with domains in config set like (without any Load-Balancer providing certificates):
"Domains": [
"tr.companyA.com",
"tr.project1.companyA.com",
"tr.customer1.com"
"tr.customer2.com"
],
But currently you can only specify a single certificate (if provided manually):
"TLSCert" : "./.setup/keys/example/server.crt",
"TLSKey" : "./.setup/keys/example/server.key",
AFAIK such a setup would require a certificate that lists all domains mentioned in the config in the SAN of the certificate. For builtin Lets-Encrypt that is no problem, supported and probably the way to go. But for manual certificates (e.g. existing one provided by a new customer) this could become a problem, same when 'companyA' tries to maintain one professional/bought one for all customers (has to update/change the SAN all the time).
If possible, please consider changing it so that one optionally can provide a certificate per domain/site (like you can do in nginx too)...
On AWS the TCP connections come from a private network IP which is owned by the ALB.
Any sort of limiting on this IP would turn out contra-productive.
Instead any sort of IP based limits should apply to the 'X-Forwarded-For' IP.
Hey @dioptre ,
leaving you a list of NGINX tweaked settings here. We don't need to have a config value for each of them in the tracker (some might come handy). In most cases it would be enough to know what the Go HTTP server value is and if it's OK for our purpose. Woud like to clarify this before switching from NGINX to Go.
Setting | NGINX | Go/Tracker |
---|---|---|
send_timeout | 40s | |
client_header_timeout | 40s | |
client_body_timeout | 40s | |
keepalive_timeout | 20s | |
keepalive_requests | 1000 | |
reset_timedout_connection | on |
Setting | NGINX | Go/Tracker |
---|---|---|
client_body_buffer_size | 8k | |
client_header_buffer_size | 1k | |
client_max_body_size | 8m | |
large_client_header_buffers | 4 8k |
Our NGINX compresses all responses larger than 512 bytes and having the listed MIME type. Does tracker have any compression yet?
There is a stats page in NGINX which shows how many clients are connected etc. I noticed you started something similar in the tracker. Could you may be please add the amount of connected clients there?
NGINX kills a connection on TCP level if it tries to access a completely invalid route (defaultcase at the end, e.g. not starting with /api/ respectively /tr/ or others here etc.). Since we use this for our backends which are only accessed by our own frontends, we can assume anything that's trying a complete trash route as malicious or at least unwanted/external (likely backdoor-scanner/bots/etc.) and we don't want to waste our resources on keeping such connections open.
The sfpla.logs table has a 'topic' column.
However, the tracker ignores this value when inserting a record to DB.
See:
https://github.com/dioptre/tracker/blob/2944b222d7996699697742b5535ba4d42fdf587b/cassandra.go#L177
Hosts:
tr1.company.com [IP:A]
tr2.company.com [IP:B]
tr3.company.com [IP:C]
Load-Balanced/Failsafe Endpoint for Clients:
tr.company.com [IP:A,B,C]
The builtin Lets-Encrypt support fails when trying to use 'tr.company.com' in the tracker, e.g. in config of tr1, tr2 and tr3:
"Domains": [
"tr.company.com"
]
Because Lets-Encrypt tries to validate the challenge on an arbitrary IP returned from DNS (A, B or C) and not necessarily the one that is waiting for it (e.g. A).
A similar problem is also described here:
https://community.letsencrypt.org/t/a-record-with-multiple-ips/72035
It kinda suggests to specify a "Lets-Encrypt-Master" and have every other instance use a 301 foward to this "Master" for the challenge. This should work to receive the cert+key successfully on this 'Master', but the cert+key would still need to be shared with other instances...
May be you got a cool idea how to support this ?
Missing device, vp, os, country, latlon, tz not passing in visitors or sessions.
As per: https://docs.google.com/document/d/1QwMOh48lnnmgOPwFd546rj0PXWrVo5hS7YfA-AOWlFI/edit?usp=sharing
This would be really helpful to have our custom config.company.json in the fork.
Less conflicts when merging your updates...
Startup would look like:
./tracker -c config.company.json
Transmitted 'ltime' logging value looks like this when leaving NodeJS to NATS:
"ltime":"15:03:02.044652999"
However, any record in sfpla.logs table always ends up as:
00:00:00.000000000
Code looks like it's completely ignoring the transmitted 'ltime' value but instead trying to figure out its own: https://github.com/dioptre/tracker/blob/2944b222d7996699697742b5535ba4d42fdf587b/cassandra.go#L165
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.