Comments (26)
@edwarnicke It's up and running.
Thanks for the great support, we can close this issue. :)
from cmd-nsmgr.
Hello @rpiceage
NSMgr and other endpoints using "google.golang.org/grpc/health/grpc_health_v1" server. So please consider using google.golang.org/grpc/health/grpc_health_v1 client to solve the issue.
from cmd-nsmgr.
Hi @denis-tingaikin
Thanks for the quick answer and the info. We will look into this.
from cmd-nsmgr.
@rpiceage Be free to reopen this if the problem will actual :)
from cmd-nsmgr.
Hi,
I'm trying to use grpc-health-probe as client to the grpc_health_v1 server in the nsmgr.
https://github.com/grpc-ecosystem/grpc-health-probe
I put the binary in the docker image of the nsmgr, and tried to use that for K8s liveness and readiness probes, but I cannot reach the grpc server of nsmgr either on the containerPort 5001, or using the unix socket /var/lib/networkservicemesh/nsm.io.sock.
I tried to configure TLS for the connection, with no success.
My probes look something like this:
readinessProbe:
exec:
command: ["/bin/grpc_health_probe", "-addr=:5001", "-tls", "-tls-no-verify"]
initialDelaySeconds: 15
livenessProbe:
exec:
command: ["/bin/grpc_health_probe", "-addr=:5001", "-tls", "-tls-no-verify"]
initialDelaySeconds: 20
Do you have any idea what might be the problem? What URL is to be used for communication with the grpc server?
from cmd-nsmgr.
One note on this issue. Right now out of the box every endpoint is exposing a GRPC liveliness probe:
There's pretty good documentation for adding grpc health probing:
https://kubernetes.io/blog/2018/10/01/health-checking-grpc-servers-on-kubernetes/
https://codeburst.io/kubernetes-grpc-services-and-probes-by-example-1cb611da45ab
I don't think it requires any code changes... but we may need to add the grpc health probe to our docker containers and update our yaml files to do so.
from cmd-nsmgr.
Hi,
Thank you for the answer.
My question was mainly about the configuration of the health probe client. I added the grpc-health-probe to the container and I tried to reach the GRPC server on ListenOn (unix:///var/lib/networkservicemesh/nsm.io.sock and also tcp://:5001) with no success, and it is really hard to debug why I could not connect.
If you add the probe to the container and provide a working example, that would be a really big help.
from cmd-nsmgr.
@rpiceage Hmm... that feel suspiciously like it might be a TLS related issue... what kinds of errors are you getting when you try to connect?
from cmd-nsmgr.
@rpiceage I've also asked the question on the Spire slack: https://spiffe.slack.com/archives/C7XDP01HB/p1617189517055500
from cmd-nsmgr.
Thank you @edwarnicke
The error was a pretty generic message in the "kubectl describe pod" output:
Liveness probe failed: timeout: failed to connect service "unix:///var/lib/networkservicemesh/nsm.io.sock" within 1s
The message was the same in every case I tried to change the URL.
from cmd-nsmgr.
Got it... seems even more likely to be the TLS thing... lets see what the Spire folks say :)
from cmd-nsmgr.
@rpiceage I'm working on a PR for grpc-health-probe that would be able to utilize spire:
https://github.com/edwarnicke/grpc-health-probe/tree/spiffe
I haven't had a chance to test it yet (hope to get to that today). I wanted you to have the opportunity to kick the tires if you so desire in case you get to it before I do :)
from cmd-nsmgr.
OK, so the grpc-health-probe had no support for spire...
I will try to build your branch and test it.
Thanks a lot @edwarnicke
from cmd-nsmgr.
Tried it but with no success, got the same not too informative error messages as before.
I tried using "-addr=/var/lib/networkservicemesh/nsm.io.sock" and "-addr=:5001", none of them worked.
from cmd-nsmgr.
@rpiceage Thank you for trying... I'll go poke at it some more :)
from cmd-nsmgr.
@rpiceage I've pushed grpc-ecosystem/grpc-health-probe#63 and am having an interesting conversation about how to productively add such functionality.
from cmd-nsmgr.
@edwarnicke Thanks for your efforts.
Yes, I can understand their concern about SPIFFE integration. On the other hand maybe it would add considerable value to the heath-probe if it had support for frameworks like SPIFFE without additional steps.
from cmd-nsmgr.
@rpiceage They've actually come back quite a bit more positively :) Also, I've tested with Spire in K8s with nsmgr and the probe as submitted in that PR does work nicely there :)
from cmd-nsmgr.
@edwarnicke thanks for the good news, I will try some more testing then.
from cmd-nsmgr.
@rpiceage This is what worked for me:
readinessProbe:
exec:
command: [ "/bin/grpc-health-probe", "-spiffe", "-addr=:5001" ]
initialDelaySeconds: 5
livenessProbe:
exec:
command: [ "/bin/grpc-health-probe", "-spiffe", "-addr=:5001" ]
initialDelaySeconds: 10
from cmd-nsmgr.
@rpiceage This should now be working as of: networkservicemesh/deployments-k8s#1133
from cmd-nsmgr.
@edwarnicke Works like a charm, thanks :)
Any chance that this feature will be available in the vpp-forwarder? I saw that cmd-registry-memory is updated with it, it also works fine.
from cmd-nsmgr.
@rpiceage VPP Forwarder doesn't expose any external ports, it only listens on a Unix File socket. Currently that unix file socket is created randomly (tempdir style). It could be made more deterministic and we could then use the same approach with grpc-health-probe. Would that meet the need?
from cmd-nsmgr.
@edwarnicke Yes, certainly. Many thanks.
from cmd-nsmgr.
@rpiceage networkservicemesh/cmd-forwarder-vpp#170 and networkservicemesh/deployments-k8s#1178 should, when merged, give you readiness/liveliness for cmd-forwarder-vpp.
from cmd-nsmgr.
@rpiceage Both are now merged. Let me know if that meets your need for cmd-forwarder-vpp and readiness/liveliness :)
from cmd-nsmgr.
Related Issues (17)
- cmd-nsmgr application and testing HOT 4
- Update NSMgr to latest SDK HOT 1
- NSMgr adds NSE in registry with wrong URL
- NSMgr deployed via k8s daemonset can't access node IP address
- TestNSmgrEndpointCallback has unexpected Errors in output. HOT 1
- NSMgr leaks memory somewhere (probably in `grpcfd`) HOT 5
- Forwarder request processing chain element HOT 1
- Log level cannot be set HOT 1
- NSMgr container restarts due to concurrent map writes HOT 1
- NSE registry functionality HOT 1
- NSM_LISTEN_ON unix socket file permissions HOT 4
- nsc to connect nsmgr via tcp
- nsmgr crashing when deploying the `floating_vl3-basic` example on 3 kind clusters HOT 11
- Memory leaks in nsmgr HOT 8
- What should be the real memory consumed by nsmgr process? HOT 1
- Connectivity changes during runtime HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cmd-nsmgr.