jharting / openshift-rabbitmq-cluster Goto Github PK
View Code? Open in Web Editor NEWDeploys a RabbitMQ cluster in OpenShift
Deploys a RabbitMQ cluster in OpenShift
Hi,
I'm facing below issue while deploying the RabbitMQ cluster in OpenShift 3.9.
| 2019-03-14 05:09:59.378 [info] <0.266.0> FHC write buffering: ON
| 2019-03-14 05:09:59.383 [info] <0.224.0> Node database directory at /var/lib/rabbitmq/mnesia/rabbit@rabbitmq-cluster-1.rabbitmq-cluster.xyz-dev.svc.cluster.local is empty. Assuming we need to join an existing cluster or initialise from scratch...
| 2019-03-14 05:09:59.383 [info] <0.224.0> Configured peer discovery backend: rabbit_peer_discovery_k8s
| 2019-03-14 05:09:59.383 [info] <0.224.0> Will try to lock with peer discovery backend rabbit_peer_discovery_k8s
| 2019-03-14 05:09:59.383 [info] <0.224.0> Peer discovery backend does not support locking, falling back to randomized delay
| 2019-03-14 05:09:59.383 [info] <0.224.0> Peer discovery backend rabbit_peer_discovery_k8s does not support registration, skipping randomized startup delay.
| 2019-03-14 05:09:59.405 [info] <0.224.0> All discovered existing cluster peers: rabbit@rabbitmq-cluster-1.rabbitmq-cluster.development.svc.cluster.local, rabbit@rabbitmq-cluster-0.rabbitmq-cluster.development.svc.cluster.local
| 2019-03-14 05:09:59.405 [info] <0.224.0> Peer nodes we can cluster with: rabbit@rabbitmq-cluster-1.rabbitmq-cluster.development.svc.cluster.local, rabbit@rabbitmq-cluster-0.rabbitmq-cluster.development.svc.cluster.local
| 2019-03-14 05:09:59.409 [warning] <0.224.0> Could not auto-cluster with node rabbit@rabbitmq-cluster-1.rabbitmq-cluster.development.svc.cluster.local: {badrpc,nodedown}
| 2019-03-14 05:09:59.419 [warning] <0.224.0> Could not auto-cluster with node rabbit@rabbitmq-cluster-0.rabbitmq-cluster.development.svc.cluster.local: {badrpc,nodedown}
| 2019-03-14 05:09:59.419 [warning] <0.224.0> Could not successfully contact any node of: rabbit@rabbitmq-cluster-1.rabbitmq-cluster.development.svc.cluster.local,rabbit@rabbitmq-cluster-0.rabbitmq-cluster.development.svc.cluster.local (as in Erlang distribution). Starting as a blank standalone node...
| 2019-03-14 05:09:59.436 [info] <0.43.0> Application mnesia exited with reason: stopped
| 2019-03-14 05:10:02.039 [info] <0.224.0> Waiting for Mnesia tables for 30000 ms, 9 retries left
| 2019-03-14 05:10:02.104 [info] <0.224.0> Waiting for Mnesia tables for 30000 ms, 9 retries left
| 2019-03-14 05:10:02.210 [info] <0.224.0> Waiting for Mnesia tables for 30000 ms, 9 retries left
| 2019-03-14 05:10:02.210 [info] <0.224.0> Peer discovery backend
Hello @jharting ,
Thank you for sharing. Is this config really working with OpenShift?
I'm using OpenShift 3.7 and cannot get service discovery run.
I'm facing 2 issues:
1.) If I'm using a headless service (clusterIP: none) and a full qualified hostname, the name resolution doesn't work:
ERROR: epmd error for host rabbitmq-68-7fcdw.rabbitmq-cluster.dcrpi-omsf-dev0.svc.cluster.local: nxdomain (non-existing domain)
With a clusterIP name resolution works, but it resolves the IP address of the service, not of the pod. In this case I get an epmd timeout.
2.) The cluster_formation.k8s.address_type = hostname doesn't work, as the OpenShift API doesn't return the hostname like Kubernetes does.
I have already opened a feature request: rabbitmq/rabbitmq-peer-discovery-k8s#33
Sitenote: I'm using a DeploymentConfig, not a StatefulSet. But this shouldn't make a difference.
I think both issues are related to the same root cause, as the pod-hostname field is used for the dns discovery:
https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/
Any help is appreciated. Thank you
Roberto
When starting up, I see the following errors:
/usr/lib/rabbitmq/bin/rabbitmq-plugins: 86: cd: can't cd to /var/log/rabbitmq
/usr/lib/rabbitmq/bin/rabbitmq-server: 86: cd: can't cd to /var/log/rabbitmq
2018-11-14 03:13:32.860 [error] <0.98.0> Failed to open crash log file /var/log/rabbitmq/log/crash.log with error: permission denied
And running any command also shows the following error:
$ rabbitmqctl list_queues
/usr/lib/rabbitmq/bin/rabbitmqctl: 86: cd: can't cd to /var/log/rabbitmq
Timeout: 60.0 seconds ...
Listing queues for vhost / ...
Most of the Errors are removed if we add an environment variable: RABBITMQ_LOG_BASE = -
2022-02-24 19:27:55.565 [warning] <0.274.0> Could not auto-cluster with node [email protected]: {badrpc,nodedown}
2022-02-24 19:27:55.565 [info] <0.274.0> Trying to join discovered peers failed. Will retry after a delay of 500 ms, 6 retries left...
Following fixes the above issue, in effect need to add both ingress and egreee
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: rabbitmq-internal-access
spec:
podSelector:
matchLabels:
app: rabbitmq
ingress:
- from:
- podSelector:
matchLabels:
app: rabbitmq
egress:
- to:
- podSelector:
matchLabels:
app: rabbitmq
Thanks for providing this! I did notice that I needed the anyuid SCC when using this on openshift v3.7.0+7ed6862 (old, I know).
Just required:
oc adm policy add-scc-to-user anyuid -z rabbitmq-discovery
I did not need to add the SCC on a v3.9.0+ba7faec-1 cluster.
Hi All,
I'm still trying to fix this but I am having trouble using this:
oc create -f rabbitmq-cluster-template.yaml
results in
error converting YAML to JSON: yaml: line 115: mapping values are not allowed in this context
I ran the yaml through http://www.yamllint.com/ and it looks fine. I even tried using the reformated version and still get the same error.
Not sure which line is 115 though :)
warning: /var/lib/rabbitmq/.erlang.cookie contents do not match RABBITMQ_ERLANG_COOKIE
sed: couldn't open temporary file /etc/rabbitmq/sedRe7mWL: Read-only file system
running "VolumeBinding" filter plugin for pod "rabbitmq-cluster-0": pod has unbound immediate PersistentVolumeClaims.
What am I missing here ?
Are there plans for introducing TLS support? To secure client and also inter-node communication? It would be good to have such an option and be able to specify a TLS k8s secret as a provider of a certificate which should be used for it.
One of the Pod is going in crash loop back off error when i tried to deploy this template. The other pod is running and is ready.
Error in logs: ** Connection attempt from disallowed node 'rabbitmqcli-1384-rabbit@rabbitmq-cluster-1.rabbitmq-cluster.rabbitmq-ha.svc.cluster.local' **
In events: (combined from similar events): Readiness probe failed: Error: unable to perform an operation on node 'rabbit@rabbitmq-cluster-1.rabbitmq-cluster.rabbitmq-ha.svc.cluster.local'. Please see diagnostics information and suggestions below. Most common reasons for this are: * Target node is unreachable (e.g. due to hostname resolution, TCP connection or firewall issues) * CLI tool fails to authenticate with the server (e.g. due to CLI tool's Erlang cookie not matching that of the server) * Target node is not running In addition to the diagnostics info below: * See the CLI, clustering and networking guides on https://rabbitmq.com/documentation.html to learn more * Consult server logs on node rabbit@rabbitmq-cluster-1.rabbitmq-cluster.rabbitmq-ha.svc.cluster.local * If target node is configured to use long node names, don't forget to use --longnames with CLI tools DIAGNOSTICS =========== attempted to contact: ['rabbit@rabbitmq-cluster-1.rabbitmq-cluster.rabbitmq-ha.svc.cluster.local'] rabbit@rabbitmq-cluster-1.rabbitmq-cluster.rabbitmq-ha.svc.cluster.local: * connected to epmd (port 4369) on rabbitmq-cluster-1.rabbitmq-cluster.rabbitmq-ha.svc.cluster.local * epmd reports node 'rabbit' uses port 25672 for inter-node and CLI tool traffic * TCP connection succeeded but Erlang distribution failed * suggestion: check if the Erlang cookie identical for all server nodes and CLI tools * suggestion: check if all server nodes and CLI tools use consistent hostnames when addressing each other * suggestion: check if inter-node connections may be configured to use TLS. If so, all nodes and CLI tools must do that * suggestion: see the CLI, clustering and networking guides on https://rabbitmq.com/documentation.html to learn more Current node details: * node name: 'rabbitmqcli-492-rabbit@rabbitmq-cluster-1.rabbitmq-cluster.rabbitmq-ha.svc.cluster.local' * effective user's home directory: /var/lib/rabbitmq * Erlang cookie hash: 6ofCpuGcLssmT/U34mLOFg==
What would be the easiest way to add a policy?
specifically, I am trying to add the following example:
https://github.com/rabbitmq/rabbitmq-server/blob/master/docs/set_rabbitmq_policy.sh.example
But it doesn't seem to work if I add as a config.map.
Any help / suggestions would be appreciated?
Am I missing something?
Running on Openshift 4 cluster
Error creating: pods "rabbitmq-cluster-operator-555fd7d956-" is forbidden: unable to validate against any security context constraint: [fsGroup: Invalid value: []int64{1000}: 1000 is not an allowed group spec.containers[0].securityContext.securityContext.runAsUser: Invalid value: 1000: must be in the ranges: [1001030000, 1001039999]]
This template creates two services:
How do i expose the 5672 to external traffic? I created routes to either services but client's can't. Any suggestions?
Thanks
RABBITMQ_CONFIG_FILE is set to "/var/lib/rabbitmq/rabbitmq.conf" but docker-entrypoint.sh waits it to be without ".conf" (https://github.com/docker-library/rabbitmq/blob/master/3.7/debian/docker-entrypoint.sh#L201-L203)
So in your version it creates new file rabbitmq.conf.conf and after new params are written there (i.e. RABBITMQ_DEFAULT_USER and RABBITMQ_DEFAULT_PASS - rabbitmq-server process doesn't see these params)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.