Giter Site home page Giter Site logo

openshift-rabbitmq-cluster's Issues

Could not auto-cluster with node

Hi,

I'm facing below issue while deploying the RabbitMQ cluster in OpenShift 3.9.

Error:
2019-03-14 05:09:59.378 [info] <0.266.0> FHC read buffering: OFF

  | 2019-03-14 05:09:59.378 [info] <0.266.0> FHC write buffering: ON
  | 2019-03-14 05:09:59.383 [info] <0.224.0> Node database directory at /var/lib/rabbitmq/mnesia/rabbit@rabbitmq-cluster-1.rabbitmq-cluster.xyz-dev.svc.cluster.local is empty. Assuming we need to join an existing cluster or initialise from scratch...
  | 2019-03-14 05:09:59.383 [info] <0.224.0> Configured peer discovery backend: rabbit_peer_discovery_k8s
  | 2019-03-14 05:09:59.383 [info] <0.224.0> Will try to lock with peer discovery backend rabbit_peer_discovery_k8s
  | 2019-03-14 05:09:59.383 [info] <0.224.0> Peer discovery backend does not support locking, falling back to randomized delay
  | 2019-03-14 05:09:59.383 [info] <0.224.0> Peer discovery backend rabbit_peer_discovery_k8s does not support registration, skipping randomized startup delay.
  | 2019-03-14 05:09:59.405 [info] <0.224.0> All discovered existing cluster peers: rabbit@rabbitmq-cluster-1.rabbitmq-cluster.development.svc.cluster.local, rabbit@rabbitmq-cluster-0.rabbitmq-cluster.development.svc.cluster.local
  | 2019-03-14 05:09:59.405 [info] <0.224.0> Peer nodes we can cluster with: rabbit@rabbitmq-cluster-1.rabbitmq-cluster.development.svc.cluster.local, rabbit@rabbitmq-cluster-0.rabbitmq-cluster.development.svc.cluster.local
  | 2019-03-14 05:09:59.409 [warning] <0.224.0> Could not auto-cluster with node rabbit@rabbitmq-cluster-1.rabbitmq-cluster.development.svc.cluster.local: {badrpc,nodedown}
  | 2019-03-14 05:09:59.419 [warning] <0.224.0> Could not auto-cluster with node rabbit@rabbitmq-cluster-0.rabbitmq-cluster.development.svc.cluster.local: {badrpc,nodedown}
  | 2019-03-14 05:09:59.419 [warning] <0.224.0> Could not successfully contact any node of: rabbit@rabbitmq-cluster-1.rabbitmq-cluster.development.svc.cluster.local,rabbit@rabbitmq-cluster-0.rabbitmq-cluster.development.svc.cluster.local (as in Erlang distribution). Starting as a blank standalone node...
  | 2019-03-14 05:09:59.436 [info] <0.43.0> Application mnesia exited with reason: stopped
  | 2019-03-14 05:10:02.039 [info] <0.224.0> Waiting for Mnesia tables for 30000 ms, 9 retries left
  | 2019-03-14 05:10:02.104 [info] <0.224.0> Waiting for Mnesia tables for 30000 ms, 9 retries left
  | 2019-03-14 05:10:02.210 [info] <0.224.0> Waiting for Mnesia tables for 30000 ms, 9 retries left
  | 2019-03-14 05:10:02.210 [info] <0.224.0> Peer discovery backend

Pod Service DNS, AddressType: hostname

Hello @jharting ,
Thank you for sharing. Is this config really working with OpenShift?
I'm using OpenShift 3.7 and cannot get service discovery run.
I'm facing 2 issues:
1.) If I'm using a headless service (clusterIP: none) and a full qualified hostname, the name resolution doesn't work:
ERROR: epmd error for host rabbitmq-68-7fcdw.rabbitmq-cluster.dcrpi-omsf-dev0.svc.cluster.local: nxdomain (non-existing domain)
With a clusterIP name resolution works, but it resolves the IP address of the service, not of the pod. In this case I get an epmd timeout.

2.) The cluster_formation.k8s.address_type = hostname doesn't work, as the OpenShift API doesn't return the hostname like Kubernetes does.
I have already opened a feature request: rabbitmq/rabbitmq-peer-discovery-k8s#33

Sitenote: I'm using a DeploymentConfig, not a StatefulSet. But this shouldn't make a difference.

I think both issues are related to the same root cause, as the pod-hostname field is used for the dns discovery:
https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/

Any help is appreciated. Thank you
Roberto

can't cd to /var/log/rabbitmq

When starting up, I see the following errors:
/usr/lib/rabbitmq/bin/rabbitmq-plugins: 86: cd: can't cd to /var/log/rabbitmq
/usr/lib/rabbitmq/bin/rabbitmq-server: 86: cd: can't cd to /var/log/rabbitmq
2018-11-14 03:13:32.860 [error] <0.98.0> Failed to open crash log file /var/log/rabbitmq/log/crash.log with error: permission denied

And running any command also shows the following error:
$ rabbitmqctl list_queues
/usr/lib/rabbitmq/bin/rabbitmqctl: 86: cd: can't cd to /var/log/rabbitmq
Timeout: 60.0 seconds ...
Listing queues for vhost / ...

Most of the Errors are removed if we add an environment variable: RABBITMQ_LOG_BASE = -

Network policy update required

2022-02-24 19:27:55.565 [warning] <0.274.0> Could not auto-cluster with node [email protected]: {badrpc,nodedown}
2022-02-24 19:27:55.565 [info] <0.274.0> Trying to join discovered peers failed. Will retry after a delay of 500 ms, 6 retries left...

Following fixes the above issue, in effect need to add both ingress and egreee

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: rabbitmq-internal-access
spec:
  podSelector:
    matchLabels:
      app: rabbitmq
  ingress:
    - from:
        - podSelector:
            matchLabels:
              app: rabbitmq
  egress:
    - to:
        - podSelector:
            matchLabels:
              app: rabbitmq

Openshift SCC requirement

Thanks for providing this! I did notice that I needed the anyuid SCC when using this on openshift v3.7.0+7ed6862 (old, I know).

Just required:

oc adm policy add-scc-to-user anyuid -z rabbitmq-discovery

I did not need to add the SCC on a v3.9.0+ba7faec-1 cluster.

error converting YAML to JSON: yaml when using Openshift 4

Hi All,
I'm still trying to fix this but I am having trouble using this:
oc create -f rabbitmq-cluster-template.yaml
results in
error converting YAML to JSON: yaml: line 115: mapping values are not allowed in this context

I ran the yaml through http://www.yamllint.com/ and it looks fine. I even tried using the reformated version and still get the same error.
Not sure which line is 115 though :)

PersistentVolumeClaims

running "VolumeBinding" filter plugin for pod "rabbitmq-cluster-0": pod has unbound immediate PersistentVolumeClaims.
What am I missing here ?

Is TLS support planned?

Are there plans for introducing TLS support? To secure client and also inter-node communication? It would be good to have such an option and be able to specify a TLS k8s secret as a provider of a certificate which should be used for it.

** Connection attempt from disallowed node 'rabbitmqcli-1384-rabbit@rabbitmq-cluster-1.rabbitmq-cluster.rabbitmq-ha.svc.cluster.local' **

One of the Pod is going in crash loop back off error when i tried to deploy this template. The other pod is running and is ready.

Error in logs: ** Connection attempt from disallowed node 'rabbitmqcli-1384-rabbit@rabbitmq-cluster-1.rabbitmq-cluster.rabbitmq-ha.svc.cluster.local' **

In events: (combined from similar events): Readiness probe failed: Error: unable to perform an operation on node 'rabbit@rabbitmq-cluster-1.rabbitmq-cluster.rabbitmq-ha.svc.cluster.local'. Please see diagnostics information and suggestions below. Most common reasons for this are: * Target node is unreachable (e.g. due to hostname resolution, TCP connection or firewall issues) * CLI tool fails to authenticate with the server (e.g. due to CLI tool's Erlang cookie not matching that of the server) * Target node is not running In addition to the diagnostics info below: * See the CLI, clustering and networking guides on https://rabbitmq.com/documentation.html to learn more * Consult server logs on node rabbit@rabbitmq-cluster-1.rabbitmq-cluster.rabbitmq-ha.svc.cluster.local * If target node is configured to use long node names, don't forget to use --longnames with CLI tools DIAGNOSTICS =========== attempted to contact: ['rabbit@rabbitmq-cluster-1.rabbitmq-cluster.rabbitmq-ha.svc.cluster.local'] rabbit@rabbitmq-cluster-1.rabbitmq-cluster.rabbitmq-ha.svc.cluster.local: * connected to epmd (port 4369) on rabbitmq-cluster-1.rabbitmq-cluster.rabbitmq-ha.svc.cluster.local * epmd reports node 'rabbit' uses port 25672 for inter-node and CLI tool traffic * TCP connection succeeded but Erlang distribution failed * suggestion: check if the Erlang cookie identical for all server nodes and CLI tools * suggestion: check if all server nodes and CLI tools use consistent hostnames when addressing each other * suggestion: check if inter-node connections may be configured to use TLS. If so, all nodes and CLI tools must do that * suggestion: see the CLI, clustering and networking guides on https://rabbitmq.com/documentation.html to learn more Current node details: * node name: 'rabbitmqcli-492-rabbit@rabbitmq-cluster-1.rabbitmq-cluster.rabbitmq-ha.svc.cluster.local' * effective user's home directory: /var/lib/rabbitmq * Erlang cookie hash: 6ofCpuGcLssmT/U34mLOFg==

Error creating replica

Am I missing something?

Running on Openshift 4 cluster

Error creating: pods "rabbitmq-cluster-operator-555fd7d956-" is forbidden: unable to validate against any security context constraint: [fsGroup: Invalid value: []int64{1000}: 1000 is not an allowed group spec.containers[0].securityContext.securityContext.runAsUser: Invalid value: 1000: must be in the ranges: [1001030000, 1001039999]]

how do i expose rabbitmq to external access?

This template creates two services:

  • rabbitmq-cluster (5672/TCP (amqp) 5672)
  • rabbitmq-cluster-balancer (15672/TCP (http) 15672 and 5672/TCP (amqp) 5672)

How do i expose the 5672 to external traffic? I created routes to either services but client's can't. Any suggestions?

Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.