Giter Site home page Giter Site logo

ondrejhome / ansible.ha-cluster-pacemaker Goto Github PK

View Code? Open in Web Editor NEW
40.0 8.0 28.0 240 KB

ansible role for HA pacemaker cluster (RHEL/CentOS/AlmaLinux/Rocky Linux/Fedora/Debian)

License: GNU General Public License v3.0

Shell 100.00%
pcs cluster pacemaker ansible-role ansible

ansible.ha-cluster-pacemaker's Introduction

ha-cluster-pacemaker

Role for configuring and expanding basic pacemaker cluster on CentOS/RHEL 6/7/8/9, AlmaLinux 8/9, Rocky Linux 8/9, Fedora 31/32/33/34/35/36/37/38/39/40 and CentOS 8/9 Stream systems.

This role can configure following aspects of pacemaker cluster:

  • enable needed system repositories
  • install needed packages
  • create and configure users and groups for running pacemaker cluster
  • configure firewall
  • generate items in /etc/hosts
  • authorize cluster nodes
  • create cluster or expand cluster (check allow_cluster_expansion)
    • "2 or more" node cluster
    • single heartbeat, rrp or knet with up to 8 links
    • remote nodes
    • use autodetected or custom selected interfaces/IPs for heartbeat
  • start and enable cluster on boot
  • configure stonith devices
    • by default install and configure fence_xvm stonith devices
    • optionally configure fence_kdump
    • optionally configure fence_vmware (SOAP/REST) or any other fence_* stonith devices
    • optionally configure fence_aws

Role fully supports --check mode for default configuration and partially supports it for most of other options.

When reporting issue please provide following information (if possible):

  • used ansible version
  • OS from which ansible was run
  • playbook and invetory file that produced error (remove sensitive information where appropriate)
  • error message or description of missbehaviour that you have encountered

Requirements

This role depend on role ondrejhome.pcs-modules-2.

Ansible 2.8 or later. (NOTE: it might be possible to use earlier versions, in case of issues please try updating Ansible to 2.8+)

RHEL 6/7/8: It is expected that machines will already be registered. Role will by default enable access to 'High Availability' or 'Resilient storage' channel. If this is not desired check the enable_repos variable.

RHEL/CentOS 7: This role requires at least version 2.9 of python-jinja2 library. If not present you may encounter error described in Issue #6. To get the updated version of python-jinja2 and its dependencies you can use following RPM repository - https://copr.fedorainfracloud.org/coprs/ondrejhome/ansible-deps-el7/ for both CentOS 7 and RHEL 7.

CentOS 8 Stream Tested with version 20240129 minimal recommended ansible version is 2.11.0 which starts to identify system as 'CentOS' instead of 'RedHat' (unline CentOS Linux). The older CentOS 8 Stream versions 20201211 minimal usable ansible version is 2.9.16/2.10.4. Version 2.8.18 was not working at time of testing. This is related to Service is in unknown state #71528.

CentOS 9 Stream Tested with version 20240129 minimal recommended ansible is 2.11.0.

Debian Buster Tested with version 20210310 with ansible version 2.10 and Debian Bullseye Tested with version 20220326 with ansible version 2.12. Debian part of this role does not include the stonith configuration and the firewall configuration. Note: This role went only through limited testing on Debian - not all features of this role were tested.

Debian Bookworm Tested with ansible version 2.14 and Debian Bookwork. Debian part of this role does not include the stonith configuration and the firewall configuration. Note: This role went only through limited testing on Debian - not all features of this role were tested.

Ansible version 2.9.10 and 2.9.11 will fail with error "'hostvars' is undefined" when trying to configure remote nodes. This applies only when there is at least one node with cluster_node_is_remote=True. Avoid these Ansible versions if you plan to configure remote nodes with this role.

On CentOS Linux 8 you have to ensure that BaseOS and Appstream repositories are working properly. As the CentOS Linux 8 is in the End-Of-Life phase, this role will configure HA repository to point to vault.centos.org if repository configuration (enable_repos: true) is requested (it is by default).

pcs-0.11 version distributions (AlmaLinux 9, Rocky Linux 9, RHEL 9, Fedora 36/37/38) are supported only with ondrejhome.pcs-modules-2 version 27.0.0 or higher.

Role Variables

  • user used for authorizing cluster nodes

    cluster_user: 'hacluster'
    
  • password for user used for authorizing cluster nodes

    cluster_user_pass: 'testtest'
    
  • group to which cluster user belongs (should be 'haclient')

    cluster_group: 'haclient'
    
  • name of the cluster

    cluster_name: 'pacemaker'
    
  • configuration of firewall for clustering, NOTE in RHEL/Centos 6 this replaces iptables configuration file!

    cluster_firewall: true
    
  • enable cluster on boot on normal (not pacemaker_remote) nodes

    cluster_enable_service: true
    
  • configure cluster with fence_xvm fencing device ? This will copy /etc/cluster/fence_xvm.key to nodes and add fencing devices to cluster NOTE: you need to define 'vm_name' in the inventory for each cluster node

    cluster_configure_fence_xvm: true
    
  • configure cluster with fence_vmware_soap/fence_vmware_rest fencing device ? This will install fence_vmware_soap/fence_vmware_rest fencing agent and configure it. When this is enabled you have to specify 3 additional variables with information on accessing the vCenter. NOTE: You also need to define 'vm_name' in the inventory for each cluster node specifying the name or UUID of VM as seen on the hypervisor or in the output of fence_vmware_soap -o list/fence_vmware_rest command.

    cluster_configure_fence_vmware_soap: false
    cluster_configure_fence_vmware_rest: false
    fence_vmware_ipaddr: ''
    fence_vmware_login: ''
    fence_vmware_passwd: ''
    

    You can optionally change the additional attributes passed to fence_vmware_soap/fence_vmware_rest using the variable fence_vmware_options. By default this variable enables encryption but disables validation of certificates.

    fence_vmware_options: 'ssl="1" ssl_insecure="1"'
    

    NOTE: Only one of fence_vmware_soap/fence_vmware_rest can be configured as stonith devices share same name.

  • configure cluster with fence_kdump fencing device ? This starts kdump service and defines the fence_kdump stonith devices. NOTE: if the kdump service is not started this won't work properly or at all

    cluster_configure_fence_kdump: false
    
  • configure cluster with fence_aws fencing device? You must provide instance id/region of AWS and Instance Profile that is able to start/stop instances for this cluster. When this is enabled you have to specify fence_aws_region variable with information on AWS region. NOTE: If you don't set up instance profile, it won't work properly or at all

    cluster_configure_fence_aws: false
    fence_aws_region: ''
    

    NOTE: You also need to define instance_id in the inventory for each cluster node by specifying the instance id as seen in the AWS web console or in the output of fence_aws -o list command. (man fence_aws)

    You can optionally change the additional attributes passed to fence_aws using the fence_aws_options variable.

    fence_aws_options: ''
    

    NOTE: Examples of proper options for some specific use cases can be found in documents below.
    https://access.redhat.com/articles/4175371#create-stonith
    https://docs.aws.amazon.com/sap/latest/sap-hana/sap-hana-on-aws-cluster-resources-1.html

  • How to map fence devices to cluster nodes? By default for every cluster node a separate stonith devices is created ('one-device-per-node'). Some fence agents can fence multiple nodes using same stonith device ('one-device-per-cluster') and can have trouble when using multiple devices due to same user login count limits. Available options:

    • one-device-per-node - (default) - one stonith device per cluster node is created
    • one-device-per-cluster - (on supported fence agents) - only one cluster-wide stonith device is created for all nodes, supported fence agents: fence_vmware_rest, fence_vmware_soap, fence_xvm, fence_kdump
    cluster_configure_stonith_style: 'one-device-per-node'
    
  • (RHEL/CentOS/AlmaLinux/Rocky) enable the repositories containing needed packages

    enable_repos: true
    
  • (RHEL only) enable the extended update (EUS) repositories containint packages needed

    enable_eus_repos: false
    
  • (RHEL only) enable the SAP Solutions update service (E4S) repositories containint packages needed

    enable_e4s_repos: false
    
  • (RHEL only) enable Beta repositories containint packages needed

    enable_beta_repos: false
    
  • (RHEL only) type of enable repositories, note that E4S repos have only 'ha' type available

    • ha - High-Availability
    • rs - Resilient Storage
    repos_type: 'ha'
    
  • (RHEL only) custom_repository allows enabling an arbitrarily named repository to be enabled. RHEL8 repo names can be found at http://downloads.redhat.com/redhat/rhel/rhel-8-beta/rhel-8-beta.repo

    custom_repository: ''
    
    
  • (CentOS only) install the needed packages from the CD-ROM media available at /dev/cdrom

    use_local_media: false
    
  • Enable or disable PCSD web GUI. By default the role keeps the default of installation means that PCSD web GUI is disabled on CentOS/RHEL 6.X and enabled on CentOS/RHEL 7.X. true or false can be passed to this variable to make sure that PCSD web GUI is enabled or disabled.

    enable_pcsd_gui: 'nochange'
    
  • Cluster transport protocol. By default this role will use what is default for give OS. For CentOS/RHEL 6.X this means 'udp' (UDP multicast) and for CentOS/RHEL 7.X this means 'udpu' (UDP unicast). This variable accepts following options: default, udp and udpu.

    cluster_transport: 'default'
    
  • Allow adding nodes to existing cluster when used with ondrejhome.pcs-modules-2 v16 or newer.

    allow_cluster_expansion: false
    
  • Cluster network interface. If specified the role will map hosts to primary IPv4 address from this interface. By default the IPv4 address from ansible_default_ipv4 or first IPv4 from ansible_all_ipv4_addressesis used. For exmaple to use primary IPv4 address from interface ens8 use cluster_net_iface: 'ens8'. Interface must exists on all cluster nodes.

    cluster_net_iface: ''
    
  • Redundant network interface. If specified the role will setup a corosync redundant ring using the default IPv4 from this interface. Interface must exist on all cluster nodes.

      rrp_interface: ''
    

    NOTE: you can define this variable either in defaults/main.yml, in this case the same rrp_interface name is used for all hosts in the hosts file. Either you specify an interface for each host present in the hosts file: this allows to use a specific interface name for each host (in the case they dont have the same interface name). Also note that instead of defining rrp_interface for a host, you can define rrp_ip: in this case this alternate ip is used to configure corosync RRP (this IP must be different than the host' default IPv4 address). This allows to use an alternate ip belonging to the same primary interface.

  • Whether to add hosts to /etc/hosts. By default an entry for the hostname given by cluster_hostname_fact is added for each host to /etc/hosts. This can be disabled by setting cluster_etc_hosts to false.

    cluster_etc_hosts: true
    
  • Which Ansible fact to use as the hostname of cluster nodes. By default this role uses the ansible_hostname fact as the hostname for each host. In some environments it may be useful to use the Fully Qualified Domain Name (FQDN) ansible_fqdn or node name ansible_nodename.

    cluster_hostname_fact: "ansible_hostname"
    
  • Whether the node should be setup as a remote pacemaker node. By default this is false, and the node will be a full member of the Pacemaker cluster. Pacemaker remote nodes are not full members of the cluster, and allow exceeding the maximum cluster size of 32 full members. Note that remote nodes are supported by this role only on EL7 and EL8.

    cluster_node_is_remote: false
    
  • Ordered list of variables for detecting primary cluster IP (ring0). First matched IPv4 is used and rest of detected IPv4s are skipped. In majority cases this should not require change, in some special cases such as when there is no default GW or non-primary IPv4 from given interface should be used this can be adjusted.

    ring0_ip_ordered_detection_list:
      - "{{ hostvars[inventory_hostname]['ansible_'+cluster_net_iface].ipv4.address|default('') }}"
      - "{{ ansible_default_ipv4.address|default('') }}"
      - "{{ ansible_all_ipv4_addresses[0]|default('') }}"
    
    
  • Configure cluster properties (Not mandatory)

    cluster_property:
      - name: required
        node: optional
        value: optional
    
  • Configure cluster resource defaults (Not mandatory)

    cluster_resource_defaults:
      - name: required
        defaults_type: optional
        value: optional
    
  • Configure cluster resources (Not mandatory)

    cluster_resource:
      - name: required
        resource_class: optional
        resource_type: optional
        options: optional
        force_resource_update: optional
        ignored_meta_attributes: optional
        child_name: optional
    
  • Configure cluster order constraints (Not mandatory)

    cluster_constraint_order:
      - resource1: required
        resource1_action: optional
        resource2: required
        resource2_action: optional
        kind: optional
        symmetrical: optional
    
  • Configure cluster colocation constraints (Not mandatory)

    cluster_constraint_colocation:
      - resource1: required
        resource1_role: optional
        resource2: required
        resource2_role: optional
        score: optional
        influence: optional
    
  • Configure cluster location constraints (Not mandatory)

    node based

    cluster_constraint_location:
      - resource: required
        node_name: required
        score: optional
    

    rule based (needs ondrejhome.pcs-modules-2 version 30.0.0 or newer)

    cluster_constraint_location:
      - resource: required
        constraint_id: required
        rule: required
        score: optional
    

Security considerations

Please consider updating the default value for cluster_user_pass.

To protect the sensitive values in variables passed to this role you can use ansible-vault to encrypt them. The recommended approach is to create a separate file with desired variables and their values, encrypt the whole file with ansible-vault encrypt and then include this file in pre_tasks: section so it is loaded before the role is executed. Example below illustrates this whole process.

Creating encrypted_vars.yaml file

    1. Create plain text encrypted_vars.yaml file with your desired secret values
    # cat encrypted_vars.yaml
    ---
    cluster_user_pass: 'cluster-user-pass'
    fence_vmware_login: 'vcenter-user'
    fence_vmware_passwd: 'vcenter-pass'
    
    1. Encrypt file suing ansible-vault
    # ansible-vault encrypt encrypted_vars.yaml
    
    1. Verify the new content of encrypted_vars.yaml
    # cat encrypted_vars.yaml
    $ANSIBLE_VAULT;1.1;AES256
    31306461386430...
    

Example playbook that is using values from encrypted_vars.yaml

- hosts: cluster
   pre_tasks:
     - include_vars: encrypted_vars.yaml
   roles:
     - { role: 'ondrejhome.ha-cluster-pacemaker', cluster_name: 'test-cluster' }

NOTE: Encrypting only the variable's value and putting it into vars: is discouraged as it could results in errors like argument 1 must be str, not AnsibleVaultEncryptedUnicode. Approach that encrypts whole file seems to be not affected by this issue.

Ansible module_defaults

While this role does not expose all configuration options through variables, one can use the module_defaults to change the default values of parameters that this role does not use. Below is non-exhaustive list of examples where this may become useful.

Example module_default A for setting the totem token to 15 seconds

- hosts: cluster
  modules_defaults:
    pcs_cluster:
      token: 15000               # default is 'null' - depends on OS default value

Example module_default B for disabling installation of weak dependencies on EL8+/Fedora systems

- hosts: cluster
  modules_defaults:
    yum:
      install_weak_deps: false   # default is 'true'

Example module_default C for disabling installation of package recommends on Debian systems

- hosts: cluster
  modules_defaults:
    apt:
      install_recommends: false  # default is 'null' - depends on OS configuration

NOTE: The module_defaults only applies to options that are not specified in task - you cannot override value that is set by task in this role, only the value of options that are not used can be changed.

Example Playbook

Example playbook A for creating cluster named 'test-cluster' enabled on boot, with fence_xvm and firewall settings. NOTE: cluster_name is optional and defaults to pacemaker.

- hosts: cluster
  roles:
     - { role: 'ondrejhome.ha-cluster-pacemaker', cluster_name: 'test-cluster' }

Example playbook B for creating cluster named 'test-cluster' without configuring firewall and without fence_xvm. For cluster to get properly authorized it is expected that firewall is already configured or disabled.

- hosts: cluster
  roles:
     - { role: 'ondrejhome.ha-cluster-pacemaker', cluster_name: 'test-cluster', cluster_firewall: false, cluster_configure_fence_xvm: false }

Example playbook C for creating cluster named vmware-cluster with fence_vmware_soap fencing device.

- hosts: cluster
  vars:
    fence_vmware_ipaddr: 'vcenter-hostname-or-ip'
    fence_vmware_login: 'vcenter-username'
    fence_vmware_passwd: 'vcenter-password-for-username'
  roles:
     - { role: 'ondrejhome.ha-cluster-pacemaker', cluster_name: 'vmware-cluster', cluster_configure_fence_xvm: false, cluster_configure_fence_vmware_soap: true }

Example playbook D for creating cluster named test-cluster where /etc/hosts is not modified:

- hosts: cluster
  roles:
     - { role: 'ondrejhome.ha-cluster-pacemaker', cluster_name: 'test-cluster', cluster_etc_hosts: false }

Example playbook E for creating cluster named vmware-cluster with single fence_vmware_rest fencing device for all cluster nodes.

- hosts: cluster
  vars:
    fence_vmware_ipaddr: 'vcenter-hostname-or-ip'
    fence_vmware_login: 'vcenter-username'
    fence_vmware_passwd: 'vcenter-password-for-username'
  roles:
     - { role: 'ondrejhome.ha-cluster-pacemaker', cluster_name: 'vmware-cluster', cluster_configure_fence_xvm: false, cluster_configure_fence_vmware_rest: true, cluster_configure_stonith_style: 'one-device-per-cluster' }

Example playbook F for creating cluster named aws-cluster with single fence_aws fencing device for all cluster nodes.

- hosts: cluster
  roles:
    - { role: 'ondrejhome.ha-cluster-pacemaker', cluster_name: 'aws-cluster', cluster_configure_fence_xvm: false, cluster_configure_fence_aws: true, cluster_configure_stonith_style: 'one-device-per-cluster', enable_repos: false, fence_aws_region: 'aws-region' }

Example playbook Resources configuration .

- hosts: cluster
  vars:
    cluster_property:
      - name: 'maintenance-mode'
        value: 'true'
    cluster_resource:
      - name: 'apache2'
        resource_type: 'systemd:apache2'
        options: 'meta migration-threshold=2 op monitor interval=20s timeout=10s'
      - name: 'cluster_vip'
        resource_type: 'ocf:heartbeat:IPaddr2'
        options: 'ip=192.168.1.150 cidr_netmask=24 meta migration-threshold=2 op monitor interval=20'
    cluster_constraint_colocation:
      - resource1: 'cluster_vip'
        resource2: 'apache2'
        score: 'INFINITY'
    cluster_resource_defaults:
      - name: 'failure-timeout'
        value: '30'
  roles:
     - { role: 'ondrejhome.ha-cluster-pacemaker', cluster_name: 'apache-cluster'}

Inventory file example for CentOS/RHEL/Fedora systems createing basic clusters.

[cluster-centos7]
192.168.22.21 vm_name=fastvm-centos-7.8-21
192.168.22.22 vm_name=fastvm-centos-7.8-22
[cluster-fedora32]
192.168.22.23 vm_name=fastvm-fedora32-23
192.168.22.24 vm_name=fastvm-fedora32-24
[cluster-rhel8]
192.168.22.25 vm_name=fastvm-rhel-8.0-25
192.168.22.26 vm_name=fastvm-rhel-8.0-26

Inventory file example for cluster using RRP interconnnect on custom interface and/or using custom IP for RRP

[cluster-centos7-rrp]
192.168.22.27 vm_name=fastvm-centos-7.6-21 rrp_interface=ens6
192.168.22.28 vm_name=fastvm-centos-7.6-22 rrp_ip=192.168.22.29

Inventory file example with two full members and two remote nodes:

[cluster]
192.168.22.21 vm_name=fastvm-centos-7.6-21
192.168.22.22 vm_name=fastvm-centos-7.6-22
192.168.22.23 vm_name=fastvm-centos-7.6-23 cluster_node_is_remote=True
192.168.22.24 vm_name=fastvm-centos-7.6-24 cluster_node_is_remote=True

Inventory file example with fence_aws:

[cluster]
172.31.0.1	instance_id="i-acbdefg1234567890"
172.31.0.2	instance_id="i-acbdefg0987654321"

Old video examples of running role with defaults for:

License

GPLv3

Author Information

To get in touch with author you can use email [email protected] or create a issue on github when requesting some feature.

ansible.ha-cluster-pacemaker's People

Contributors

adamaze avatar chhanz avatar felixb avatar jability avatar jackhodgkiss avatar markgoddard avatar mib1185 avatar ofamera-test avatar olipou avatar ondrejhome avatar paktosan avatar soliverr avatar spitchag avatar tleguern avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ansible.ha-cluster-pacemaker's Issues

Feature request: Coroync totem timeout support in the role

Hello again,

We have had some hard time with the default timeouts in corosync that were triggering cluster disjoins and some weird scenario with Pacemaker. We increased both totem timeout value and totem consensus value accordingly on the servers.

We could see that your modules does support at least the totem timeout but it is not implemented in the role.

I'd like to know if it is something that can be added (either you, or I by doing a PR) ?

Thank you

video examples on the usage of the role

Idea:

TODO:

  • default config with latest version of this role (update the previous video)
  • example with custom fencing device configuration
  • example with minimal configuration (no fencing devices, no firewall or repository configuration)

enable PCS GUI

to enable the GUI
vi /etc/sysconfig/pcsd

change PCSD_DISABLE_GUI=true to PCSD_DISABLE_GUI=false

service pcsd restart
you should now be able to get to the GUI vi https://(host or dns):2224
you will also need to open iptables top new port
iptables -I INPUT -p tcp --dport 2224 -m state --state NEW,ESTABLISHED -j ACCEPT

this only needs to be done on one host to manage the cluster vi the GUI

how to enable resource from playbook

Hey @OndrejHome,

looke like you built the best corosync/pacemaker role available. Trying out to steup a simple 2-node cluster with a shared IP for failover.

But I can't enable that resource. So I feel to be missing something form docs. Role + vars is setup as so:

    - role: ondrejhome.ha-cluster-pacemaker
      tags: cluster
      cluster_name: "mailx-cluster"
      cluster_user_pass: ngzj27rgr4k7
      cluster_firewall: false
      cluster_configure_fence_xvm: false
      cluster_resource:
        - name: "sharedIP-213"
          state: present
          resource_type: "ocf:heartbeat:IPaddr2"
          options: "ip=xxx.yyy.zzz.213 cidr_netmask=24 op monitor interval=10s"

crm_mon gives:

Cluster Summary:
  * Stack: corosync
  * Current DC: mailx-06 (version 2.0.5-ba59be7122) - partition with quorum
  * Last updated: Wed Nov 23 19:25:43 2022
  * Last change:  Wed Nov 23 19:18:44 2022 by root via cibadmin on mailx-05
  * 2 nodes configured
  * 1 resource instance configured

Node List:
  * Online: [ mailx-05 mailx-06 ]

Active Resources:
  * No active resources

while pcs resource config knows that resource properly:

 Resource: sharedIP-213 (class=ocf provider=heartbeat type=IPaddr2)
  Attributes: cidr_netmask=24 ip=xxx.yyy.zzz.213
  Operations: monitor interval=10s (sharedIP-213-monitor-interval-10s)
              start interval=0s timeout=20s (sharedIP-213-start-interval-0s)
              stop interval=0s timeout=20s (sharedIP-213-stop-interval-0s)

and even after pcs resource enable sharedIP-213 that resource remains "inactive"

Any hints? What am I missing?


Ah btw: firewall is shut down on local dev

Attempts to load nonexistent Centos 8 Stream variables

When targeting Centos 8 Stream systems the role will fail on Include distribution version specific variables with the following error message No file was found when using first_found. Use errors='ignore' to allow this task to be skipped if no files are found.

When looking at the value of ansible_distribution_file_variety and ansible_distribution it is clear that it is now attempting to load vars/centos8.yml which doesn't exist. Fortunately, copying the contents of vars/redhat8.yml to vars/centos8.yml works and resolves the issue.

run cluster in UNICAST

did some playing around and on CentOS 6 if execute the following before enabling the cluster the following;

name: Configure cluster for Unicast
command: ccs -f /etc/cluster/cluster.conf --setcman broadcast="no" expected_votes="1" transport="udpu" two_node="1"
the cluster starts up in unicast mode vs multicast

hostvars undefined in delegate_host (commit af46318c8e46245f950888502d72f0bdd3f07493)

seems that commit af46318 broke playbook, at least on my setup,
could be related to ansible/awx#7725

setup:
CentOS Linux release 7.8.2003 (Core)

ansible 2.9.11
config file = None
configured module search path = ['/home/centos/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
ansible python module location = /usr/local/lib/python3.6/site-packages/ansible
executable location = /usr/local/bin/ansible
python version = 3.6.8 (default, Apr 2 2020, 13:34:55) [GCC 4.8.5 20150623 (Red Hat 4.8.5-39)]

Setup firewall task conditional always true on CentOS

instead of this:

- name: Setup firewall for RHEL/CentOS systems
  include_tasks: "firewall-el{{ ansible_distribution_major_version }}.yml"
  when: cluster_firewall|bool and ansible_distribution == 'RedHat' or ansible_distribution == 'CentOS'

it should look like this:

- name: Setup firewall for RHEL/CentOS systems
  include_tasks: "firewall-el{{ ansible_distribution_major_version }}.yml"
  when:                                                                                                                                                                                                                                          
    - cluster_firewall|bool
    - ansible_distribution == 'RedHat' or ansible_distribution == 'CentOS'

CentOS 8.1 repository setup

CentOS 8.1 is using repository for the pacemaker packages so compared to previous CentOS 7.x we need to add extra step of activating that repository.

Service pcs-ruby not started on Debian 11

Hello,

When installing pacemaker on Debian, pcsd rely on "pcsd-ruby" service which is not started at install (but the unit is enabled by default). The authentication fails. After manually starting the service on each node everything is fine.

Is it possible to make it start in the role ?
Thank you

group_by task always reports 'changes'

Issue:
Since the inclusion of code that required group_by, the role always reports the task with group_by to be changed. Even when changed_when: False is used. This issue is here as reminder to check how (and if) this can be somehow fixed.
This seems to be limited to newer versions of Ansible (2.9.x/2.10.x), version 2.8.x has no such issue.

For possible contributors:
If you have a good idea on how to fix this issue feel free to open PR and refer to this issue.

role fails on CentOS/RHEL 7 with `do_truncate() takes at most 4 arguments (5 given)`

Error comes from the old version of python-jinja2 package. Version present in CentOS/RHEL (2.7.2-2.el7) doesn't support the function truncate with extra parameter. At least version 2.9 of python-jinja2 library is needed. Never version of python-jinja2 also requires the newer version of python-markupsafe package - version 0.23 seems to work from preliminary testing.

Affected is the version of this role is one with tag 18.0.0 and newer until this is fixed.

The changed version of truncate was introduced in commit 0347461.

Reproducer:

---
- hosts: localhost
  tasks:
  - debug: msg="{{ '1234567890abcdefgh' | truncate(16, True, '', 0)}}"

TODO:

  • create COPR repo with updated packages to fix this for EPEL 7 - http://copr.fedorainfracloud.org/
  • document this behaviour better in roles README - issue is not related to ansible version but rather to libraries around it which cannot be expressed with roles min_ansible_version as even newer versions of ansible with old libraries will fail in same way
  • test on RHEL 7.6 system

feature request: ability to configure fence_vmware_rest via role

In addition to the cluster_configure_fence_vmware_soap option and functionality, is it possible to add support for fence_vmware_rest with a similar config option?

I noticed there is a fence_custom which we can probably use, but as redhat are now suggesting fence_vmware_rest for environments where fence_vmware_soap is timing out, it would be nice if this was available directly from the role.

And finally, in some redhat article (which I cannot find right now - I can dig it out if you think it will help), I have seen recommendations to run only a single fence_vmware_* agent across all the nodes - rather than having one fence_vmware_* agent / stonith resource per node. Could we add a flag to the relevant tasks to allow only one of these devices per cluster if desired?

"validate redundant ring ip" false positive without default gateway

Issue

The following check can fail if there is no default gateway configured on cluster nodes.

https://github.com/OndrejHome/ansible.ha-cluster-pacemaker/blob/master/tasks/main.yml#L144-L150

- name: validate redundant ring ip
  fail:
    msg: >-
      invalid redundant ip {{ rrp_ip }} for {{ ansible_hostname }}:
      must be different than default ip {{ ansible_default_ipv4.address|default(ansible_all_ipv4_addresses[0]) }}
  when: rrp_ip is defined and rrp_ip == ansible_default_ipv4.address|default(ansible_all_ipv4_addresses[0])
  any_errors_fatal: true

Without a default gatway ansible_default_ipv4.address is not set, therefore the check falls back to default(ansible_all_ipv4_addresses[0]) but this fallback can cause issues because it resolves to wrong IP addresses.

Example

Let me explain this with an example:

A host (home02) has no default gateway configured and 3 network interfaces (bond[0-2])

[root@home02 ~]# ip r
10.10.0.0/16 dev bond0 proto kernel scope link src 10.10.200.2 metric 302 
10.12.0.0/16 dev bond1 proto kernel scope link src 10.12.200.2 metric 301 linkdown 
192.168.1.0/30 dev bond2 proto kernel scope link src 192.168.1.2 metric 300 
[root@home02 ~]# hostname -I
192.168.1.2 10.12.200.2 10.10.200.2 

Therefore, Ansible detects the following facts:

"ansible_all_ipv4_addresses": [
    "192.168.1.2",
    "10.10.200.2",
    "10.12.200.2"
],
"ansible_default_ipv4": {},

The hostname home02 resolves to 10.10.200.2.

[root@home02 ~]# hostname -i
10.10.200.2
[root@home02 ~]# ping -c1 home02
PING home02.service (10.10.200.2) 56(84) bytes of data.
64 bytes from home02.service (10.10.200.2): icmp_seq=1 ttl=64 time=0.009 ms

The host should be configured with ring0 on bond0 and ring1 on bond2.

node {
    ring0_addr: home02
    ring1_addr: 192.168.1.2
    name: home02
    nodeid: 2
}

Unfortunately, this is not possible because default(ansible_all_ipv4_addresses[0]) is 192.168.1.2 which is incorrect, because home02 resolves to 10.10.200.2, not 192.168.1.2. It should be 10.10.200.2.

Therefore, setting rrp_interface=bond2 for home02 fails with:

fatal: [home02]: FAILED! => changed=false 
  msg: 'invalid redundant ip 192.168.1.2 for home02: must be different than default ip 192.168.1.2'

Proposal

I see multiple options:

  1. Replace the fallback check default(ansible_all_ipv4_addresses[0]) with an actual hostname lookup
  2. Remove the fallback and skip the check if ansible_default_ipv4 is unset
  3. Allow to configure an IP address for ring0_addr, too

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.