Giter Site home page Giter Site logo

ansible-collections / ansible-consul Goto Github PK

View Code? Open in Web Editor NEW
446.0 446.0 310.0 1.27 MB

:satellite: Ansible role for Hashicorp Consul clusters

Home Page: https://galaxy.ansible.com/ansible-community/consul/

License: BSD 2-Clause "Simplified" License

Shell 19.06% Jinja 78.62% Dockerfile 2.32%
ansible ansible-role consul hacktoberfest hashicorp service-discovery

ansible-consul's People

Contributors

ahjohannessen avatar arledesma avatar arouene avatar bbaassssiiee avatar brianshumate avatar calebtonn avatar chrismckee avatar danielkucera avatar dependabot[bot] avatar drewmullen avatar evilhamsterman avatar gardar avatar gofrolist avatar hwmrocker avatar jasonneurohr avatar judy-zz avatar lanefu avatar liuxu623 avatar logan2211 avatar marcaurele avatar mattburgess avatar misho-kr avatar nre-ableton avatar oliverprater avatar ppuschmann avatar rodjers avatar slomo avatar soloradish avatar vincele avatar violuke avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ansible-consul's Issues

Add checks to make role idempotent

As @kostyrevaa pointed out in #14, the role does not yet offer any idempotency, which is a touch sad. 😿 Let's identify and update areas where it needs to be checking and not altering anything existing.

Some good starting points were mentioned in #14.

change start_join to retry_join

Hi
If there are no knowing drawbacks at your site I would suggest changing “start_join” to “retry_join” in config.json.j2. I have problems to deploy a new cluster without using retry, because not all nodes start simultaneously.

'dict object' has no attribute 'stdout'

I'm new to this project and just trying it now. errors come out from a few of my servers.

my inventory file:

[cluster_nodes]
adcb-zk-1.vm.elenet.me consul_node_role=bootstrap
adcb-zk-[2:3].vm.elenet.me consul_node_role=server
adcb-mesos-[18:23].vm.elenet.me consul_node_role=client

my playbook:

---
- name: Assemble Consul cluster
  hosts: cluster_nodes 
  remote_user: root 
  roles:
    - { role: brianshumate.consul }

error log:

TASK [brianshumate.consul : Save encryption key] *******************************
skipping: [adcb-mesos-18.vm.elenet.me] => {"changed": false, "skip_reason": "Conditional check failed", "skipped": true}
fatal: [adcb-mesos-18.vm.elenet.me]: FAILED! => {"failed": true, "msg": "the field 'args' has an invalid value, which appears to include a variable that is undefined. The error was: 'dict object' has no attribute 'stdout'\n\nThe error appears to have been in '/etc/ansible/roles/brianshumate.consul/tasks/main.yml': line 89, column 7, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n      run_once: true\n    - name: Save encryption key\n      ^ here\n"}

my starting code:

ansible-playbook -i ./hosts consul.yml --extra-vars "consul_iface={{ansible_default_ipv4.alias}}" -v

any help would be appreciated :)

Lower minimal Debian version

Is it ok if I lower the minimal Debian version from 8.5 to 8.0? The latest Raspbian is 8.0, but it does install without issues after circumventing the Debian version assert.

Cannot rerun the role to setup additional nodes

I ran the role once to setup a node, then added a new node in inventory and re-ran the playbook.

It failed because of the combination of:

  • when clause skipped the node I had already setup on the first run (that host, even if skipped counts for the run once)
  • run_once ensured that the download tasks are not run on the new node
  • the cleanup action from the previous run removed the downloaded file

Error:

TASK [ansible-consul : Install Consul] *******************************************************************************************************************************************************
skipping: [192.168.1.200]
fatal: [192.168.1.111]: FAILED! => {"changed": false, "failed": true, "msg": "Unable to find '/home/vlegoll/dev/ansible/roles/ansible-consul-brianshumate/files/consul' in expected paths."}

At least this should be documented, IMHO.

Commenting out the first (succesful node from the inventory) made the new node download and install OK...

Deploy only consul clients

Hi

Please correct me if I am wrong:
ATM the role is only usable when all consul servers and clients are part of the hostgroup “cluster_nodes/consul_instances (latest)”. Therefore I have to be a server admin also when I only want to deploy a client.
-> I cannot use the role out of the box as a consul client admin.

My quick and dirty solution was:

Playbook:

- hosts: "consulclients.com"
  remote_user: centos
  become: true
  become_user: root
  become_method: sudo
  gather_facts: yes
  roles:
    - consul  
  vars:
    consul_node_role: "client"
    consul_acl_agent_token: "MyClientToken"
    consul_tls_enable: "true"
    consul_join_servers:
      - "1.2.3.4"
      - "1.2.3.5"
      - "1.2.3.6"
    ...

config.json.j2:


    {## LAN Join ##}
    "start_join": [
			{% for server in consul_join_servers %}
					"{{ server }}",
			{% endfor %}
        {% for server in _consul_lan_servers %}
            "{{ hostvars[server]['consul_bind_address'] | ipwrap }}",
        {% endfor %} ],

Is it possible to add a “clients only mode” in one of the next releases?

Building consul_tls_dir dont work

defaults/main.yml:
consul_tls_dir: "{{ lookup('env','CONSUL_TLS_DIR') | default('{{ consul_config_path }}/ssl', true) }}"

default('{{ consul_config_path }}/ssl', true) }}" will not be interpreted because of single quotes.
Workaround, because I was not able to solve the problem in one line within 5 minutes:

consul_tls_dir_default: "{{consul_config_path}}/ssl"
consul_tls_dir: "{{ lookup('env','CONSUL_TLS_DIR') | default(consul_tls_dir_default, true) }}"

'dict object' has no attribute u'ansible_eth1'

Good afternoon, Brian!

I installed your role from Ansible Galaxy and I'm trying to install consul on a standalone server.

Here's basic information:

  • ansible 2.2.0.0
  • Installing on Ubuntu 16.04 (not officially supported)

Full error:

TASK [brianshumate.consul : Bootstrap configuration] ***************************
fatal: [consul]: FAILED! => {"changed": false, "failed": true, "msg": "AnsibleUndefinedVariable: {{ hostvars[inventory_hostname]['ansible_'+consul_iface]['ipv4']['address'] }}: 'dict object' has no attribute u'ansible_eth1'"}

The playbook:

  - hosts: "consul"
    gather_facts: true
    become: yes
    vars:
      consul_node_role: "bootstrap" 
    roles:
    - { role: brianshumate.consul, tags: ["consul"] }

ansible.cfg

[defaults]
host_key_checking = False

Thanks in advance!

Key not read by servers that need it

Hi Brian,

Thanks for your work on this role!

When adding a server after initial setup, the key is not correctly read by the server.

I think it's a wrong check;
https://github.com/brianshumate/ansible-consul/blob/master/tasks/main.yml#L125

- name: Read key for servers that require it
  set_fact:
    consul_raw_key: "{{ lookup('file', '/tmp/consul_raw.key') }}"
  when: consul_raw_key is not defined and bootstrap_marker.stat.exists

Servers that require it don't have the bootstrap_marker file. So I think the check should be ... and not bootstrap_marker.stat.exists.

Happy to provide a PR if you can confirm.

How To Set a recursor

Greetings!!

Can anyone give me a hint on how to use a recursor ? I tried to set recursor list using:

--extra-vars "consul_recursors=192.168.96.245,192.168.96.212,192.168.96.234" 

However, it's not working, because it is getting parsed i config.json in the following manner:

"recursors": [
"1",
"9",
"2",
"."
...

Any hints ?

Thanks again!!

Consul encryption key is not retrieved from existing json config

I have 1 server and N clients, and I am running the playbook repeatedly. On subsequent runs the following error is manifested:

TASK [brianshumate.consul : Read key for servers that require it] **************
skipping: [10.180.45.22]
 [WARNING]: Unable to find '/tmp/consul_raw.key' in expected paths.

fatal: [10.180.45.18]: FAILED! => {"failed": true, "msg": "could not locate file in lookup: /tmp/consul_raw.key"}

It appears the playbook correctly detects an exiting config file on the bootstrapping server (10.180.45.18), but it does not read it from that server. Instead it tries to read it from host that is not designated as bootstrapping server (10.180.45.22).

I suspect the culprit is the playbook uses run_once clause in a block where the block has when condition. What happens is run_once clause is applied first, it picks one host which may or may not be a bootstrapping server, and then the when clause is applied. If that host is not a server then there is no config to read and later on the playbook fails with the above message.

I was able to get pass this error by removing the run_once clause from the tasks enclosed in block. I will create a PR shortly for review.

Not compatible with Ansible 2.3.0.0 with python3 on Ubuntu 16.04

TASK [brianshumate.consul : Check Consul package checksum file] ****************
fatal: [4cas]: FAILED! => {"changed": false, "failed": true, "module_stderr": "/bin/sh: /usr/bin/python3: No such file or directory\n", "module_stdout": "", "msg": "MODULE FAILURE", "rc": 0}

ansible --version
ansible 2.3.0.0

vars:
ansible_python_interpreter: "/usr/bin/python3"

Role does not work on second run

When I try to run the role a second time I get the following error.

Playbook:

- name: Install & Configure Consul Cluster
  hosts: consuls
  become: yes 
  roles:
    - role: consul
      consul_iface: eth0
      consul_group_name: consuls

Error:

 TASK [consul : Writing key locally to share with other servers that are new] ***
fatal: [consul-host-0]: FAILED! => {"failed": true, "msg": "Failed to get information on remote file (/tmp/consul_raw.key): MODULE FAILURE"}
fatal: [consul-host-2]: FAILED! => {"failed": true, "msg": "Failed to get information on remote file (/tmp/consul_raw.key): MODULE FAILURE"}
fatal: [consul-host-1]: FAILED! => {"failed": true, "msg": "Failed to get information on remote file (/tmp/consul_raw.key): MODULE FAILURE"}
  to retry, use: --limit @/var/lib/awx/projects/_6__consulting_bootcamp/ansible/playbooks/setup_consul.retry

I was able to resolve this by setting become: no on the local_action tasks.

value for bootstrap_expect - autoscaling

Hi

ATM the config looks like this:
"bootstrap_expect": {{ _consul_lan_servers | length }},

I think there is no need to set this value to the current number of server instances.
In some time I will try to implement autoscaling for the servers and so i need a fix value for the bootstramp (3 or 5) that will not be changed if new server instances will be added or removed - at least I think so.
There is no benefit to set "bootstrap_expect": "6" when the only usefull values are 5 (or 3).

Could you implement a integer key which overides bootstrap_expect if it is set?
Something like that:
"bootstrap_expect": {{ consul_static_bootstramp | default(_consul_lan_servers | length) }},

Maybe I am wrong and this role will only be used to deploy the first "core" of servers and scaling servers have to be deployed with a customized role.
Is some one of you using auto or manuel scaling of an existing consul server environment (up and down scaling)?

consul_servers only works if consul_node_role is a fact

I think I found the problem you were having with the consul_node_role. When a default is used, the variable isn't available as a fact. But the used loops and look-ups need this as a fact.

Simply setting it as a fact should solve the problem.

I'll add a fix to the config layout PR

Is there a way to use `consul keygen` & `encrypt`?

Hi, Brian;

I love this role and I've used it quite a bit both for learning and in production. I'm in the process of switching over to TLS but for the short term is there a way to use the encrypt value in the config?

My current config looks like this:

{
    "bootstrap": true,
    "server": true,
    "datacenter": "stuff",
    "data_dir": "/var/consul",
    "encrypt": "QyYc0lVJXcNl3idBPREjIYww==",
    "log_level": "INFO",
    "statsd_addr": "127.0.0.1:8125",
    "enable_syslog": true,
    "start_join": ["consul2.server.prod", "consul3.server.prod"]
}

Thanks in advance!

join advertise adresses

Hi
Is it possible to implement an option to choose if start|retry_join in config.json.j2 use the advertise addresses of the cluster partners?
The current implementation does not work between the servers when they are using advertise addresses, because the bind addresses are still used to join the cluster (lan and wan).

Prestent:

    {## LAN Join ##}
    "start_join": [
        {% for server in _consul_lan_servers %}
           "{{ hostvars[server]['consul_bind_address'] | ipwrap }}",
        {% endfor %} ],

Suggestion:

    {## LAN Join ##}
    "retry_join": [
        {% for server in _consul_lan_servers %}
            "{{ hostvars[server]['consul_advertise_address'] | ipwrap }}",
        {% endfor %} ],

The Change should be enough, because the default of consul_advertise_address is:
consul_advertise_address: "{{ consul_bind_address }}"

Same story for {## WAN Join ##} .

Changing cluster_nodes to consul_cluster_nodes

Hello Brian,

I'm currently using your roles for nomad and consul and I find it weird to have one role with cluster_nodes as a mandatory host group while having the nomad role with nomad_cluster_nodes.

This tiny issues makes it weird when using both on the same playbook (the inventory is then explicit for nomad but not for consul).

Do you think it would be possible to change this (I could make the PR if needed).

'consul_node_role' is undefined

TASK [brianshumate.consul : Create cluster groupings] **************************
fatal: [127.0.0.1]: FAILED! => {"failed": true, "msg": "the field 'args' has an invalid value, which appears to include a variable that is undefined. The error was: 'consul_node_role' is undefined\n\nThe error appears to have been in '/etc/ansible/roles/brianshumate.consul/tasks/main.yml': line 24, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: Create cluster groupings\n  ^ here\n"}

Changed to consul_node_name?

consul_bind_address invalid value

Hi,

When installing a cluster with vagrant each node with role=server has the error:

fatal: [node2]: FAILED! => {"failed": true, "msg": "the field 'args' has an invalid value, which appears to include a variable that is undefined. The error was: 'dict object' has no attribute 'consul_bind_address'\n\nThe error appears to have been in '/etc/ansible/roles/brianshumate.consul/tasks/config.yml': line 4, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: Create configuration\n ^ here\n"}

The node with bootstrap role is sucessful configured

Config files and dir structure

Currently the following config dir structure is used;

/etc/consul
└── .consul_bootstrapped
/etc/consul.d
├── bootstrap
│   └── config.json
├── client
│   └── config.json
└── server
    └── config.json

I would like to change this to a more common layout where the agent config resides in the consul dir. And all files in consul.d are included automatically.

/etc/consul
├── {{ consul_node_role }}.json
└── .consul_bootstrapped
/etc/consul.d
└── random_config.json

In this example I only added the config for the specific role of the agent. Is there a reason why the role currently provides all 3 configs? I would provide bootstrap/server to all servers and only client to clients.

Happy to provide a PR.

Support Upgrading Consul

I noticed that when I change the variable consul_version and run the script on an existing consul cluster that each nodes still states the old consul_version in consul members.

When I checked on the machine I noticed that the binaries were not upgraded to the specified consul version.

The main issue I found is the Check for existing Consul binary task inside the file task/main.yml.

With the guide from consul at https://www.consul.io/docs/upgrading.html I hacked together something that is not idempotent and uses a variable called consul_install_upgrade. This bypasses the stated check and runs additional tasks such as systemctl daemon-reload on Ubuntu.

I think further restructuring of this hack is required to ideally make the process idempotent.

So my questions: Is it planned to support Upgrading Consul via the ansible role?

Binary gets replaced each role run

Hello @brianshumate,

first thanks for creating this role. Makes the first hands on consul much easier!

We're using consul_install_remotely: true and I've noticed that the consul binary gets replaced each time the role get's applied to the servers.
This is bit confusing for us as the installation is already done, so why replacing it each time?
Mainly these tasks report a change on each run even when consul is already up and running:

  • Read Consul package checksum
  • Unarchive Consul and install binary (replaced the binary)

Also curious that the task Cleanup in the end of install_remote.yml does not trigger either.

May it be possible to alter the role so it performs the installation only if consul is absent?
Changing/replacing it when another version then installed is given via consul_version might not be an good idea as a manual upgrade seems the safer way.

Thank you very much!

Best regards

Jard

issue with downloading files locally

My control machine is a mac - downloading these files locally is causing errors with the get_url module.

fatal: [consul01]: FAILED! => {"changed": false, "failed": true, "msg": "Failed to validate the SSL certificate for releases.hashicorp.com:443. Make sure your managed systems have a valid CA certificate installed. You can use validate_certs=False if you do not need to confirm the servers identity but this is unsafe and not recommended. Paths checked for this platform: /etc/ssl/certs, /etc/ansible, /usr/local/etc/openssl"}

I have tried addressing the issue with what google returns, but haven't been able to get past this error. I will issue a PR which should let all the file transferring occur on the remote host - which lets me get past it.

bootstrap without bootstrap node

If I read the info about the -bootstrap-expect option in the consul docs, I'm under the impression this is the better choice for normal clusters (not for testing that is).

As this doesn't require a node to be in bootstrap mode all the time, or restart it in normal server mode after bootstrapping. (something that's currently not done by this role, but would be good practice if I understand correctly)

I'd like to add an option to choose between both bootstrapping techniques.

What's your opinion on this @brianshumate?

discussion: default acl_agent_token for servers

Hi

Consul servers are dropping “[WARN] agent: Node info update blocked by ACLs” messages without adding an "acl_agent_token".

See:
thread

ATM I workaround the problem with "acl_agent_token": <acl_master_token>
I would like to discuss how to solve this within the role.

I see these ways:

  1. "acl_agent_token": <acl_master_token>
    Very simple to implement in the role. But are there any security issues?
{## Server ACLs ##}
{% if consul_acl_enable %}
 "acl_default_policy": "{{ consul_acl_default_policy }}",
 "acl_master_token": "{{ consul_acl_master_token }}",
 "acl_agent_token": "{{ consul_acl_master_token }}",
  1. Own token without restart
    Give/Gen an acl_agent_token for the servers and add it also to the configs.
    Start the servers
    Call the API with the master token and add the necessary acl rule.

  2. Own token with restart
    Start the server
    Call the API with the master token, add the necessary acl rule and store it locally
    Add acl_agent_token configs
    Restart servers

Maybe some one knows a better way?

Not disabling bootstrap mode

After a succesful deployment, I still see :
"bootstrap": true,
in the config file of the server which had consul_node_role=bootstrap passed in ansible inventory file.

I'm not seeing that happening:
Be aware that for clustering, the included site.yml does the following:

Executes consul role (installs Consul and bootstraps cluster)
Reconfigures bootstrap node to run without bootstrap-expect setting
Restarts bootstrap node

Have I missed something ?

Condition "Fail if more then one bootstrap server is defined" fails

Env

My inventory:

[consulservers]
80.some.ip.1 consul_node_role=bootstrap
80.some.ip.2 consul_node_role=server

My group_vars:

consul_version: '0.8.4'
consul_ui: yes

consul_bootstrap_expect: yes
consul_dnsmasq_enable: yes

My playbook:

- name: "Setting up consul cluster"
  hosts: consulservers
  roles:
    - role: brianshumate.consul

I am using:

» ansible --version
ansible 2.3.1.0
config file = /Users/sobolev/Documents/configs/callocal_ansible/ansible.cfg
configured module search path = Default w/o overrides
python version = 2.7.13 (default, Dec 17 2016, 23:03:43) [GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.42.1)]

- brianshumate.consul, v1.24.1

Error

TASK [brianshumate.consul : Fail if more then one bootstrap server is defined] ***
fatal: [77.244.214.252]: FAILED! => {"failed": true, "msg": "The conditional check '_consul_bootstrap_servers | length > 1' failed. The error was: error while evaluating conditional (_consul_bootstrap_servers | length > 1): {% set __consul_bootstrap_servers = [] %}{% for server in _consul_lan_servers %}{% set _consul_node_role = hostvars[server]['consul_node_role'] | default('client', true) %}{% if _consul_node_role == 'bootstrap' %}{% if __consul_bootstrap_servers.append(server) %}{% endif %}{% endif %}{% endfor %}{{ __consul_bootstrap_servers }}: {% set __consul_lan_servers = [] %}{% for server in consul_servers %}{% set _consul_datacenter = hostvars[server]['consul_datacenter'] | default('dc1', true) %}{% if _consul_datacenter == consul_datacenter %}{% if __consul_lan_servers.append(server) %}{% endif %}{% endif %}{% endfor %}{{ __consul_lan_servers }}: {% set _consul_servers = [] %}{% for host in groups[consul_group_name] %}{% set _consul_node_role = hostvars[host]['consul_node_role'] | default('client', true) %}{% if ( _consul_node_role == 'server' or _consul_node_role == 'bootstrap') %}{% if _consul_servers.append(host) %}{% endif %}{% endif %}{% endfor %}{{ _consul_servers }}: 'dict object' has no attribute u'cluster_nodes'\n\nThe error appears to have been in '/usr/local/etc/ansible/roles/brianshumate.consul/tasks/asserts.yml': line 68, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: Fail if more then one bootstrap server is defined\n ^ here\n"}
to retry, use: --limit @/Users/sobolev/Documents/configs/callocal_ansible/main.retry

I tried

  1. To change consul_node_role to every possible value.
  2. To disable consul_bootstrap_expect

No luck.

get_url failing on releases.hashicorp.com

OS: Ubuntu 14.04

Seems to be an issue with Python 2.7.6 and TLS 1.2-only servers?

TASK [brianshumate.consul : Get Consul package checksum file] ******************
fatal: [10.0.12.158]: FAILED! => {"changed": false, "failed": true, "msg": "Failed to validate the SSL certificate for releases.hashicorp.com:443. Make sure your managed systems have a valid CA certificate installed. You can use validate_certs=False if you do not need to confirm the servers identity but this is unsafe and not recommended. Paths checked for this platform: /etc/ssl/certs, /etc/ansible, /usr/local/etc/openssl"}

Issues with install -> Get Consul package checksum file

- name: Get Consul package checksum file
  become: no
  connection: local
  get_url:
    url: "{{ consul_checksum_file_url }}"
    dest: "{{ role_path }}/files/consul_{{ consul_version }}_SHA256SUMS"
  run_once: true
  tags: installation
  when: consul_checksum.stat.exists == False

I realize this is executing on my local machine, and i can use wget and curl to successfully download the file.

If i run it as is - I get the error:
Failed to validate the SSL certificate for releases.hashicorp.com:443

if i run with the verify_certs flag set to no, i get this error:
Request failed: <urlopen error EOF occurred in violation of protocol (_ssl.c:590)>

if i download the file manually and place it in the desired location, i get similar errors when i get to downloading the binary for consul.

Any thoughts on a solution/workaround?

Failing at "Create configuration" task

I'm trying to run this role using everything as default.

TASK [ansible-consul : Create configuration] ******************************************************************************************************************************************************************
fatal: [52.90.156.175]: FAILED! => {"failed": true, "msg": "An unhandled exception occurred while running the lookup plugin 'template'. Error was a <class 'ansible.errors.AnsibleError'>, original message: {{ lookup('env','CONSUL_TLS_DIR') | default({{ consul_config_path }}/ssl, true) }}: template error while templating string: expected token ':', got '}'. String: {{ lookup('env','CONSUL_TLS_DIR') | default({{ consul_config_path }}/ssl, true) }}"}
fatal: [54.90.196.123]: FAILED! => {"failed": true, "msg": "An unhandled exception occurred while running the lookup plugin 'template'. Error was a <class 'ansible.errors.AnsibleError'>, original message: {{ lookup('env','CONSUL_TLS_DIR') | default({{ consul_config_path }}/ssl, true) }}: template error while templating string: expected token ':', got '}'. String: {{ lookup('env','CONSUL_TLS_DIR') | default({{ consul_config_path }}/ssl, true) }}"}
fatal: [54.164.166.113]: FAILED! => {"failed": true, "msg": "An unhandled exception occurred while running the lookup plugin 'template'. Error was a <class 'ansible.errors.AnsibleError'>, original message: {{ lookup('env','CONSUL_TLS_DIR') | default({{ consul_config_path }}/ssl, true) }}: template error while templating string: expected token ':', got '}'. String: {{ lookup('env','CONSUL_TLS_DIR') | default({{ consul_config_path }}/ssl, true) }}"}

Question: Deploying Consul Using Ansible

Greetings!!

Trying to deploy using ansible-consul, but for some reason it gathers facts and then exits.

site.yml

- name: cluster1
  hosts: consul_instances
  any_errors_fatal: true
  become: true
  become_user: root

and my hosts file is:

hosts

[consul_instances]
server83.localnet.net ansible_ssh_port=22 ansible_ssh_user=root consul_node_role=server consul_bootstrap_expect=true
server84.localnet.net ansible_ssh_port=22 ansible_ssh_user=root consul_node_role=server consul_bootstrap_expect=true
server85.localnet.net ansible_ssh_port=22 ansible_ssh_user=root consul_node_role=server consul_bootstrap_expect=true
server86.localnet.net ansible_ssh_port=22 ansible_ssh_user=root consul_node_role=server consul_bootstrap_expect=true
server87.localnet.net ansible_ssh_port=22 ansible_ssh_user=root consul_node_role=server consul_bootstrap_expect=true

Now, when deploying this is what I am issuing:

ansible-playbook -i hosts site.yml --extra-vars "consul_acl_master_token=95FBC040-C484-XXXXXXXX" --extra-vars "consul_datacenter=dc1" --extra-vars "consul_default_port_dns=53"

Output:

PLAY [cluster1] **********************************************************************************************************************************************************************

TASK [Gathering Facts] ******************************************************************************************************************************************************************************
ok: [server85.localnet.net]
ok: [server83.localnet.net]
ok: [server87.localnet.net]


ok: [server86.localnet.net]


---
ok: [server84.localnet.net]

PLAY RECAP ******************************************************************************************************************************************************************************************
server83.localnet.net  : ok=1    changed=0    unreachable=0    failed=0
server84.localnet.net  : ok=1    changed=0    unreachable=0    failed=0
server85.localnet.net  : ok=1    changed=0    unreachable=0    failed=0
server86.localnet.net  : ok=1    changed=0    unreachable=0    failed=0
server87.localnet.net  : ok=1    changed=0    unreachable=0    failed=0

Why it's not deploying ? What am I doing wrong?

Thanks again!

Alex

Ubuntu 14.04 init file

I had some trouble getting the consul service to start up on a 14.04 server, but finally narrowed things down to some quotation problems in consul_debianinit.j2.

First, ${DAEMON_ARGS} apparently shouldn't be quoted. According to this post, bash and ksh on Linux will add in extra quotes that mess things up. So this section:

start-stop-daemon --start --quiet --pidfile "${PIDFILE}" --exec "${DAEMON}" --chuid "${USER}" --background --make-pidfile -- \
        "${DAEMON_ARGS}" \
        || return 2

should become this:

start-stop-daemon --start --quiet --pidfile "${PIDFILE}" --exec "${DAEMON}" --chuid "${USER}" --background --make-pidfile -- \
        ${DAEMON_ARGS} \
        || return 2

Second, the process check right after that section was missing a tailing quote mark on the pidfile and the user parameter. After I got the first problem fixed, the service would be started correctly but these checks wouldn't run. So this:

 if ! start-stop-daemon --quiet --stop --test --pidfile "${PIDFILE} --exec "${DAEMON}" --user "${USER}; then

should be this:

if ! start-stop-daemon --quiet --stop --test --pidfile "${PIDFILE}" --exec "${DAEMON}" --user "${USER}"; then

Last, thanks to all of this I found an edge case for the "Consul up?" playbook task where it incorrectly thought the process was running. Checking for the existence of the pidfile does work, except that in my case, the pidfile stuck around even after the consul process exited. Since the DAEMON_ARGS weren't being passed properly, I would just get a "==> Must specify data directory using -data-dir" error from consul, and it would immediately exit. I think the pidfile was created because the consul process did technically start properly, but thanks to the immediate exit, it was never removed with a "service consul stop". I don't really have a good way to improve that.

I did test my template changes and was able to successfully deploy to new 14.04 servers. Thanks for the good work, it's made my life easier as I start to learn and roll out consul.

Issues with consul_iface

Hello!

Thank you for the module! One thing I am not understanding is the "consul_iface" variable.
Several of my machines do not have the same interface name, so I have attempted to set "consul_iface" to "{{ ansible_default_ipv4.interface }}"

When your module runs, it then fails with the following error.

fatal: [10.11.1.46]: FAILED! => {
    "changed": false, 
    "failed": true, 
    "invocation": {
        "module_args": {
            "dest": "/etc/consul.d/client/config.json", 
            "group": "bin", 
            "owner": "consul", 
            "src": "config_client.json.j2"
        }, 
        "module_name": "template"
    }, 
    "msg": "AnsibleUndefinedVariable: 'dict object' has no attribute u'ansible_officelanbr0'"
}

It appears that https://github.com/brianshumate/ansible-consul/blob/master/templates/config_client.json.j2#L29 is properly trying to find the var (see below for the output of ansible -m setup

       "ansible_officelanbr0": {                                                                                                                           
            "active": true,                                                                                                                                 
            "device": "officelanbr0",                                                                                                                       
            "features": {                                                                                                                                   
                "busy_poll": "off [fixed]",                                                                                                                 
                "fcoe_mtu": "off [fixed]",                                                                                                                  
                "generic_receive_offload": "on",                                                                                                            
                "generic_segmentation_offload": "on",                                                                                                       
                "highdma": "on",                                                                                                                            
                "hw_tc_offload": "off [fixed]",                                                                                                             
                "l2_fwd_offload": "off [fixed]",                                                                                                            
                "large_receive_offload": "off [fixed]",                                                                                                     
                "loopback": "off [fixed]",                                                                                                                  
                "netns_local": "on [fixed]",                                                                                                                
                "ntuple_filters": "off [fixed]",                                                                                                            
                "receive_hashing": "off [fixed]",                                                                                                           
                "rx_all": "off [fixed]",                                                                                                                    
                "rx_checksumming": "off [fixed]",                                                                                                           
                "rx_fcs": "off [fixed]",                                                                                                                    
                "rx_vlan_filter": "off [fixed]",                                                                                                            
                "rx_vlan_offload": "off [fixed]",                                                                                                           
                "rx_vlan_stag_filter": "off [fixed]",                                                                                                       
                "rx_vlan_stag_hw_parse": "off [fixed]",                                                                                                     
                "scatter_gather": "on",                                                                                                                     
                "tcp_segmentation_offload": "on",                                                                                                           
                "tx_checksum_fcoe_crc": "off [fixed]",                                                                                                      
                "tx_checksum_ip_generic": "on",                                                                                                             
                "tx_checksum_ipv4": "off [fixed]",                                                                                                          
                "tx_checksum_ipv6": "off [fixed]",                                                                                                          
                "tx_checksum_sctp": "off [fixed]",                                                                                                          
                "tx_checksumming": "on",                                                                                                                    
                "tx_fcoe_segmentation": "off [requested on]",                                                                                               
                "tx_gre_segmentation": "on",                                                                                                                
                "tx_gso_robust": "off [requested on]",                                                                                                      
                "tx_ipip_segmentation": "on",                                                                                                               
                "tx_lockless": "on [fixed]",                                                                                                                
                "tx_nocache_copy": "off",                                                                                                                   
                "tx_scatter_gather": "on",                                                                                                                  
                "tx_scatter_gather_fraglist": "on",                                                                                                         
                "tx_sit_segmentation": "on",                                                                                                                
                "tx_tcp6_segmentation": "on",                                                                                                               
                "tx_tcp_ecn_segmentation": "on",                                                                                                            
                "tx_tcp_segmentation": "on",                                                                                                                
                "tx_udp_tnl_segmentation": "on",                                                                                                            
                "tx_vlan_offload": "on",                                                                                                                    
                "tx_vlan_stag_hw_insert": "on",                                                                                                             
                "udp_fragmentation_offload": "on",                                                                                                          
                "vlan_challenged": "off [fixed]"                                                                                                            
            },                                                                                                                                              
            "id": "8000.10c37b699bd4",                                                                                                                      
            "interfaces": [                                                                                                                                 
                "vethO17AQ9",                                                                                                                               
                "enp10s0",                                                                                                                                  
                "veth42ECAF"                                                                                                                                
            ],                                                                                                                                              
            "ipv4": {                                                                                                                                       
                "address": "10.11.1.46",                                                                                                                    
                "broadcast": "10.11.1.255",                                                                                                                 
                "netmask": "255.255.255.0",                                                                                                                 
                "network": "10.11.1.0"                                                                                                                      
            },                                                                                                                                              
            "ipv6": [                                                                                                                                       
                {                                                                                                                                           
                    "address": "fe80::12c3:7bff:fe69:9bd4",                                                                                                 
                    "prefix": "64",                                                                                                                         
                    "scope": "link"                                                                                                                         
                }                                                                                                                                           
            ],                                                                                                                                              
            "macaddress": "<removed>",                                                                                                              
            "mtu": 1500,                                                                                                                                    
            "promisc": false,                                                                                                                               
            "stp": false,                                                                                                                                   
            "type": "bridge"                                                                                                                                
        }

CentOS 6 question

Could you please clarify why CentOS 6 is banned?

TASK [brianshumate.consul : Fail if not a new release of Red Hat / CentOS] *****
fatal: [consul1.local]: FAILED! => {"changed": false, "failed": true, "msg": "6.8 is not an acceptable version of CentOS for this role"}
fatal: [consul3.local]: FAILED! => {"changed": false, "failed": true, "msg": "6.8 is not an acceptable version of CentOS for this role"}
fatal: [consul2.local]: FAILED! => {"changed": false, "failed": true, "msg": "6.8 is not an acceptable version of CentOS for this role"}

Fail if more than one bootstrap server is defined

So I'm trying to run this role, I haven't tried to override any vars or parameters this role provides, I'm just running it raw. However, when my playbook runs, I get this error.

fatal: [node_1]: FAILED! => {"failed": true, "msg": "The conditional check '_consul_bootstrap_servers | length > 1' failed. The error was: error while evaluating conditional (_consul_bootstrap_servers | length > 1): {% set __consul_bootstrap_servers = [] %}{% for server in _consul_lan_servers %}{% set _consul_node_role = hostvars[server]['consul_node_role'] | default('client', true) %}{% if _consul_node_role == 'bootstrap' %}{% if __consul_bootstrap_servers.append(server) %}{% endif %}{% endif %}{% endfor %}{{ __consul_bootstrap_servers }}: {% set __consul_lan_servers = [] %}{% for server in consul_servers %}{% set _consul_datacenter = hostvars[server]['consul_datacenter'] | default('dc1', true) %}{% if _consul_datacenter == consul_datacenter %}{% if __consul_lan_servers.append(server) %}{% endif %}{% endif %}{% endfor %}{{ __consul_lan_servers }}: {% set _consul_servers = [] %}{% for host in groups[consul_group_name] %}{% set _consul_node_role = hostvars[host]['consul_node_role'] | default('client', true) %}{% if ( _consul_node_role == 'server' or _consul_node_role == 'bootstrap') %}{% if _consul_servers.append(host) %}{% endif %}{% endif %}{% endfor %}{{ _consul_servers }}: 'dict object' has no attribute u'consul_instances'\n\nThe error appears to have been in '[REDACTED]/roles/brianshumate.consul/tasks/asserts.yml': line 68, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: Fail if more than one bootstrap server is defined\n ^ here\n"}

Naming of repo should be brianshumate.consul

Hey Brian - the naming convention of your repo should be the same as that in Ansible-galaxy to avoid issues: e.g. brianshumate.consul instead of ansible-consul.

The reason I say this is that if anyone clones your repo via ansible-galaxy cli instead of git clone, the example wont work ( - { role: ansible-consul } is there instead of role: brianshumate.consull)

user 'consul' with ID 1042 created

The role is templated to use {{ consul_user }} everywhere, but one of the first things that tasks/main.yml does is to create a user with a hardcoded name (consul) and a hardcoded ID (1042). This smells like a bug to me. At the very least, the hardcoded name should probably be changed to "{{ consul_user }}". I also submit that the user ID should not be hardcoded, and that the consul user should be a system account, as follows:

- name: Add Consul user
  user:
    name: "{{ consul_user }}"
    comment: "Consul user"
    group: "{{ consul_group }}"
    system: yes

What are your thoughts?

Ubuntu Trusty (14.04) support

The galaxy page says that Trusty is supported, but Trusty still uses upstart and upstart support has been removed from this role. Full support for systemd on Ubuntu started with version 15.04, see here.

Role is not idempotent

Strangely enough, nobody pointed out yet that this role is not idempotent.
with every Ansible run it will:

TASK [brianshumate.consul : Bootstrap configuration
TASK [brianshumate.consul : Client configuration
TASK [brianshumate.consul : Server configuration
TASK [brianshumate.consul : systemd script

not to mention
TASK [brianshumate.consul : Reconfigure bootstrap node (systemd)

Multi Datacenter Features?

This is just a discussion/thinking issue around what kinds of support we might like to add in for multiple datacenters given the initial bits to differentiate between LAN and WAN servers which were recently added by @groggemans.

I'd like to get some ideas about any features which would be helpful to add, like:

If there are ideas for what would be handy and how we might implement it, let's coordinate here before hitting the YAML and so on. 😄

Starting Consul under well known ports

ot: thank you for the great role!

Im using redhat and have to run consul within the well known port range (<1024). But there are two problems:

  • SELinux
  • Non root users are not allowed to open wkn ports.

I have still no solution for SELinux because its new for me. That could helpful with the root problem:

`
# - name: TODO: allow well known ports in selinux

- name: Allow well known ports for consul
  capabilities:
    path: "{{consul_binary}}"
    capability: CAP_NET_BIND_SERVICE=+eip
    state: present
  when: (consul_ports.rpc|int < 1024)

- name: Forbid well known ports for consul
  capabilities:
    path: "{{consul_binary}}"
    capability: CAP_NET_BIND_SERVICE=-eip
    state: absent
  when: (consul_ports.rpc|int > 1024)

`

Fail if more then one bootstrap server is defined

I'm fairly new to ansible, and trying a simple ansible consul test install on VMs (3 * CentOS 7 on openstack)

The host launching the install has this:

$ cat /etc/redhat-release
CentOS Linux release 7.2.1511 (Core)

$ ansible --version
ansible 2.3.1.0
config file = /etc/ansible/ansible.cfg
configured module search path = Default w/o overrides
python version = 2.7.5 (default, Sep 15 2016, 22:37:39) [GCC 4.8.5 20150623 (Red Hat 4.8.5-4)]

$ cat inventory.cfg
[consul]
192.168.1.200 consul_node_role=bootstrap
192.168.1.106 consul_node_role=server
192.168.1.103 consul_node_role=client

$ cat consul.yml

  • hosts: consul
    roles:
    • ansible-consul

$ l -d roles/ansible-consul
drwxr-xr-x. 12 vlegoll users 4.0K Jun 23 14:18 roles/ansible-consul

This is a git clone of your repository, current master branch

$ ansible-playbook -i inventory.cfg consul.yml
[...]
TASK [ansible-consul : Fail if more then one bootstrap server is defined] ********************************************************************************************************************
fatal: [192.168.1.200]: FAILED! => {"failed": true, "msg": "The conditional check '_consul_bootstrap_servers | length > 1' failed. The error was: error while evaluating conditional (_consul_bootstrap_servers | length > 1): {% set __consul_bootstrap_servers = [] %}{% for server in _consul_lan_servers %}{% set _consul_node_role = hostvars[server]['consul_node_role'] | default('client', true) %}{% if _consul_node_role == 'bootstrap' %}{% if __consul_bootstrap_servers.append(server) %}{% endif %}{% endif %}{% endfor %}{{ __consul_bootstrap_servers }}: {% set __consul_lan_servers = [] %}{% for server in consul_servers %}{% set _consul_datacenter = hostvars[server]['consul_datacenter'] | default('dc1', true) %}{% if _consul_datacenter == consul_datacenter %}{% if __consul_lan_servers.append(server) %}{% endif %}{% endif %}{% endfor %}{{ __consul_lan_servers }}: {% set _consul_servers = [] %}{% for host in groups[consul_group_name] %}{% set _consul_node_role = hostvars[host]['consul_node_role'] | default('client', true) %}{% if ( _consul_node_role == 'server' or _consul_node_role == 'bootstrap') %}{% if _consul_servers.append(host) %}{% endif %}{% endif %}{% endfor %}{{ _consul_servers }}: 'dict object' has no attribute u'cluster_nodes'\n\nThe error appears to have been in '/home/vlegoll/ansible/roles/ansible-consul/tasks/asserts.yml': line 68, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: Fail if more then one bootstrap server is defined\n ^ here\n"}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.