Giter Site home page Giter Site logo

terraform-aws-consul's Introduction

DISCLAIMER

This repository is no longer supported, please consider using this repository for the latest and most supported version for Consul.

Moving forward in the future this repository will be no longer supported and eventually lead to deprecation. Please use our latest versions of our products moving forward or alternatively you may fork the repository to continue use and development for your personal/business use.


Consul AWS Module

This repo contains a set of modules in the modules folder for deploying a Consul cluster on AWS using Terraform. Consul is a distributed, highly-available tool that you can use for service discovery and key/value storage. A Consul cluster typically includes a small number of server nodes, which are responsible for being part of the consensus quorum, and a larger number of client nodes, which you typically run alongside your apps:

Consul architecture

How to use this Module

This repo has the following folder structure:

  • modules: This folder contains several standalone, reusable, production-grade modules that you can use to deploy Consul.
  • examples: This folder shows examples of different ways to combine the modules in the modules folder to deploy Consul.
  • test: Automated tests for the modules and examples.
  • root folder: The root folder is an example of how to use the consul-cluster module module to deploy a Consul cluster in AWS. The Terraform Registry requires the root of every repo to contain Terraform code, so we've put one of the examples there. This example is great for learning and experimenting, but for production use, please use the underlying modules in the modules folder directly.

To deploy Consul servers for production using this repo:

  1. Create a Consul AMI using a Packer template that references the install-consul module. Here is an example Packer template.

    If you are just experimenting with this Module, you may find it more convenient to use one of our official public AMIs. Check out the aws_ami data source usage in main.tf for how to auto-discover this AMI.

    WARNING! Do NOT use these AMIs in your production setup. In production, you should build your own AMIs in your own AWS account.

  2. Deploy that AMI across an Auto Scaling Group using the Terraform consul-cluster module and execute the run-consul script with the --server flag during boot on each Instance in the Auto Scaling Group to form the Consul cluster. Here is an example Terraform configuration to provision a Consul cluster.

To deploy Consul clients for production using this repo:

  1. Use the install-consul module to install Consul alongside your application code.
  2. Before booting your app, execute the run-consul script with --client flag.
  3. Your app can now use the local Consul agent for service discovery and key/value storage.
  4. Optionally, you can use the install-dnsmasq module for Ubuntu 16.04 and Amazon Linux 2 or setup-systemd-resolved for Ubuntu 18.04 and Ubuntu 20.04 to configure Consul as the DNS for a specific domain (e.g. .consul) so that URLs such as foo.service.consul resolve automatically to the IP address(es) for a service foo registered in Consul (all other domain names will be continue to resolve using the default resolver on the OS).

What's a Module?

A Module is a canonical, reusable, best-practices definition for how to run a single piece of infrastructure, such as a database or server cluster. Each Module is created using Terraform, and includes automated tests, examples, and documentation. It is maintained both by the open source community and companies that provide commercial support.

Instead of figuring out the details of how to run a piece of infrastructure from scratch, you can reuse existing code that has been proven in production. And instead of maintaining all that infrastructure code yourself, you can leverage the work of the Module community to pick up infrastructure improvements through a version number bump.

Who created this Module?

These modules were created by Gruntwork, in partnership with HashiCorp, in 2017 and maintained through 2021. They were deprecated in 2022 in favor of newer alternatives (see the top of the README for details).

Code included in this Module:

  • install-consul: This module installs Consul using a Packer template to create a Consul Amazon Machine Image (AMI).

  • consul-cluster: The module includes Terraform code to deploy a Consul AMI across an Auto Scaling Group.

  • run-consul: This module includes the scripts to configure and run Consul. It is used by the above Packer module at build-time to set configurations, and by the Terraform module at runtime with User Data to create the cluster.

  • install-dnsmasq module: Install Dnsmasq for Ubuntu 16.04 and Amazon Linux 2 and configure it to forward requests for a specific domain to Consul. This allows you to use Consul as a DNS server for URLs such as foo.service.consul.

  • setup-systemd-resolved module: Setup systemd-resolved for Ubuntu 18.04 and Ubuntu 20.04 and configure it to forward requests for a specific domain to Consul. This allows you to use Consul as a DNS server for URLs such as foo.service.consul.

  • consul-iam-policies: Defines the IAM policies necessary for a Consul cluster.

  • consul-security-group-rules: Defines the security group rules used by a Consul cluster to control the traffic that is allowed to go in and out of the cluster.

  • consul-client-security-group-rules: Defines the security group rules used by a Consul agent to control the traffic that is allowed to go in and out.

How is this Module versioned?

This Module follows the principles of Semantic Versioning. You can find each new release, along with the changelog, in the Releases Page.

During initial development, the major version will be 0 (e.g., 0.x.y), which indicates the code does not yet have a stable API. Once we hit 1.0.0, we will make every effort to maintain a backwards compatible API and use the MAJOR, MINOR, and PATCH versions on each release to indicate any incompatibilities.

License

This code is released under the Apache 2.0 License. Please see LICENSE and NOTICE for more details.

Copyright © 2017 Gruntwork, Inc.

terraform-aws-consul's People

Contributors

adriananeci avatar ameissnersofi avatar andrew-womeldorf avatar anouarchattouna avatar briandbecker avatar brikis98 avatar bruno avatar bwhaley avatar clovis818 avatar edtan avatar efx-jjohnson avatar etiene avatar gruntwork-ci avatar icy-arctic-fox avatar izzyleung avatar josh-padnick avatar kfishner avatar laakso avatar lawliet89 avatar mcalhoun avatar miguelaferreira avatar mr-miles avatar patoarvizu avatar punkrokk avatar robmorgan avatar rszalski avatar ryandens avatar thenom avatar yardbirdsax avatar yorinasub17 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

terraform-aws-consul's Issues

consul_0.3.5_linux_amd64.zip empty

I'm very new to this, but when running 'install-consul', it fails when trying to unzip consul_0.3.5_linux_amd64.zip. The file actually contains a 403 Forbidden error (http). I've tried this with 0.3.4 & 0.3.5. Any idea what I'm doing incorrectly?

Thanks!

`sudo pip command not found ECS base image`

Hi,

I've been trying to get this module running on a base AMI of the ECS image with packer:

source_ami_filter: 
      filters:
        virtualization-type: "hvm"
        architecture: "x86_64"
        name: "*-amazon-ecs-optimized"
        root-device-type: "ebs"
      owners: ["amazon"]
      most_recent: true

However when i get to install-consul it errors with sudo pip command not found

To overcome that I installed pip - - "sudo easy_install pip"

I can run pip -V and I get a version out but sudo pip -V and I get sudo pip command not found

I can't tell if this an issue with the base ECS image or if the code in this module shouldn't call sudo on pip

Any pointers gratefully received 😄

UPDATE - I have this working but the workaround is a bit ugly - - "sudo ln -s /usr/local/bin/pip /usr/bin/pip"

Consul module fails to read the aws provider credentials

My configuration file looks like below

provider "aws" {
  access_key = "${var.access_key}"
  secret_key = "${var.secret_key}"
  region     = "eu-west-2"
  version = "~> 1.10"
}

module "consul" {
  source = "hashicorp/consul/aws"
  version = "0.1.2"
  aws_region  = "eu-west-2"
  num_servers = "3"
}

When i run #terraform.exe plan I get the following error on Windows. Note i entered the aws_key and secret_key as part of command line input.

Error: Error refreshing state: 1 error(s) occurred:

I dont know how to fix this. So far i have tried #terraform.exe init -upgrade to ensure the aws provider and consul module provider are current.

Yum update -y causing disconnects on CentOS 7

https://github.com/hashicorp/terraform-aws-consul/blob/master/modules/install-consul/install-consul#L125

By adding this, it seems the remote system occasionally disconnects the client - guessing a systemd restart or similar problem. In this case, doing a yum update as part of the install of consul seems like a bad idea and something that should be coming from a base OS vs. part of the consul install. When that restart his, the rest of the script fails to run.

IAM roles for nomad-cluster.

Hi,

Good day.

When using consul-cluster as below:

module "consul_nomad_server_cluster" {
source = "github.com/hashicorp/terraform-aws-consul//modules/consul-cluster?ref=v0.4.0"
...
}

an IAM role is automatically created to allow the instances to communicate with each other.

Is there any reason why this is not the case when using the nomad-cluster as below?:

module "nomad_client_cluster" {
  source = "github.com/hashicorp/terraform-aws-nomad//modules/nomad-cluster?ref=v0.4.5"
...
}

Thanks.

Regards.
JJ

RFE: Support TLS/Encrypt options

Opening this as a reference - I plan on working to add this at some point, as we're wanting this when using it for Vault purposes.

Documentation:
https://www.consul.io/docs/agent/encryption.html

Note there WOULD be a challenge if someone attempted to use this on an existing cluster that did NOT have encryption appropriately enabled, but that's a separate discussion.

Places that need to be modified:
Config file:
https://github.com/hashicorp/terraform-aws-consul/blob/master/modules/run-consul/run-consul#L194
User data files to pass it to run-consul
https://github.com/hashicorp/terraform-aws-consul/tree/master/examples/root-example
Finally the below to add variables for the user data scripts
https://github.com/hashicorp/terraform-aws-consul/blob/master/main.tf#L92

Might be missing things... but it's a start

Latest public AMIs not available (Consul Connect)

Was making initial experiments with this module and got an AMI-not-found error by setting the var ami_id = "ami-0cefe1c6ca6cb38f6". Double checked via AWS web-console for us-east-1 and us-east-2 and did not find public AMIs referenced in _docs:
https://github.com/hashicorp/terraform-aws-consul/blame/v0.3.8/_docs/amazon-linux-ami-list.md#L19

Via AWS web-console looked up owner account 562637147889 filtered to 2018 which resulted with no public Consul AMIs…
https://github.com/hashicorp/terraform-aws-consul/blame/v0.3.8/main.tf#L33

This is probably a backlog issue, but thought mentioning it would be helpful for those wanting to easily try Consul Connect or new UI without using Packer to build AMIs.

Error: Error applying plan:

2 error(s) occurred:

* module.consul.module.consul_servers.aws_launch_configuration.launch_configuration: 1 error(s) occurred:

* aws_launch_configuration.launch_configuration: Expected to find a Root Device name for AMI (ami-0cefe1c6ca6cb38f6), but got none
* module.consul.module.consul_clients.aws_launch_configuration.launch_configuration: 1 error(s) occurred:

* aws_launch_configuration.launch_configuration: Expected to find a Root Device name for AMI (ami-0cefe1c6ca6cb38f6), but got none

enable autoscaling group metrics

Metrics appear to be missing from the autoscaling group.

GroupMinSize, GroupMaxSize, GroupDesiredCapacity, GroupInServiceInstances, GroupPendingInstances, GroupStandbyInstances, GroupTerminatingInstances, GroupTotalInstances

Need to add target_group_arns to ASG configuration

In the case of adding an Application Load Balancer (ALB) to front the Consul cluster, we need to have the ALB's Target Group pointing to the ASG. Since target_group_arns is not part of ASG's configuration, the Terraform module is trying to remove any attached Target Group, which is not what we want, e.g.

Terraform will perform the following actions:

  ~ module.prod_consul_1.module.consul_servers.aws_autoscaling_group.autoscaling_group
      target_group_arns.#:         "1" => "0"
      target_group_arns.842221997: "arn:aws:elasticloadbalancing:us-east-2:423487953385:targetgroup/my-lb-tg/2f4d7259f471d85a" => ""

Add variable to allow adding a security group ingress for http port only

I would like the ability to add ingress to consul via a security group for only the http port, so that I attach a load balancer to only http.

There are already these two variables:

variable "allowed_ssh_security_group_ids" {
  description = "A list of security group IDs from which the EC2 Instances will allow SSH connections"
  type        = "list"
  default     = []
}

variable "allowed_inbound_security_group_ids" {
  description = "A list of security group IDs that will be allowed to connect to Consul"
  type        = "list"
  default     = []
}

The first adds a security group to the allowed ingress for SSH (port 22).

The second adds a security group for the allowed ingress for all the other standard ports.

would be great to have a variable like:

variable "allowed_http_security_group_ids" {
  description = "A list of security group IDs from which the EC2 Instances will allow http connections"
  type        = "list"
  default     = []
}

which can be used for UI load balancers.

I realize that I can restrict this in the target group, only creating a listener that passes 443 to 8500, but it would be cleaner if I also didn't open non-http ports to the load balancer.

allow specifying https port

currently the template creates a rule for the http port, but if I'm enabling https as per

{
  "ports": {
    "https": 8501
  }
}

then I currently cannot access port 8501 because there is no security group rule created for it

Support granting source security group IDs access to consul-cluster

To be consistent with vault-cluster module there should be variables for allowed_ssh_security_group_ids and allowed_inbound_security_group_ids to allow security group IDs to access the cluster (in addition to, or instead of cidr_blocks).

Could be implemented exactly the same way in consul-cluster/main.tf

resource "aws_security_group_rule" "allow_ssh_inbound_from_security_group_ids" {
  count                    = "${length(var.allowed_ssh_security_group_ids)}"
  type                     = "ingress"
  from_port                = "${var.ssh_port}"
  to_port                  = "${var.ssh_port}"
  protocol                 = "tcp"
  source_security_group_id = "${element(var.allowed_ssh_security_group_ids, count.index)}"

  security_group_id = "${aws_security_group.lc_security_group.id}"
}

and multiple aws_security_group_rule resources in consul-security-group-rules/main.tf

resource "aws_security_group_rule" "allow_<some_protocol>_inbound_from_security_group_ids" {
  count                    = "${length(var.allowed_inbound_security_group_ids)}"
  type                     = "ingress"
  from_port                = "${var.<some_protocol>_port}"
  to_port                  = "${var.<some_protocol>_port}"
  protocol                 = "<tcp or udp>"
  source_security_group_id = "${element(var.allowed_inbound_security_group_ids, count.index)}"

  security_group_id = "${var.security_group_id}"
}

EBS volumes generated for instances should receive tags

My team is using a mixture of hashicorp/terraform-aws-consul, hashicorp/terraform-aws-vault, and our own modules for all other types of instances. We noticed that EBS volumes generated for Consul and Vault instances do not get any tags, so they show up in the UI as having no name. Consequently, it is more difficult to determine at a glance if they are legitimate or actively-used volumes.

For our own modules, we use aws_launch_template rather than aws_launch_configuration, and we put tag_specifications blocks within them like so:

  tag_specifications {
    resource_type = "instance"

    tags {
      Name = "${var....}"
    }
  }

  tag_specifications {
    resource_type = "volume"

    tags {
      Name = "${var....}"
    }
  }

The Terraform docs suggest that the tag_specifications feature is exclusive to aws_launch_template although I hope that you may be able to determine some way to get the same thing in aws_launch_configuration as well.

Add option to set -domain in run-consul

Thanks a lot for these examples/modules! They saved me a great deal of time!

Context:
We put our internal services under private-subdomain.public-domain.com, and use consul as the internal DNS.

Suggestion:
It would be great if we could add the support for "-domain" config in run-consul. I had to fork the repo and modify the script myself.

enhancement request: amazon linux 2 support

amazon linux 2 has a suite of numerous updates and improvements

however, with the bump in centos it's based on, it now uses systemd instead of supervisord, so I suspect the install script will need testing and additional environment checks to use the appropriate installation method

Can't recreate auto scale group due to security group dependency on old instances

* module.consul_servers.aws_security_group.lc_security_group (destroy): 1 error(s) occurred:

* aws_security_group.lc_security_group (deposed #0): 1 error(s) occurred:

* aws_security_group.lc_security_group (deposed #0): DependencyViolation: resource sg-XXXXX has a dependent object

Seeing this error after changing the cluster name. It seems it isn't creating the new autoscaling group and deleting the old autoscaling group BEFORE trying to delete the security groups. The result is it can't delete the security group as it's still connected to the EC2 instances running prior.

  • Update - there IS a circular dependency issue potentially here. Not sure of the correct solution yet.

Init script pointing to /usr/local/bin/supervisor* incorrectly

the install-consul shell script has:

  # On Amazon Linux, /usr/local/bin is not in PATH for the root user, so we add symlinks to /usr/bin, which is in PATH
  if [[ ! -f "/usr/bin/supervisorctl" ]]; then
    sudo ln -s /usr/local/bin/supervisorctl /usr/bin/supervisorctl
  fi
  if [[ ! -f "/usr/bin/supervisord" ]]; then
    sudo ln -s /usr/local/bin/supervisord /usr/bin/supervisord
  fi

However, the init script is hard coded to a path in /usr/local/bin

supervisorctl=/usr/local/bin/supervisorctl
supervisord=${SUPERVISORD-/usr/local/bin/supervisord}

Init script should probably be hard coded to /usr/bin/supervisorctl since the installer script is creating a symlink in /usr/bin - hit this issue trying to install this on a CentOS 7 based AMI.

Add variable to allow setting 'leave_on_terminate' value in the Consul config.

When trying to perform an upgrade of the Consul cluster, the autorestart on the supervisor configuration prevents running a consul leave command on the node to have it gracefully leave the cluster. Supervisor immediately restarts the agent as soon as the leave command terminates it.
Stopping the Consul service via supervisorctl does not cause the node to leave the cluster gracefully either. This leads to a situation where you either need to edit the supervisor configuration before leaving the cluster or using a force-leave command after stopping the service.

Should security group rules implicitly allow inbound from self?

This follows up on #13.

The pattern from the Vault cluster module which that PR emulates adds an implicit "inbound from self" block, which allows for Vault request forwarding within the Vault security group:

https://github.com/hashicorp/terraform-aws-vault/blob/8d093d05c0a0d6c8861eea79c054cb7274628dc6/modules/vault-security-group-rules/main.tf#L27

I believe something similar is necessary for the security group rules in this module, in order to make sure the Consul instances can talk to each other.

If I try to allow inbound traffic just from our intended clients (a Vault cluster), the Consul cluster never selects a leader (probably because the instances cannot talk to each other).

A workaround fix is to add another instance of the security rules module explicitly:

module "consul_cluster" {
  source = "github.com/hashicorp/terraform-aws-consul//modules/consul-cluster?ref=v0.1.0"
  allowed_inbound_security_group_ids = ["${module.vault_cluster.security_group_id}"]
  allowed_inbound_cidr_blocks = []
  <snip>
}

module "consul_self_access_sg_rules" {
  source = "github.com/hashicorp/terraform-aws-consul//modules/consul-security-group-rules?ref=v0.1.0"
  security_group_id = "${module.consul_cluster.security_group_id}"
  allowed_inbound_cidr_blocks = []
  allowed_inbound_security_group_ids = ["${module.consul_cluster.security_group_id}"]
}

It's likely that not all the different inbound security rules need to be applied for intra-cluster communication - just the protocols actually used for chatter between instances.

No way to use an encrypted EBS volume for the consul data directroy

Would like an option to add a separate EBS volume to be mounted at consul/data for storage of consul data. An option inside this would create the volume as encrypted, if desired.

I imagine creating a second set of variables similiar to the root_volume_* variables for this purpose:

variable "use_separate_volume_for_data" {
  description = "Enable this flag to use a separate EBS volume for consul data storage"
  default     = false
}

variable "data_volume_type" {
  description = "The type of volume to use for data storage. Must be one of: standard, gp2, or io1. Only used if 'use_separate_volume_for_data' is enabled."
  default     = "standard"
}

variable "data_volume_size" {
  description = "The size, in GB, of the data EBS volume. Only used if 'use_separate_volume_for_data' is enabled."
  default     = 50
}

variable "data_volume_delete_on_termination" {
  description = "Whether the data volume should be destroyed on instance termination. Only used if 'use_separate_volume_for_data' is enabled."
  default     = true
}

variable "data_volume_encrypted" {
  description = "Should the data volume be encrypted. Only used if 'use_separate_volume_for_data' is enabled."
  default     = true
}

Thoughts?

I'm happy to work on these additions myself...

splitting the consul-cluster in 2 separate module : one for masters and one for clients

refactoring the consul-cluster in 2 separate (sub-) module : one for masters keeping as it is and the second the adapts to the application stack , i.e it adds rule to a given security groups, adds strings to a given file , adds tags to a given (supported by application stack) tag-list (is it possible to implement?) but does not deploy its own ASG . ASG and all related stuff ( scaling policies , alarms...launch conf attribute) are usually set by application stack.

Is this code meant to be used in production?

Yes, the modules are all production-grade, but the root example is meant primarily as an example. Per the discussion on Twitter, if you'd like to use this code in production:

  1. First get the example up and running to get a feel for things.
  2. Then carefully review the example code to see which aspects don't make sense for production (e.g. we strongly recommend you build your own AMI for your deployment, rather than using ours).
  3. Then carefully review the variables.tf file for any modules you call to understand which vars you are and are not using.
  4. Deploy in prod and profit.

I'm opening this issue to document this clarification and to signal that we should consider adding this to the official docs.

On Amazon Linux AMI consul not seen when run as sudo

I am trying to add a supervisorctl config so that it starts consul as part of a packer image however the packer image is run as ec2-user which can't seem to run supervisorctl and results in error: <class 'socket.error'>, [Errno 13] Permission denied: file: /usr/lib64/python2.7/socket.py line: 228 so I could fix the supervisorctl permissions. However, if I set my scripts to use sudo supervisorctl, the root user then can't see consul for some reason so I was wondering what your recommendation would be?

Thanks

Enhancement: Add iam_role_id as variable in root main.tf to permit referencing pre-built IAM roles instead of creating them

Right now, if you run the terraform-aws-vault role as a user who does not have permissions to create a role (but can assign them), then this build will not be successful due to the need to create a role that just grants describe-instances, describe-groups, and describe-tags. I'd like to be able to supply the iam_role_id, but that's not possible.

https://github.com/hashicorp/terraform-aws-consul/blob/master/modules/consul-iam-policies/main.tf

I will submit a PR later to implement this. Figured I would create the issue and reference it after.

Inbound required arguments.

Hi,

Good day.

I'm using the consul_cluster module. Quick question, why is allowed_inbound_cidr_blocks required? If my cluster is deployed in a private subnet, would allowed_inbound_cidr_blocks be needed? If we only allowing access from a bastion instance, only allowed_inbound_security_group_ids would be needed.

Maybe have one of allowed_inbound_cidr_blocks or allowed_inbound_security_group_ids required? I might also be completely wrong and missed something? :)

Regards.
JJ

Dedicated consul clients when using Vault?

If I understand correctly, this TF script deploys Consul agents in server mode, under a tier of Consul agents running as clients. The architecture shown in https://github.com/hashicorp/terraform-aws-vault implies that Vault talks directly to the Consul server agents - i.e. no Consul client agents. Is that correct?

Just wondering what a Prod architecture would like like, i.e.
Vault - Consul Client - Consul Server
or
Vault - Consul Server?

Much appreciated in advance

Support selecting subnets

A VPC may contain subnets intended for different use cases. The module as written stripes consul against all subnets in the provided VPC.

This could be accomplished via the tags argument in aws_subnet_ids and having a subnet_tags variable in main, or by accepting a list of subnets in main.

consul-examples-helper.sh doesn't check for public IPs properly

I have used packer to build multiple versions of consul/dnsmasq into an AMI. With every version the instances spin up, but do not elect a leader. I have also tried the test with the example AMI that was provided on this page: https://registry.terraform.io/modules/hashicorp/consul/aws/0.0.5 Same result, no leader. I've confirmed that the instances can reach each other on all ports defined in the module, so it's not a rogue security group rule or anything like that.

I'm not sure if I'm missing something, or the module just doesn't work as expected, but I'm open to any suggestions.

Using AWS snapshot creates conflicting node ids.

Issue Description

Consul creates during its operation temporary data which is stored, in the default configuration, in the folder /opt/consul/data. The folder contains the unique id consul generates to identify the node.

When the AWS snapshot functionality is used to start up new instances, in maybe an autoscaling group, the persistent node id is reused on the new instance and hence creates conflicts.

Suggested Solution

Since the run-consul script is executed only one time after the instance was created at startup this script could be used to cleanup the data directory.

I would provide a PR if this is an acceptable solution.

Installation Doesn't work

This consul installation doesn't work on Centos7, supervisorD doesn't start after the installation.
Supoervisord init script isn't compatible with centos.. the place that it appears to break down is the fact that on centos7, pip installs supervisord binary to /usr/bin/supervisord as well as supervisorctl but the init script in that package (  https://github.com/hashicorp/terraform-aws-consul/blob/master/modules/install-consul/supervisor-init... ) expects them in /usr/local/bin

IAM permissions are incorrect

Hi,

Good day.

Had trouble getting a consul/nomad cluster running. Checked the logs and it could not set a leader. Took me a while to figure out the IAM role permissions might be incorrect.

Been following the guide here https://github.com/hashicorp/terraform-aws-nomad/tree/master/modules/nomad-cluster.

Not sure what the minimal changes are, I granted the IAM role all describe EC2 permissions:

ec2:Describe*

My consul_cluster terraform:

module "consul_cluster" {
  source = "github.com/hashicorp/terraform-aws-consul//modules/consul-cluster?ref=v0.4.0"
...
  user_data = <<-EOF
              #!/bin/bash
              /opt/consul/bin/run-consul --server --cluster-tag-key consul --cluster-tag-value consul-cluster-server
              EOF
}

This appears to happen with the nomad_cluster module as well.

My nomad_cluster terraform:

module "nomad_cluster" {
  source = "github.com/hashicorp/terraform-aws-nomad//modules/nomad-cluster?ref=v0.4.5"

  user_data = <<-EOF
              #!/bin/bash
              /opt/consul/bin/run-consul --client --cluster-tag-key consul --cluster-tag-value consul-cluster-server
              /opt/nomad/bin/run-nomad --client 
              EOF
}

After manually updating the IAM role policy the the cluster is up and running.

Regards.
JJ

Consul AMI example - missing Git for AWS example

I was trying to set up the AMI for production usage per your example code on this page. The Amazon Linux version failed with this error:

amazon-linux-ami: /tmp/script_7892.sh: line 2: git: command not found

I was able to re-run the Amazon Linux one by modifying the inline block as follows:

"inline": [
      "sudo yum install -y git",
      "git clone --branch v0.1.0 https://github.com/hashicorp/terraform-aws-consul.git /tmp/terraform-aws-consul",
      "/tmp/terraform-aws-consul/modules/install-consul/install-consul --version {{user `consul_version`}}",
      "/tmp/terraform-aws-consul/modules/install-dnsmasq/install-dnsmasq"
    ],

Ability to specify VPC

As a feature request, could the ability specify which VPC the resources are stood up in be added? Right now it only uses the default VPC.

Thanks!

aws provider authentication issue

Version info

  • module.consul = "0.1.0"
  • provider.aws: version = "~> 1.7"
  • provider.template: version = "~> 1.0"

Situation

I read this article : https://www.terraform.io/intro/getting-started/modules.html

I wrote example.tf :

provider "aws" {
  access_key = "${var.aws_access_key}"
  secret_key = "${var.aws_secret_key}"
  region     = "${var.region}”
}

module "consul" {
    source = "hashicorp/consul/aws"
    aws_region = "${var.region}"
    num_servers = "3"
}

When I run terraform plan I got the message below.

data.template_file.user_data_client: Refreshing state...
data.template_file.user_data_server: Refreshing state...

Error: Error refreshing state: 1 error(s) occurred:

* module.consul.provider.aws: No valid credential sources found for AWS Provider.
  Please see https://terraform.io/docs/providers/aws/index.html for more information on
  providing credentials for the AWS Provider

After researching I tried to modify example.tf :

provider "aws" {
  access_key = "${var.aws_access_key}"
  secret_key = "${var.aws_secret_key}"
  region     = "${var.region}”
}

module "consul" {
    source = "hashicorp/consul/aws"

    providers = {
        aws = "aws"
    }
    aws_region = "${var.region}"
    num_servers = "3"
}

and it works.

Question

Is some configuration changed and do I have to specify the "provider"?
Or did I do something wrong?

Thanks. :-)

unable to execute without a default vpc

Was stoked to try this out but it turns out that my past foolishness (deleted default VPC years ago) has come back to bite me:

✔ 22:26 (jacob@JLEBC-LWS) ~/Projects/dweomer/terraform-aws-consul $ cat main.tf 
module "consul" {
  source = "hashicorp/consul/aws"

  aws_region = "us-west-2"
}
✔ 22:26 (jacob@JLEBC-LWS) ~/Projects/dweomer/terraform-aws-consul $ terraform version
Terraform v0.10.6

✔ 22:26 (jacob@JLEBC-LWS) ~/Projects/dweomer/terraform-aws-consul $ terraform init
Downloading modules...
Get: https://api.github.com/repos/hashicorp/terraform-aws-consul/tarball/v0.0.2//*?archive=tar.gz
Get: file:///home/jacob/Projects/dweomer/terraform-aws-consul/.terraform/modules/750eb34a9425e9023c286675ca4ed2e4/modules/consul-cluster
Get: file:///home/jacob/Projects/dweomer/terraform-aws-consul/.terraform/modules/750eb34a9425e9023c286675ca4ed2e4/modules/consul-cluster
Get: file:///home/jacob/Projects/dweomer/terraform-aws-consul/.terraform/modules/750eb34a9425e9023c286675ca4ed2e4/modules/consul-security-group-rules
Get: file:///home/jacob/Projects/dweomer/terraform-aws-consul/.terraform/modules/750eb34a9425e9023c286675ca4ed2e4/modules/consul-iam-policies
Get: file:///home/jacob/Projects/dweomer/terraform-aws-consul/.terraform/modules/750eb34a9425e9023c286675ca4ed2e4/modules/consul-security-group-rules
Get: file:///home/jacob/Projects/dweomer/terraform-aws-consul/.terraform/modules/750eb34a9425e9023c286675ca4ed2e4/modules/consul-iam-policies

Initializing provider plugins...
- Checking for available provider plugins on https://releases.hashicorp.com...
- Downloading plugin for provider "aws" (0.1.4)...
- Downloading plugin for provider "template" (0.1.1)...

The following providers do not have any version constraints in configuration,
so the latest version was installed.

To prevent automatic upgrades to new major versions that may contain breaking
changes, it is recommended to add version = "..." constraints to the
corresponding provider blocks in configuration, with the constraint strings
suggested below.

* provider.aws: version = "~> 0.1"
* provider.template: version = "~> 0.1"

Terraform has been successfully initialized!

You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.

If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.
✔ 22:26 (jacob@JLEBC-LWS) ~/Projects/dweomer/terraform-aws-consul $ terraform plan
Refreshing Terraform state in-memory prior to plan...
The refreshed state will be used to calculate this plan, but will not be
persisted to local or remote state storage.

data.template_file.user_data_server: Refreshing state...
data.template_file.user_data_client: Refreshing state...
data.aws_vpc.default: Refreshing state...
data.aws_ami.consul: Refreshing state...
data.aws_iam_policy_document.instance_role: Refreshing state...
data.aws_iam_policy_document.instance_role: Refreshing state...
data.aws_iam_policy_document.auto_discover_cluster: Refreshing state...
data.aws_iam_policy_document.auto_discover_cluster: Refreshing state...
Error refreshing state: 1 error(s) occurred:

* module.consul.data.aws_vpc.default: 1 error(s) occurred:

* module.consul.data.aws_vpc.default: data.aws_vpc.default: no matching VPC found
✘-1 22:26 (jacob@JLEBC-LWS) ~/Projects/dweomer/terraform-aws-consul $ 

Given that this is the "official" module for Consul on AWS it would be nice of it were a little less opinionated. Failing that, the README should at least mention this assumption.

`run-consul` fails to lookup tags for instance

In attempting to run a consul cluster, I'm receiving the following output from the user-data script that calls run-consul:

HTTPSConnectionPool(host='ec2.us-east-1.amazonaws.com', port=443): Max retries exceeded with url: / (Caused by ConnectTimeoutError(<requests.packages.urllib3.connection.VerifiedHTTPSConnection object at 0x7f67b5423278>, 'Connection to ec2.us-east-1.amazonaws.com timed out. (connect timeout=60)'))
2018-04-24 22:24:11 [WARN] [run-consul] This Instance i-09318afe7d984d50f in us-east-1 does not have any Tags.
2018-04-24 22:24:11 [WARN] [run-consul] Will sleep for 10 seconds and try again.

Executing run-consul by hand reveals that it will consistently hang and timeout while trying to contact ec2.us-east-1.amazonaws.com. I am running the instances on a few private VPC subnets that can communicate with one another, but are relatively locked down otherwise.

Are the tags necessary? If so, what do I need to do to configure terraform to allow access to ec2.us-east-1.amazonaws.com? Finally, are there other services/hosts that run-consul needs access to from these VPC/subnets?

Lack of log rotating

# tail /opt/consul/log/consul-
consul-error.log      consul-stdout.log.1   consul-stdout.log.2   consul-stdout.log.4   consul-stdout.log.6   consul-stdout.log.8   
consul-stdout.log     consul-stdout.log.10  consul-stdout.log.3   consul-stdout.log.5   consul-stdout.log.7   consul-stdout.log.9

If some of checks in consul are failing (for some reason) it can cause out of space errors

"* module.consul.data.aws_vpc.default: data.aws_vpc.default: InvalidVpcID.NotFound: The vpc ID 'vpc-ac2220ca' does not exist"... yet it does

Howdy. Apologies if this is user error. I'm passing in a vpc_id that most definitely exists. I also tried to use the "VPC Name", because I see there are old AWS documentation references to AWS' own confusing nomenclature for "vpc_id", to no avail.

--------------------------------------------------
|                  DescribeVpcs                  |
+------------------------------------------------+
||                     Vpcs                     ||
|+-----------------------+----------------------+|
||  CidrBlock            |  10.50.0.0/16        ||
||  DhcpOptionsId        |  dopt-dcea0db9       ||
||  InstanceTenancy      |  default             ||
||  IsDefault            |  False               ||
||  State                |  available           ||
||  VpcId                |  vpc-ac2220ca        ||
|+-----------------------+----------------------+|
|||           CidrBlockAssociationSet          |||
||+----------------+---------------------------+||
|||  AssociationId |  vpc-cidr-assoc-164f8d7d  |||
|||  CidrBlock     |  10.50.0.0/16             |||
||+----------------+---------------------------+||
||||              CidrBlockState              ||||
|||+---------------+--------------------------+|||
||||  State        |  associated              ||||
|||+---------------+--------------------------+|||
|||                    Tags                    |||
||+-------------------------+------------------+||
|||           Key           |      Value       |||
||+-------------------------+------------------+||
|||  Name                   |  pre-dev         |||
|||  Environment            |  pre-dev         |||
|||  Terraform              |  true            |||
||+-------------------------+------------------+||

FD limit is too low

# grep 'Max open files' /proc/$(pidof consul)/limits
Limit                     Soft Limit           Hard Limit           Units
Max open files            1024                 4096                 files     

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.