terraform-aws-modules / terraform-aws-emr Goto Github PK

View Code? Open in Web Editor NEW

21.0 2.0 15.0 104 KB

Terraform module to create AWS EMR resources 🇺🇦

Home Page: https://registry.terraform.io/modules/terraform-aws-modules/emr/aws

License: Apache License 2.0

HCL 100.00%

aws-emr aws-emr-clusters aws-emr-serverless terraform terraform-module

terraform-aws-emr's People

Contributors

Stargazers

Watchers

Forkers

bryantbiggs maneeshs prakash-kumard michalosu magreenbaum michaelr1980 afelipe-data jannatbawa preethiashiwin01 youngjeong46 vara-bonthu noah-fctl kumo-ai srg1177 alopezsanchez

terraform-aws-emr's Issues

ebs_config could not take effect

Description

If you specifiy ebs_config in core_instance_fleet section, it will be ignore when running "terraform plan", for example:
core_instance_fleet = {
name = "core fleet"
target_on_demand_capacity = 10

instance_type_configs = [
  {
    bid_price_as_percentage_of_on_demand_price = 100
    ebs_config = 
      {
        size                 = 256
        iops                 = 4000
        type                 = "gp3"
        volumes_per_instance = 4
      }
    instance_type     = "r6a.2xlarge"
    weighted_capacity = 1
  }
]
launch_specifications = {
  on_demand_specification = {
    allocation_strategy = "lowest-price"
  }
}

}
will output:

core_instance_fleet {
+ id = (known after apply)
+ name = "core fleet"
+ provisioned_on_demand_capacity = (known after apply)
+ provisioned_spot_capacity = (known after apply)
+ target_on_demand_capacity = 10
+ target_spot_capacity = 0

   + instance_type_configs {
       + bid_price_as_percentage_of_on_demand_price = 100
       + instance_type                              = "r6a.2xlarge"
       + weighted_capacity                          = 1

       + ebs_config {
           + size                 = 64
           + type                 = "gp3"
           + volumes_per_instance = 1
         }
     }

   + launch_specifications {
       + on_demand_specification {
           + allocation_strategy = "lowest-price"
         }
     }
 }

The output EBS storage is just 64G. Obviously it's not expected.

When I changed it into:
core_instance_fleet = {
name = "core fleet"
target_on_demand_capacity = 10

instance_type_configs = [
  {
    bid_price_as_percentage_of_on_demand_price = 100
    ebs_config = **[**
      {
        size                 = 256
        iops                 = 4000
        type                 = "gp3"
        volumes_per_instance = 4
      }
    **]**
    instance_type     = "r6a.2xlarge"
    weighted_capacity = 1
  }
]
launch_specifications = {
  on_demand_specification = {
    allocation_strategy = "lowest-price"
  }
}

}

terraform plan will output:
core_instance_fleet {
+ id = (known after apply)
+ name = "core fleet"
+ provisioned_on_demand_capacity = (known after apply)
+ provisioned_spot_capacity = (known after apply)
+ target_on_demand_capacity = 10
+ target_spot_capacity = 0

      + instance_type_configs {
          + bid_price_as_percentage_of_on_demand_price = 100
          + instance_type                              = "r6a.2xlarge"
          + weighted_capacity                          = 1

          **+ ebs_config {
              + iops                 = 4000
              + size                 = 256
              + type                 = "gp3"
              + volumes_per_instance = 4
            }**
        }

      + launch_specifications {
          + on_demand_specification {
              + allocation_strategy = "lowest-price"
            }
        }
    }

Although plan 256 GB, but after EMR Cluster is created, you will noticed that it in fact also just has 64 G EBS Storage.

Versions

Module version [Required]:
source = "terraform-aws-modules/emr/aws"
version = "1.2.2"
Terraform version:
terraform --version
Terraform v1.5.3
on linux_amd64

provider registry.terraform.io/hashicorp/archive v2.4.0
provider registry.terraform.io/hashicorp/aws v5.31.0
provider registry.terraform.io/hashicorp/random v3.5.1
provider registry.terraform.io/hashicorp/template v2.2.0

Your version of Terraform is out of date! The latest version
is 1.7.5. You can update by downloading from https://www.terraform.io/downloads.html

Provider version(s):
Terraform v1.5.3
on linux_amd64

provider registry.terraform.io/hashicorp/archive v2.4.0
provider registry.terraform.io/hashicorp/aws v5.31.0
provider registry.terraform.io/hashicorp/random v3.5.1
provider registry.terraform.io/hashicorp/template v2.2.0

Your version of Terraform is out of date! The latest version
is 1.7.5. You can update by downloading from https://www.terraform.io/downloads.html

Reproduction Code [Required]

Please refer to above terarform snippet.

Steps to reproduce the behavior:

I use S3 as the backend.
terraform init -backend-config=backend.config
terraform plan
terraform apply

Expected behavior

EC2 in setup EMR CLuster has EBS storage of 256 GB other than 64 GB.

Actual behavior

EC2 in setup EMR CLuster has EBS storage of 64 GB.

EMR Studio -> Input service_role_s3_bucket_arns not working as expected

Description

Please provide a clear and concise description of the issue you are encountering, and a reproduction of your configuration (see the examples/* directory for references that you can copy+paste and tailor to match your configs if you are unable to copy your exact configuration). The reproduction MUST be executable by running terraform init && terraform apply without any further changes.

If your request is for a new feature, please use the Feature request template.

✋ I have searched the open/closed issues and my issue is not listed.

⚠️ Note

Before you submit an issue, please perform the following first:

Remove the local .terraform directory (! ONLY if state is stored remotely, which hopefully you are following that best practice!): rm -rf .terraform/
Re-initialize the project root to pull down modules: terraform init
Re-attempt your terraform plan or apply and check if the issue still persists

Versions

Module version [Required]:
Terraform version:

Provider version(s):

Reproduction Code [Required]

Steps to reproduce the behavior:

Expected behavior

As per my understanding if a user pass a s3 arn in service_role_s3_bucket_arns input then the EMR studio should be able to read/write on that particular bucker only.

Actual behavior

But here even if you pass s3 arun in service_role_s3_bucket_arns then also its taking all the buckets and the probable reason for the issue is below code.

terraform-aws-emr/modules/studio/main.tf

Line 293 in d987b8d

resources = coalescelist(

This has to be coalescelist( ["var.service_role_s3_bucket_arns"], ["arn:aws:s3:::*"])

Terminal Output Screenshot(s)

Additional context

EMR Cluster Service Role not able to assume the EMR Cluster Autoscaling Role

Description

I've tried to deploy an EMR cluster using a custom autoscaling policy and it turned out that the cluster gets successfully created but the custom automatic scaling policy fails.

To debug this, I've started to look into the EMR events, these two were the most meaningful:

Then, I've looked into the Cloudtrail Logs, and I found out that the EMR Cluster Service Role was not able to assume the EMR Cluster Autoscaling Role. The error message was like that: Unable to assume IAM role: arn:aws:iam::aws-account-id:role/Spark-ETL-autoscaling

After that, I checked the trust relationship of the Autoscaling Role, which looked like this:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "EMRAssumeRole",
            "Effect": "Allow",
            "Principal": {
                "Service": [
                    "elasticmapreduce.amazonaws.com",
                    "application-autoscaling.amazonaws.com"
                ]
            },
            "Action": "sts:AssumeRole",
            "Condition": {
                "StringEquals": {
                    "aws:SourceAccount": "123456"
                },
                "ArnLike": {
                    "aws:SourceArn": "arn:aws:elasticmapreduce:eu-central-1:123456:*"
                }
            }
        }
    ]
}

And I've also verified the AWS doc here, regarding the trust relationship that the autoscaling role for EMR must have:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Service": "application-autoscaling.amazonaws.com"
            },
            "Action": "sts:AssumeRole",
            "Condition": {
                "StringEquals": {
                    "aws:SourceAccount": "<account-id>"
                },
                "ArnLike": {
                    "aws:SourceArn": "arn:aws:application-autoscaling:<region>:<account-id>:scalable-target/*"
                }
            }
        }
    ]
}

It's pretty straightforward to note that the condition with "aws:SourceArn": "arn:aws:application-autoscaling:<region>:<account-id>:scalable-target/*" is missing in the module here.

To solve the issue, I had to implement the trust relationship of the autoscaling role for EMR as following:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "EMRAssumeRole",
            "Effect": "Allow",
            "Principal": {
                "Service": [
                    "elasticmapreduce.amazonaws.com",
                    "application-autoscaling.amazonaws.com"
                ]
            },
            "Action": "sts:AssumeRole",
            "Condition": {
                "StringEquals": {
                    "aws:SourceAccount": "123456"
                },
                "ArnLike": {
                    "aws:SourceArn": [
                        "arn:aws:elasticmapreduce:eu-central-1:123456:*",
                        "arn:aws:application-autoscaling:eu-central-1:123456:scalable-target/*"
                    ]
                }
            }
        }
    ]
}

✋ I have searched the open/closed issues and my issue is not listed.

Versions

Module version [Required]: v2.0.0
Terraform version: v1.5.5

Provider version(s): provider registry.terraform.io/hashicorp/aws v5.44.0

Reproduction Code [Required]

module "emr" {
  source        = "terraform-aws-modules/emr/aws"
  version       = "v2.0.0"
  name          = var.cluster_name
  release_label = var.release_label
  applications  = var.applications

  bootstrap_action = var.bootstrap_action

  vpc_id = data.terraform_remote_state.vpc.outputs.vpc_id
  log_uri = var.log_uri
  ebs_root_volume_size = var.ebs_root_volume_size
  step_concurrency_level = var.step_concurrency_level
  termination_protection = var.termination_protection
  ec2_attributes = {
    subnet_id = var.subnet_id
    key_name  = "airflow"
  }
  configurations_json = var.configurations_json
  iam_role_use_name_prefix = false
  iam_instance_profile_policies = {
    AmazonElasticMapReduceforEC2Role = "arn:aws:iam::aws:policy/service-role/AmazonElasticMapReduceforEC2Role"
    AWSGlueConsoleFullAccess = "arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess"
    SecretManagerProductionReadWrite = aws_iam_policy.secret_manager_read_write.arn
    AppFlow = aws_iam_policy.app_flow.arn
  }

  # Master Group
  master_instance_group = {
    name           = "Master - 1"
    instance_count = var.master_instance_count
    instance_type  = var.master_instance_type
  }

  # Core Group
  core_instance_group = {
    name               = "Core - 1"
    instance_count     = var.core_instance_count
    instance_type      = var.core_instance_type
    autoscaling_policy = jsonencode({
    "Constraints" : {
      "MinCapacity" : 2,
      "MaxCapacity" : 8
    },
    "Rules" : [
      {
        "Action" : {
          "SimpleScalingPolicyConfiguration" : {
            "ScalingAdjustment" : 1,
            "CoolDown" : 1200,
            "AdjustmentType" : "CHANGE_IN_CAPACITY"
          }
        },
        "Trigger" : {
          "CloudWatchAlarmDefinition" : {
            "MetricName" : "ContainerPending",
            "ComparisonOperator" : "GREATER_THAN_OR_EQUAL",
            "Statistic" : "AVERAGE",
            "Period" : 300,
            "EvaluationPeriods" : 3,
            "Unit" : "COUNT",
            "Namespace" : "AWS/ElasticMapReduce",
            "Threshold" : 6
          }
        },
        "Name" : "prod_emr_core_scale_out"
      },
      {
        "Action" : {
          "SimpleScalingPolicyConfiguration" : {
            "ScalingAdjustment" : -1,
            "CoolDown" : 600,
            "AdjustmentType" : "CHANGE_IN_CAPACITY"
          }
        },
        "Trigger" : {
          "CloudWatchAlarmDefinition" : {
            "MetricName" : "ContainerPending",
            "ComparisonOperator" : "LESS_THAN_OR_EQUAL",
            "Statistic" : "AVERAGE",
            "Period" : 300,
            "EvaluationPeriods" : 8,
            "Unit" : "COUNT",
            "Namespace" : "AWS/ElasticMapReduce",
            "Threshold" : 5
          }
        },
        "Name" : "prod_emr_core_scale_in"
      }
    ]
  }
})
  }

    # Security Groups
  managed_security_group_use_name_prefix = false
  master_security_group_rules = [ ... ]
  slave_security_group_rules = [ ... ]
}

Steps to reproduce the behavior:

Create an EMR cluster using the above code (add a variables.tf with some values)
Note that the custom automatic scaling policies has the failed status

Expected behavior

The Service Role for EMR is able to assume the Autoscaling Role and there are no terraform drifts.

Actual behavior

The service Role for EMR is not able to assume the Autoscaling Role due to misconfigured trust-relationship for the Autoscaling Role, and I have a terraform drift since I had to manually change the trust relationship in the AWS Console.

Error deploying EMR due to `insufficient ec2 permissions

Description

When doing a deployment via the example it generates an error with:

 Error: waiting for EMR Cluster (j-1P38LJGZQ23DK) to create: unexpected state 'TERMINATED_WITH_ERRORS', wanted target 'RUNNING, WAITING'. last error: VALIDATION_ERROR: Service role arn:aws:iam::xxx:role/oc-dev-data-science-emr-service-20230505161755695800000001 has insufficient EC2 permissions
│ 
│   with module.oc-aws-data-science.module.emr_instance_fleet.aws_emr_cluster.this[0],
│   on .terraform/modules/oc-aws-data-science.emr_instance_fleet/main.tf line 26, in resource "aws_emr_cluster" "this":
│   26: resource "aws_emr_cluster" "this" {

terraform code used:

module "emr" {
  source  = "terraform-aws-modules/emr/aws"
  version = "1.0.0"
  name    = "${local.full_name}-emr"

  release_label_filters = {
    emr6 = {
      prefix = "emr-6"
    }
  }
  applications = ["spark"]
  auto_termination_policy = {
    idle_timeout = 3600
  }

  bootstrap_action = {
    example = {
      name = "Just an example",
      path = "file:/bin/echo",
      args = ["Hello World!"]
    }
  }

  configurations_json = jsonencode([
    {
      "Classification" : "spark-env",
      "Configurations" : [
        {
          "Classification" : "export",
          "Properties" : {
            "JAVA_HOME" : "/usr/lib/jvm/java-1.8.0"
          }
        }
      ],
      "Properties" : {}
    }
  ])

  master_instance_group = {
    name           = "master-group"
    instance_count = 1
    instance_type  = "m5.xlarge"
  }

  core_instance_group = {
    name           = "core-group"
    instance_count = 2
    instance_type  = "c4.large"
  }

  task_instance_group = {
    name           = "task-group"
    instance_count = 2
    instance_type  = "c5.xlarge"
    bid_price      = "0.1"

    ebs_config = {
      size                 = 64
      type                 = "gp3"
      volumes_per_instance = 1
    }
    ebs_optimized = true
  }

  ebs_root_volume_size = 64
  ec2_attributes = {
    subnet_id = data.aws_subnets.intra.ids[0]
  }
  vpc_id = data.aws_vpc.this.id

  keep_job_flow_alive_when_no_steps = true
  list_steps_states                 = ["PENDING", "RUNNING", "CANCEL_PENDING", "CANCELLED", "FAILED", "INTERRUPTED", "COMPLETED"]
  log_uri                           = "s3://${var.s3_prevent_destroy == true ? aws_s3_bucket.oc-aws-data-science[0].id : aws_s3_bucket.oc-aws-data-science-destroy[0].id}/emr-logs/"

  scale_down_behavior    = "TERMINATE_AT_TASK_COMPLETION"
  step_concurrency_level = 3
  termination_protection = false
  visible_to_all_users   = true

}

versions.tf

terraform {
  required_version = ">= 1.1.5"

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 4.64"
    }
  }
}

Master Security Group Rule Does Not Match Terraform Configuration

Description

I have attempted to add a specific rule for auto-generated master security group to allow SSH access. The rule is to allow port 22 access from anything on the internal network of my VPC. However, the result in AWS shows that it is open to all ports, not just 22.

I have the following section in my module configuration

master_security_group_rules = {
    "default" : {
      "cidr_blocks" : [
        "0.0.0.0/0"
      ],
      "description" : "Allow all egress traffic",
      "from_port" : 0,
      "ipv6_cidr_blocks" : [
        "::/0"
      ],
      "protocol" : "-1",
      "to_port" : 0,
      "type" : "egress"
    },
    "ssh_access" : {
      "cidr_blocks" : [data.terraform_remote_state.shared_state.outputs.shared_vpc_cidr],
      "description" : "Allow ssh traffic",
      "from_port" : 22,
      "ipv6_cidr_blocks" : ["::/0"],
      "protocol" : "-1",
      "to_port" : 22,
      "type" : "ingress"
    }
  }

Here is a screen shot of the respective SG ingress table.

As you can see it created the rule but for every port, not just 22.

The terraform state for the rule looks correct...

{
    "index_key": "ssh_access",
    "schema_version": 2,
    "attributes": {
      "cidr_blocks": [
        "10.3.0.0/16"
      ],
      "description": "Allow ssh traffic",
      "from_port": 22,
      "id": "sgrule-4093924578",
      "ipv6_cidr_blocks": [
        "::/0"
      ],
      "prefix_list_ids": null,
      "protocol": "-1",
      "security_group_id": "sg-04d6a320e7f0178ca",
      "security_group_rule_id": "",
      "self": false,
      "source_security_group_id": null,
      "timeouts": null,
      "to_port": 22,
      "type": "ingress"
    },
    "sensitive_attributes": [],
    "dependencies": [
      "data.aws_vpc.vpc",
      "data.terraform_remote_state.shared_state",
      "module.emr.aws_security_group.master"
    ]
  }
]
    }

The state for the security group itself does match what is in AWS but it doesn't match the rule...

"ingress": [
    {
      "cidr_blocks": [
        "10.3.0.0/16"
      ],
      "description": "Allow ssh traffic",
      "from_port": 0,
      "ipv6_cidr_blocks": [
        "::/0"
      ],
      "prefix_list_ids": [],
      "protocol": "-1",
      "security_groups": [],
      "self": false,
      "to_port": 0
    }
...
]

Versions

Module version [Required]: 1.0.0
Terraform version: 1.4.6
Provider version(s): AWS v5.5.0

Reproduction Code [Required]

See above, using code that closely follows the private instance_groups https://github.com/terraform-aws-modules/terraform-aws-emr#private-cluster-w-instance-group

Steps to reproduce the behavior:

I am using terraform cloud with workspaces

Expected behavior

The SSH rule should be just port 22

Actual behavior

The SSH rule is all ports

Terminal Output Screenshot(s)

Error when installing livy application on EMR

Description

When using the EMR module for an EMR cluster (private) creation, I have issues when installing these applications: "livy", "spark", "hadoop". EMR cluster creation is finishing correctly but Livy and Spark history server are not started correctly.

If I'm installing the same cluster manually through AWS console, all applications are started correctly.

Versions

Module version [Required]: 1.2.0 (and previous versions)
Terraform version: 1.6.2
Provider version(s):

provider registry.terraform.io/hashicorp/archive v2.4.0
provider registry.terraform.io/hashicorp/aws v5.22.0

Reproduction Code [Required]

Just create the private EMR cluster using the module, without any specific configuration, with these applications
applications = ["spark", "hadoop", "livy"]

Expected behavior

Having Livy and Spark history server correctly started and its URL displayed on AWS console

Actual behavior

Currently 2 of the 4 applications URL displayed in AWS console (tab applications) are empty because applications did not start correctly

If I'm creating a cluster manually through AWS console, I have correctly the 4 applications URL displayed.

Unable to start EMR studio workspaces

Description

After deploying EMR studio using the modules/studio terraform module, users are unable to start EMR studio workspaces

Versions

Module version [Required]: v1.0.0
Terraform version: 1.2.9

Provider version(s): hashicorp/aws 4.54.0

Reproduction Code [Required]

module "test_emr_studio" {
  source  = "terraform-aws-modules/emr/aws//modules/studio"
  version = "1.0.0"

  name                = "test-emr-studio"
  auth_mode           = "IAM"
  default_s3_location = "s3://sometestbucket/test-emr-studio"

  vpc_id     = <vpcid>
  subnet_ids = [<subnet_ids>]
}

Steps to reproduce the behavior:

Deploy EMR studio using the above code
Have an IAM user with only EMR studio privileges (no EC2:CreateNetworkInterface privileges) login to AWS
The user starts an idle workspace

Expected behavior

User able to start an idle workspace

Actual behavior

User unable to start an idle workspace

Cloudtrail shows Client.UnauthroizedOperation for CreateNetworkInterface

{
    "eventVersion": "1.08",
    "userIdentity": {
        "type": "AssumedRole",
        "principalId": "XXXXXXXXXXXXXX:ElasticMapReduceEditorsSession",
        <removed>
        },
        "invokedBy": "elasticmapreduce.amazonaws.com"
    },
    "eventTime": "2023-05-18T06:36:39Z",
    "eventSource": "ec2.amazonaws.com",
    "eventName": "CreateNetworkInterface",
    "awsRegion": "us-east-1",
    "sourceIPAddress": "elasticmapreduce.amazonaws.com",
    "userAgent": "elasticmapreduce.amazonaws.com",
    "errorCode": "Client.UnauthorizedOperation",
    "errorMessage": "You are not authorized to perform this operation. Encoded authorization failure message: <removed>",
    "requestParameters": {
        "subnetId": "subnet-XXXXXXXXXXXXXXXXXX,
        "description": "ENI for attaching editor e-XXXXXXXXXXXXXXXX",
        "groupSet": {
            "items": [
                {
                    "groupId": "sg-XXXXXXXXXXXXXXXX"
                }
            ]
        },
        "privateIpAddressesSet": {},
        "tagSpecificationSet": {
            "items": [
                {
                    "resourceType": "network-interface",
                    "tags": [
                        {
                            "key": "for-use-with-amazon-emr-managed-policies",
                            "value": "true"
                        }
                    ]
                }
            ]
        },
        "clientToken": "XXXXXXXXXX
    },
    "responseElements": null,
    "requestID": "XXXXX",
    "eventID": "XXXXXXXXX",
    "readOnly": false,
    "eventType": "AwsApiCall",
    "managementEvent": true,
    "recipientAccountId": "XXXXXXXXXXXX",
    "eventCategory": "Management"
}

Additional context

There are discrepancies between the service role policy provided by AWS in https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-studio-service-role.htm and the one in modules/studio/main.tf:data.aws_iam_policy_document.service

Insufficient Role/Rolebinding for EMR on EKS Virtual Cluster

Description

The virtual-cluster module defines a set of policy rules for the Role and Rolebindings here. It is missing some rules when compared to the official AWS documentations here. Namely:

apiGroups: [""]
resources: ["persistentvolumeclaims"]
verbs: ["get", "list", "watch", "describe", "create", "edit", "delete", "annotate", "patch", "label"]
apiGroups: ["networking.k8s.io"]
resources: ["ingresses"]
verbs: ["get", "list", "watch", "describe", "create", "edit", "delete", "annotate", "patch", "label"]
[ x] ✋ I have searched the open/closed issues and my issue is not listed.

Versions

Module version [Required]:
Terraform version:
Terraform v1.4.2
Provider version(s):

provider registry.terraform.io/hashicorp/aws v5.8.0

Reproduction Code [Required]

git clone https://github.com/terraform-aws-modules/terraform-aws-emr

Expected behavior

The resource kubernetes_role_v1 in modules/virtual-cluster/main.tf is expected to include all of the rules listed here.

Actual behavior

The resource is missing a couple of rules.

terraform-aws-modules / terraform-aws-emr Goto Github PK

terraform-aws-emr's People

Contributors

Stargazers

Watchers

Forkers

terraform-aws-emr's Issues

Description

Versions

Reproduction Code [Required]

Expected behavior

Actual behavior

Description

⚠️ Note

Versions

Reproduction Code [Required]

Expected behavior

Actual behavior

Terminal Output Screenshot(s)

Additional context

Description

Versions

Reproduction Code [Required]

Expected behavior

Actual behavior

Description

Description

Versions

Reproduction Code [Required]

Expected behavior

Actual behavior

Terminal Output Screenshot(s)

Description

Versions

Reproduction Code [Required]

Expected behavior

Actual behavior

Description

Versions

Reproduction Code [Required]

Expected behavior

Actual behavior

Additional context

Description

Versions

Reproduction Code [Required]

Expected behavior

Actual behavior

Recommend Projects

Recommend Topics

Recommend Org