silinternational / ecs-deploy Goto Github PK

Simple shell script for initiating blue-green deployments on Amazon EC2 Container Service (ECS)

License: MIT License

Shell 99.29% Dockerfile 0.71%

aws-ecs continuous-deployment ecs-deploy ecs

ecs-deploy's Introduction

ecs-deploy

This script uses the Task Definition and Service entities in Amazon's ECS to instigate an automatic blue/green deployment.

NOTE: Maintenance Only

ecs-deploy is now in maintenance mode. In other words, we are considering it "feature complete" and will generally only consider PRs if they are bugfixes or are to add support for new AWS CLI features.

Usage

One of the following is required:
    -n | --service-name     Name of service to deploy
    -d | --task-definition  Name of task definition to deploy

Required arguments:
    -k | --aws-access-key         AWS Access Key ID. May also be set as environment variable AWS_ACCESS_KEY_ID
    -s | --aws-secret-key         AWS Secret Access Key. May also be set as environment variable AWS_SECRET_ACCESS_KEY
    -r | --region                 AWS Region Name. May also be set as environment variable AWS_DEFAULT_REGION
    -p | --profile                AWS Profile to use - If you set this aws-access-key, aws-secret-key and region are not needed
       | --aws-instance-profile   Use the IAM role associated with the current AWS instance. Can only be used from within a running AWS instance. If you set this, aws-access-key and aws-secret-key are not needed
    -c | --cluster                Name of ECS cluster
    -n | --service-name           Name of service to deploy
    -i | --image                  Name of Docker image to run, ex: repo/image:latest
                                  Format: [domain][:port][/repo][/][image][:tag]
                                  Examples: mariadb, mariadb:latest, private.registry.com:8000/repo/image:tag

Optional arguments:
    -a | --aws-assume-role        ARN for AWS Role to assume for ecs-deploy operations.
    -D | --desired-count          The number of instantiations of the task to place and keep running in your service.
    -m | --min                    minumumHealthyPercent: The lower limit on the number of running tasks during a deployment. (default: 100)
    -M | --max                    maximumPercent: The upper limit on the number of running tasks during a deployment. (default: 200)
    -t | --timeout                Default is 90s. Script monitors ECS Service for new task definition to be running.
    -e | --tag-env-var            Get image tag name from environment variable. If provided this will override value specified in image name argument.
    -to | --tag-only              New tag to apply to all images defined in the task (multi-container task). If provided this will override value specified in image name argument.
    --max-definitions             Number of Task Definition Revisions to persist before deregistering oldest revisions.
                                  Note: This number must be 1 or higher (i.e. keep only the current revision ACTIVE).
                                        Max definitions causes all task revisions not matching criteria to be deregistered, even if they're created manually.
                                        Script will only perform deregistration if deployment succeeds.
    --task-definition-file        File used as task definition to deploy
    --enable-rollback             Rollback task definition if new version is not running before TIMEOUT
    --use-latest-task-def         Will use the most recently created task definition as it's base, rather than the last used.
    --force-new-deployment        Force a new deployment of the service. Default is false.
    --skip-deployments-check      Skip deployments check for services that take too long to drain old tasks
    --run-task                    Run created task now. If you set this, service-name are not needed.
    --wait-for-success            Wait for task execution to complete and to receive the exitCode 0.
    --launch-type                 The launch type on which to run your task. (https://docs.aws.amazon.com/cli/latest/reference/ecs/run-task.html)
    --platform-version            The Fargate platform version on which to run your task. (https://docs.aws.amazon.com/cli/latest/reference/ecs/run-task.html)
    --network-configuration       The network configuration for the task. This parameter is required for task definitions that use
                                      the awsvpc network mode to receive their own elastic network interface, and it is not supported
                                      for other network modes. (https://docs.aws.amazon.com/cli/latest/reference/ecs/run-task.html)
    --copy-task-definition-tags   Copy the existing task definition tags to the new task definition revision
    -v | --verbose                Verbose output
         --version                Display the version

Requirements:
    aws:  AWS Command Line Interface
    jq:   Command-line JSON processor

Examples:
  Simple deployment of a service (Using env vars for AWS settings):

    ecs-deploy -c my-cluster-name -n my-service-name -i my.private.repo.com/frontend_container:latest

  All options:

    ecs-deploy -k ABC123 -s SECRETKEY -r us-east-1 -c my-cluster-name -n my-service-name -i my.private.repo.com/frontend_container -m 50 -M 100 -t 240 -D 2 -e CI_TIMESTAMP -v

  Updating a task definition with a new image:

    ecs-deploy -d my-task-definition -i my.private.repo.com/frontend_container:17

  Using profiles (for STS delegated credentials, for instance):

    ecs-deploy -p my-profile -c my-cluster-name -n my-service-name -i my.private.repo.com/frontend_container -t 240 -e CI_TIMESTAMP -v

  Update just the tag on whatever image is found in ECS Task (supports multi-container tasks):

    ecs-deploy -c staging -n core-service -to 0.1.899 -i ignore

Notes:
  - If a tag is not found in image and an ENV var is not used via -e, it will default the tag to "latest"

Installation

Install and configure aws-cli
Install jq
Install ecs-deploy:

curl https://raw.githubusercontent.com/silinternational/ecs-deploy/master/ecs-deploy | sudo tee /usr/bin/ecs-deploy
sudo chmod +x /usr/bin/ecs-deploy

How it works

Note: Some nouns in the next paragraphs are capitalized to indicate that they are words which have specific meanings in AWS

Remember that in the EC2 Container Service, the relationship between the group of containers which together provide a useful application (e.g. a database, web frontend, and perhaps some for maintenance/cron) is specified in a Task Definition. The Task Definition then acts a sort of template for actually running the containers in that group. That resulting group of containers is known as a Task. Due to the way docker implements networking, generally you can only run one Task per Task Definition per Container Instance (the virtual machines providing the cluster infrastructure).

Task Definitions are automatically version controlled---the actual name of a Task Definition is composed of two parts, the Family name, and a version number, like so: phpMyAdmin:3

Since a Task is supposed to be a fully self-contained "worker unit" of a broader application, Amazon uses another configuration entity, Services, to manage the number of Tasks running at any given time. As Tasks are just instantiations of Task Definitions, a Service is just a binding between a specified revision of a Task Definition, and the number of Tasks which should be run from it.

Conveniently, Amazon allows this binding to be updated, either to change the number of Tasks running or to change the Task Definition they are built from. In the former case, the Service will respond by building or killing Tasks to bring the count to specifications. In the latter case, however, it will do a blue/green deployment, that is, before killing any of the old Tasks, it will first ensure that a new Task is brought up and ready to use, so that there is no loss of service.

Naturally, enough computing resources must be available in the ECS cluster for any of this to work.

Consequently, all that is needed to deploy a new version of an application is to update the Service which is running its Tasks to point at a new version of the Task Definition. ecs-deploy uses the python aws utility to do this. It,

Pulls the JSON representation of the in-use Task Definition; or the most recently created if using --use-latest-task-def
Edits it
Defines a new version, with the changes
Updates the Service to use the new version
Waits, querying Amazon's API to make sure that the Service has been able to create a new Task

The second step merits more explanation: since a Task Definition [may] define multiple containers, the question arises, "what must be changed to create a new revision?" Empirically, the surprising answer is nothing; Amazon allows you to create a new but identical version of a Task Definition, and the Service will still do a blue/green deployment of identical tasks.

Nevertheless, since the system uses docker, the assumption is that improvements to the application are built into its container images, which are then pushed into a repository (public or private), to then be pulled down for use by ECS. This script therefore uses the specified image parameter as a modification key to change the tag used by a container's image. It looks for images with the same repository name as the specified parameter, and updates its tag to the one in the specified parameter.

A direct consequence of this is that if you define more than one container in your Task Definition to use the same image, all of them will be updated to the specified tag, even if you set them to use different tags initially. But this is considered to be an unlikely use case.

This behavior allows two possible process to specify which images, and therefore which configurations, to deploy. First, you may set the tag to always be latest (or some other static value), like so:

ecs-deploy -c my-cluster-name -n my-service-name -i my.private.repo.com/frontend_container:latest

This will result in identical new versions of the Task Definition being created, but the Service will still do a blue/green deployment, and will so will pull down the latest version (if you previously pushed it into the registry).

Alternatively, you may specify some other means of obtaining the tag, since the script evals the image string. You could use git tags as a map to docker tags:

ecs-deploy -c my-cluster-name -n my-service-name -i 'my.private.repo.com/frontend_container:`git describe`'

Or perhaps just obtain read the docker tag from another file in your development:

ecs-deploy -c my-cluster-name -n my-service-name -i 'my.private.repo.com/frontend_container:$(< VERSION)'

In any case, just make sure your process builds, tags, and pushes the docker image you use to the repository before running this script.

Use Environment Variable for tag name value

In some cases you may want to use an environment variable for the tag name of your image. For example, we want to use a unique docker image/tag for each task definition. This gives us the ability to revert/rollback changes by just selecting a previous task definition and updating the service.

Using the -e argument you can provide the name of an environment variable that holds the value you wish to use for the tag.

For example:

ecs-deploy -c my-cluster-name -n my-service-name -i my.private.repo.com/frontend_container -e CI_TIMESTAMP

AWS IAM Policy Configuration

Here's an example of a suitable custom policy for AWS IAM:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "ecs:DeregisterTaskDefinition",
        "ecs:DescribeServices",
        "ecs:DescribeTaskDefinition",
        "ecs:DescribeTasks",
        "ecs:ListTasks",
        "ecs:ListTaskDefinitions",
        "ecs:RegisterTaskDefinition",
        "ecs:StartTask",
        "ecs:StopTask",
        "ecs:UpdateService",
        "iam:PassRole"
      ],
      "Resource": "*"
    }
  ]
}

Troubleshooting

You must provide AWS credentials in one of the supported formats. If you do not, you'll see some error output from the AWS CLI, something like:
```
 You must specify a region. You can also configure your region by running "aws configure".
```

Testing

Automated tests are performed using bats. The goal of testing is to ensure that updates/changes do not break core functionality. Unfortunately not all of ecs-deploy is testable since portions interact with AWS APIs to perform actions. So for now any parsing/processing of data locally is tested.

Any new functionality and pull requests should come with tests as well (if possible).

Github Actions Support

Github Actions support is available. Add a code block similar to that below to your actions yaml file. Parameters are passed to the ecs-deploy tool under 'with' section. For each parameter, the parameter name followed by _cmd must be called with the appropriate parameter option like '--aws-access-key' in addition to supplying the parameter aws_access_key with the appropriate value.

deploy_to_ecs:
  name: 'Deploy updated container image via blue/green deployment to ECS service.'
  runs-on: ubuntu-18.04
  steps:
  - uses: silinternational/ecs-deploy@master
    env:
      AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
      AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
      AWS_DEFAULT_REGION: 'us-east-1'
    with:
      aws_access_key_cmd: '--aws-access-key'
      aws_access_key: ${{ secrets.AWS_ACCESS_KEY_ID }}
      aws_secret_key_cmd: '--aws-secret-key'
      aws_secret_key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
      cluster_cmd: '--cluster'
      cluster: 'cluster-name'
      image_cmd: '--image'
      image: '{amazon_id}.dkr.ecr.us-east-1.amazonaws.com/cluster-name/image_name:latest'
      region_cmd: '--region'
      region: 'us-east-1'
      service_name_cmd: '--service-name'
      service_name: 'aws-service-name'
      timeout_cmd: '--timeout'
      timeout: '360'

ecs-deploy's People

Contributors

Stargazers

Watchers

Forkers

romanminkin apsl kevinkarwaski futureadlabs chonghong djeusette aleachjr cornelf macgyver2028 beeva-javiermartincaro johnallen3d proofme nathanwindle argusua toricls reidab lmickh campuslabs farrukhny mrmiagi z0mbix errordeveloper risul dhanugupta edtechfoundry jepettoruti koenijn rrana mgivney diddeb josephsiefers sumzero parallaxisjones tomharris yaleuniversity edlio whosthatknocking iansheridan teamwork timrc aosmith naphthalene dimi-tree orenr61 ashersz republicservicesrepository humblelistener mvoropaiev hmatland trek10inc iwinscotland singularitysolutionsgroup linxsys-admin falconwiz terencet tundeaoni reesilva creativesterminal frankv inakiabt ekrupnik erickrawczyk finesseio kariusdx davidqian2015 getninjas nicolaspeixoto fatgit danielpanzella everydaypineapple thatdevicecompany nilroy blaw2422 aavileli gitbenji ifrade blueshift-labs koenvo asotirov lotusk getdreams watermarkchurch trek10-8jen3 alvarow paperslack pm11 thiagoramos23 shigewa bknoth jkarthid buscojobs empitesithum gnasr r1b muz3 chansuke skahack bongole jaizquierdogalan jrudge-kcom

ecs-deploy's Issues

Docker host with port will break the script.

When running the command with the -i tag (which is required.) while supplying the port on the docker host will result in.

docker run --rm --env-file=./.aws.env silintl/ecs-deploy -c my-app-cluster -n MyAppServer -i example.com:5000/my-app:latest
Current task definition: <irrelevant information>
You must specify an image with a tag

Which won't allow me too use this tool with the current configuration I have with my docker host.

I am currently investigating my own fix.

Double slash in image name

#71

Command:
/var/www/ecs-deploy.sh --cluster "batman-cluster" --service-name "batman-service" -i asotirov/tickey-serverside:r-batman-1.94

Result:
Using image name: asotirov_//_tickey-serverside:r-batman-1.94
Current task definition: arn:aws:ecs:eu-west-1:267034406528:task-definition/batman-task:24
New task definition: arn:aws:ecs:eu-west-1:267034406528:task-definition/batman-task:25

Failing to change organization

When I use ecs-deploy to change organization and tag, such as from ctrp/ctrp-frontend:XXX to fnlcr/ctrp-frontend:05-02-17-1654 I end up running ctrp/ctrp-frontend:05-02-17-1044. Seems it doesn't change the organization passed to it, just the tag?

Appears to work from the docker output, but in AWS it only changed the tag.

$ docker run -it --rm -e AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY_ID -e AWS_SECRET_ACCESS_KEY=$AWS_SECRET_ACCESS_KEY -e AWS_DEFAULT_REGION=us-east-1 silintl/ecs-deploy --cluster CTRP-IntTest-Frontend --service-name CTRP-ctrp-frontend-INTTEST -i $DOCKER_IMAGE:$DATE
Unable to find image 'silintl/ecs-deploy:latest' locally
latest: Pulling from silintl/ecs-deploy
Status: Downloaded newer image for silintl/ecs-deploy:latest
Using image name: **fnlcr/ctrp-frontend:05-02-17-1654**
Current task definition: arn:aws:ecs:us-east-1:127516845550:task-definition/CTRP-ctrp-frontend-INTTEST:64
New task definition: arn:aws:ecs:us-east-1:127516845550:task-definition/CTRP-ctrp-frontend-INTTEST:65
ERROR: New task definition not running within 90 seconds

InvalidSignatureException when calling the DescribeServices operation

I run it like this.

$ docker run -it --rm -e AWS_ACCESS_KEY_ID=XXX -e AWS_SECRET_ACCESS_KEY=XXX -e AWS_DEFAULT_REGION=eu-west-1 silintl/ecs-deploy --cluster "batman-cluster" --service-name "batman-service" -i "iii:feature/xxx"

Using image name: iii:feature/xxx

An error occurred (InvalidSignatureException) when calling the DescribeServices operation: Signature expired: 20160925T124307Z is now earlier than 20160925T125643Z (20160925T13014Z - 5 min.)
Current task definition:
usage: aws [options] [ ...] [parameters]
To see help text, you can run:

aws help
aws help
aws help
aws: error: argument --task-definition: expected one argument

Error parsing parameter 'cli-input-json': Invalid JSON: No JSON object could be decoded
JSON received:
New task definition:
usage: aws [options] [ ...] [parameters]
To see help text, you can run:

aws help
aws help
aws help
aws: error: argument --task-definition: expected one argument

System information

From Windows 7

docker version

Client:
Version: 1.10.3
API version: 1.22
Go version: go1.5.3
Git commit: 20f81dd
Built: Thu Mar 10 21:49:11 2016
OS/Arch: windows/amd64

Server:
Version: 1.11.1
API version: 1.23
Go version: go1.5.4
Git commit: 5604cbe
Built: Wed Apr 27 00:34:20 2016
OS/Arch: linux/amd64

docker-machine version
docker-machine.exe version 0.6.0, build e27fb87

add an installation section to readme, getting errors related to "jq"

Maybe this is obvious to most, but I think at least some (myself included) would find it valuable to have a "installation" section in the readme.

Here's what I did and the trouble I'm having:

download the ecs-deploy script: wget https://raw.githubusercontent.com/silinternational/ecs-deploy/master/ecs-deploy to a deploy directory in my project folder. (it's a dockerized node app)
run the script:

./deploy/ecs-deploy -k my_key -s my_super_long_secret -r us-east-1 -c default -n studyloop-queues -i my_id.dkr.ecr.us-east-1.amazonaws.com/my_image:latest

This gives me the errors:

./deploy/ecs-deploy: line 240: jq: command not found
Exception ignored in: <_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>
BrokenPipeError: [Errno 32] Broken pipe
Current task definition: 
usage: aws [options] <command> <subcommand> [<subcommand> ...] [parameters]
To see help text, you can run:

  aws help
  aws <command> help
  aws <command> <subcommand> help
aws: error: argument --task-definition: expected one argument
./deploy/ecs-deploy: line 250: jq: command not found
./deploy/ecs-deploy: line 253: jq: command not found

Error parsing parameter 'cli-input-json': Invalid JSON: Expecting value: line 1 column 1 (char 0)
JSON received: 
New task definition: 
usage: aws [options] <command> <subcommand> [<subcommand> ...] [parameters]
To see help text, you can run:

  aws help
  aws <command> help
  aws <command> <subcommand> help
aws: error: argument --task-definition: expected one argument
./deploy/ecs-deploy: line 269: jq: command not found
Exception ignored in: <_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>
BrokenPipeError: [Errno 32] Broken pipe
./deploy/ecs-deploy: line 272: jq: command not found
./deploy/ecs-deploy: line 269: jq: command not found
Exception ignored in: <_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>
BrokenPipeError: [Errno 32] Broken pipe
./deploy/ecs-deploy: line 272: jq: command not found
./deploy/ecs-deploy: line 269: jq: command not found
Exception ignored in: <_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>
BrokenPipeError: [Errno 32] Broken pipe
./deploy/ecs-deploy: line 272: jq: command not found

I have the aws cli configured and I'm using the default Ubuntu 15 shell.

Error: The ENV var named is empty or not set

background info: MyService already exists and is at revision 1.
ecs-deploy is at tag 2.2.0 in this test.

command line args:
$ ecs-deploy -c mycluster -n MyService -i myimagerepo/image:latest -v

result:

+ aws ecs describe-task-definition --task-def arn:aws:ecs:us-east-1:AWSACCOUNTNUMBER:task-definition/TaskDefinition:2
+ [[ myimagerepo/image:latest =~ ^[^:]+:[^:/]+$ ]]
++ echo myimagerepo/image:latest
++ cut -d: -f 1
+ im=myimagerepo/image
++ echo myimagerepo/image:latest
++ cut -d: -f 2
+ tag=latest
+ [[ -n latest ]]
+ tag=
+ [[ x == \x ]]
+ echo 'The ENV var named  is empty or not set'
The ENV var named  is empty or not set
+ exit 1

support --version

Please add a --version option to print out the current version and exit.

"TASK_DEFINITION_ARN: unbound variable" when using with task definition "-d" option

When using the -d or --task-definition option, an error is produced:

/ecs-deploy: line 464: TASK_DEFINITION_ARN: unbound variable

It seems as though the TASK_DEFINITION_ARN is not getting set since SERVICE is false (as it should be), though that branch of code is what is responsible for setting TASK_DEFINITION_ARN.

Script reports failure if task count is 0.

When running the script on a service that has a task count of 0, it will fail while trying to see if the task has started. This has occurred once while I was debugging my setup and disabled a portion of the services and some new code was deployed. The only issue here is it returns a non-zero response which causes the CI I'm using to stop in its tracks. The script should have a graceful way of handling services that won't be instantiated right away.

Please support it can designate TASK_DEFINITION_ARN

I want to designate TASK_DEFINITION_ARN when I use latest task definition and update ecs service.
Because getCurrentTaskDefinition function does not get latest task definition.

Ex.

ecs-deploy -c $cluster_name -n $service_name --task-definition-arn $task_definition_arn -i $image

How to get last task definition:

LAST_TASK_DEFINITION_ARN=$(aws ecs list-task-definitions --status ACTIVE --family-prefix "${FAMILY_PREFIX}" --sort DESC | jq -r ".taskDefinitionArns[0]")

AWS IAM Policy Configuration, needs updating

Great service, thanks for sharing. I just created a custom IAM policy to run this (after using admin for a few weeks). There seems to have been a layout change to IAM policies since you posted, so wanted to let you know so you could update your example on DockerHub. What you had converted to the new layout looks like this:
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "ecs:DeregisterTaskDefinition", "ecs:DescribeServices", "ecs:DescribeTaskDefinition", "ecs:DescribeTasks", "ecs:ListTasks", "ecs:RegisterTaskDefinition", "ecs:StartTask", "ecs:StopTask", "ecs:UpdateService" ], "Resource": "*" } ] }

Merge of name parsing broke

67238d9

No tags on the docker images so that when we pull latest, we got the latest code.

I can fork and manage our own docker images, but thought this issue should be looked at.

Unable to find image 'silintl/ecs-deploy:latest' locally
latest: Pulling from silintl/ecs-deploy
Status: Downloaded newer image for silintl/ecs-deploy:latest
Unable to parse image name: 357801675085.dkr.ecr.us-east-1.amazonaws.com/boc-staging-hcp:20160323183016-8b06035, check the format and try again
{
"imageIds": [
{
"imageDigest": "sha256:7fe040d305f6d3ad9ceb7cd46908ce3b74cfeb995c4bfa158baf84ebc4b2731c"
},
{
"imageDigest": "sha256:d56a04282ebd8946a2231cfb162256335c5ac09c1a0bcae88c112d56c7890f05"
}
],
"failures": []
}
Deploy of hcp to staging complete, image: 357801675085.dkr.ecr.us-east-1.amazonaws.com/boc-staging-hcp:20160323183016-8b06035
Done. Your build exited with 0.

Rollouts are now rolling back

I've been using ecs-deploy for over a year. Suddenly I'm getting a message that it is rolling back to previous versions. My rollouts never finished in 90 seconds so technically failed. Did the code change to roll back with failures now? How can I avoid this?

$ docker run -it --rm -e AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY_ID -e AWS_SECRET_ACCESS_KEY=$AWS_SECRET_ACCESS_KEY -e AWS_DEFAULT_REGION=us-east-1 silintl/ecs-deploy --cluster CTRP-IntTest-Backend --service-name CTRP-ctrp_import_ct_api-INTTEST -i $DOCKER_IMAGE:$DATE
Unable to find image 'silintl/ecs-deploy:latest' locally
latest: Pulling from silintl/ecs-deploy
Status: Downloaded newer image for silintl/ecs-deploy:latest
Using image name: ctrp/ctrp_import_ct_api:05-01-17-1354
Current task definition: arn:aws:ecs:us-east-1:127516845550:task-definition/CTRP-ctrp_import_ct_api-INTTEST:146
New task definition: arn:aws:ecs:us-east-1:127516845550:task-definition/CTRP-ctrp_import_ct_api-INTTEST:147
Service updated successfully, new task definition running.
Waiting for service deployment to complete...
Rolling back to arn:aws:ecs:us-east-1:127516845550:task-definition/CTRP-ctrp_import_ct_api-INTTEST:146

env variables

Is there a way with this script to set the value of env variables within the container you're deploying?

Not able to update task definition when it does not have a service associated

I'm trying to use the example case in the Readme to update the image in a task definition that has no service attached to it. The usage() states:

Updating a task definition with a new image:

|
| ecs-deploy -d open-door-task -i docker.repo.com/doorman:17

However when I try running it I end up with an error message:
/ecs-deploy: line 520: TASK_DEFINITION_ARN: unbound variable

It seems the problem is in the code, if there is no SERVICE specified, function getCurrentTaskDefinition() will not expand the TASK_DEFINITION_ARN and that causes this error

Replace tag for multiple container definitions

Right now, it only replaces the first container definition. For those having same tag container group (i.e. it shares the same tag name), it should be replace as well.

Include IAM policy in readme

It would be helpful for other users if we include a sample IAM policy that grants the necessary permissions for this tool. This is what I have currently:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "Stmt1457037856137",
            "Action": [
                "ecs:DeregisterTaskDefinition",
                "ecs:DescribeServices",
                "ecs:DescribeTaskDefinition",
                "ecs:DescribeTasks",
                "ecs:ListTasks",
                "ecs:RegisterTaskDefinition",
                "ecs:StartTask",
                "ecs:StopTask",
                "ecs:UpdateService"
            ],
            "Effect": "Allow",
            "Resource": "*"
        }
    ]
}

package ecs-deploy

Right now I install it via an ugly wget. would be nice if it were in brew or pip

Doesn't replace image version with latest commit

The change to the sed regex has broken ecs-deploy for us. It doesn't match anything and therefore the same image version is deployed over again.

An error occurred (InvalidParameterException) when calling the DescribeTasks operation: Tasks cannot be empty when the cluster is deploying with no tasks running

This happens when there is no task running in the ECS cluster during deployment. Say you have a single container service defined, with minimum health of 0, then the deployment causes the cluster to temporarily have 0 tasks running. This causes the error below to be thrown.

ecs-deploy -c blah -n blah -i blah.dkr.ecr.us-east-1.amazonaws.com/blah/blah:95e7aab8 -t 120
Using image name: blah.dkr.ecr.us-east-1.amazonaws.com/blah/blah:95e7aab8
Current task definition: arn:aws:ecs:us-east-1:blah:task-definition/blah:2
New task definition: arn:aws:ecs:us-east-1:blah:task-definition/blah:3

An error occurred (InvalidParameterException) when calling the DescribeTasks operation: Tasks cannot be empty.
Service updated successfully, new task definition running.

Verbose:

+ RUNNING=
+ :
+ '[' '' ']'
+ sleep 10
+ i=20
+ '[' 20 -lt 120 ']'
++ /usr/local/bin/aws --output json ecs list-tasks --cluster blah --service-name blah --desired-status RUNNING
++ jq -r '.taskArns[]'
+ RUNNING_TASKS=
++ /usr/local/bin/aws --output json ecs describe-tasks --cluster blah --tasks
++ jq '.tasks[]| if .taskDefinitionArn == "arn:aws:ecs:us-east-1:blah:task-definition/blah:4" then . else empty end|.lastStatus'
++ grep -e RUNNING

An error occurred (InvalidParameterException) when calling the DescribeTasks operation: Tasks cannot be empty.

This does not impact the deployment though and the service will get correctly updated.

Replace tag for images

I wrote a script to replace tags for all containers in the task definition, based on your script, maybe add parameter --tag or something?

https://gist.github.com/Forever-Young/e939d9cc41bc7a105cdcf8cd7ab9d714

Main part is

NEW_CONTAINER_DEFINITIONS=$(echo "$TASK_DEFINITION" | jq --arg NEW_TAG $NEW_TAG 'def replace_tag: if . | test("[a-zA-Z0-9.]+/[a-zA-Z0-9]+:[a-zA-Z0-9]+") then sub("(?<s>[a-zA-Z0-9.]+/[a-zA-Z0-9]+:)[a-zA-Z0-9]+"; "\(.s)" + $NEW_TAG) else . end ; .containerDefinitions | [.[] | .+{image: .image | replace_tag}]')

I think test part needs an improvement, I don't check for docker-hub/private images for example

New task definition not running within 90 seconds

I try using this script but this message is displayed:

Current task definition: arn:aws:ecs:us-east-1:444480608166:task-definition/auth:5
New task definition: arn:aws:ecs:us-east-1:444480608166:task-definition/auth:7
ERROR: New task definition not running within 90 seconds

Broken on Alpine Linux (busybox)

I've been this script (big thanks, btw!) inside a Ubuntu-based Docker container for deployment, but recently switched to an Alpine-based and it doesn't work in there.

A client error (InvalidParameterException) occurred when calling the DescribeTasks operation: taskId longer than 36.
xargs: /usr/bin/aws: exited with status 255; aborting

I've checked the output of the commands and it seems to be double quotes around the Task ARNs:

aws --output json ecs --region eu-west-1 list-tasks --cluster MyCluster --service-name MyService --desired-status RUNNING | jq '.taskArns[]' | xargs '-I{}' echo '{}'
"arn:aws:ecs:eu-west-1:803064921768:task/4485a855-0608-4c75-bfd4-04e45fc7f34c"
"arn:aws:ecs:eu-west-1:803064921768:task/fcc976a2-2fa5-4e75-8d01-4f915a014342"

but it works fine with the "-r" switch for jq:

aws --output json ecs --region eu-west-1 list-tasks --cluster MyCluster --service-name MyService --desired-status RUNNING | jq -r '.taskArns[]' | xargs '-I{}' echo '{}'
arn:aws:ecs:eu-west-1:803064921768:task/4485a855-0608-4c75-bfd4-04e45fc7f34c
arn:aws:ecs:eu-west-1:803064921768:task/fcc976a2-2fa5-4e75-8d01-4f915a014342

Just saw issue #18 , this is probably related ;-)

Memory SoftLimit causes error

ECS service/task definitions require to either set Hard or Soft Memory limits. I've been using ecs-deploy successfully with hard (memory) limits, and when I just updated my task definition to soft (memoryReservation) limits I got this error the next time I tried to use ecs-deploy. I did have one of the two specified so I shouldn't be getting this error. I added a hard definition in addition to the soft one and it goes away.

An error occurred (ClientException) when calling the RegisterTaskDefinition operation: Invalid setting for container 'PatientAPI'. At least one of 'memory' or 'memoryReservation' must be specified.

HomeBrew formula

Hi,

I was wondering if you had considered creating a Homebrew Formulae for this tool?

Good work btw, makes ECS deployments a breeze.

Thanks,

Tom

Logic to check for busybox/sleep task

Currently the program expects an existing task to already be running within the cluster. However, if a user-specified task is not already running, clusters are often running the busybox/sleep task by default.

If you run ecs-deploy against a cluster in this state, it will simply attempt to do a rolling deployment of the sleep task itself, rather than using the desired task specified.

It would be a nice feature if it could check for said task, and bypass/create a new task based on the specified definition.

Support for ECS scheduled tasks

It would be awesome to add support for scheduled rules to update their targets on task's revision creation.
http://docs.aws.amazon.com/cli/latest/reference/events/put-targets.html

Wait for service steady state

Have you all looked into extending the deploy command to wait for the service to reach "steady state"? Currently, after ecs-deploy exits successfully I can't tell if the updated service was able to place tasks successfully.

As a stop gap we have been watching the event logs that come out of the aws ecs describe-services command. E.g.

watch "aws ecs describe-services --region us-east-1 --cluster staging --services staging-sd-neodarwin --output text --query 'services[0].events[0:5]'"

And waiting for a set of messages that look like this, indicating the service is done updating:

2017-03-31T14:17:26.154000-04:00        e4b61534-0b90-4b1c-b46e-2ae3625b8018    (service staging-sd-neodarwin) has reached a steady state.
2017-03-31T14:17:15.851000-04:00        95f28e76-2373-4dee-aae2-852d7155ce59    (service staging-sd-neodarwin) has stopped 1 running tasks: (task a1bf1b06-edea-4492-a431-a28b49a362fe).
2017-03-31T14:17:01.962000-04:00        6a51b1d0-1c52-481f-a9d6-eb5e43b8aa04    (service staging-sd-neodarwin) has begun draining connections on 1 tasks.
...

Would it make sense to add something like this to ecs-deploy? I can make a PR to try this idea out.

Thanks for creating such a great script!

Task Role not duplicated

I have a specific IAM role for my Task Definitions. This one is not duplicated after running ecs-deploy, which means the service does not run with the required permissions. Is there a flag we can pass in to assign a Task Definition IAM role? Or perhaps certain rights are required to assign this role?

Using awscli: 1.11.5 and the latest ecs-deploy from develop

Edit: I see this has been discussed in #48, but it seems the solution never made it into ecs-deploy.

Edit 2: I see there's been a merged PR #64, #67 aiming to address this issue. For me this is not working, however. Any pointers?

unary operator expected

I'm running ecs-deploy -t 120 --max-definitions 5 --cluster ${ECS_CLUSTER} --service-name ${SERVICE} --image ${ECR_REPO}:latest and i receive this error (while it continues, though):

Using image name: 1234567890.dkr.ecr.eu-west-1.amazonaws.com/frontend-v2:latest
Current task definition: arn:aws:ecs:eu-west-1:1234567890:task-definition/frontend-v2_blue:280
New task definition: arn:aws:ecs:eu-west-1:1234567890:task-definition/frontend-v2_blue:281
/usr/bin/ecs-deploy: line 325: [: !=: unary operator expected
Service updated successfully, new task definition running.
[...]

Should handle new `memoryReservation` settings

If I have a container that is only using the new "soft" memory limits, this scripts fails with the following:

An error occurred (ClientException) when calling the RegisterTaskDefinition operation: Invalid setting for container 'my-container'. At least one of 'memory' or 'memoryReservation' must be specified.

Will look into a fix.

Python rewrite

@Steevel - I liked your comment about starting a python rewrite. When we started this we did it in Bash because our initial use case was so basic it just seemed right. However as the popularity of this script has grown and we've received so many PRs, I've really wished it was better suited for unit testing and some automation for that.

Would you be interested in helping us with the rewrite? I also saw you were doing it in Python 3, but Python 2 is still the default for so most distros right? Could the rewrite be done in 2 but __future__ imports for forward compatibility too? I'm not much of a python developer myself so I'd love some help with it if you're open to it.

I'm also interested in feedback from more people using this script to know whether they'd appreciate it being in Python as well or perhaps if they are using it because it is bash. If you're using ecs-deploy please add comments to let me know what you think.

[Feature Request] Support for bounce/restart deployment

Hey,
Is it possible using ecs-deploy to just redeploy the most recent task definition on the service?

This is useful if you just want to reboot the fleet. In my case, I am using same task definition on 2 services and running ecs-deploy creates separate (but identical) task definition for both. I would like to run ecs-deploy on first service which will create a new task definition and just run ecs-deploy with bounce type of parameter which will update the second service to most recent version.

ecs-deploy removes per-task IAM roles in task definitions

AWS added support for per-task IAM roles on 2016-07-13. This role is exposed in the Task Definition JSON via the field taskRoleArn. ecs-deploy currently omits this field when updating a task definition, effectively removing it.

The reason is line 304:

| jq '.taskDefinition|{ family: .family, volumes: .volumes, containerDefinitions: .containerDefinitions}' )

I've fixed this by simply adding the new taskRoleArn field:

| jq '.taskDefinition|{ family: .family, volumes: .volumes, containerDefinitions: .containerDefinitions, taskRoleArn: .taskRoleArn}' )

However, I'm not sure if this will cause problems for users on older versions of the AWS CLI.

A client error occured when calling the DescribeTasks operation: taskId longer than 36.

A client error (InvalidParameterException) occurred when calling the DescribeTasks operation: taskId longer than 36.
xargs: /c/Amazon/AWSCLI/aws: exited with status 255; aborting

I'm getting this one on windows.

Script Does not stop the tasks with previous revision

I am trying to use the script to deploy a new version in ecs but all the time i get:

ERROR: New task definition not running within 90 seconds

Also, when i see the ecs console i can only see the task with the previous revision

Support for multi-container task definitions

The ecs-deploy cli will update both containers to be the same one if the task definitions have multiple docker containers defined.

My best guess: https://github.com/silinternational/ecs-deploy/blob/master/ecs-deploy#L282

"placementConstraints" is being removed from old task definition

For example, I have a service with a task definition that has:

  "placementConstraints": [
    {
      "expression": "attribute:instance-us == true",
      "type": "memberOf"
    }
  ],

When I update my service using ecs-deploy, the new task definition hasn't the "placementConstraints" object.

Use aws-cli config and credential files

The aws-cli creates two files in a user's home directory under ~/.aws:

credentials (contains the AWS_ACCESS_KEY and the AWS_SECRET_ACCESS_KEY)
config.json (contains the default region)

It would be nice if ecs-deploy used the values in these files as a fallback to the env vars or cli arguments.

unbound variable

I was calling ecs-deploy incorrectly, as such:

./ecs-deploy -r "$AWS_DEFAULT_REGION" -k "$AWS_ACCESS_KEY_ID" -s "$AWS_SECRET_ACCESS_KEY" -c $CLUSTER -n $NAME -i $IMAGE -e $TAG

Note in particular, -e $TAG as opposed to -e TAG.

And it resulted in the following error:

./ecs-deploy: line 262: !TAGVAR: unbound variable

A bad variable name passed to -e should probably generate a more friendly error message.

Blue green deployment doesn't seem to be working for me

hi, i just tried this tool on a ECS cluster with the following specs
Registered Container Instances : 4
Pending tasks : 0
Running tasks : 2

So i do have enough capacity in my cluster for blue/green deployments. Yet when i run this the new task is never run inside the ECS cluster, not quite sure what i may be doing wrong.
./ecs-deploy -v -c mup-social-ECSCluster-1G965XTDFBLO1 -n mup-social-Service-U2QPXOXIJ9FL -i 414367886899.dkr.ecr.us-east-1.amazonaws.com/mup/mup-social:icorbett-snapshot -t 1200 -m 50 -M 100 -D 2

As far as verbose mode here is what i get:

ecs-deploy.log.txt

Error ProfileNotFound with a command without profile

Hi,

I'm trying to use ecs-deploy with a simple config but I get this error:

AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_DEFAULT_REGION are defined and working (I'm using awscli for other stuff).

Do you have an idea?

ECS deploy script

Hi All,

Do you guys know how do we pull the latest docker image with constant image tag from private repo into aws ECS without modifying image tag in the docker

Because every time I get cached images only from local via ECS agent. I couldn't get able to get latest image with the same tag from private repository

I configured multiple applications in aws ECS cluster, so whenever Dev updated image/snapshot in a private repo I need to get the latest image in aws cluster

Because I don't want to use the old image when we got updated image in private repo

Because I created task definition in aws cluster and we are getting the cached image only from ECS agent

docker example

I'm not able to follow your docker example. It should have a docker run rather than just the script name right? I've tried versions of the following but all either give me the usage message or complain about not having the credentials and needing "aws configure". I shouldn't need to run aws configure inside a docker if I pass in the key and secret key right? Any help would be appreciated.

docker run -it --rm silintl/ecs-deploy --aws-access-key KEYHERE --aws-secret-key SECRETHERE --cluster uMatch-Test --region us-east-1 --service-name uMatch-nci-treatment-arm-api-TEST -i matchbox/nci-treatment-arm-api:latest

autoscale instance count from 1 to 2 only for purposes of service update

Do you know if its possible to have the instance count scale say from 1 to 2 when we perform a service update and then scale back down to 1 once the update is complete ?

IAM role credentials support?

This is actually a feature request. Any plans to support gathering credentials through IAM roles?

BTW, nicely done! Congrats, the script is incredibly useful.

Cheers,

Bash-expandable characters in a task definition are expanded

If a bash-expandable character like * is used in a task definition, it is expanded by bash. Expected behavior is that * stays untouched.

Example:

"environment": [
  {
     "name": "KEY",
    "value": "value * "
  }
]

is turned into

"environment": [
  {
     "name": "KEY",
    "value": "value bin dev ecs-deploy etc home lib media mnt proc root run sbin srv sys tmp usr var bin dev ecs-deploy etc home lib media mnt proc root run sbin srv sys tmp usr var "
  }
]

Invalid choice: JSON?

usage: aws [-h] [-v] {as,ec2,elb} ...
aws: error: invalid choice: 'json' (choose from 'as', 'ec2', 'elb')

Any clue as to why this is happening? Using HEAD and awscli-1.11.2.

Regression bug replacing image name of all containers in service

@gregwebs your change to use jq instead of sed for updating image names appears to have broken services that have multiple containers using different names. I should have tested for it but forgot and now we have a regression bug. Can you fix this so that it only updates containers that match the image name given without the tag like it did before?

See commit cc75ef1

Multiple containers on task overwrites both images.

When deploying a task definition that has two containers. The images on both containers will be overwritten by the image supplied through the cli. Causing two duplicate containers being deployed, which breaks if you have two services that require each other. i.e. httpd and memcached