fabfuel / ecs-deploy Goto Github PK
View Code? Open in Web Editor NEWPowerful CLI tool to simplify Amazon ECS deployments, rollbacks & scaling
License: Other
Powerful CLI tool to simplify Amazon ECS deployments, rollbacks & scaling
License: Other
I installed ecs deploy via pip and from source, but it seems I still don't have the deregister option. Is there any timeline when this option will be available?
We are using ecs deploy to specify environment (and secrets) via command line for new TaskDefinitions. If an older TaskDefinition has some ENV variables that we need to remove from future deployments (by removing them from the command line) they do still show up in the new TaskDefinition - seems the existing ones don't get deleted if removing them from the command line.
Please make it possible to add the executionRoleArn
to a task.
It is already possible for taskRoleArn
using --role
or -r
.
Here you can find how to do it in boto3.
I got the error when using secrets for the first time:
botocore.errorfactory.ClientException: An error occurred (ClientException) when calling the RegisterTaskDefinition operation: When you are specifying container secrets, you must also specify a value for 'executionRoleArn'.
Hi,
tl;dr - would you be willing to take a pull request for implementing a retry on run-task when throttled?
We've been using your library and are occasionally running into errors with regard to rate limit/throttling when running a task. Fargate likely exacerbates this issue for us -- I think it has more strict limits.
I was looking at the code and it looks pretty straight-forward to add support for retry on rate exceeded errors, up to a configurable limit. I was thinking default to something like 5 retries, 2 seconds wait for the first retry, and doubling that wait each consecutive retry (once the retry limit is reached, the rate error just gets re-raised). I have most of the concept ready to test, I'm just trying to get all the tox and virtualenv stuff working in Docker (I'm on a Windows machine).
Thanks
I get the following when I try to run ecs run ...
with a Fargate cluster.
An error occurred (InvalidParameterException) when calling the RunTask operation: Network Configuration must be provided when networkMode 'awsvpc' is specified.
Is there a command to change allocated cpu or memory in a task?
Hi there.
This project looks quite promising and I'm keen to try using it for our ECS deployments. We use Datadog for our deployments though, and rather than slapping some bash scripts on top of this library I'd be keen to have some native support for Datadog events.
What would be involved in this? I could create a PR, just need to discuss the design for this I guess.
Cheers!
Hi, i'm using ecs deploy with gitlab-ci, when trying to use it on service with 2 running tasks, it won't stop the tasks, and get timeout on the deployment.
the command im using:
ecs deploy --region ${ECS_REGION} ${CLUSTER_NAME} ${SERVICE_NAME} --deregister --task ${TASK_FAMILY} --user gitlab-ci
the new revision is deployed, and running 2 tasks, but the old tasks doesn't stop.
After the timeout below ends, i got 1 task with revision 24, and 2 tasks with revision 25.
$ ecs deploy --region ${ECS_REGION} ${CLUSTER_NAME} ${SERVICE_NAME} --deregister --task ${TASK_FAMILY} --user gitlab-ci
Deploying based on task definition: backend-engine
Creating new task definition revision
Successfully created revision: 25
Updating service
Successfully changed task definition to: backend-engine:25
Deploying new task definition........................................................................................................................................................................................................................................................
Deployment failed due to timeout. Please see: https://github.com/fabfuel/ecs-deploy#timeout
ERROR: Job failed: exit code 1
Currently this script deletes old task definition during deploy. So it's not possible to rollback to the old one. It would be great to have an option which allows me to keep old task definitions.
P.S.: we're attending to widely use this script for our infrastructure so we can provide some help with PRs if needed :)
Tasks that support Fargate are not properly copied.
Hello,
Getting the error below when trying to deploy.
ecs deploy --profile my_profile --region us-east-1 my_cluster my_service --tag rel-0.1
Creating new task definition revision
Parameter validation failed:
Unknown parameter in input: "taskRoleArn", must be one of: family, containerDefinitions, volumes
ecs deploy CL-WebApp-Dev-azA-B SV-WEBAPP-DEV
Traceback (most recent call last):
File "/usr/bin/ecs", line 11, in
load_entry_point('ecs-deploy==1.4.3', 'console_scripts', 'ecs')()
File "/usr/lib/python2.7/site-packages/click/core.py", line 764, in call
return self.main(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/click/core.py", line 717, in main
rv = self.invoke(ctx)
File "/usr/lib/python2.7/site-packages/click/core.py", line 1137, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/lib/python2.7/site-packages/click/core.py", line 956, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/lib/python2.7/site-packages/click/core.py", line 555, in invoke
return callback(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/ecs_deploy/cli.py", line 62, in deploy
client = get_client(access_key_id, secret_access_key, region, profile)
File "/usr/lib/python2.7/site-packages/ecs_deploy/cli.py", line 23, in get_client
return EcsClient(access_key_id, secret_access_key, region, profile)
File "/usr/lib/python2.7/site-packages/ecs_deploy/ecs.py", line 15, in init
self.boto = session.client(u'ecs')
File "/usr/lib/python2.7/site-packages/boto3/session.py", line 263, in client
aws_session_token=aws_session_token, config=config)
File "/usr/lib/python2.7/site-packages/botocore/session.py", line 889, in create_client
client_config=config, api_version=api_version)
File "/usr/lib/python2.7/site-packages/botocore/client.py", line 76, in create_client
verify, credentials, scoped_config, client_config, endpoint_bridge)
File "/usr/lib/python2.7/site-packages/botocore/client.py", line 291, in _get_client_args
verify, credentials, scoped_config, client_config, endpoint_bridge)
File "/usr/lib/python2.7/site-packages/botocore/args.py", line 45, in get_client_args
endpoint_url, is_secure, scoped_config)
File "/usr/lib/python2.7/site-packages/botocore/args.py", line 112, in compute_client_args
service_name, region_name, endpoint_url, is_secure)
File "/usr/lib/python2.7/site-packages/botocore/client.py", line 364, in resolve
service_name, region_name)
File "/usr/lib/python2.7/site-packages/botocore/regions.py", line 122, in construct_endpoint
partition, service_name, region_name)
File "/usr/lib/python2.7/site-packages/botocore/regions.py", line 135, in _endpoint_for_partition
raise NoRegionError()
botocore.exceptions.NoRegionError: You must specify a region.
Currently if you have failed deployment run you'll face with situation when AWS ECS constantly tries to deploy broken version. To prevent this we can automatically, before script exit, switch service to the last working task definition and prevent desperate ECS deployment attempts.
Creating new task definition revision
Successfully created revision: 22
Updating service
Successfully changed task definition to: TD-WEBAPP-DEV:22
Deploying new task definition..................................................................................................................................................................................................................................
Deployment failed due to timeout. Please see: https://github.com/fabfuel/ecs-deploy#timeout
I have 2 services running in my cluster and during deployments, I need to update containers images of both the service, is it possible to do that with one deployment command?
Hi there.
I want to wait for a running task to stop after running one-shot task using aws ecs wait tasks-stopped
.
To do so, I have to pass the task ARN to awscli but the ecs-deploy doesn't output task ARN.
I think it's useful if the ecs-deploy outputs ARN of the task which is run by itself.
I use ecs deploy
in combination with gitlab-ci.yml
.
In some cases the actual deployment line in the file can be very long and confusing. This is due to an extended external configuration of the docker containers by using environment variables.
I did not find an option to set those variables up in a file and let ecs deploy
just use this file instead of a long list of single environment variables.
The option could be called something like --env_file
.
What do you think?
Hi, i tried to deploy but got error:
Creating new task definition revision
Traceback (most recent call last):
File "/usr/local/bin/ecs", line 11, in
sys.exit(ecs())
File "/usr/local/lib/python2.7/site-packages/click/core.py", line 722, in call
return self.main(*args, **kwargs)
File "/usr/local/lib/python2.7/site-packages/click/core.py", line 697, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python2.7/site-packages/click/core.py", line 1066, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python2.7/site-packages/click/core.py", line 895, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python2.7/site-packages/click/core.py", line 535, in invoke
return callback(*args, **kwargs)
File "/usr/local/lib/python2.7/site-packages/ecs_deploy/cli.py", line 81, in deploy
new_td = create_task_definition(deployment, td)
File "/usr/local/lib/python2.7/site-packages/ecs_deploy/cli.py", line 291, in create_task_definition
new_td = action.update_task_definition(task_definition)
File "/usr/local/lib/python2.7/site-packages/ecs_deploy/ecs.py", line 545, in update_task_definition
additional_properties=task_definition.additional_properties
File "/usr/local/lib/python2.7/site-packages/ecs_deploy/ecs.py", line 65, in register_task_definition
**additional_properties
File "/usr/local/lib/python2.7/site-packages/botocore/client.py", line 312, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/usr/local/lib/python2.7/site-packages/botocore/client.py", line 575, in _make_api_call
api_params, operation_model, context=request_context)
File "/usr/local/lib/python2.7/site-packages/botocore/client.py", line 630, in _convert_to_request_dict
api_params, operation_model)
File "/usr/local/lib/python2.7/site-packages/botocore/validate.py", line 291, in serialize_to_request
raise ParamValidationError(report=report.generate_report())
botocore.exceptions.ParamValidationError: Parameter validation failed:
Unknown parameter in input: "executionRoleArn", must be one of: family, taskRoleArn, networkMode, containerDefinitions, volumes, placementConstraints
any idea ?
Hi!
I'm having an issue when using the ecs deploy on CircleCI.
After the whole process (create new task def, update service), it freezes on the "Deploying task definition".
ecs deploy $AWS_CLUSTER $AWS_SERVICE -t $AWS_TAG_DEV
Updating task definition
Changed image of container 'xxxxx-frontend' to: "1234.dkr.ecr.us-west-2.amazonaws.com/xxxxx/frontend:dev" (was: "1234.dkr.ecr.us-west-2.amazonaws.com/xxxxx/frontend:dev")
Creating new task definition revision
Successfully created revision: 10
Successfully deregistered revision: 9
Updating service
Successfully changed task definition to: xxxxx-frontend:10
Deploying task definition...................................................
Checking the AWS Console, I can see that the new task is running, with the new task definition - but the old one is not stopped and the desired status is still 'running'.
If I manually stop the old task, the "Deploying task definition" unfreezes and the process finishes with success. The desired status is 1 task for this service, so checking your code I can see that the stop task is never called for the old one.
Am I missing something? Is this an error of configuration on ECS?
Thanks for the project!
Have you given any consideration to creating a python package that (roughly) replicates the functionality of the CLI?
I've got a (stripped-down) fork that I've been using with AWS Lambda, and was wondering if there's any interest in incorporating those changes upstream.
Everytime I do a deploy I get the following response:
__init__() takes at least 8 arguments (7 given)
command ran is: ecs deploy test-cluster test-service
. Though I've tried multiple combinations of different options flags, but it always returns the error above.
Any insight would be great, Thanks.
Is there any way I can update the service without creating new task definition (I couldn't find anything in the documentation)?
Since I am updating only the docker image and always using the latest tag, there is no need for new task definition.
Via the AWS Console this can be done by checking the force new deployment option when updating the service.
Thanks,
Nikola.
is there a way for ecs-deploy run command to run what is in ecs scheduled task command override?
Hello, i set the Service in "DAEMON" mode meaning the # of desired tasks is Automatic (Top right corner at the screen-shot)
However i'm getting an error when deploying a new tag
Command
ecs deploy --region ${AWS_REGION} ${CLUSTER_NAME} ${ECS_SERVICE_NAME} --tag ${CI_COMMIT_SHA} --timeout 1200
Error:
Creating new task definition revision
Successfully created revision: 6
Updating service
An error occurred (InvalidParameterException) when calling the UpdateService operation: The daemon scheduling strategy does not support a desired count for services. Remove the desired count value and try again
i must be missing something here
can you please advise on this ?
I tried --region_name us-east-1 but it didn't work
Hello!
New task definition created successfully, but deploy fails too often.
Environment: Gitlab CI shared runner
Right now when you deploy a new task definition on an existing service, if you don't have enough capacity on the cluster to place the new task, the script will just exit with this message
"ERROR: (service foo) was unable to place a task because no container instance met all of its requirements. The closest matching (container-instance instance-id-bar) has insufficient CPU units available. For more information, see the Troubleshooting section of the Amazon ECS Developer Guide"
This happens even if you set the "Minimum healthy percent" to 0, so the container will eventually be deployed, but you don't know when.
In this case I think it would be a good idea to have a new flag to kill the existing tasks right after the service has been updated, that way the new one will be placed without any issues
NOTE: this happens very often on dev/qa environments, where you don't want to assign too much resources to the clusters.
I use profiles to login to aws:
$(aws ecr get-login --profile dev)
Would it be possible to add an option --profile to send it to aws with ecr-deploy?
For example:
aws ecs describe-task-definition --task-definition web-service --profile dev
is working.
aws ecs describe-task-definition --task-definition web-service
not (An error occurred (ClientException) when calling the DescribeTaskDefinition operation: Unable to describe task definition.).
(First off - thanks for the helpful tool that saved me a load of time).
We use scheduled tasks on ECS and think it would be useful if we could update scheduled tasks with this tool. When we push a new docker image we'd like to also use your tool to update these.
I believe using the current aws cli you would do:
aws ecs register-task-definition # task def with new image
aws events put-targets # point existing rule at new task def
It kind of breaks your current CLI, so we'd have to think about that.
Is this something you would consider a PR for?
The network mode and task placement constraint parameters of task definitions are relatively recent additions. It looks like this tool isn't set up to handle them presently. When I run an ecs deploy
the new task definition loses the networkMode: host
I originally had on the prior task definition.
A quick fix would probably be to just update register_task_definition
and update_task_definition
in ecs.py
to add the new attributes. A longer-term one might be to rearrange things so that the new definition is a complete copy of the old (i.e. no matter what new attributes get added in the future), with targeted mutations of only what's necessary to effect the deployment.
Hello, I was thinking it would be nice to add a rollback feature. This means that it will deregister the previous task definition after the deploy is marked a healthy, otherwise it will rollback to the previous one after the TIMEOUT
Hello! Thanks again for the great tool
I use version 1.7.0 and I have the next problem.
ecs deploy $cluster_name $APP_NAME -t $version --no-deregister --timeout -1
During the execution of the command no secrets information is copied from previous container definition to the new one
Wish you a good day!
Hi there,
Thanks for the tool, I quickly used the incredibly handy ecs deploy my-cluster my-service --tag 1.2.3
to update 2 containers running behind an ELB. Besides the fact it takes around 6 minutes (seems this is to do with AWS though?!) all worked fine.
My question is: can i use this cli to update a single container (no load balancing) with downtime?
For staging I simply want to bring down the old task and bring up a new one. If I try this using ecs deploy my-cluster my-service --tag 1.2.3
I see ECS complains due to the lack of dynamic port mapping. Ie both tasks use same port rather than the old task is brought down and the new task is brought up. I have tried even when setting healthy to 0 but it seems the same issue occurs.
Any thoughts, ideas appreciated. Thanks,
Traceback (most recent call last):
File "/usr/local/bin/ecs", line 7, in <module>
from ecs_deploy.cli import ecs
File "/usr/local/lib/python2.7/site-packages/ecs_deploy/cli.py", line 11, in <module>
from ecs_deploy.ecs import DeployAction, ScaleAction, RunAction, EcsClient, \
File "/usr/local/lib/python2.7/site-packages/ecs_deploy/ecs.py", line 5, in <module>
from dateutil.tz.tz import tzlocal
ImportError: No module named tz
https://github.com/fabfuel/ecs-deploy/blob/master/ecs_deploy/ecs.py#L5
- from dateutil.tz.tz import tzlocal
+ from dateutil.tz import tzlocal
I mistakenly pushed a --tag project:latest
, instead of --tag latest
. I then got stuck in a loop of the new image being project:latest:latest
.
Suggest changing this line to a split
instead of rsplit
:
Line 219 in b8b1210
However, the fact that it is an rsplit() tells me it was done deliberately, maybe hoping for a port or something. Maybe using urllib.parse.urlparse is better? The ideal behaviour IMO would error out before attempting to deploy anyway.
Hi,
First, thanks for this tool. It's been great thus far.
I'm seeing a "Deployment failed" message, but my tasks are updated appropriately. Is there a way to get more information as to what is actually failing?
Creating new task definition revision
Successfully created revision: 18
Updating service
Successfully changed task definition to: scout-prod-td:18
Deploying new task definition...........................
Deployment failed
For what it's worth, I have two identical cluster (beta + staging) that run successfully with the same configurations.
Thanks!
When trying to define a new task I come across the following message:
Deployment failed (timeout)
I'm runing v1.2.0.
Hello! Thanks for the great tool!
We currently use it with the option:
Deploy a new version of your service:
$ ecs deploy my-cluster my-service --tag 1.2.3 --timeout -1
It works fine, but the previous task definition is marked as INACTIVE with such option.
Is it possible to have an option to leave the previous task definition in ACTIVE status for rollback use cases?
I am using your amazing tool with jenkins to make my CD, but I'd like to be able to provide variables from
an environment file where I have it tracked which git, can you in a future release a way to support this ?
thanks again for your effort and bring us this amazing tool :)
Hi,
I want to thank you guys for this fantastic utility program to simplify deployment on ECS. Are there plans to support deployments without a timeout option?
For certain deployments (especially launch type Fargate) it appears ECS Cluster manager takes a while to shutdown/deregister from ASG and reach steady state. The default 300s is not really enough, but it is hard to know when to arbitrarily increase it to
In certain projects, it may be better to not have a timeout. Is this an functionality being considered?
Hello,
Could you add a check on Service stable?
Now I can deploy a new task definition but something will go wrong and the new task definition will not accept and I will not know this.
http://boto3.readthedocs.io/en/latest/reference/services/ecs.html#ECS.Waiter.ServicesStable
I'm getting timeouts when trying to deploy to a cluster, I'm using
ecs deploy <cluster> <service> -i <service> repo/foo:bar
I'm running the following
NAME="Container Linux by CoreOS" ID=coreos VERSION=1298.7.0 VERSION_ID=1298.7.0 BUILD_ID=2017-03-31-0215 PRETTY_NAME="Container Linux by CoreOS 1298.7.0 (Ladybug)" ANSI_COLOR="38;5;75" HOME_URL="https://coreos.com/" BUG_REPORT_URL="https://github.com/coreos/bugs/issues"
I'm using the latest image from Docker hub for the ecs-agent
, it basically deploys fine but doesn't remove the old task so the deploy never finishes. Everything in the ECS agent log looks fine to me, any pointers?
It would be nice to have a parameter to redefine and deploy a provided task revision (or the latest), or a provided JSON file
I use ecs deploy
a lot with the gitlab ci and every now and then I come across the need of simplifying the stages, i.e. getting similar bits into a base stage.
The problem with this is, I use environment varibales to populate, among others, the --command
string.
Problem:
In case an empty variable is at the very beginning of this --command
string, the resulting CMD
section becomes something like this
"", --any-option 1, ...
This something, that is not supported in AWS ECS.
It worrks without any issues in case the empty variable is somewhere in between the options of the --command
string.
Solution:
Remove empty variables from the command list.
I already created a pull request (cause it worked so well last time ๐ฅ )
Hi,
I have big problem with deployment by use ecs-deploy. Of course it works very well, but I encountered situation when I want to deploy my image according to blue-green methodology with 3 active instance in one cluster and managed by one service with desired 2 numbers of the same task. So it's obvious that when I start ecs-deploy it deploy my updated task into empty instance, and next show my warning about not enough space and resources for next deploy: "service was unable to place a task because no container instance met all of its requirements." And AWS throw me this info two times with two time intervals (approximately 8 minutes). I Assume that meanwhile he is trying to deploy and that's true he can find any free instance BUT according to blue-green methodology, after first deploy he should drain next instance and done next deploy. Of course finnaly after two tries and warnings he do it what i want but my pipeline is much longer and in my opinion it's just a waste of time. Maybe there is any option to force ecs to deploy without any info and warning (--ignore-warnings in this particular situation doesn't work)?
Okay, I want to use ecs-deploy in a rolling release setup. I run 3 container instances and this particular service is load balanced across 2 tasks.
When running ecs deploy
, the command fails on my command line with 'Deployment failed' because of an error similar to this:
{
'2016-12-28T14:26:44.793000+01:00': 'ERROR: (service more-services4-MyService-130NUIEVKWEKO) was unable to place a task because no container instance met all of its requirements. The closest matching (container-instance b33c7bbd-365a-4789-ad8f-2c4316e28965) is already using a port required by your task. For more information, see the Troubleshooting section of the Amazon ECS Developer Guide.'
}
however, 5 minutes later, the new TaskDefinition is active, steady, healthy and serving because AWS is doing a rolling release (deploy new task to 1 machine, switch the ELB to that, switch off one of the old tasks, deploy the second new task, ...)
Bottom line is: I'd like to ignore this error, as in my setup it's only a intermediate error.
So, I patched my local version like this:
diff --git a/ecs_deploy/cli.py b/ecs_deploy/cli.py
index 23dbc54..8d99017 100644
--- a/ecs_deploy/cli.py
+++ b/ecs_deploy/cli.py
@@ -157,11 +157,20 @@ def wait_for_finish(action, timeout, title, success_message, failure_message):
waiting_timeout = datetime.now() + timedelta(seconds=timeout)
while waiting and datetime.now() < waiting_timeout:
sleep(1)
- click.secho('.', nl=False)
service = action.get_service()
- waiting = not action.is_deployed(service) and not service.errors
- if waiting or service.errors:
+ service_errors = dict(service.errors.items())
+
+ for error_key, error_message in dict(service_errors.items()).items():
+ if 'is already using a port required by your task' in error_message:
+ click.secho('o', nl=False)
+ del service_errors[error_key]
+
+ click.secho('.', nl=False)
+
+ waiting = not action.is_deployed(service) and not service_errors
+
+ if waiting or service_errors:
print_errors(service, waiting, failure_message)
exit(1)
Admittedly, this is not the most beautiful code because I wanted to ask for a heads up whether that would be something you'd want to integrate in ecs-deploy
at all.
If so, should it be configurable ?
Changed environment of container 'container_name' to: {whatever}
User should be able to choose whether or not to echo the previous and new variables to current shell or not.
$ pip install boto3 ecs-deploy
Collecting boto3
Downloading boto3-1.4.7-py2.py3-none-any.whl (128kB)
Collecting ecs-deploy
Downloading ecs-deploy-1.4.0.tar.gz
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "", line 1, in
File "/tmp/pip-build-vWl8tn/ecs-deploy/setup.py", line 6, in
from ecs_deploy.ecs import VERSION
File "ecs_deploy/ecs.py", line 3, in
from boto3.session import Session
ImportError: No module named boto3.session
I'm trying to find a way to trigger a scheduled task manually in ecs. Does this have the capability?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.