Giter Site home page Giter Site logo

bosh-aws-cpi-release's Introduction

bosh-aws-cpi-release's People

Contributors

aaronshurley avatar andrew-su avatar beyhan avatar bgandon avatar cf-rabbit-bot avatar charleshansen avatar cobyrne-pivot avatar codeword avatar cppforlife avatar cunnie avatar flawedmatrix avatar friegger avatar h4xnoodle avatar jfmyers9 avatar jpalermo avatar kaleo211 avatar keymon avatar krishicks avatar ljfranklin avatar lnguyen avatar lwoydziak avatar mariash avatar meganmurawski avatar mrdavidlaing avatar oppegard avatar pivotal-saman-alvi avatar pmenglund avatar rkoster avatar tylerschultz avatar zaksoup avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

bosh-aws-cpi-release's Issues

spot request being cancelled immediately

I'm attempting to use a spot bid in my bosh create-env; but although bosh create-env says it is waiting for the instance, instead AWS thinks the bid was immediately cancelled.

screen shot 2017-11-01 at 10 03 43 am

To my bosh create-env -o bosh-deployment/aws/cpi.yml I've added the following operation:

- type: replace
  path: /resource_pools/name=vms/cloud_properties/spot_bid_price?
  value: 0.05

Can you think of reasons the aws cpi might be immediately cancelling the spot request?

The first time I attempted this, I waited and got this error:

Deploying:
  Creating instance 'bosh/0':
    Creating VM:
      Creating vm with stemcell cid 'ami-1084a875 light':
        CPI 'create_vm' method responded with error: CmdError{"type":"Bosh::Clouds::VMCreationFailed","message":"Spot instance creation failed: #\u003cBosh::Clouds::VMCreationFailed: Timed out waiting for spot request #\u003cstruct Aws::EC2::Types::RequestSpotInstancesResult spot_instance_requests=[#\u003cstruct Aws::EC2::Types::SpotInstanceRequest actual_block_hourly_price=nil, availability_zone_group=nil, block_duration_minutes=nil, create_time=2017-10-31 23:54:11 UTC, fault=nil, instance_id=nil, launch_group=nil, launch_specification=#\u003cstruct Aws::EC2::Types::LaunchSpecification user_data=nil, security_groups=[#\u003cstruct Aws::EC2::Types::GroupIdentifier group_name=\"default\", group_id=nil\u003e], addressing_type=nil, block_device_mappings=[#\u003cstruct Aws::EC2::Types::BlockDeviceMapping device_name=\"/dev/sdb\", virtual_name=nil, ebs=#\u003cstruct Aws::EC2::Types::EbsBlockDevice encrypted=nil, delete_on_termination=true, iops=nil, snapshot_id=nil, volume_size=25, volume_type=\"gp2\"\u003e, no_device=nil\u003e, #\u003cstruct Aws::EC2::Types::BlockDeviceMapping device_name=\"/dev/xvda\", virtual_name=nil, ebs=#\u003cstruct Aws::EC2::Types::EbsBlockDevice encrypted=nil, delete_on_termination=true, iops=nil, snapshot_id=nil, volume_size=nil, volume_type=\"gp2\"\u003e, no_device=nil\u003e], ebs_optimized=nil, iam_instance_profile=nil, image_id=\"ami-1084a875\", instance_type=\"m4.xlarge\", kernel_id=nil, key_name=\"admin-community-ohio\", network_interfaces=[#\u003cstruct Aws::EC2::Types::InstanceNetworkInterfaceSpecification associate_public_ip_address=nil, delete_on_termination=nil, description=nil, device_index=0, groups=[], ipv_6_address_count=nil, ipv_6_addresses=[], network_interface_id=nil, private_ip_address=\"10.10.1.4\", private_ip_addresses=[], secondary_private_ip_address_count=nil, subnet_id=\"subnet-d7aef4ac\"\u003e], placement=#\u003cstruct Aws::EC2::Types::SpotPlacement availability_zone=\"us-east-2b\", group_name=nil, tenancy=nil\u003e, ramdisk_id=nil, subnet_id=nil, monitoring=#\u003cstruct Aws::EC2::Types::RunInstancesMonitoringEnabled enabled=false\u003e\u003e, launched_availability_zone=nil, product_description=\"Linux/UNIX\", spot_instance_request_id=\"sir-y28rr38h\", spot_price=\"0.030000\", state=\"open\", status=#\u003cstruct Aws::EC2::Types::SpotInstanceStatus code=\"pending-evaluation\", message=\"Your Spot request has been submitted for review, and is pending evaluation.\", update_time=2017-10-31 23:54:11 UTC\u003e, tags=[], type=\"one-time\", valid_from=nil, valid_until=nil\u003e]\u003e to be fulfilled.\u003e","ok_to_retry":false}

I'm not yet sure how Your Spot request has been submitted for review, and is pending evaluation relates to to the aws console status of canceled-before-fulfillment: Your Spot request is canceled before it was fulfilled.

I was able to manually create a spot instance:

screen shot 2017-11-01 at 10 13 48 am

Working with AWS china Ningxia region

Hi Team, working of aws china, ningxia region (cn-nortwest-1). I have workstation and bosh deployed in eu-west-1 and trying to deploy in china
Got following error, while uploading light-stemcell

Error 100: Unknown CPI error 'Unknown' with message 'undefined method `endpoint' for #<Aws::EC2::Client>' in 'create_stemcell' CPI method

added cn endpoint properties in bosh via manifest:

properties:
  aws:
    ec2_endpoint: "<redacted>"
    elb_endpoint: "<redacted>"

Still facing same error, is there anything else to be configure with bosh to work with china region

Error when compiling native extensions for Eventmachine on OSX 10.10.3

make
compiling binder.cpp
In file included from binder.cpp:20:
In file included from ./project.h:29:
In file included from /Library/Developer/CommandLineTools/usr/bin/../include/c++/v1/iostream:38:
In file included from /Library/Developer/CommandLineTools/usr/bin/../include/c++/v1/ios:216:
In file included from /Library/Developer/CommandLineTools/usr/bin/../include/c++/v1/__locale:15:
In file included from /Library/Developer/CommandLineTools/usr/bin/../include/c++/v1/string:439:
In file included from /Library/Developer/CommandLineTools/usr/bin/../include/c++/v1/algorithm:628:
In file included from /Library/Developer/CommandLineTools/usr/bin/../include/c++/v1/memory:604:
/Library/Developer/CommandLineTools/usr/bin/../include/c++/v1/iterator:341:10: fatal error: '__debug' file not found
#include <__debug>
         ^
1 error generated.
make: *** [binder.o] Error 1

Gem files will remain installed in /Users/rubenkoster/.bosh_init/installations/4228a1d2-d428-463e-76b8-72fc7aadc83b/packages/bosh_aws_cpi/gem_home/ruby/1.9.1/gems/eventmachine-1.0.7 for inspection.
Results logged to /Users/rubenkoster/.bosh_init/installations/4228a1d2-d428-463e-76b8-72fc7aadc83b/packages/bosh_aws_cpi/gem_home/ruby/1.9.1/gems/eventmachine-1.0.7/ext/gem_make.out
An error occurred while installing eventmachine (1.0.7), and Bundler cannot
continue.
Make sure that `gem install eventmachine -v '1.0.7'` succeeds before bundling.

[AWS]Error: Unknown CPI error 'Unknown' with message 'snapshotId can only be modified on EBS devices' in 'create_vm' CPI method

I deployed using bbl for cf-deploy.

bosh cli  version : 3.0.1-712bfd7-2018-03-13T23:26:43Z
director  version : 265.2.0 (00000000)
bbl version : bbl 6.7.1 (linux/amd64)
stemcell version: 3586.8

the deploy is as follows.

bosh deploy -d cf cf-deployment/cf-deployment.yml \
 --vars-store cf-deployment-vars.yml \
 -o cf-deployment/operations/aws.yml \
 -o cf-deployment/operations/scale-to-one-az.yml \
 -v system_domain=cf-test4.com

and then I got the error

Task 39 | 07:23:47 | Preparing deployment: Preparing deployment (00:00:02)
Task 39 | 07:23:52 | Preparing package compilation: Finding packages to compile (00:00:01)
Task 39 | 07:23:53 | Compiling packages: libtool/3e211ee9e3aab09a9e8a9ff55ab9ce9ba81590d945640e2d29c078597db33a94
Task 39 | 07:23:53 | Compiling packages: autoconf/4f8914a0ada02006da32066d2e374e2a065d3acafbceb60ffd45ea146df7af1f
Task 39 | 07:23:53 | Compiling packages: libseccomp/11f3a8e12b881d4d6f660d84df70d3fe99b12e1621f9e93ff66e120aec12ec96
Task 39 | 07:23:53 | Compiling packages: tar/f922f97b27619f8331332e4186c1f9c63fa5b4ddd213c4a02c7ed7cf68fced21
Task 39 | 07:23:53 | Compiling packages: apparmor/0133a2288b0f71d05d89bd117c957297ad19cd19c0253236ba0ce4dedeb27186
Task 39 | 07:23:53 | Compiling packages: busybox/02b86e9e891e78294e366cfc3361514b752308952f56b954f7a6ae31961f859c
Task 39 | 07:24:46 | Compiling packages: autoconf/4f8914a0ada02006da32066d2e374e2a065d3acafbceb60ffd45ea146df7af1f (00:00:53)
                   L Error: Unknown CPI error 'Unknown' with message 'snapshotId can only be modified on EBS devices' in 'create_vm' CPI method
Task 39 | 07:24:46 | Compiling packages: libseccomp/11f3a8e12b881d4d6f660d84df70d3fe99b12e1621f9e93ff66e120aec12ec96 (00:00:53)
                   L Error: Unknown CPI error 'Unknown' with message 'snapshotId can only be modified on EBS devices' in 'create_vm' CPI method
Task 39 | 07:24:46 | Compiling packages: busybox/02b86e9e891e78294e366cfc3361514b752308952f56b954f7a6ae31961f859c (00:00:53)
                   L Error: Unknown CPI error 'Unknown' with message 'snapshotId can only be modified on EBS devices' in 'create_vm' CPI method
Task 39 | 07:24:47 | Compiling packages: libtool/3e211ee9e3aab09a9e8a9ff55ab9ce9ba81590d945640e2d29c078597db33a94 (00:00:54)
                   L Error: Unknown CPI error 'Unknown' with message 'snapshotId can only be modified on EBS devices' in 'create_vm' CPI method
Task 39 | 07:24:47 | Compiling packages: apparmor/0133a2288b0f71d05d89bd117c957297ad19cd19c0253236ba0ce4dedeb27186 (00:00:54)
                   L Error: Unknown CPI error 'Unknown' with message 'snapshotId can only be modified on EBS devices' in 'create_vm' CPI method
Task 39 | 07:24:47 | Compiling packages: tar/f922f97b27619f8331332e4186c1f9c63fa5b4ddd213c4a02c7ed7cf68fced21 (00:00:54)
                   L Error: Unknown CPI error 'Unknown' with message 'snapshotId can only be modified on EBS devices' in 'create_vm' CPI method
Task 39 | 07:24:47 | Error: Unknown CPI error 'Unknown' with message 'snapshotId can only be modified on EBS devices' in 'create_vm' CPI method

Task 39 Started  Tue Jun  5 07:23:47 UTC 2018
Task 39 Finished Tue Jun  5 07:24:47 UTC 2018
Task 39 Duration 00:01:00
Task 39 error

Updating deployment:
  Expected task '39' to succeed but state is 'error'

Exit code 1

What's that mean ?
Error: Unknown CPI error 'Unknown' with message 'snapshotId can only be modified on EBS devices' in 'create_vm' CPI method

Bosh - creating an aws instance - merge two calls into one

we are currently using bosh version 262.3. The question I have is on how bosh provisions
ec2 instances in AWS through bosh-aws-cpi.

To give a little context on how provisioning on AWS used to happen

In March, AWS Blog announced two new features we’re implementing:
Ability to enforce required tags on RunInstances with IAM policy
Ability to include tags on RunInstances with API and CLI

Which means, before this date all provisioning was done in two steps:

ec2 run-instances --image-id --count 1 --instance-type t2.micro --subnet-id

ec2-create-tags --tag "Name=<name_value>" --tag
"App=<app_value>" --tag "AppOwner=<app_owner_value>" --tag
"Environment=<env_value>"

The new workflow assured that instance creation and tag management can be done in a single
call.

ec2 run-instances --image-id --count 1 --instance-type t2.micro --subnet-id
--tag-specifications
'ResourceType=instance,Tags=[{Key=Name,Value=<name_value>},{Key=App,Value=<app_value>},{Key=AppOwner,Value=<app_owner_value>},{Key=Environment,Value=<env_value>}]'

Bosh ( tested with version 262.3 ) still seem to be provisioning instances in multiple
calls instead of a single one. Bosh is issuing 1 api call just to create the instance, and
then following that up with a 2nd api call to create-tags.

Allow the use of STS generated credentials

We make use of linked accounts and pretty strictly disallow the creation of "local" IAM users, so we really need to make use of the sts assume-role capability which results in:

SecretAccessKey,
SessionToken,
AccessKeyId

Doesn't look like I can currently make use of this approach since src/bosh_aws_cpi/lib/cloud/aws/cloud.rb does a Aws::Credentials.new(access_key, secret_access_key) call. The library should support using the session token as well, per:
http://docs.aws.amazon.com/sdkforruby/api/Aws/Credentials.html

Not sure what the operational implications are though, so maybe not the best approach.

Missing snapshots cause repeated orphan disk cleanup retries

I'm not entirely sure if this is a director issue or a CPI issue, but... when deleting a snapshot fails, it bubbles up to the director as an unknown error...

{ "result": null,
  "error": {
    "type": "Unknown",
    "message":"The snapshot 'snap-a1b2c3d4' does not exist.",
    "ok_to_retry": false ...

Long-standing behavior, but with the introduction of orphan disks this becomes much more noticeable. The every-30-minute cron job continuously retries deleting the same snapshot indefinitely. This leads to excess orphan disks hanging around and can easily fill up the ~FIFO task log history if unnoticed. I think the only workaround is manually deleting the snapshot entry from the database.

In the case of deleting disks, director will rescue the Bosh::Clouds::DiskNotFound error that cpi raises and always destroy the entry from the database.

In the case of deleting snapshots, director will rescue the same (?) Bosh::Clouds::DiskNotFound error, but cpi doesn't raise for that case so the generic error is thrown and director doesn't know what to do.

Often getting the error "Request limit exceeded." using AWS API. Is aws.max_retries the solution?

When deploying a CF installation on AWS, sometimes I hit the error AWS::EC2::Errors::RequestLimitExceeded Request limit exceeded..

In some portions of the code of this CPI, some calls are wrapped around a decorator which retries the requests, but there are several calls to the API that are not wrapped but that call the AWS api via the AWS sdk library used, specially when querying the existing state (e.g. listing AMI images).

In this PR the solution is wrap all the AWS client with some retry logic in the CPI code, but as said, there are several inexplicit calls to the API that are not wrapped.

But I do think that the configuration max_entries can solve that. bosh-aws-cpi uses the AWS sdk v1 (latest version is v2), and reviewing the SDK code it seems the requests are retried by the library.

So, if using a higher value of aws.max_entries this issue can be solved, I suggest either increase the default value and/or document the issue and the setting, as the issue is really common. I also wonder if we need to wrap the code with Bosh::Common.retryable at all.

Error example:

D, [2016-01-12 09:53:13 #18310] [] DEBUG -- DirectorJobRunner: Renewing lock: lock:deployment:matt
D, [2016-01-12 09:53:13 #18310] [create_missing_vm(runner_z1, 0/1)] DEBUG -- DirectorJobRunner: External CPI got response: {"result":null,"error":{"type":"Unknown","message":"Request limit exceeded.","ok_to_retry":false},"log":"Rescued Unknown: Request limit exceeded.. backtrace: /var/vcap/packages/bosh_aws_cpi/vendor/bundle/ruby/2.1.0/gems/aws-sdk-v1-1.60.2/lib/aws/core/client.rb:375:in `return_or_raise'\n/var/vcap/packages/bosh_aws_cpi/vendor/bundle/ruby/2.1.0/gems/aws-sdk-v1-1.60.2/lib/aws/core/client.rb:476:in `client_request'\n(eval):3:in `describe_images'\n/var/vcap/packages/bosh_aws_cpi/vendor/bundle/ruby/2.1.0/gems/aws-sdk-v1-1.60.2/lib/aws/ec2/resource.rb:72:in `describe_call'\n/var/vcap/packages/bosh_aws_cpi/vendor/bundle/ruby/2.1.0/gems/aws-sdk-v1-1.60.2/lib/aws/ec2/resource.rb:56:in `get_resource'\n/var/vcap/packages/bosh_aws_cpi/vendor/bundle/ruby/2.1.0/gems/aws-sdk-v1-1.60.2/lib/aws/core/resource.rb:235:in `block (2 levels) in define_attribute_getter'\n/var/vcap/packages/bosh_aws_cpi/vendor/bundle/ruby/2.1.0/gems/aws-sdk-v1-1.60.2/lib/aws/core/cacheable.rb:63:in `retrieve_attribute'\n/var/vcap/packages/bosh_aws_cpi/vendor/bundle/ruby/2.1.0/gems/aws-sdk-v1-1.60.2/lib/aws/ec2/resource.rb:66:in `retrieve_attribute'\n/var/vcap/packages/bosh_aws_cpi/vendor/bundle/ruby/2.1.0/gems/aws-sdk-v1-1.60.2/lib/aws/core/resource.rb:235:in `block in define_attribute_getter'\n/var/vcap/packages/bosh_aws_cpi/lib/cloud/aws/instance_manager.rb:120:in `build_instance_params'\n/var/vcap/packages/bosh_aws_cpi/lib/cloud/aws/instance_manager.rb:81:in `create'\n/var/vcap/packages/bosh_aws_cpi/lib/cloud/aws/cloud.rb:127:in `block in create_vm'\n/var/vcap/packages/bosh_aws_cpi/vendor/bundle/ruby/2.1.0/gems/bosh_common-1.3071.0/lib/common/thread_formatter.rb:49:in `with_thread_name'\n/var/vcap/packages/bosh_aws_cpi/lib/cloud/aws/cloud.rb:122:in `create_vm'\n/var/vcap/packages/bosh_aws_cpi/vendor/bundle/ruby/2.1.0/gems/bosh_cpi-1.3071.0/lib/bosh/cpi/cli.rb:71:in `public_send'\n/var/vcap/packages/bosh_aws_cpi/vendor/bundle/ruby/2.1.0/gems/bosh_cpi-1.3071.0/lib/bosh/cpi/cli.rb:71:in `run'\n/var/vcap/packages/bosh_aws_cpi/bin/aws_cpi:28:in `<main>'"}, err: I, [2016-01-12T09:53:10.833273 #18490]  INFO -- : [AWS EC2 200 0.134925 0 retries] describe_regions()  

I, [2016-01-12T09:53:11.073708 #18490]  INFO -- : [AWS EC2 200 0.229609 0 retries] describe_images(:filters=>[{:name=>"image-id",:values=>["ami-8bf7c4fc"]}])  

I, [2016-01-12T09:53:13.168565 #18490]  INFO -- : [AWS EC2 503 2.078879 2 retries] describe_images(:image_ids=>["ami-8bf7c4fc"]) AWS::EC2::Errors::RequestLimitExceeded Request limit exceeded.

E, [2016-01-12T09:53:13.169478 #18490] ERROR -- : Failed to create instance: Request limit exceeded.
/var/vcap/packages/bosh_aws_cpi/vendor/bundle/ruby/2.1.0/gems/aws-sdk-v1-1.60.2/lib/aws/core/client.rb:375:in `return_or_raise'
/var/vcap/packages/bosh_aws_cpi/vendor/bundle/ruby/2.1.0/gems/aws-sdk-v1-1.60.2/lib/aws/core/client.rb:476:in `client_request'
(eval):3:in `describe_images'
/var/vcap/packages/bosh_aws_cpi/vendor/bundle/ruby/2.1.0/gems/aws-sdk-v1-1.60.2/lib/aws/ec2/resource.rb:72:in `describe_call'
/var/vcap/packages/bosh_aws_cpi/vendor/bundle/ruby/2.1.0/gems/aws-sdk-v1-1.60.2/lib/aws/ec2/resource.rb:56:in `get_resource'
/var/vcap/packages/bosh_aws_cpi/vendor/bundle/ruby/2.1.0/gems/aws-sdk-v1-1.60.2/lib/aws/core/resource.rb:235:in `block (2 levels) in define_attribute_getter'
/var/vcap/packages/bosh_aws_cpi/vendor/bundle/ruby/2.1.0/gems/aws-sdk-v1-1.60.2/lib/aws/core/cacheable.rb:63:in `retrieve_attribute'
/var/vcap/packages/bosh_aws_cpi/vendor/bundle/ruby/2.1.0/gems/aws-sdk-v1-1.60.2/lib/aws/ec2/resource.rb:66:in `retrieve_attribute'
/var/vcap/packages/bosh_aws_cpi/vendor/bundle/ruby/2.1.0/gems/aws-sdk-v1-1.60.2/lib/aws/core/resource.rb:235:in `block in define_attribute_getter'
/var/vcap/packages/bosh_aws_cpi/lib/cloud/aws/instance_manager.rb:120:in `build_instance_params'
/var/vcap/packages/bosh_aws_cpi/lib/cloud/aws/instance_manager.rb:81:in `create'
/var/vcap/packages/bosh_aws_cpi/lib/cloud/aws/cloud.rb:127:in `block in create_vm'
/var/vcap/packages/bosh_aws_cpi/vendor/bundle/ruby/2.1.0/gems/bosh_common-1.3071.0/lib/common/thread_formatter.rb:49:in `with_thread_name'
/var/vcap/packages/bosh_aws_cpi/lib/cloud/aws/cloud.rb:122:in `create_vm'
/var/vcap/packages/bosh_aws_cpi/vendor/bundle/ruby/2.1.0/gems/bosh_cpi-1.3071.0/lib/bosh/cpi/cli.rb:71:in `public_send'
/var/vcap/packages/bosh_aws_cpi/vendor/bundle/ruby/2.1.0/gems/bosh_cpi-1.3071.0/lib/bosh/cpi/cli.rb:71:in `run'
/var/vcap/packages/bosh_aws_cpi/bin/aws_cpi:28:in `<main>'
, exit_status: pid 18483 exit 0
E, [2016-01-12 09:53:13 #18310] [create_missing_vm(runner_z1, 0/1)] ERROR -- DirectorJobRunner: error creating vm: Unknown CPI error 'Unknown' with message 'Request limit exceeded.'

I, [2016-01-12T09:53:13.333183 #18476]  INFO -- : [AWS EC2 503 2.252245 2 retries] describe_images(:filters=>[{:name=>"image-id",:values=>["ami-8bf7c4fc"]}]) AWS::EC2::Errors::RequestLimitExceeded Request limit exceeded.

, exit_status: pid 18463 exit 0
I, [2016-01-12 09:53:13 #18310] [create_missing_vm(runner_z1, 0/1)]  INFO -- DirectorJobRunner: Cleaning up the created VM due to an error: Unknown CPI error 'Unknown' with message 'Request limit exceeded.'
D, [2016-01-12 09:53:13 #18310] [] DEBUG -- DirectorJobRunner: Worker thread raised exception: Unknown CPI error 'Unknown' with message 'Request limit exceeded.' - /var/vcap/packages/director/gem_home/ruby/2.1.0/gems/bosh_cpi-1.3153.0/lib/cloud/external_cpi.rb:108:in `handle_error'
/var/vcap/packages/director/gem_home/ruby/2.1.0/gems/bosh_cpi-1.3153.0/lib/cloud/external_cpi.rb:89:in `invoke_cpi_method'
/var/vcap/packages/director/gem_home/ruby/2.1.0/gems/bosh_cpi-1.3153.0/lib/cloud/external_cpi.rb:51:in `create_vm'
/var/vcap/packages/director/gem_home/ruby/2.1.0/gems/bosh-director-1.3153.0/lib/bosh/director/vm_creator.rb:41:in `create'
/var/vcap/packages/director/gem_home/ruby/2.1.0/gems/bosh-director-1.3153.0/lib/bosh/director/resource_pool_updater.rb:51:in `create_missing_vm'
/var/vcap/packages/director/gem_home/ruby/2.1.0/gems/bosh-director-1.3153.0/lib/bosh/director/resource_pool_updater.rb:34:in `block (4 levels) in create_missing_vms'
/var/vcap/packages/director/gem_home/ruby/2.1.0/gems/bosh_common-1.3153.0/lib/common/thread_formatter.rb:49:in `with_thread_name'
/var/vcap/packages/director/gem_home/ruby/2.1.0/gems/bosh-director-1.3153.0/lib/bosh/director/resource_pool_updater.rb:32:in `block (3 levels) in create_missing_vms'
/var/vcap/packages/director/gem_home/ruby/2.1.0/gems/bosh-director-1.3153.0/lib/bosh/director/event_log.rb:97:in `call'
/var/vcap/packages/director/gem_home/ruby/2.1.0/gems/bosh-director-1.3153.0/lib/bosh/director/event_log.rb:97:in `advance_and_track'
/var/vcap/packages/director/gem_home/ruby/2.1.0/gems/bosh-director-1.3153.0/lib/bosh/director/event_log.rb:50:in `track'
/var/vcap/packages/director/gem_home/ruby/2.1.0/gems/bosh-director-1.3153.0/lib/bosh/director/resource_pool_updater.rb:31:in `block (2 levels) in create_missing_vms'
/var/vcap/packages/director/gem_home/ruby/2.1.0/gems/bosh_common-1.3153.0/lib/common/thread_pool.rb:77:in `call'
/var/vcap/packages/director/gem_home/ruby/2.1.0/gems/bosh_common-1.3153.0/lib/common/thread_pool.rb:77:in `block (2 levels) in create_thread'
/var/vcap/packages/director/gem_home/ruby/2.1.0/gems/bosh_common-1.3153.0/lib/common/thread_pool.rb:63:in `loop'
/var/vcap/packages/director/gem_home/ruby/2.1.0/gems/bosh_common-1.3153.0/lib/common/thread_pool.rb:63:in `block in create_thread'
/var/vcap/packages/director/gem_home/ruby/2.1.0/gems/logging-1.8.2/lib/logging/diagnostic_context.rb:323:in `call'
/var/vcap/packages/director/gem_home/ruby/2.1.0/gems/logging-1.8.2/lib/logging/diagnostic_context.rb:323:in `block in create_with_logging_context'

Following some stacktraces, I identified these points where this error can raise:

https://github.com/cloudfoundry-incubator/bosh-aws-cpi-release/blob/master/src/bosh_aws_cpi/lib/cloud/aws/instance_manager.rb#L121

/var/vcap/packages/bosh_aws_cpi/vendor/bundle/ruby/2.1.0/gems/aws-sdk-v1-1.60.2/lib/aws/core/client.rb:375:in `return_or_raise'
/var/vcap/packages/bosh_aws_cpi/vendor/bundle/ruby/2.1.0/gems/aws-sdk-v1-1.60.2/lib/aws/core/client.rb:476:in `client_request'
(eval):3:in `describe_images'
/var/vcap/packages/bosh_aws_cpi/vendor/bundle/ruby/2.1.0/gems/aws-sdk-v1-1.60.2/lib/aws/ec2/resource.rb:72:in `describe_call'
/var/vcap/packages/bosh_aws_cpi/vendor/bundle/ruby/2.1.0/gems/aws-sdk-v1-1.60.2/lib/aws/ec2/resource.rb:56:in `get_resource'
/var/vcap/packages/bosh_aws_cpi/vendor/bundle/ruby/2.1.0/gems/aws-sdk-v1-1.60.2/lib/aws/core/resource.rb:235:in `block (2 levels) in define_attribute_getter'
/var/vcap/packages/bosh_aws_cpi/vendor/bundle/ruby/2.1.0/gems/aws-sdk-v1-1.60.2/lib/aws/core/cacheable.rb:63:in `retrieve_attribute'
/var/vcap/packages/bosh_aws_cpi/vendor/bundle/ruby/2.1.0/gems/aws-sdk-v1-1.60.2/lib/aws/ec2/resource.rb:66:in `retrieve_attribute'
/var/vcap/packages/bosh_aws_cpi/vendor/bundle/ruby/2.1.0/gems/aws-sdk-v1-1.60.2/lib/aws/core/resource.rb:235:in `block in define_attribute_getter'
/var/vcap/packages/bosh_aws_cpi/lib/cloud/aws/instance_manager.rb:120:in `build_instance_params'
/var/vcap/packages/bosh_aws_cpi/lib/cloud/aws/instance_manager.rb:81:in `create'
/var/vcap/packages/bosh_aws_cpi/lib/cloud/aws/cloud.rb:127:in `block in create_vm'
/var/vcap/packages/bosh_aws_cpi/vendor/bundle/ruby/2.1.0/gems/bosh_common-1.3071.0/lib/common/thread_formatter.rb:49:in `with_thread_name'
/var/vcap/packages/bosh_aws_cpi/lib/cloud/aws/cloud.rb:122:in `create_vm'
/var/vcap/packages/bosh_aws_cpi/vendor/bundle/ruby/2.1.0/gems/bosh_cpi-1.3071.0/lib/bosh/cpi/cli.rb:71:in `public_send'
/var/vcap/packages/bosh_aws_cpi/vendor/bundle/ruby/2.1.0/gems/bosh_cpi-1.3071.0/lib/bosh/cpi/cli.rb:71:in `run'
/var/vcap/packages/bosh_aws_cpi/bin/aws_cpi:28:in `<main>'

https://github.com/cloudfoundry-incubator/bosh-aws-cpi-release/blob/master/src/bosh_aws_cpi/lib/cloud/aws/stemcell.rb#L9

/var/vcap/packages/bosh_aws_cpi/vendor/bundle/ruby/2.1.0/gems/aws-sdk-v1-1.60.2/lib/aws/core/client.rb:375:in `return_or_raise'
/var/vcap/packages/bosh_aws_cpi/vendor/bundle/ruby/2.1.0/gems/aws-sdk-v1-1.60.2/lib/aws/core/client.rb:476:in `client_request'
(eval):3:in `describe_images'
/var/vcap/packages/bosh_aws_cpi/vendor/bundle/ruby/2.1.0/gems/aws-sdk-v1-1.60.2/lib/aws/ec2/image.rb:220:in `exists?'
/var/vcap/packages/bosh_aws_cpi/lib/cloud/aws/stemcell.rb:9:in `find'
/var/vcap/packages/bosh_aws_cpi/lib/cloud/aws/stemcell_finder.rb:8:in `find_by_region_and_id'
/var/vcap/packages/bosh_aws_cpi/lib/cloud/aws/cloud.rb:124:in `block in create_vm'
/var/vcap/packages/bosh_aws_cpi/vendor/bundle/ruby/2.1.0/gems/bosh_common-1.3071.0/lib/common/thread_formatter.rb:49:in `with_thread_name'
/var/vcap/packages/bosh_aws_cpi/lib/cloud/aws/cloud.rb:122:in `create_vm'
/var/vcap/packages/bosh_aws_cpi/vendor/bundle/ruby/2.1.0/gems/bosh_cpi-1.3071.0/lib/bosh/cpi/cli.rb:71:in `public_send'
/var/vcap/packages/bosh_aws_cpi/vendor/bundle/ruby/2.1.0/gems/bosh_cpi-1.3071.0/lib/bosh/cpi/cli.rb:71:in `run'
/var/vcap/packages/bosh_aws_cpi/bin/aws_cpi:28:in `<main>'

Heavy and Light Stemcell names differ

Heavy and light stemcell names in stemcell.MF differ for the AWS stemcells:

Light name: bosh-aws-xen-hvm-ubuntu-trusty-go_agent
Heavy name: bosh-aws-xen-ubuntu-trusty-go_agent

Our understanding is, that they should represent the same stemcell inside bosh, which checks for name and version. In the Google Stemcells, the name is the same.

Unable to use EIP on VPC instance

I see the following error when attempting to create-env an existing manifest with an elastic IP after upgrading to v63. Reverted to v62 to resurrect the VM.

You must specify an allocation id when mapping an address to a VPC instance

Sounds like it needs to be passing allocation_id: "eipalloc-a1b2c3d4" instead of public_ip: "203.0.113.182" since the instance is on a VPC? Seems like I'd have the same issue when running on director, but didn't try. Tried deploying twice and saw the error both times before reverting. Relevant debug message look like...

D, [2017-04-18T06:36:58.546463 #17434] DEBUG -- : [Aws::EC2::Client 400 0.066058 0 retries]
  associate_address(instance_id:"i-a1b2c3d4e5f6a7b8c9",public_ip:"203.0.113.182")
  Aws::EC2::Errors::InvalidParameterCombination You must specify an allocation id when mapping an address to a VPC instance

E, [2017-04-18T06:36:58.546655 #17434] ERROR -- : Failed to create instance: You must specify an allocation id when mapping an address to a VPC instance
  ./vendor/bundle/ruby/2.2.0/gems/aws-sdk-core-2.9.6/lib/seahorse/client/plugins/raise_response_errors.rb:15:in `call'
  ./vendor/bundle/ruby/2.2.0/gems/aws-sdk-core-2.9.6/lib/aws-sdk-core/plugins/jsonvalue_converter.rb:20:in `call'
  ./vendor/bundle/ruby/2.2.0/gems/aws-sdk-core-2.9.6/lib/aws-sdk-core/plugins/idempotency_token.rb:18:in `call'
  ./vendor/bundle/ruby/2.2.0/gems/aws-sdk-core-2.9.6/lib/aws-sdk-core/plugins/param_converter.rb:20:in `call'
  ./vendor/bundle/ruby/2.2.0/gems/aws-sdk-core-2.9.6/lib/seahorse/client/plugins/response_target.rb:21:in `call'
  ./vendor/bundle/ruby/2.2.0/gems/aws-sdk-core-2.9.6/lib/seahorse/client/request.rb:70:in `send_request'
  ./vendor/bundle/ruby/2.2.0/gems/aws-sdk-core-2.9.6/lib/seahorse/client/base.rb:207:in `block (2 levels) in define_operation_methods'
  ./lib/cloud/aws/vip_network.rb:36:in `block in configure'
  ./vendor/bundle/ruby/2.2.0/gems/bosh_common-1.3262.24.0/lib/common/retryable.rb:28:in `call'
  ./vendor/bundle/ruby/2.2.0/gems/bosh_common-1.3262.24.0/lib/common/retryable.rb:28:in `block in retryer'
  ./vendor/bundle/ruby/2.2.0/gems/bosh_common-1.3262.24.0/lib/common/retryable.rb:26:in `loop'
  ./vendor/bundle/ruby/2.2.0/gems/bosh_common-1.3262.24.0/lib/common/retryable.rb:26:in `retryer'
  ./vendor/bundle/ruby/2.2.0/gems/bosh_common-1.3262.24.0/lib/common/common.rb:119:in `retryable'
  ./lib/cloud/aws/vip_network.rb:35:in `configure'
  ./lib/cloud/aws/network_configurator.rb:75:in `configure'
  ./lib/cloud/aws/cloud.rb:190:in `block in create_vm'
  ./vendor/bundle/ruby/2.2.0/gems/bosh_common-1.3262.24.0/lib/common/thread_formatter.rb:49:in `with_thread_name'
  ./lib/cloud/aws/cloud.rb:154:in `create_vm'
  ./vendor/bundle/ruby/2.2.0/gems/bosh_cpi-2.1.0/lib/bosh/cpi/cli.rb:81:in `public_send'
  ./vendor/bundle/ruby/2.2.0/gems/bosh_cpi-2.1.0/lib/bosh/cpi/cli.rb:81:in `run'
  ./bin/aws_cpi:29:in `<main>'

The create_vm-related call...

[vmManager] 2017/04/18 06:36:36 DEBUG - Creating VM with network interfaces:
  map[string]property.Map{
    "public": property.Map{
      "type": "manual",
      "gateway": "192.0.2.1",
      "dns": []string{"8.8.8.8", "8.8.4.4"},
      "netmask": "255.255.240.0",
      "cloud_properties": property.Map{"subnet": "subnet-a1b2c3d4"},
      "ip": "192.0.2.8",
      "default": []manifest.NetworkDefault{"dns", "gateway"}
    },
    "vip": property.Map{
      "cloud_properties": property.Map{},
      "ip": "203.0.113.182",
      "type": "vip"
    }
  }

AWS Tagging limitation

I have some questions about tagging on AWS.

  1. Allowing for tagging parameters we can pass in to instances that will be created by bosh. This way I don't have to go back and tag items created post instance or volume creation.
  2. Instead of creating several tags that eat into the tag limit count combining bosh specific tags. ie. Key Name="Bosh Tags", Values of Tags="bosh:job:index:something;else". This way would use up 1 tag instead of 5 for example:
"Tags": [
        {
            "Value": "router-partition-50386dce3e87b8779d04/1",
            "Key": "Name"
        },
        {
            "Value": "cf-c563a8009d0819f5d361",
            "Key": "deployment"
        },
        {
            "Value": "p-bosh-1e0844c6e06194193362",
            "Key": "director"
        },
        {
            "Value": "1",
            "Key": "index"
        },
        {
            "Value": "router-partition-50386dce3e87b8779d04",
            "Key": "job"
        } 
       ]

Packaged Bundler version doesn't match Gemfile.lock version

The packaged Bundler version is 1.11.2, however the CPI was bundled with Bundler version 1.12.5. This produces a warning on my system:

'Warning: the running version of Bundler is older than the version that created the lockfile. We suggest you upgrade to the latest version of Bundler by running `gem install bundler`.

I think this warning is breaking the CPI because its attempting to parse the stdout as JSON:

Command 'deploy' failed:
  creating stemcell (bosh-aws-xen-hvm-ubuntu-trusty-go_agent 3262.12):
    Unmarshalling external CPI command output: STDOUT: 'Warning: the running version of Bundler is older than the version that created the lockfile. We suggest you upgrade to the latest version of Bundler by running `gem install bundler`.
{"result":"ami-983cecf8 light","error":null,"log":""}', STDERR: 'I, [2016-10-15T01:37:12.120032 #2308]  INFO -- : [AWS EC2 200 18.312085 0 retries] describe_regions()

I, [2016-10-15T01:37:12.948406 #2308]  INFO -- : [AWS EC2 200 0.828374 0 retries] describe_images(:filters=>[{:name=>"image-id",:values=>["ami-79b26618","ami-4d1fca23","ami-b671aad5","ami-fe47769d","ami-7b4ab914","ami-fcabd78f","ami-3c8b1b50","ami-36c0b421","ami-e78bc787","ami-983cecf8"]}])

':
      invalid character 'W' looking for beginning of value

This could be because I have BOSH_INIT_LOG_LEVEL set to DEBUG or it could be for other reasons TBD.

unworking example manifest

Tried to deploy bosh using a bosh deployed with bosh-micro on aws...

It looks like the example manifest has too many things configured to 127.0.0.1. Pretty much all the jobs except redis need a non-local IP that can be accessed from other vms. A bosh deployed with bosh-micro will appear to be working until you try to deploy something with it.

If you're using a dynamic network, then you have to configure all the jobs to know about the vip.

If you're using a manual network with a static ip then the jobs can use that, provided you include the aws dns in your dns list.

retry during VM creation led to "IP address in use"

We're using aws-cpi-release 62.

A deployment failed with

Error 100: Unknown CPI error 'Unknown' with message 'Address 10.0.100.34 is in use.' in 'create_vm' CPI method

Looking into AWS instances, I found the instance occupying the IP address was from the same deployment, created at the exact time the bosh deploy was running. However, bosh instances didn't know about a VM using this IP address. Therefore, all consecutive deployments failed with the same error message.

Looking in to the debug logs, I found this suspicious little thing:

I, [2017-05-04T09:32:17.927986 #21457]  INFO -- : Creating new instance with: {:image_id=>"ami-f4b4679b", :instance_type=>"m3.large", :key_name=>"bosh-hcp-live-eu10", :user_data=><redacted>, :block_device_mappings=>[{:device_name=>"/dev/sdb", :ebs=>{:volume_size=>32, :volume_type=>"gp2", :delete_on_termination=>true}}, {:device_name=>"/dev/xvda", :ebs=>{:volume_type=>"gp2", :delete_on_termination=>true}}], :placement=>{:availability_zone=>"eu-central-1a"}, :network_interfaces=>[{:groups=>["sg-3c931154", "sg-8a9210e2", "sg-879210ef"], :subnet_id=>"subnet-b1c90ad9", :private_ip_address=>"10.0.100.34", :device_index=>0}]}
I, [2017-05-04T09:32:17.928175 #21457]  INFO -- : Launching on demand instance...
I, [2017-05-04T09:32:25.946468 #21457]  INFO -- : [AWS EC2 400 8.017247 1 retries] run_instances(:block_device_mappings=>[{:device_name=>"/dev/sdb",:ebs=>{:delete_on_termination=>true,:volume_size=>32,:volume_type=>"gp2"}},{:device_name=>"/dev/xvda",:ebs=>{:delete_on_termination=>true,:volume_type=>"gp2"}}],:image_id=>"ami-f4b4679b",:instance_type=>"m3.large",:key_name=>"bosh-hcp-live-eu10",:max_count=>1,:min_count=>1,:network_interfaces=>[{:device_index=>0,:groups=>["sg-3c931154","sg-8a9210e2","sg-879210ef"],:private_ip_address=>"10.0.100.34",:subnet_id=>"subnet-b1c90ad9"}],:placement=>{:availability_zone=>"eu-central-1a"},:user_data=>"<redacted>") AWS::EC2::Errors::InvalidIPAddress::InUse Address 10.0.100.34 is in use.

What's particularly interesting is the 1 retries part. Did the CPI or the used aws library actually retry creating the VM? Seems like the call succeeded eventually in the first call and all following ones failed.

build fails on OS X 10.11 El Capitan with rbenv and Homebrew

Hello folks! I'm following this document and during step 2 I repeatedly get the following error:

$Β bosh-init deploy ./bosh.yml
Deployment manifest: '/Users/rk/deployments/rk-stage-paas/bosh.yml'
Deployment state: '/Users/rk/deployments/rk-stage-paas/bosh-state.json'

Started validating
  Downloading release 'bosh'... Skipped [Found in local cache] (00:00:00)
  Validating release 'bosh'... Finished (00:00:00)
  Downloading release 'bosh-aws-cpi'... Skipped [Found in local cache] (00:00:00)
  Validating release 'bosh-aws-cpi'... Finished (00:00:00)
  Validating cpi release... Finished (00:00:00)
  Validating deployment manifest... Finished (00:00:00)
  Downloading stemcell... Skipped [Found in local cache] (00:00:00)
  Validating stemcell... Finished (00:00:00)
Finished validating (00:00:00)

Started installing CPI
  Compiling package 'ruby_aws_cpi/a5b66d011ce1b31642ff148ea2c9097af65ff78c'... Finished (00:00:00)
  Compiling package 'bosh_aws_cpi/d7ffe4e7cd4cc233372185d8fd9374b737c3320a'... Failed (00:00:03)
Failed installing CPI (00:00:03)

Command 'deploy' failed:
  Installing CPI:
    Compiling job package dependencies for installation:
      Compiling job package dependencies:
        Compiling package:
          Running command: 'bash -x packaging', stdout: 'Installing rake 10.3.2
Installing CFPropertyList 2.3.1
Installing addressable 2.3.8
Installing json 1.8.3
Installing mini_portile 0.6.2

Gem::Installer::ExtensionBuildError: ERROR: Failed to build gem native extension.

    /Users/rk/.bosh_init/installations/d93aa675-d3f6-4371-5a8b-e704073bbc8f/packages/ruby_aws_cpi/bin/ruby extconf.rb
checking if the C compiler accepts ... yes
checking if the C compiler accepts -Wno-error=unused-command-line-argument-hard-error-in-future... no
Building nokogiri using packaged libraries.
*** extconf.rb failed ***
Could not create Makefile due to some reason, probably lack of necessary
libraries and/or headers.  Check the mkmf.log file for more details.  You may
need configuration options.

Provided configuration options:
    --with-opt-dir
    --with-opt-include
    --without-opt-include=${opt-dir}/include
    --with-opt-lib
    --without-opt-lib=${opt-dir}/lib
    --with-make-prog
    --without-make-prog
    --srcdir=.
    --curdir
    --ruby=/Users/rk/.bosh_init/installations/d93aa675-d3f6-4371-5a8b-e704073bbc8f/packages/ruby_aws_cpi/bin/ruby
    --help
    --clean
    --use-system-libraries
/Users/rk/.bosh_init/installations/d93aa675-d3f6-4371-5a8b-e704073bbc8f/packages/ruby_aws_cpi/lib/ruby/site_ruby/2.1.0/rubygems/core_ext/kernel_require.rb:55:in `require': cannot load such file -- openssl (LoadError)
    from /Users/rk/.bosh_init/installations/d93aa675-d3f6-4371-5a8b-e704073bbc8f/packages/ruby_aws_cpi/lib/ruby/site_ruby/2.1.0/rubygems/core_ext/kernel_require.rb:55:in `require'
    from /Users/rk/.bosh_init/installations/d93aa675-d3f6-4371-5a8b-e704073bbc8f/packages/ruby_aws_cpi/lib/ruby/2.1.0/net/https.rb:22:in `<top (required)>'
    from /Users/rk/.bosh_init/installations/d93aa675-d3f6-4371-5a8b-e704073bbc8f/packages/ruby_aws_cpi/lib/ruby/site_ruby/2.1.0/rubygems/core_ext/kernel_require.rb:55:in `require'
    from /Users/rk/.bosh_init/installations/d93aa675-d3f6-4371-5a8b-e704073bbc8f/packages/ruby_aws_cpi/lib/ruby/site_ruby/2.1.0/rubygems/core_ext/kernel_require.rb:55:in `require'
    from /Users/rk/.bosh_init/installations/d93aa675-d3f6-4371-5a8b-e704073bbc8f/packages/bosh_aws_cpi/vendor/bundle/ruby/2.1.0/gems/mini_portile-0.6.2/lib/mini_portile.rb:3:in `<top (required)>'
    from /Users/rk/.bosh_init/installations/d93aa675-d3f6-4371-5a8b-e704073bbc8f/packages/ruby_aws_cpi/lib/ruby/site_ruby/2.1.0/rubygems/core_ext/kernel_require.rb:135:in `require'
    from /Users/rk/.bosh_init/installations/d93aa675-d3f6-4371-5a8b-e704073bbc8f/packages/ruby_aws_cpi/lib/ruby/site_ruby/2.1.0/rubygems/core_ext/kernel_require.rb:135:in `rescue in require'
    from /Users/rk/.bosh_init/installations/d93aa675-d3f6-4371-5a8b-e704073bbc8f/packages/ruby_aws_cpi/lib/ruby/site_ruby/2.1.0/rubygems/core_ext/kernel_require.rb:144:in `require'
    from extconf.rb:395:in `<main>'


Gem files will remain installed in /Users/rk/.bosh_init/installations/d93aa675-d3f6-4371-5a8b-e704073bbc8f/packages/bosh_aws_cpi/vendor/bundle/ruby/2.1.0/gems/nokogiri-1.6.6.2 for inspection.
Results logged to /Users/rk/.bosh_init/installations/d93aa675-d3f6-4371-5a8b-e704073bbc8f/packages/bosh_aws_cpi/vendor/bundle/ruby/2.1.0/gems/nokogiri-1.6.6.2/ext/nokogiri/gem_make.out
An error occurred while installing nokogiri (1.6.6.2), and Bundler cannot
continue.
Make sure that `gem install nokogiri -v '1.6.6.2'` succeeds before bundling.
', stderr: '+ set -e -x
+ BOSH_PACKAGES_DIR=/Users/rk/.bosh_init/installations/d93aa675-d3f6-4371-5a8b-e704073bbc8f/packages
+ cp -a bosh_aws_cpi/Gemfile bosh_aws_cpi/Gemfile.lock bosh_aws_cpi/bin bosh_aws_cpi/bosh_aws_cpi.gemspec bosh_aws_cpi/lib bosh_aws_cpi/scripts bosh_aws_cpi/spec bosh_aws_cpi/vendor /Users/rk/.bosh_init/installations/d93aa675-d3f6-4371-5a8b-e704073bbc8f/packages/bosh_aws_cpi
+ bundle_cmd=/Users/rk/.bosh_init/installations/d93aa675-d3f6-4371-5a8b-e704073bbc8f/packages/ruby_aws_cpi/bin/bundle
+ cd /Users/rk/.bosh_init/installations/d93aa675-d3f6-4371-5a8b-e704073bbc8f/packages/bosh_aws_cpi
+ /Users/rk/.bosh_init/installations/d93aa675-d3f6-4371-5a8b-e704073bbc8f/packages/ruby_aws_cpi/bin/bundle install --local --no-prune --deployment
':
            exit status 5

I provided the required OpenSSL version by installing Homebrew's openssl package and linking it into place with brew link --force openssl. Having done this, I can build Nokogiri 1.6.6.2 on this system:

$Β gem list --local nokogiri

*** LOCAL GEMS ***

nokogiri (1.6.6.2)

However, it looks like the bosh-aws-cpi build script isn't correctly finding the necessary dependencies in my environment. What's the recommended procedure for using this package on OS X 10.11?

create-env is broken with v68

We are using bosh-deployment with latest versions. Since yesterday we are not able to execute create-env with AWS CPI anymore. We are using access_key_id and secret_access_key. Based on the release notes for the bumped AWS CPI v68, nothing should change for us.

The error message we get is:

Deploying:
  Creating instance 'bosh/0':
    Creating VM:
      Creating vm with stemcell cid 'ami-304dcf5f light':
        CPI 'create_vm' method responded with error: CmdError{"type":"Unknown","message":"You are not authorized to perform this operation.","ok_to_retry":false}

Exit code 1

Is this related to the new optional session-token? We did not specify it as it says its optional.

We are not modifying credentials_source so the default value static applies.
Does the documentation for session-token (Optional, used when aws.credentials_source is set to 'static') mean "it 'might' be used with aws.credentials_source=static" or does it rather mean "you 'have to' use it when aws.credentials_source=static"?

AWS CPI cpi.json endpoint registry.password should be URL encoded

The template cpi.json in https://github.com/cloudfoundry-incubator/bosh-aws-cpi-release/blob/master/jobs/aws_cpi/templates/cpi.json.erb#L17 creates a endpoint url for the registry including the registry password.

But if the password includes especial characters, the rendered URL will be valid.

For instance, a password like:

    registry:
      password: a/invalid/password
      username: admin

Will result in a rendered url like: http://admin:a/invalid/[email protected]:25777, which will result in the AWS CPI failing with the following error: Unknown CPI error 'Unknown' with message 'Invalid port number: "a"'

The solution should be URL encode the password in the template, so that the URL will be: http://admin:a%2Finvalid%[email protected]:25777

Support for EBS-optimisation setting on non optimised instances

When an EBS-optimised instance is configured in the manifest, the instance is launched without taking into account this property! I noticed afterwords the EBS-optimised setting in the AWS EC2 instances panel is not activated.
The list of supported instances is here:
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSOptimized.html#ebs-optimization-support

Launching the stemcell AMI directly in AWS sets this automatically (although it is just visual - read below).

Apparently it has no effect if the instance launched is from the list above, as quoted from the link:
"Note that some instance types are EBS–optimized by default. For instances that are EBS–optimized by default, there is no need to enable EBS optimization and there is no effect if you disable EBS optimization using the CLI or API. You can enable EBS optimization for the other instance types that support EBS optimization when you launch the instances, or enable EBS optimization after the instances are running."
The optimisation is automatically enabled without any additional cost.

Now, in the case of non-optimised instances they could take advantage of this feature.

Found a good explanation of this setting:
https://dzone.com/articles/when-amazon-ebs-optimized

Tested with AWS CPI v63

Discussion on slack:
https://cloudfoundry.slack.com/archives/C02HPPYQ2/p1492531440364851

Selecting correct Security Group by name for accounts with multiple VPCs

It is possible to specify security groups by name or id. While the specification of ids works correctly for accounts containing multiple VPCs, the specification by name has a problem.

It somehow randomly(?) maps the name to a valid security group in the given account. But it does
not check whether the selected security group belongs to the VPC the given network belongs to.
As a result typically the VM creation fails if there are multiple VPCs using the same security groups, because the selected security group id is not valid for the network the VM is created for.

We are using IDs now, but the resulting manifests are quite unreadable. It would be much better
to be able to use security group name instead.

Using publicly accessible AMI for testing

Hi Team,
Can we use more of the publicly accessible AMI for all the integration tests. It will really help in testing changes locally, also it's sometime difficult to figure out the variables required to change AMIs and other test related variables.

The doc specifying how to "Manually run lifecycle tests" doesn't list all the variables which can be modified.

Thanks,
Gaurab

It should be possible to define AWS Security Groups in resource pool, not only network

Problem description

AWS Security Groups is a property that can be specified per EC2 instance during its creation. It is also used in other resources like ELBs, RDS instances, etc... Different instances and resources can use different Security Groups, regardless the VPC and subnet they are located.

But BOSH AWS CPI only allows assign security groups in the network definition (networks.<id>.cloud_properties.security_groups), so that ALL the instances defined in that network will have the same security groups.

To workaround the issue, if it is required different security groups in different instances in the same network, one is forced to "artificially" split the network definition in two networks. If several security groups are required, several different networks must be defined, making the manifest difficult to understand and manage. See below for an example.

Expected behaviour

As security groups in AWS is a property assigned per instance, it is desirable that security groups can be defined in the resource_pool, as in resource_pool.<id>.cloud_properties.security_groups.

As the CPI currently allows the SG definition in the network, such feature should not be removed, instead we can add the new feature as:

  • resource_pool.<id>.cloud_properties.security_groups completely overrides the ones defined in the networks.
  • resource_pool.<id>.cloud_properties.security_groups is merged with the ones defined in the network.
  • Two new properties, resource_pool.<id>.cloud_properties.security_groups to override and resource_pool.<id>.cloud_properties.additional_security_groups to merge.

Example use case

We want to implement RDS for Cloud Controller and UAA, and setup security groups to only allow API/CloudController and UAA instances connect to our RDS instance.

For that, we create a new AWS security group called cf_rds_clients, which needs to be assigned ONLY to API instances and UAA.

With the current implementation we must define a new network in the manifest only for the api and uaa jobs, which is counter intuitive and requires additional work.

If SG can be defined in the resource pool, we simple need to define a new resource pool type for api and uaa.

Example of workaround

In a normal network definition like this:

- name: cf1
  subnets:
    - range: 10.0.16.0/24
      reserved:
        - 10.0.16.2 - 10.0.16.9
      static:
        - 10.0.16.10 - 10.0.16.40
      gateway: 10.0.16.1
      dns:
        - 10.0.0.2
      cloud_properties:
        subnet: (( grab terraform_outputs.cf1_subnet_id ))

If we want to setup a new SG, we need to split that network, allocating a range of IPs which will be reserved in the old network but used in the new one.

- name: cf2
  subnets:
    - range: 10.0.17.0/24
      reserved:
        - 10.0.17.2 - 10.0.17.9
        - 10.0.17.41 - 10.0.17.50
      static:
        - 10.0.17.10 - 10.0.17.40
      dynamic:
        - 10.0.17.51 - 10.0.17.255
      gateway: 10.0.17.1
      dns:
        - 10.0.0.2
      cloud_properties:
        subnet: (( grab terraform_outputs.cf2_subnet_id ))

- name: cf2_rds
  subnets:
    - range: 10.0.17.0/24
      reserved:
        - 10.0.17.2 - 10.0.17.9
        - 10.0.17.10 - 10.0.17.40
        - 10.0.17.51 - 10.0.17.255
      static: []
      dynamic:
        - 10.0.17.41 - 10.0.17.50
      gateway: 10.0.17.1
      dns:
        - 10.0.0.2
      cloud_properties:
        subnet: (( grab terraform_outputs.cf2_subnet_id ))
        security_groups:
          - (( grab terraform_outputs.default_security_group ))
          - (( grab terraform_outputs.cf_rds_client_security_group ))

Get a release compilation error with new bosh create-env cli command.

I received a gcc compilation error when creating a new Bosh director using the new Bosh cli. Workaround used: https://gist.github.com/application2000/73fd6f4bf1be6600a2cf9f56315a2d91
Please update bosh documentation as it was unclear the necessary steps for beginning a Director deployment needs to have OS specific deps: http://bosh.io/docs/install-bosh-init

bosh create-env ./bosh.yaml
Deployment manifest: '/Users/nsagoo/Workspace/bosh/bosh.yaml'
Deployment state: '/Users/nsagoo/Workspace/bosh/bosh-state.json'

Started validating
Downloading release 'bosh'... Skipped [Found in local cache] (00:00:00)
Validating release 'bosh'... Finished (00:00:00)
Downloading release 'bosh-aws-cpi'... Skipped [Found in local cache] (00:00:00)
Validating release 'bosh-aws-cpi'... Finished (00:00:00)
Validating cpi release... Finished (00:00:00)
Validating deployment manifest... Finished (00:00:00)
Downloading stemcell... Skipped [Found in local cache] (00:00:00)
Validating stemcell... Finished (00:00:00)
Finished validating (00:00:01)

Started installing CPI
Compiling package 'ruby_aws_cpi/5e8696452d4676dd97010e91475e86b23b7e2042'... Failed (00:00:01)
Failed installing CPI (00:00:01)

Installing CPI:
Compiling job package dependencies for installation:
Compiling job package dependencies:
Compiling package:
Running command: 'bash -x packaging', stdout: 'Installing yaml
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for a thread-safe mkdir -p... config/install-sh -c -d
checking for gawk... no
checking for mawk... no
checking for nawk... no
checking for awk... awk
checking whether make sets $(MAKE)... no
checking for gcc... gcc
checking whether the C compiler works... no
', stderr: '+ set -e -x
++ PATH=/usr/local/opt/openssl/bin:/usr/local/bin:/usr/bin:/bin
++ openssl version

  • openssl_version='OpenSSL 1.0.2h 3 May 2016'
    ++ echo OpenSSL 1.0.2h 3 May 2016
    ++ cut -f2 '-d '
    ++ cut -f1 -d.
  • openssl_major_version=1
  • '[' 1 == 0 ']'
  • echo 'Installing yaml'
  • tar xzf ruby_aws_cpi/yaml-0.1.5.tar.gz
  • pushd yaml-0.1.5
  • CFLAGS=-fPIC
  • ./configure --prefix=/Users/nsagoo/.bosh/installations/30403ef4-3e2a-44e8-45fb-c101bbd5c86f/packages/ruby_aws_cpi --disable-shared
    configure: error: in /Users/nsagoo/.bosh/installations/30403ef4-3e2a-44e8-45fb-c101bbd5c86f/tmp/bosh-release-pkg315473132/yaml-0.1.5': configure: error: C compiler cannot create executables See config.log' for more details
    ':
    exit status 77

Exit code 1

The operator should receive a human-readable error if region is invalid

A user on the bosh slack specified "us-east" instead of "us-east-1" as the region in their manifest. Instead of a human-readable error (e.g. "Invalid region 'us-east'"), they received this:

Starting registry... Finished (00:00:00)
Uploading stemcell 'bosh-aws-xen-hvm-ubuntu-trusty-go_agent/3012'... Failed (00:00:05)
Stopping registry... Finished (00:00:00)
Cleaning up rendered CPI jobs... Finished (00:00:00)

Command 'deploy' failed:
  creating stemcell (bosh-aws-xen-hvm-ubuntu-trusty-go_agent 3012):
    CPI 'create_stemcell' method responded with error: CmdError{"type":"Unknown","message":"getaddrinfo: nodename nor servname provided, or not known","ok_to_retry":false}

CPI doesn't cope well with rate limit during `describe instance`

When a VM is created or deleted, the CPI polls regulary (iirc every 5 seconds) for its state. If you happen to create or delete a lot of VMs in parallel, this happens:

Task 1504332 | 16:48:27 | Updating instance etcd: etcd/7cb83d59-a9d9-4806-8132-c62f646fccf9 (0) (canary) (00:02:11)
                     L Error: Unknown CPI error 'Unknown' with message 'Request limit exceeded.' in 'delete_vm' CPI method

which shows up in the CPI debug log as a problem with describe instance, not with the delete instance call:

D, [2017-12-12T16:49:05.743678 #32102] DEBUG -- [req_id cpi-292807]: Waiting for i-0d1edda4ea71b4ad2 to be terminated, retrying in 16 seconds (4/54)
D, [2017-12-12T16:50:38.489497 #32102] DEBUG -- [req_id cpi-292807]: [Aws::EC2::Client 503 76.74499 8 retries] describe_instances(instance_ids:["i-0d1edda4ea71b4ad2"]) Aws::EC2::Errors::RequestLimitExceeded Request limit exceeded.

As you can see, the SDK already tries to retry and backoff:
[req_id cpi-292807]: [Aws::EC2::Client 503 76.74499 8 retries] describe_instances(instance_ids:["i-0d1edda4ea71b4ad2"]) , but still fails. The VM actually gets deleted fine, it is just the polling that fails.

I feel like hitting a rate limit during polling should not cause a deployment to fail, but rather cause the CPI to backoff and retry at least one to three times. The action is executed correctly, the CPI is just not able to poll for the state change.

@cdutra @cppforlife @dpb587-pivotal opinions?

create_vm CPI method not supported

For some reason when i try to run the

bosh -d cf deploy

i see below error with the latest cf-deployment package

L Error: Unknown CPI error 'Unknown' with message 'The requested configuration is currently not supported. Please check the documentation for supported configurations.' in 'create_vm' CPI method

fail fast if another instance already using IP address

When bosh unknowingly selects an IP address for a create_vm that is already in use, the bosh director will call out to create_vm and just block. bosh cancel-task won't stop it. You just have to wait and stew.

Could we update create_vm to first quickly double check with AWS that the selected IP address is not already in use and fail fast? Then the operator can update their cloud-config's reserved list and start again.

bosh create-env failing on Amazon Linux AMI

Hello, we are trying to install a bosh director on AWS and getting an error when running bosh create-env from an Amazon Linux AMI instance. It did work from an EC2 Ubuntu instance.

This is the error we are seeing:

creating stemcell (bosh-aws-xen-hvm-ubuntu-trusty-go_agent 3421.4):
  CPI 'create_stemcell' method responded with error: CmdError{"type":"Unknown","message":"undefined method `set_api' for #\u003cClass:0x0055c51465b038\u003e","ok_to_retry":false}

Here is the stack trace:

[cpiCmdRunner] 2017/06/26 16:05:48 DEBUG - Rescued Unknown: undefined method `set_api' for #<Class:0x0055c51465b038>. backtrace: /home/ec2-user/.bosh/installations/1bc6d574-915a-4d31-52ed-f9b34e894c4b/packages/bosh_aws_cpi/vendor/bundle/ruby/2.2.0/gems/aws-sdk-core-2.9.6/lib/aws-sdk-core/client.rb:45:in `define'
/home/ec2-user/.bosh/installations/1bc6d574-915a-4d31-52ed-f9b34e894c4b/packages/bosh_aws_cpi/vendor/bundle/ruby/2.2.0/gems/aws-sdk-core-2.9.6/lib/aws-sdk-core.rb:508:in `block in <module:Aws>'
/home/ec2-user/.bosh/installations/1bc6d574-915a-4d31-52ed-f9b34e894c4b/packages/bosh_aws_cpi/vendor/bundle/ruby/2.2.0/gems/aws-sdk-core-2.9.6/lib/aws-sdk-core.rb:498:in `call'
/home/ec2-user/.bosh/installations/1bc6d574-915a-4d31-52ed-f9b34e894c4b/packages/bosh_aws_cpi/vendor/bundle/ruby/2.2.0/gems/aws-sdk-core-2.9.6/lib/aws-sdk-core.rb:498:in `block in add_service'
/home/ec2-user/.bosh/installations/1bc6d574-915a-4d31-52ed-f9b34e894c4b/packages/bosh_aws_cpi/vendor/bundle/ruby/2.2.0/gems/aws-sdk-core-2.9.6/lib/aws-sdk-core.rb:497:in `each'
/home/ec2-user/.bosh/installations/1bc6d574-915a-4d31-52ed-f9b34e894c4b/packages/bosh_aws_cpi/vendor/bundle/ruby/2.2.0/gems/aws-sdk-core-2.9.6/lib/aws-sdk-core.rb:497:in `add_service'
/home/ec2-user/.bosh/installations/1bc6d574-915a-4d31-52ed-f9b34e894c4b/packages/bosh_aws_cpi/vendor/bundle/ruby/2.2.0/gems/aws-sdk-core-2.9.6/lib/aws-sdk-core/acm.rb:1:in `<top (required)>'
/home/ec2-user/.bosh/installations/1bc6d574-915a-4d31-52ed-f9b34e894c4b/packages/bosh_aws_cpi/vendor/bundle/ruby/2.2.0/gems/aws-sdk-core-2.9.6/lib/aws-sdk-core.rb:457:in `const_get'
/home/ec2-user/.bosh/installations/1bc6d574-915a-4d31-52ed-f9b34e894c4b/packages/bosh_aws_cpi/vendor/bundle/ruby/2.2.0/gems/aws-sdk-core-2.9.6/lib/aws-sdk-core.rb:457:in `block in sub_modules'
/home/ec2-user/.bosh/installations/1bc6d574-915a-4d31-52ed-f9b34e894c4b/packages/bosh_aws_cpi/vendor/bundle/ruby/2.2.0/gems/aws-sdk-core-2.9.6/lib/aws-sdk-core.rb:456:in `each'
/home/ec2-user/.bosh/installations/1bc6d574-915a-4d31-52ed-f9b34e894c4b/packages/bosh_aws_cpi/vendor/bundle/ruby/2.2.0/gems/aws-sdk-core-2.9.6/lib/aws-sdk-core.rb:456:in `inject'
/home/ec2-user/.bosh/installations/1bc6d574-915a-4d31-52ed-f9b34e894c4b/packages/bosh_aws_cpi/vendor/bundle/ruby/2.2.0/gems/aws-sdk-core-2.9.6/lib/aws-sdk-core.rb:456:in `sub_modules'
/home/ec2-user/.bosh/installations/1bc6d574-915a-4d31-52ed-f9b34e894c4b/packages/bosh_aws_cpi/vendor/bundle/ruby/2.2.0/gems/aws-sdk-core-2.9.6/lib/aws-sdk-core.rb:444:in `eager_autoload!'
/home/ec2-user/.bosh/installations/1bc6d574-915a-4d31-52ed-f9b34e894c4b/packages/bosh_aws_cpi/lib/cloud/aws/cloud.rb:69:in `initialize'
/home/ec2-user/.bosh/installations/1bc6d574-915a-4d31-52ed-f9b34e894c4b/packages/bosh_aws_cpi/bin/aws_cpi:24:in `new'
/home/ec2-user/.bosh/installations/1bc6d574-915a-4d31-52ed-f9b34e894c4b/packages/bosh_aws_cpi/bin/aws_cpi:24:in `block in <main>'
/home/ec2-user/.bosh/installations/1bc6d574-915a-4d31-52ed-f9b34e894c4b/packages/bosh_aws_cpi/vendor/bundle/ruby/2.2.0/gems/bosh_cpi-2.1.1/lib/bosh/cpi/cli.rb:79:in `call'
/home/ec2-user/.bosh/installations/1bc6d574-915a-4d31-52ed-f9b34e894c4b/packages/bosh_aws_cpi/vendor/bundle/ruby/2.2.0/gems/bosh_cpi-2.1.1/lib/bosh/cpi/cli.rb:79:in `run'
/home/ec2-user/.bosh/installations/1bc6d574-915a-4d31-52ed-f9b34e894c4b/packages/bosh_aws_cpi/bin/aws_cpi:29:in `<main>'

Generic error message when volume limit is exceeded

Hi,

when trying to create a new instance while the volume limit is exceeded bosh failed with:

Error 100: CPI error 'Bosh::Clouds::VMCreationFailed' with message 'Instance 'i-id' terminated while starting' in 'create_vm' CPI method

In the debug logs there are also no indications that the volume limit is exceeded:

I, [2017-01-05T10:15:56.785420 #28477]  INFO -- : Launching on demand instance...
I, [2017-01-05T10:15:57.474458 #28477]  INFO -- : [AWS EC2 200 0.68869 0 retries] run_instances(...)

I, [2017-01-05T10:15:57.474602 #28477]  INFO -- : Waiting for instance to be ready...
I, [2017-01-05T10:15:57.624174 #28477]  INFO -- : [AWS EC2 200 0.148772 0 retries] describe_instances(:instance_ids=>["i-id"])

D, [2017-01-05T10:15:57.625211 #28477] DEBUG -- : Waiting for i-id to be running, retrying in 2 seconds (1/54)
I, [2017-01-05T10:15:59.743833 #28477]  INFO -- : [AWS EC2 200 0.117689 0 retries] describe_instances(:instance_ids=>["i-id"])

D, [2017-01-05T10:15:59.744643 #28477] DEBUG -- : Waiting for i-id to be running, retrying in 4 seconds (2/54)
I, [2017-01-05T10:16:03.833012 #28477]  INFO -- : [AWS EC2 200 0.087468 0 retries] describe_instances(:instance_ids=>["i-id"])

E, [2017-01-05T10:16:03.833757 #28477] ERROR -- : Instance 'i-id' terminated while starting
W, [2017-01-05T10:16:03.833887 #28477]  WARN -- : Failed to configure instance 'i-id': #<Bosh::Clouds::VMCreationFailed: Instance 'i-id' terminated while starting>
I, [2017-01-05T10:16:03.926241 #28477]  INFO -- : [AWS EC2 200 0.091794 0 retries] terminate_instances(:instance_ids=>["i-id"])

I, [2017-01-05T10:16:03.926308 #28477]  INFO -- : Deleting instance settings for 'i-id'
I, [2017-01-05T10:16:03.929614 #28477]  INFO -- : Deleting instance 'i-id'
I, [2017-01-05T10:16:03.972774 #28477]  INFO -- : [AWS EC2 200 0.042442 0 retries] describe_instances(:instance_ids=>["i-id"])

I, [2017-01-05T10:16:03.973536 #28477]  INFO -- : i-id is now terminated, took 0.043842155s
E, [2017-01-05T10:16:03.973619 #28477] ERROR -- : Failed to create instance: Instance 'i-id' terminated while starting

Based on the output it is hard to tell what the problem is.

When creating an instance manually aws returns:

State transition reason
Client.VolumeLimitExceeded: Volume limit exceeded

Bosh v260.0
aws cpi v61

Resource tagging can produce an endless loop

When working for the pull request #49 I noticed that the tags method in the tag_manager can produce an endless loop. If the taggable resource is not found for what ever reason (infrastructure issue) tagging will be retried forever. Does it make sense to add a max number of retries here?

Ephemeral disk encryption not working?

I am not sure if I am doing something wrong but I can't get the ephemeral disk encryption to work.

Here is the section of the manifest:

resource_pools:
- cloud_properties:
    availability_zone: us-east-1b
    ephemeral_disk:
      cloud_properties:
        encrypted: true
        type: gp2
      size: 30000
    instance_type: t2.small
  name: small_z1
  network: cf1
  stemcell:
    name: bosh-aws-xen-hvm-ubuntu-trusty-go_agent
    version: 3232.8

The gp2 part and the size are picked up but not the encryption. We are using v53 for the CPI. Any ideas?

bundle: No such file or directory

I get the following error when create-env attempts to compile bosh-aws-cpi/67 on OS X 10.13.1.

packaging: line 15: $HOME/.bosh/installations/7d96b0b8-bada-45c0-5770-6523a20cc6e2/packages/ruby-2.4/bin/bundle: No such file or directory

Not sure why it's happening. It seems vaguely familiar, but I'm not sure why.

When downgrading to bosh-aws-cpi/66 it works.

Full output...

Deployment manifest: 'bosh.yml'
Deployment state: 'bosh-state.json'

Started validating
  Downloading release 'bosh-aws-cpi'... Skipped [Found in local cache] (00:00:00)
  Validating release 'bosh-aws-cpi'... Finished (00:00:00)
  ...
  Validating cpi release... Finished (00:00:00)
  Validating deployment manifest... Finished (00:00:00)
  Downloading stemcell... Skipped [Found in local cache] (00:00:00)
  Validating stemcell... Finished (00:00:00)
Finished validating (00:00:02)

Started installing CPI
  Compiling package 'ruby-2.4/cff0ddb99e62cd9dfa146248dcbfd6df2a62a54e'... Finished (00:00:00)
  Compiling package 'bosh_aws_cpi/8fdc24f9ef724e7ea5204464eb64803ca851a97a'... Failed (00:00:02)
Failed installing CPI (00:00:02)

Installing CPI:
  Compiling job package dependencies for installation:
    Compiling job package dependencies:
      Compiling package:
        Running command: 'bash -x packaging', stdout: '', stderr: '+ set -e
+ BOSH_PACKAGES_DIR=$HOME/.bosh/installations/7d96b0b8-bada-45c0-5770-6523a20cc6e2/packages
+ cp -a bosh_aws_cpi/Gemfile bosh_aws_cpi/Gemfile.lock bosh_aws_cpi/bin bosh_aws_cpi/lib bosh_aws_cpi/vendor $HOME/.bosh/installations/7d96b0b8-bada-45c0-5770-6523a20cc6e2/packages/bosh_aws_cpi
+ export BUNDLE_CACHE_PATH=vendor/package
+ BUNDLE_CACHE_PATH=vendor/package
+ export BUNDLE_WITHOUT=development:test
+ BUNDLE_WITHOUT=development:test
+ bundle_cmd=$HOME/.bosh/installations/7d96b0b8-bada-45c0-5770-6523a20cc6e2/packages/ruby-2.4/bin/bundle
+ cd $HOME/.bosh/installations/7d96b0b8-bada-45c0-5770-6523a20cc6e2/packages/bosh_aws_cpi
+ $HOME/.bosh/installations/7d96b0b8-bada-45c0-5770-6523a20cc6e2/packages/ruby-2.4/bin/bundle install --local --no-prune --deployment
packaging: line 15: $HOME/.bosh/installations/7d96b0b8-bada-45c0-5770-6523a20cc6e2/packages/ruby-2.4/bin/bundle: No such file or directory
':
          exit status 127

Exit code 1

Creating snapshot is timing out

When uploading a stemcell sometimes creating a snapshot times out. The time is hard coded by determining the number of retries here
Can we remove this number being passed in and simply use the DEFAULT_TRIES like in all the other wait methods?

Unable to use c5 instance types

When using instance_type c5.2xlarge, I get this error:

Error: Unknown CPI error 'Unknown' with message 'Enhanced networking with the Elastic Network Adapter (ENA) is required for the 'c5.2xlarge' instance type. Ensure that you are using an AMI that is enabled for ENA.' in 'create_vm' CPI method

C5 instances, require specific networking drivers as described here

Besides the error, the deployment seemed to be fine.
Correction: Deployment was not fine πŸ˜…

Stemcell used: ubuntu trusty 3468.21

Support for Application Load Balancers

Hi AWS CPI team,

can you please enable support for AWS ALBs? As of now, we can attach instances to ELBs like this:

resource_pools:
- name: ha_proxy_z1
  cloud_properties:
    elbs:
    - my-aws-elb

In a similar way we would like to attach instances to an AWS ALB:

resource_pools:
- name: router_z1
  cloud_properties:
    elbs:
    - my-aws-alb

I think we can keep the property name "elbs" as the names for all load balancers is unique.

Thanks for your support and best regards,

Jochen.

feature request: retry registry update

In one of our systems we noticed that (automated) bosh deployments may fail with

Bosh::Clouds::CloudError: CPI error ''Bosh::Clouds::CloudError'' with message ''Cannot update settings for ''i-04f7cada39581ccc7'', got HTTP 500'' in ''create_vm'' CPI method

There's also a "Failed to create instance" message which may be a little confusing because creation itself worked (-> should be in a different exception block than the CPI interactions?).
https://github.com/cloudfoundry-incubator/bosh-aws-cpi-release/blob/733db1cc02c626fdc39dc73eda804d8e0402e5ef/src/bosh_aws_cpi/lib/cloud/aws/cloud.rb#L152-L156

The registry log shows that the PG (RDS instance) connection broke down.

E, [2018-02-28T09:06:23.487976 #5421] ERROR -- : PG::Error: server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.: SELECT * FROM "registry_instances" WHERE ("instance_id" = 'i-04f7cada39581ccc7') LIMIT 1

As this can happen and other CPI related methods have retry methods already, it would be great to have one for the registry update, too.

Package all the dependencies

AFAICT, the ruby package makes assumptions about the host machine. In the past, these were probably met by bosh stemcells. But with bosh-init this CPI and its ruby package will be installed on unprepared OS X & Linux machines. Perhaps they should be converted into packages for this (and the openstack) release?

For Ubuntu the packages required are:

build-essential zlibc zlib1g-dev \  
  openssl libxslt-dev libxml2-dev libssl-dev \
  libreadline6 libreadline6-dev libyaml-dev libsqlite3-dev sqlite3

Improve cpi error handling

I did hit a api rate limit at AWS and received this error message from CPI:

"Unknown CPI error 'Unknown' with message 'Request limit exceeded.' in 'create_vm' CPI method (00:01:26) "

AWS says: "If an API request exceeds the API request rate for its category, the request returns the RequestLimitExceeded error code." (http://docs.aws.amazon.com/AWSEC2/latest/APIReference/query-api-troubleshooting.html)

So, I think the CPI error is not unknown. Instead I suggest that the CPI should return "equestLimitExceeded error code" additionally to the text message.

WDYT?

In Cloud Formation; -- Request limit exceeded. (Service: AmazonEC2; Status Code: 503; Error Code: RequestLimitExceeded; Request ID: 7fd6cf20-2880-4845-a2f5-3fa30583767f)

Hi, I'm trying to create the stack in Cloud formation with 15 beanstalk applications. I've successfully added 13 app and tested with CF and also got the successful result. But if I'm going to add 14th app, then I'm getting the below error randomly.

19:32:04 UTC+0550 | CREATE_FAILED | AWS::ElasticBeanstalk::Environment | envsocureactionauditprod1515 | Request limit exceeded. (Service: AmazonEC2; Status Code: 503; Error Code: RequestLimitExceeded; Request ID: 7fd6cf20-2880-4845-a2f5-3fa30583767f)

I've checked and already increased the EC2, ASG and CLB limit. But I don't know why I'm getting this error. Anyone help me on this issue.

[feature] Spot Requests for longer than 1 week

Spot instances in bosh would be more useful if we could have longer duration spot requests. The valid_until option is not configurable in the CPI and defaults to 7 days from the request date.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.