Comments (6)
Could you please help to collect more information? 1) the full trace in volume info log around 18:24:34. 2) the availability zones of the 3 EC2 instances. 3) firecamp-service-cli -op=list-members -region=us-east-1 -cluster=firecamp-stage -service-name=firecamp-stage-zookeeper.
from firecamp.
thanks, @JuniusLuo for taking at this...
- ASG AvailabilityZones: us-east-1a,us-east-1b,us-east-1c
More from: firecamp-dockervolume.ERROR
this just repeats from the start of the error log (after the init log message)
E0315 18:24:34.022478 6 volume.go:829] service has no idle member &{931a5f81f9ce40ae5bc0ccde07a8747c ACTIVE 1521136846996011291 3 firecamp-stage firecamp-stage-zookeeper {/dev/xvdg {gp2 10 100 false} { 0 0 false}} true fir
ecamp-stage-firecamp.com /hostedzone/Z1826MR4G8CQU6 false 0xc4202b5800 {0 256 0 4096} } requuid 10.0.43.217-931a5f81f9ce40ae5bc0ccde07a8747c-1521138273
E0315 18:24:34.022496 6 volume.go:592] findIdleMember error InternalError requuid 10.0.43.217-931a5f81f9ce40ae5bc0ccde07a8747c-1521138273 service &{931a5f81f9ce40ae5bc0ccde07a8747c ACTIVE 1521136846996011291 3 firecamp-stag
e firecamp-stage-zookeeper {/dev/xvdg {gp2 10 100 false} { 0 0 false}} true firecamp-stage-firecamp.com /hostedzone/Z1826MR4G8CQU6 false 0xc4202b5800 {0 256 0 4096} }
E0315 18:24:34.022513 6 volume.go:546] Mount failed, get service member error InternalError, serviceUUID 931a5f81f9ce40ae5bc0ccde07a8747c, requuid 10.0.43.217-931a5f81f9ce40ae5bc0ccde07a8747c-1521138273
More from: firecamp-dockervolume.INFO
I0315 18:19:09.245270 6 dynamodb_servicemember.go:270] list serviceMembers succeeded, serviceUUID 931a5f81f9ce40ae5bc0ccde07a8747c limit 0 requuid 10.0.43.217-931a5f81f9ce40ae5bc0ccde07a8747c-1521137949 resp count 0xc420061
978
I0315 18:19:09.245388 6 dynamodb_servicemember.go:297] list 3 serviceMembers, serviceUUID 931a5f81f9ce40ae5bc0ccde07a8747c LastEvaluatedKey map[] requuid 10.0.43.217-931a5f81f9ce40ae5bc0ccde07a8747c-1521137949
I0315 18:19:09.245400 6 volume.go:821] member &{931a5f81f9ce40ae5bc0ccde07a8747c 1 ACTIVE firecamp-stage-zookeeper-1 us-east-1b arn:aws:ecs:us-east-1:xxxxxxxxxxxx:task/7b438391-7188-4955-ae9d-1292cbe35ac0 arn:aws:ecs:us-eas
t-1:xxxxxxxxxxxx:container-instance/79d0b066-abda-4944-b82d-597e9b137a16 i-0cc05e662e755c435 1521136853495041077 {vol-06fc1c5d2d0d6304b /dev/xvdg } 127.0.0.1 [0xc420124960 0xc420125050 0xc420125170 0xc4201252c0 0xc4201253e0]} in
use, service &{931a5f81f9ce40ae5bc0ccde07a8747c ACTIVE 1521136846996011291 3 firecamp-stage firecamp-stage-zookeeper {/dev/xvdg {gp2 10 100 false} { 0 0 false}} true firecamp-stage-firecamp.com /hostedzone/Z1826MR4G8CQU6 false
0xc4202b4c00 {0 256 0 4096} } requuid 10.0.43.217-931a5f81f9ce40ae5bc0ccde07a8747c-1521137949
E0315 18:19:09.245423 6 volume.go:829] service has no idle member &{931a5f81f9ce40ae5bc0ccde07a8747c ACTIVE 1521136846996011291 3 firecamp-stage firecamp-stage-zookeeper {/dev/xvdg {gp2 10 100 false} { 0 0 false}} true fir
ecamp-stage-firecamp.com /hostedzone/Z1826MR4G8CQU6 false 0xc4202b4c00 {0 256 0 4096} } requuid 10.0.43.217-931a5f81f9ce40ae5bc0ccde07a8747c-1521137949
E0315 18:19:09.245439 6 volume.go:592] findIdleMember error InternalError requuid 10.0.43.217-931a5f81f9ce40ae5bc0ccde07a8747c-1521137949 service &{931a5f81f9ce40ae5bc0ccde07a8747c ACTIVE 1521136846996011291 3 firecamp-stag
e firecamp-stage-zookeeper {/dev/xvdg {gp2 10 100 false} { 0 0 false}} true firecamp-stage-firecamp.com /hostedzone/Z1826MR4G8CQU6 false 0xc4202b4c00 {0 256 0 4096} }
E0315 18:19:09.245455 6 volume.go:546] Mount failed, get service member error InternalError, serviceUUID 931a5f81f9ce40ae5bc0ccde07a8747c, requuid 10.0.43.217-931a5f81f9ce40ae5bc0ccde07a8747c-1521137949
I0315 18:24:33.907839 6 volume.go:147] Get volume {931a5f81f9ce40ae5bc0ccde07a8747c map[]}
I0315 18:24:33.907862 6 volume.go:166] volume is not mounted for service 931a5f81f9ce40ae5bc0ccde07a8747c
I0315 18:24:33.951591 6 volume.go:224] handle Mount {931a5f81f9ce40ae5bc0ccde07a8747c 9c8018df65bf9e0e850e049c82837cb6e24f6907fba75958bab38a5079d96a14}
I0315 18:24:33.974019 6 dynamodb_serviceattr.go:310] get service attr &{931a5f81f9ce40ae5bc0ccde07a8747c ACTIVE 1521136846996011291 3 firecamp-stage firecamp-stage-zookeeper {/dev/xvdg {gp2 10 100 false} { 0 0 false}} true
firecamp-stage-firecamp.com /hostedzone/Z1826MR4G8CQU6 false 0xc4202b5800 {0 256 0 4096} } requuid 10.0.43.217-931a5f81f9ce40ae5bc0ccde07a8747c-1521138273
I0315 18:24:33.974041 6 volume.go:540] get service attr &{931a5f81f9ce40ae5bc0ccde07a8747c ACTIVE 1521136846996011291 3 firecamp-stage firecamp-stage-zookeeper {/dev/xvdg {gp2 10 100 false} { 0 0 false}} true firecamp-stag
e-firecamp.com /hostedzone/Z1826MR4G8CQU6 false 0xc4202b5800 {0 256 0 4096} } requuid 10.0.43.217-931a5f81f9ce40ae5bc0ccde07a8747c-1521138273
I0315 18:24:34.018568 6 ecs.go:101] list service firecamp-stage-zookeeper cluster firecamp-stage resp {
TaskArns: ["arn:aws:ecs:us-east-1:xxxxxxxxxxxx:task/536c253d-9ae7-460c-9bb4-f7d24cf53807","arn:aws:ecs:us-east-1:xxxxxxxxxxxx:task/7b438391-7188-4955-ae9d-1292cbe35ac0","arn:aws:ecs:us-east-1:xxxxxxxxxxxx:task/a2ab3e4e-558c-452
c-9a78-2363c2d949d7"]
}
I0315 18:24:34.018596 6 ecs.go:119] list task arn:aws:ecs:us-east-1:xxxxxxxxxxxx:task/536c253d-9ae7-460c-9bb4-f7d24cf53807
I0315 18:24:34.018602 6 ecs.go:119] list task arn:aws:ecs:us-east-1:xxxxxxxxxxxx:task/7b438391-7188-4955-ae9d-1292cbe35ac0
I0315 18:24:34.018606 6 ecs.go:119] list task arn:aws:ecs:us-east-1:xxxxxxxxxxxx:task/a2ab3e4e-558c-452c-9a78-2363c2d949d7
I0315 18:24:34.018628 6 ecs.go:122] list 3 tasks, service firecamp-stage-zookeeper cluster firecamp-stage
I0315 18:24:34.022317 6 dynamodb_servicemember.go:270] list serviceMembers succeeded, serviceUUID 931a5f81f9ce40ae5bc0ccde07a8747c limit 0 requuid 10.0.43.217-931a5f81f9ce40ae5bc0ccde07a8747c-1521138273 resp count 0xc420526
778
I0315 18:24:34.022434 6 dynamodb_servicemember.go:297] list 3 serviceMembers, serviceUUID 931a5f81f9ce40ae5bc0ccde07a8747c LastEvaluatedKey map[] requuid 10.0.43.217-931a5f81f9ce40ae5bc0ccde07a8747c-1521138273
I0315 18:24:34.022448 6 volume.go:821] member &{931a5f81f9ce40ae5bc0ccde07a8747c 1 ACTIVE firecamp-stage-zookeeper-1 us-east-1b arn:aws:ecs:us-east-1:xxxxxxxxxxxx:task/7b438391-7188-4955-ae9d-1292cbe35ac0 arn:aws:ecs:us-eas
t-1:xxxxxxxxxxxx:container-instance/79d0b066-abda-4944-b82d-597e9b137a16 i-0cc05e662e755c435 1521136853495041077 {vol-06fc1c5d2d0d6304b /dev/xvdg } 127.0.0.1 [0xc4201b6cf0 0xc4201b6d20 0xc4201b6d50 0xc4201b6d80 0xc4201b6de0]} in
use, service &{931a5f81f9ce40ae5bc0ccde07a8747c ACTIVE 1521136846996011291 3 firecamp-stage firecamp-stage-zookeeper {/dev/xvdg {gp2 10 100 false} { 0 0 false}} true firecamp-stage-firecamp.com /hostedzone/Z1826MR4G8CQU6 false
0xc4202b5800 {0 256 0 4096} } requuid 10.0.43.217-931a5f81f9ce40ae5bc0ccde07a8747c-1521138273
E0315 18:24:34.022478 6 volume.go:829] service has no idle member &{931a5f81f9ce40ae5bc0ccde07a8747c ACTIVE 1521136846996011291 3 firecamp-stage firecamp-stage-zookeeper {/dev/xvdg {gp2 10 100 false} { 0 0 false}} true fir
ecamp-stage-firecamp.com /hostedzone/Z1826MR4G8CQU6 false 0xc4202b5800 {0 256 0 4096} } requuid 10.0.43.217-931a5f81f9ce40ae5bc0ccde07a8747c-1521138273
E0315 18:24:34.022496 6 volume.go:592] findIdleMember error InternalError requuid 10.0.43.217-931a5f81f9ce40ae5bc0ccde07a8747c-1521138273 service &{931a5f81f9ce40ae5bc0ccde07a8747c ACTIVE 1521136846996011291 3 firecamp-stag
e firecamp-stage-zookeeper {/dev/xvdg {gp2 10 100 false} { 0 0 false}} true firecamp-stage-firecamp.com /hostedzone/Z1826MR4G8CQU6 false 0xc4202b5800 {0 256 0 4096} }
E0315 18:24:34.022513 6 volume.go:546] Mount failed, get service member error InternalError, serviceUUID 931a5f81f9ce40ae5bc0ccde07a8747c, requuid 10.0.43.217-931a5f81f9ce40ae5bc0ccde07a8747c-1521138273
I0315 18:29:55.427880 6 volume.go:147] Get volume {931a5f81f9ce40ae5bc0ccde07a8747c map[]}
I0315 18:29:55.427934 6 volume.go:166] volume is not mounted for service 931a5f81f9ce40ae5bc0ccde07a8747c
I0315 18:29:55.472599 6 volume.go:224] handle Mount {931a5f81f9ce40ae5bc0ccde07a8747c 5ab387ef0c7537b725e18ec68507f95b556d8f27a3f668766fa2fca958a52de5}
I0315 18:29:55.503645 6 dynamodb_serviceattr.go:310] get service attr &{931a5f81f9ce40ae5bc0ccde07a8747c ACTIVE 1521136846996011291 3 firecamp-stage firecamp-stage-zookeeper {/dev/xvdg {gp2 10 100 false} { 0 0 false}} true
from firecamp.
This looks weird. It looks like 2 EC2 nodes are in us-east-1b. Could you please check the AZ of all 3 EC2 nodes?
from firecamp.
confirmed. the cluster spun up the replacement node in the same az as one of the others. I can't verify if the original one was in there too at that time. I am going to kill the stack and try again to see if that changes anything.
out of curiosity, does it matter to the firecamp services that each node is in it's own availability zone? i mean if i need to scale up to more nodes it's going to double up at some point.
from firecamp.
It is weird. Could you please share the detail configurations of the ASG? ASG should try to distribute the nodes equally across 3 AZs.
Yes, each node should be in it's own AZ. This is the limitation of EBS volume. FireCamp creates the EBS volume for every service (zookeeper in this case) member. One EBS volume is owned by one AZ, and could not be attached to another AZ. When you scale out to more nodes, it is best to add 3 nodes at one time. So the service members could be distributed to all AZs to tolerate the possible failure of one AZ.
from firecamp.
I spun up a fresh build and ASG placed two in zone b, so it was having that issue earlier today too. Dug into the ASG and found the issue, won't deploy to us-east-1c due to lack of ec2 instance type support. (m3.large in my case). So, i'm going back and will place it in a different availability zone and try again. lets consider this closed, based on what you explained having services spread evenly across the az's is a requirement and issue was on my end.
thanks again, I really appreciate the time you've been taking to help me out.
from firecamp.
Related Issues (20)
- Kafka JMX metrics are not available HOT 1
- update service doesn't change task definition HOT 2
- Zookeeper JMX port is not reachable
- Automatically add CloudWatch Logs filters and alarms
- Kafka configuration changes HOT 6
- MySQL/MardiaDB support? HOT 2
- zookeeper needs to be restarted after upgrading instance type HOT 18
- Unable to start kafka service HOT 10
- zookeeper error HOT 2
- The following resource(s) failed to create: [LambdaCustomResource] HOT 4
- Replace Kafka with the newest HOT 10
- how to update ecs agent? HOT 1
- Referencing Subnets (need output in master stack) HOT 1
- Unable to connect kafka outside containers programatically HOT 2
- Enable SSL for kafka HOT 4
- New ecs agent HOT 1
- Restore kafka data from another volume/snapshot
- Multi-Region Deployment HOT 1
- Questions about enable_materialized_views and enable_transient_replication HOT 1
- Show Error Details
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from firecamp.