Giter Site home page Giter Site logo

Mount Failed about firecamp HOT 6 CLOSED

cloudstax avatar cloudstax commented on May 21, 2024
Mount Failed

from firecamp.

Comments (6)

JuniusLuo avatar JuniusLuo commented on May 21, 2024

Could you please help to collect more information? 1) the full trace in volume info log around 18:24:34. 2) the availability zones of the 3 EC2 instances. 3) firecamp-service-cli -op=list-members -region=us-east-1 -cluster=firecamp-stage -service-name=firecamp-stage-zookeeper.

from firecamp.

dev-head avatar dev-head commented on May 21, 2024

thanks, @JuniusLuo for taking at this...

  • ASG AvailabilityZones: us-east-1a,us-east-1b,us-east-1c

More from: firecamp-dockervolume.ERROR

this just repeats from the start of the error log (after the init log message)

E0315 18:24:34.022478       6 volume.go:829] service has no idle member &{931a5f81f9ce40ae5bc0ccde07a8747c ACTIVE 1521136846996011291 3 firecamp-stage firecamp-stage-zookeeper {/dev/xvdg {gp2 10 100 false}  { 0 0 false}} true fir
ecamp-stage-firecamp.com /hostedzone/Z1826MR4G8CQU6 false 0xc4202b5800 {0 256 0 4096} } requuid 10.0.43.217-931a5f81f9ce40ae5bc0ccde07a8747c-1521138273
E0315 18:24:34.022496       6 volume.go:592] findIdleMember error InternalError requuid 10.0.43.217-931a5f81f9ce40ae5bc0ccde07a8747c-1521138273 service &{931a5f81f9ce40ae5bc0ccde07a8747c ACTIVE 1521136846996011291 3 firecamp-stag
e firecamp-stage-zookeeper {/dev/xvdg {gp2 10 100 false}  { 0 0 false}} true firecamp-stage-firecamp.com /hostedzone/Z1826MR4G8CQU6 false 0xc4202b5800 {0 256 0 4096} }
E0315 18:24:34.022513       6 volume.go:546] Mount failed, get service member error InternalError, serviceUUID 931a5f81f9ce40ae5bc0ccde07a8747c, requuid 10.0.43.217-931a5f81f9ce40ae5bc0ccde07a8747c-1521138273

More from: firecamp-dockervolume.INFO

I0315 18:19:09.245270       6 dynamodb_servicemember.go:270] list serviceMembers succeeded, serviceUUID 931a5f81f9ce40ae5bc0ccde07a8747c limit 0 requuid 10.0.43.217-931a5f81f9ce40ae5bc0ccde07a8747c-1521137949 resp count 0xc420061
978
I0315 18:19:09.245388       6 dynamodb_servicemember.go:297] list 3 serviceMembers, serviceUUID 931a5f81f9ce40ae5bc0ccde07a8747c LastEvaluatedKey map[] requuid 10.0.43.217-931a5f81f9ce40ae5bc0ccde07a8747c-1521137949
I0315 18:19:09.245400       6 volume.go:821] member &{931a5f81f9ce40ae5bc0ccde07a8747c 1 ACTIVE firecamp-stage-zookeeper-1 us-east-1b arn:aws:ecs:us-east-1:xxxxxxxxxxxx:task/7b438391-7188-4955-ae9d-1292cbe35ac0 arn:aws:ecs:us-eas
t-1:xxxxxxxxxxxx:container-instance/79d0b066-abda-4944-b82d-597e9b137a16 i-0cc05e662e755c435 1521136853495041077 {vol-06fc1c5d2d0d6304b /dev/xvdg  } 127.0.0.1 [0xc420124960 0xc420125050 0xc420125170 0xc4201252c0 0xc4201253e0]} in
 use, service &{931a5f81f9ce40ae5bc0ccde07a8747c ACTIVE 1521136846996011291 3 firecamp-stage firecamp-stage-zookeeper {/dev/xvdg {gp2 10 100 false}  { 0 0 false}} true firecamp-stage-firecamp.com /hostedzone/Z1826MR4G8CQU6 false 
0xc4202b4c00 {0 256 0 4096} } requuid 10.0.43.217-931a5f81f9ce40ae5bc0ccde07a8747c-1521137949
E0315 18:19:09.245423       6 volume.go:829] service has no idle member &{931a5f81f9ce40ae5bc0ccde07a8747c ACTIVE 1521136846996011291 3 firecamp-stage firecamp-stage-zookeeper {/dev/xvdg {gp2 10 100 false}  { 0 0 false}} true fir
ecamp-stage-firecamp.com /hostedzone/Z1826MR4G8CQU6 false 0xc4202b4c00 {0 256 0 4096} } requuid 10.0.43.217-931a5f81f9ce40ae5bc0ccde07a8747c-1521137949
E0315 18:19:09.245439       6 volume.go:592] findIdleMember error InternalError requuid 10.0.43.217-931a5f81f9ce40ae5bc0ccde07a8747c-1521137949 service &{931a5f81f9ce40ae5bc0ccde07a8747c ACTIVE 1521136846996011291 3 firecamp-stag
e firecamp-stage-zookeeper {/dev/xvdg {gp2 10 100 false}  { 0 0 false}} true firecamp-stage-firecamp.com /hostedzone/Z1826MR4G8CQU6 false 0xc4202b4c00 {0 256 0 4096} }
E0315 18:19:09.245455       6 volume.go:546] Mount failed, get service member error InternalError, serviceUUID 931a5f81f9ce40ae5bc0ccde07a8747c, requuid 10.0.43.217-931a5f81f9ce40ae5bc0ccde07a8747c-1521137949
I0315 18:24:33.907839       6 volume.go:147] Get volume {931a5f81f9ce40ae5bc0ccde07a8747c map[]}
I0315 18:24:33.907862       6 volume.go:166] volume is not mounted for service 931a5f81f9ce40ae5bc0ccde07a8747c
I0315 18:24:33.951591       6 volume.go:224] handle Mount  {931a5f81f9ce40ae5bc0ccde07a8747c 9c8018df65bf9e0e850e049c82837cb6e24f6907fba75958bab38a5079d96a14}
I0315 18:24:33.974019       6 dynamodb_serviceattr.go:310] get service attr &{931a5f81f9ce40ae5bc0ccde07a8747c ACTIVE 1521136846996011291 3 firecamp-stage firecamp-stage-zookeeper {/dev/xvdg {gp2 10 100 false}  { 0 0 false}} true
 firecamp-stage-firecamp.com /hostedzone/Z1826MR4G8CQU6 false 0xc4202b5800 {0 256 0 4096} } requuid 10.0.43.217-931a5f81f9ce40ae5bc0ccde07a8747c-1521138273
I0315 18:24:33.974041       6 volume.go:540] get service attr &{931a5f81f9ce40ae5bc0ccde07a8747c ACTIVE 1521136846996011291 3 firecamp-stage firecamp-stage-zookeeper {/dev/xvdg {gp2 10 100 false}  { 0 0 false}} true firecamp-stag
e-firecamp.com /hostedzone/Z1826MR4G8CQU6 false 0xc4202b5800 {0 256 0 4096} } requuid 10.0.43.217-931a5f81f9ce40ae5bc0ccde07a8747c-1521138273
I0315 18:24:34.018568       6 ecs.go:101] list service firecamp-stage-zookeeper cluster firecamp-stage resp {
  TaskArns: ["arn:aws:ecs:us-east-1:xxxxxxxxxxxx:task/536c253d-9ae7-460c-9bb4-f7d24cf53807","arn:aws:ecs:us-east-1:xxxxxxxxxxxx:task/7b438391-7188-4955-ae9d-1292cbe35ac0","arn:aws:ecs:us-east-1:xxxxxxxxxxxx:task/a2ab3e4e-558c-452
c-9a78-2363c2d949d7"]
}
I0315 18:24:34.018596       6 ecs.go:119] list task arn:aws:ecs:us-east-1:xxxxxxxxxxxx:task/536c253d-9ae7-460c-9bb4-f7d24cf53807
I0315 18:24:34.018602       6 ecs.go:119] list task arn:aws:ecs:us-east-1:xxxxxxxxxxxx:task/7b438391-7188-4955-ae9d-1292cbe35ac0
I0315 18:24:34.018606       6 ecs.go:119] list task arn:aws:ecs:us-east-1:xxxxxxxxxxxx:task/a2ab3e4e-558c-452c-9a78-2363c2d949d7
I0315 18:24:34.018628       6 ecs.go:122] list 3 tasks, service firecamp-stage-zookeeper cluster firecamp-stage
I0315 18:24:34.022317       6 dynamodb_servicemember.go:270] list serviceMembers succeeded, serviceUUID 931a5f81f9ce40ae5bc0ccde07a8747c limit 0 requuid 10.0.43.217-931a5f81f9ce40ae5bc0ccde07a8747c-1521138273 resp count 0xc420526
778
I0315 18:24:34.022434       6 dynamodb_servicemember.go:297] list 3 serviceMembers, serviceUUID 931a5f81f9ce40ae5bc0ccde07a8747c LastEvaluatedKey map[] requuid 10.0.43.217-931a5f81f9ce40ae5bc0ccde07a8747c-1521138273
I0315 18:24:34.022448       6 volume.go:821] member &{931a5f81f9ce40ae5bc0ccde07a8747c 1 ACTIVE firecamp-stage-zookeeper-1 us-east-1b arn:aws:ecs:us-east-1:xxxxxxxxxxxx:task/7b438391-7188-4955-ae9d-1292cbe35ac0 arn:aws:ecs:us-eas
t-1:xxxxxxxxxxxx:container-instance/79d0b066-abda-4944-b82d-597e9b137a16 i-0cc05e662e755c435 1521136853495041077 {vol-06fc1c5d2d0d6304b /dev/xvdg  } 127.0.0.1 [0xc4201b6cf0 0xc4201b6d20 0xc4201b6d50 0xc4201b6d80 0xc4201b6de0]} in
 use, service &{931a5f81f9ce40ae5bc0ccde07a8747c ACTIVE 1521136846996011291 3 firecamp-stage firecamp-stage-zookeeper {/dev/xvdg {gp2 10 100 false}  { 0 0 false}} true firecamp-stage-firecamp.com /hostedzone/Z1826MR4G8CQU6 false 
0xc4202b5800 {0 256 0 4096} } requuid 10.0.43.217-931a5f81f9ce40ae5bc0ccde07a8747c-1521138273
E0315 18:24:34.022478       6 volume.go:829] service has no idle member &{931a5f81f9ce40ae5bc0ccde07a8747c ACTIVE 1521136846996011291 3 firecamp-stage firecamp-stage-zookeeper {/dev/xvdg {gp2 10 100 false}  { 0 0 false}} true fir
ecamp-stage-firecamp.com /hostedzone/Z1826MR4G8CQU6 false 0xc4202b5800 {0 256 0 4096} } requuid 10.0.43.217-931a5f81f9ce40ae5bc0ccde07a8747c-1521138273
E0315 18:24:34.022496       6 volume.go:592] findIdleMember error InternalError requuid 10.0.43.217-931a5f81f9ce40ae5bc0ccde07a8747c-1521138273 service &{931a5f81f9ce40ae5bc0ccde07a8747c ACTIVE 1521136846996011291 3 firecamp-stag
e firecamp-stage-zookeeper {/dev/xvdg {gp2 10 100 false}  { 0 0 false}} true firecamp-stage-firecamp.com /hostedzone/Z1826MR4G8CQU6 false 0xc4202b5800 {0 256 0 4096} }
E0315 18:24:34.022513       6 volume.go:546] Mount failed, get service member error InternalError, serviceUUID 931a5f81f9ce40ae5bc0ccde07a8747c, requuid 10.0.43.217-931a5f81f9ce40ae5bc0ccde07a8747c-1521138273
I0315 18:29:55.427880       6 volume.go:147] Get volume {931a5f81f9ce40ae5bc0ccde07a8747c map[]}
I0315 18:29:55.427934       6 volume.go:166] volume is not mounted for service 931a5f81f9ce40ae5bc0ccde07a8747c
I0315 18:29:55.472599       6 volume.go:224] handle Mount  {931a5f81f9ce40ae5bc0ccde07a8747c 5ab387ef0c7537b725e18ec68507f95b556d8f27a3f668766fa2fca958a52de5}
I0315 18:29:55.503645       6 dynamodb_serviceattr.go:310] get service attr &{931a5f81f9ce40ae5bc0ccde07a8747c ACTIVE 1521136846996011291 3 firecamp-stage firecamp-stage-zookeeper {/dev/xvdg {gp2 10 100 false}  { 0 0 false}} true

from firecamp.

JuniusLuo avatar JuniusLuo commented on May 21, 2024

This looks weird. It looks like 2 EC2 nodes are in us-east-1b. Could you please check the AZ of all 3 EC2 nodes?

from firecamp.

dev-head avatar dev-head commented on May 21, 2024

confirmed. the cluster spun up the replacement node in the same az as one of the others. I can't verify if the original one was in there too at that time. I am going to kill the stack and try again to see if that changes anything.

out of curiosity, does it matter to the firecamp services that each node is in it's own availability zone? i mean if i need to scale up to more nodes it's going to double up at some point.

from firecamp.

JuniusLuo avatar JuniusLuo commented on May 21, 2024

It is weird. Could you please share the detail configurations of the ASG? ASG should try to distribute the nodes equally across 3 AZs.

Yes, each node should be in it's own AZ. This is the limitation of EBS volume. FireCamp creates the EBS volume for every service (zookeeper in this case) member. One EBS volume is owned by one AZ, and could not be attached to another AZ. When you scale out to more nodes, it is best to add 3 nodes at one time. So the service members could be distributed to all AZs to tolerate the possible failure of one AZ.

from firecamp.

dev-head avatar dev-head commented on May 21, 2024

I spun up a fresh build and ASG placed two in zone b, so it was having that issue earlier today too. Dug into the ASG and found the issue, won't deploy to us-east-1c due to lack of ec2 instance type support. (m3.large in my case). So, i'm going back and will place it in a different availability zone and try again. lets consider this closed, based on what you explained having services spread evenly across the az's is a requirement and issue was on my end.

thanks again, I really appreciate the time you've been taking to help me out.

from firecamp.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.