This issue is to track the remaining work for manager rolling updates. <p dir="aut

PR <a class="issue-link js-issue-link" data-error-text="Failed to load title" data-id=

Force manager leadership change on during manager rolling update about deploykit HOT 3 CLOSED

kaufers commented on June 18, 2024

Force manager leadership change on during manager rolling update

from deploykit.

Comments (3)

chungers commented on June 18, 2024 1

This is the protocol for handling the last node (the current leader):

Let's say we have

ID                            HOSTNAME            STATUS              AVAILABILITY        MANAGER STATUS
y3mzqmkhjumbm80c9k628wcjt *   ip-172-31-20-100    Ready               Active              Leader
0anmdnze0p3fdky9egdsmnut5     ip-172-31-20-101    Ready               Active              Reachable
hxv6q6cej095lw66cdp2r48oh     ip-172-31-20-102    Ready               Active              Reachable

Infrakit is running on ip-172-31-20-100.
On this node we have been updating the other nodes ip-172-31-20-101 and ip-172-31-20-102.
Now this last node "self" / leader node:

root@ip-172-31-20-100:~# docker node demote y3mzqmkhjumbm80c9k628wcjt
Manager y3mzqmkhjumbm80c9k628wcjt demoted in the swarm.
root@ip-172-31-20-100:~# docker swarm leave
Node left the swarm.

At this point, on another node:

root@ip-172-31-20-102:~# docker node ls
ID                            HOSTNAME            STATUS              AVAILABILITY        MANAGER STATUS
y3mzqmkhjumbm80c9k628wcjt     ip-172-31-20-100    Down                Active
0anmdnze0p3fdky9egdsmnut5     ip-172-31-20-101    Ready               Active              Leader
hxv6q6cej095lw66cdp2r48oh *   ip-172-31-20-102    Ready               Active              Reachable

We have a working two-node quorum, ip-172-31-20-102 is the leader. At this point, the cluster is still running fine, but we are at the max tolerance of 1 node, so we'd need to bring up a new manager to join the quorum soon.
Infrakit should be running on this node now as the leader.
Infrakit provisions a new node (ip-172-31-20-103) . On this new node

root@ip-172-31-20-103:~# docker swarm join --token SWMTKN-1-4btcqpypxxd24t194hihgaf7sy8py76gktzfx1dpasq9umjo08-1g1im49494f8nc3mr8mf6yybh 172.31.20.102:2377
This node joined a swarm as a manager.

Back on the leader ip-172-31-20-102:

root@ip-172-31-20-102:~# docker node ls
ID                            HOSTNAME            STATUS              AVAILABILITY        MANAGER STATUS
y3mzqmkhjumbm80c9k628wcjt     ip-172-31-20-100    Down                Active
0anmdnze0p3fdky9egdsmnut5     ip-172-31-20-101    Ready               Active              Leader
hxv6q6cej095lw66cdp2r48oh *   ip-172-31-20-102    Ready               Active              Reachable
n88dc4rlelwz5o53utz619ip8     ip-172-31-20-103    Ready               Active              Reachable

At this point, we have restored the 3 node quorum. However, we need to do some clean up for the node y3mzqmkhjumbm80c9k628wcjt (which was the previous leader and where we originally did the demote and swarm leave.) So from the leader node:

root@ip-172-31-20-102:~# docker node rm y3mzqmkhjumbm80c9k628wcjt
y3mzqmkhjumbm80c9k628wcjt
root@ip-172-31-20-102:~# docker node ls
ID                            HOSTNAME            STATUS              AVAILABILITY        MANAGER STATUS
0anmdnze0p3fdky9egdsmnut5     ip-172-31-20-101    Ready               Active              Leader
hxv6q6cej095lw66cdp2r48oh *   ip-172-31-20-102    Ready               Active              Reachable
n88dc4rlelwz5o53utz619ip8     ip-172-31-20-103    Ready               Active              Reachable

Now we have a clean cluster with all managers successfully updated.

Note that you cannot docker node rm from a non-manager node... but any node (worker or manager), you can do docker swarm leave (assuming a docker node demote is applied to a manager).

So to summarize, this is the behavior we want:

For Managers

Bootscript:
a. docker swarm init for the first manager node, which becomes the leader.
b. docker swarm join --token <manager_token> <ip> for the followers.
Flavor.Drain needs to do the following:
a. docker node demote <id> where <id> is the Swarm node id of the current leader node (self).
b. docker swarm leave

For Workers

Bootscript: docker swarm join --token <worker_token> <ip> for all workers.
Flavor.Drain only needs to do docker swarm leave (no demotion necessary)

Garbage Collection

Now this doesn't account for the clean up with docker node rm <old_leader_id> to remove the old leader node which is now considered in the Down state. I think this clean up mechanism could be implemented separately as another continuous process that just does docker node rm on anything that is in the Down status and no actual VM instances with the link tag. We could even generalize this into a reaper that can garbage collect running vm instances that somehow have no corresponding docker node ls entries (failed to join the cluster) and those Down entries in the Swarm with no corresponding vm instances (the case with the old leader). To keep the scope of this issue manageable, I'd leave garbage collection as a separate issue or PR.

from deploykit.

chungers commented on June 18, 2024

PR #839 is related to this.

Since there is no API to 'demote' a swarm leader, the easiest way to accomplish this is to have the leader node destroy itself as the very last node of rolling update. This will force another node to assume leadership and that new leader will then provision a new manager node, thereby completing the rolling update.

Can we use this issue to track the testing of this behavior?

from deploykit.

chungers commented on June 18, 2024

When a resource (e.g. vm instance) is destroyed, the following steps are true:

Flavor's Drain on the resource is called.
Instance plugin's Destroy is called to delete the resource.

It turns out Drain in the swarm manager flavor isn't implemented. So that's a place where we could do a docker swarm leave on the node... Then the Destroy on the resource (the "self" node), is a no-op.

This way another node could take over as leader and continue with a real Destroy.

I need to verify the behavior of the swarm managers via this sequence to make sure a new manager can join even if it has all the /var/lib/docker state from a node that technically has 'left'. If the new node can join without problems then this may be an easy implementation.

The comments for #840 on using tombstones (with symlinks) are still relevant in helping the terraform plugin recover from a crash during resource deletion.

from deploykit.

Force manager leadership change on during manager rolling update about deploykit HOT 3 CLOSED

Comments (3)

For Managers

For Workers

Garbage Collection

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent