Comments (3)
So, I'm specifically working on this. It seems like there's two potential pathways forwards:
- write to the shared information service (SIS) with the information about the manual failover
- send signals to the API on each node about the failover
SIS Appoach:
Advantages:
- client only needs to access the SIS
- easy to integrate with SIS-based monitoring
- consistency with other signals we may want to pass via the SIS, i.e. "nogovernor" flag
- allows signalling other replicas not to try to grab the master flag
- requires only a very simple client script
Disadvantages:
- very asynchronous; hard to know when failover is complete except by polling
- zero troubleshooting info if failover doesn't happen
- requires admins writing to the SIS, which is a new pattern
Node API approach
Advantages:
- synchronous: can find out how failover is going immediately
- doesn't require touching the main failover loop
- makes targeting specific nodes fairly obvious
Disadvangates:
- requires securing the node API because now you can change databases via it.
- client would need to be heavy; will have to connect to all database nodes, possibly, to make the failover go as planned
- possible synchronization issues between data in SIS and what individual nodes are doing
Overall, it seems to me that doing this via the SIS makes more sense. Discussion?
from patroni.
The SIS approach seems like it needs only one extra piece of information: a "new-master" key for the cluster repo. The way it would work is:
- manual client writes the "new-master" key.
- the current master, at the beginning of its governor loop, checks for a new-master key. If the new-master key isn't itself, it shuts down the postgreSQL server and releases the master lock key.
- at the beginning of each replica's loop is a check for a new-master key. If present, this key preempts all other failover logic. If the key is present, that named server starts trying to acquire the master lock and become the master.
- Once the new replica has become the master, it removes the new-master key.
Issues/additions:
a. the old master needs to check the new-master's metadata in SIS to make sure that it's able to fail over before shutting down.
b. we need some way to indicate to the user why manual failover did not occur if it fails.
c. if the new-master is unable to promote after the old master is shut down, what should the system do? And how?
from patroni.
This feature is covered by: #56, #67 and #82
from patroni.
Related Issues (20)
- Reinit master with empty directory after data corruption HOT 1
- unreasonable ttl will cause all DCS connection raise [Errno 22] Invalid argument HOT 4
- ERROR: replication slot "bar_psqldb04" does not exist HOT 2
- Patroni overwrite synchronous_standby_names on primary in async mode
- Failsafe mode when master doesn't have access to DCS HOT 1
- TypeError: string argument without an encoding HOT 1
- Patroni Does Not Failover on Data Disk Full Shutdown HOT 3
- Missing cdiff in requirements HOT 2
- switchover pg cluster,but master not failover HOT 3
- Unable to deploy haproxy after deploy Patroni on K8s HOT 1
- [3.1.0] synchronous_mode updating synchronous_standby_names with leader node HOT 12
- recovery settings in postgresql parameters will cause recovery_conf ignored while building config HOT 1
- master postgres crashed and rejoin failed HOT 1
- 'psycopg2.extensions.connection' object has no attribute 'info'
- WAL Files are not deleted after 20 GB pg_restore on primary node HOT 3
- Patroni lost connection and restarted after restarting the etcd-Server HOT 3
- doing unnecessary crash recovery when primary_start_timeout=0 and failover is impossible HOT 1
- Patronictl edit-config throws an exception but updates the config file even when less and more are present on the host HOT 1
- FATAL: could not receive data from WAL stream: HOT 12
- Failover issue url: /patroni ('Connection aborted.', 'Connection reset by peer' HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from patroni.