shmel1k / qumomf Goto Github PK
View Code? Open in Web Editor NEWMom's friend vshard quorum
License: Apache License 2.0
Mom's friend vshard quorum
License: Apache License 2.0
Avoid flapping (cascading failures causing continuous outage and elimination of resources) by introducing a block period, where on any given replica set, qumomf will not kick in automated recovery on an interval smaller than said period.
Just will be really useful in future development process.
Will be easy to deploy and distribute
Default level is debug.
Using vshard.storage.info
or vshard.router.info
we might auto discover the cluster topology. User would just provide credentials and initial endpoint to connect.
Each cluster should have a unique name to identify them. Names and other meta information we can store to a database backend like SQLite.
What should be implemented:
We should provide an options to filter out replicas which are too lagging from master election:
It will allow users to control the master election behaviour and fail the promotion if chosen replica is too far from master.
The idea is similar to the logic realized in MySQL Orchestrator.
Current implementation: qumomf decides to switch replica role to master if it lost the connection to old master. It might lead to a false failover.
I suggest to analyze replica alerts and if they notify that replication is broken too then qumomf should start a failover process.
Automate build, test, lint tasks
Add instance URI to metrics. UUID is not handy when looking for a problem.
For testing purpose it would be useful to have a readonly option. If that option is enabled, qumomf will not change cluster topology in any way.
Replace log + fmt
to something more convenient, e.g. zerolog
It will allow to save and retrieve all important information in case of failures.
To monitor the qumomf status and getting its info like version, built time, etc.
Subj.
{"level":"warn","cluster":"order_control","replica_set":"5065fb5f-5f40-498e-af79-43887ba3d1ec","master_uri":"some_master_uri", ,"message":"Master is reachable but some of its replicas are not replicating. No actions will be applied."}
Can you please add some additional information about replicas that are failing to replicate?
Let users to specify custom promotion priorities to each replica.
Negative priority should mean that replica must not participate in the master election.
Subj. No timestamp is currently present in logs. :(
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.