Giter Site home page Giter Site logo

Comments (5)

mjpt777 avatar mjpt777 commented on June 3, 2024

This is deliberate corruption of the state and not supported. If this happens you would need to recover from backup or use our warm standby premium feature.

from aeron.

pcdv avatar pcdv commented on June 3, 2024

Not sure I would call that corruption since the state fully exists in one node: it just needs to be replicated to the other nodes that don't have any state at all. It feels like not much is missing to make it work (but I don't know enough the cluster internals to know for sure).

you would need to recover from backup

Actually, this problem occurred while recovering from backup, but it boils down to the test above.

The initial scenario was:

  • start 3 nodes
  • start one ClusterBackup node, replicating the state
  • stop everything
  • forget about the 3 initial nodes
  • set up a new 3-node cluster where one node reuses the directories generated by ClusterBackup and 2 other nodes start empty

This scenario works only if the ingress log is never truncated.

An alternative would be start 3 ClusterBackup nodes but it seems overkill to replicate the state 3 times.

Is ClusterBackup an actually viable solution or only the premium features allow to have a reliable backup?

from aeron.

mikeb01 avatar mikeb01 commented on June 3, 2024

Not sure I would call that corruption since the state fully exists in one node

This is where the example breaks down. Raft and other similar consensus algorithms that handle fail-stop type faults can only handle 1 failure in a 3 node cluster. You have a scenario where the 3 node cluster has 2 failed nodes. This is beyond what the algorithm has the ability to correctly and automatically recover from. Therefore you would need to fall back to manually fixing the system.

An alternative would be start 3 ClusterBackup nodes but it seems overkill to replicate the state 3 times.

The premium Cluster Standby would create 3 replicated copies of the data for the scenarios where the user wants to have another cluster that can they can fail over to. It has some functionality to support daisy chain style replication to reduce load on the primary cluster and potential WAN bandwidth consumption. However, we would not see having 3 replicated copies of the state as overkill.

from aeron.

pcdv avatar pcdv commented on June 3, 2024

Thank you for your detailed answer.

This is beyond what the algorithm has the ability to correctly and automatically recover from.

In my initial tests with ClusterBackup, this scenario worked perfectly (until I truncated the log). My mistake was then probably to think it was a supported use-case. There is not much litterature around ClusterBackup.

You have a scenario where the 3 node cluster has 2 failed nodes.

Actually, if I modify the test to have only one failed node (i.e. replace one true by false), it fails just the same.

Therefore you would need to fall back to manually fixing the system.

Yes, I could detect the absence of archive + cluster dirs in one node, wait a bit for other nodes to be ready to start, and automate the download of the state from another node before starting the cluster. Not trivial, but doable. Probably easier to update the fail over procedure so that data is copied manually into the extra nodes :)

we would not see having 3 replicated copies of the state as overkill.

Agree. I only meant that transmitting the same data 3 times over the network would not be optimal.

The premium Cluster Standby sure looks interesting!

from aeron.

mjpt777 avatar mjpt777 commented on June 3, 2024

As @mikeb01 has pointed out this goes beyond the Raft algorithm. The reason the purge causes issues is that the leader must be log complete under the spec. Also consider without the purge the others nodes have to recovery the whole log. For a long running system this would not be practical as an alternative, even if it works in a simple test.

Snapshots are an optimisation but again there is very little explanation to how they are implemented in the Raft paper or PhD thesis. To purge a old log it needs to be coordinated and you need to know other nodes are up to date so you cannot introduce more than one failure in a 3 node system as Mike points out.

You can use a combination of Cluster Backup and some scripting to make this work. We have to make a living so we provide commercial support and a premium offering that makes this much easier. Many open core offerings do not even provide basic replication or fault tolerance in the open offering. We think we have gone pretty far with with we offer openly given the years of engineering effort that as gone into Aeron.

from aeron.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.