Giter Site home page Giter Site logo

Comments (10)

rhcarvalho avatar rhcarvalho commented on August 11, 2024

@wattsteve @bparees @deanpeterson I'm bringing the conversation about how to best implement replication for MongoDB in containers to this spot.


[...] obviously this means the mongo replica example needs to setup replicated shards to ensure that data is not lost if a single pod fails.

@bparees, to some extent, isn't a redeploy equivalent to all pods failing in a short time frame? So it is not enough to tolerate losing a pod (with it's data replicated elsewhere), but, for what I can imagine, we need to have persistent storage for at least one replica of each shard from which we could bring all shards back on after a new deployment is rolled.

its looking like the best way to do this for now, is to not use RCs, and have curated pods that use NodeSelectors, so they are scheduled on hosts that have the right storage available.

@wattsteve, scaling up is one of the things an RC can give, but we don't really need to use that. The other aspect of having an RC is that it can work as a "supervisor" that will make sure the pods that should be running are running.
I wrote a brief note about that in the trello card for this feature. IMHO instead of deploying pods by themselves we should use OpenShift DeploymentConfigs with replicas=1 -- that's what we do in other places, like the examples we have for MySQL and PostgreSQL in which we have a single master in a DC with replicas=1.

The problem with that approach is that there's no obvious way to "synchronize" the multiple DCs to run the logic we have currently to setup the MongoDB Replica Set. The current example uses a DC post-deploy hook that will fire a run-once pod that wires together the replica set members.

I heard an idea from @mfojtik to use e.g. etcd (in yet another pod/DC) to keep the replica set info and use that to keep any and every running MongoDB replica set member configured.

Things I'd like to see in the replication example:

  • Resilience to single pod failure with automatic rescheduling
  • Passwords (admin & app user) can be changed via OpenShift/DeploymentConfig
  • Redeployment at any time without data loss
  • Usage of faster local storage wherever possible
  • The example should be fully automated (via OpenShift templates), and not require users to do more than oc new-app mongo-clustered.json (given that some requirements are met, like the existence of suitable PVs or any other form of required storage)

Note that we don't have sharding involved in the example.

from mongodb-container.

bparees avatar bparees commented on August 11, 2024

@bparees, to some extent, isn't a redeploy equivalent to all pods failing in a short time frame? So it is not enough to tolerate losing a pod (with it's data replicated elsewhere), but, for what I can imagine, we need to have persistent storage for at least one replica of each shard from which we could bring all shards back on after a new deployment is rolled.

that's a deployment strategy problem. we have a rolling deployment which should be doing 1:1 replacements of pods so that it does not take the entire cluster down simultaneously.

from mongodb-container.

rhcarvalho avatar rhcarvalho commented on August 11, 2024

If we'll rely on rolling deployment, then we need a readiness probe or the like, to be able to wait until the replacement has joined the replica set before shutting down the next pod.

from mongodb-container.

bparees avatar bparees commented on August 11, 2024

If we'll rely on rolling deployment, then we need a readiness probe or the like, to be able to wait until the replacement has joined the replica set before shutting down the next pod.

sure. but that's just using the platform the way it's designed to be used. (strictly speaking you also need to wait for the new member to have completed replicating/catching up. not sure if mongo makes that distinction clear or not)

from mongodb-container.

rhcarvalho avatar rhcarvalho commented on August 11, 2024

AFAIK, depending on oplog size and # writes/sec, we can get to a point where it is impossible to fire up a new replica.
Not sure if I recall the working details correctly, but a replica starting from scratch would need to 1) do a full sync of existing data; plus 2) sync all data from when 1 started up till "now" (that needs to come from the oplog). If the oldest entry in the oplog is older than when the full sync started, replication fails.
In this kind of scenario, an online replication is impossible, or at least it used to be in the past. I heard this from MongoDB engineers several years ago in one of their events.

from mongodb-container.

bparees avatar bparees commented on August 11, 2024

At the end of the day, the solution to these problems is "disable automatic deployments and use a controlled process to roll out a new deployment, possibly including quiescing traffic".

from mongodb-container.

wattsteve avatar wattsteve commented on August 11, 2024

@rhcarvalho I acknowledge and agree with your point on the fact that RCs can provide a Singleton pattern when there is only one replica whereas standalone pods do not offer that.

To your other point about needing "persistent storage for at least one replica of each shard". I think that depends on the size of the Mongo cluster and how many replicas there are. If you have 10 ephemeral (EmptyDir) replicas for each shard, with each pod/shard replica on an independent server (i.e. a large cluster) then perhaps you have enough replicas to be safe, whereas if you have a small cluster and can only manage 1 or 2 replicas for each shard then one of those should probably be persistent. I think eventually, it would be good for us to provide the option to tune things like this.

If we are going to use an RC for this, we need a way to ensure that there is never more than a single MongoDB pod on each server. This is because we would introduce a new failure domain if we have two pods on the same server and the one pod is providing a shard replica for the other.

from mongodb-container.

goern avatar goern commented on August 11, 2024

pls take into consideration that mongodb 3.2 changes the default storage engine to wiredtiger: https://docs.mongodb.org/manual/core/wiredtiger/

from mongodb-container.

rhcarvalho avatar rhcarvalho commented on August 11, 2024

This issue is blocked on having PetSets in Kubernetes kubernetes/kubernetes#18016 in order to have a single Deployment Config and independent storage for replicated pods.

A known workaround for having persistent storage with replication today is to use multiple DCs and never scale up.

from mongodb-container.

omron93 avatar omron93 commented on August 11, 2024

#206 implementing this feature was merged. So closing this issue.

from mongodb-container.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.