The replication example for MongoDB should use persistent storage (<a href="https://tr

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

<a class="user-mention notranslate" data-hovercard-type="user" data-hover

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

This issue is blocked on having PetSets in Kubernetes <a class="issue-link js-issue-li

<a class="issue-link js-issue-link" data-error-text="Failed to load title" data-id="18

Replication example with persistent storage about mongodb-container HOT 10 CLOSED

sclorg commented on August 11, 2024

Replication example with persistent storage

from mongodb-container.

Comments (10)

rhcarvalho commented on August 11, 2024

@wattsteve @bparees @deanpeterson I'm bringing the conversation about how to best implement replication for MongoDB in containers to this spot.

[...] obviously this means the mongo replica example needs to setup replicated shards to ensure that data is not lost if a single pod fails.

@bparees, to some extent, isn't a redeploy equivalent to all pods failing in a short time frame? So it is not enough to tolerate losing a pod (with it's data replicated elsewhere), but, for what I can imagine, we need to have persistent storage for at least one replica of each shard from which we could bring all shards back on after a new deployment is rolled.

its looking like the best way to do this for now, is to not use RCs, and have curated pods that use NodeSelectors, so they are scheduled on hosts that have the right storage available.

@wattsteve, scaling up is one of the things an RC can give, but we don't really need to use that. The other aspect of having an RC is that it can work as a "supervisor" that will make sure the pods that should be running are running.
I wrote a brief note about that in the trello card for this feature. IMHO instead of deploying pods by themselves we should use OpenShift DeploymentConfigs with replicas=1 -- that's what we do in other places, like the examples we have for MySQL and PostgreSQL in which we have a single master in a DC with replicas=1.

The problem with that approach is that there's no obvious way to "synchronize" the multiple DCs to run the logic we have currently to setup the MongoDB Replica Set. The current example uses a DC post-deploy hook that will fire a run-once pod that wires together the replica set members.

I heard an idea from @mfojtik to use e.g. etcd (in yet another pod/DC) to keep the replica set info and use that to keep any and every running MongoDB replica set member configured.

Things I'd like to see in the replication example:

Resilience to single pod failure with automatic rescheduling
Passwords (admin & app user) can be changed via OpenShift/DeploymentConfig
Redeployment at any time without data loss
Usage of faster local storage wherever possible
The example should be fully automated (via OpenShift templates), and not require users to do more than oc new-app mongo-clustered.json (given that some requirements are met, like the existence of suitable PVs or any other form of required storage)

Note that we don't have sharding involved in the example.

from mongodb-container.

bparees commented on August 11, 2024

@bparees, to some extent, isn't a redeploy equivalent to all pods failing in a short time frame? So it is not enough to tolerate losing a pod (with it's data replicated elsewhere), but, for what I can imagine, we need to have persistent storage for at least one replica of each shard from which we could bring all shards back on after a new deployment is rolled.

that's a deployment strategy problem. we have a rolling deployment which should be doing 1:1 replacements of pods so that it does not take the entire cluster down simultaneously.

from mongodb-container.

rhcarvalho commented on August 11, 2024

If we'll rely on rolling deployment, then we need a readiness probe or the like, to be able to wait until the replacement has joined the replica set before shutting down the next pod.

from mongodb-container.

bparees commented on August 11, 2024

If we'll rely on rolling deployment, then we need a readiness probe or the like, to be able to wait until the replacement has joined the replica set before shutting down the next pod.

sure. but that's just using the platform the way it's designed to be used. (strictly speaking you also need to wait for the new member to have completed replicating/catching up. not sure if mongo makes that distinction clear or not)

from mongodb-container.

rhcarvalho commented on August 11, 2024

AFAIK, depending on oplog size and # writes/sec, we can get to a point where it is impossible to fire up a new replica.
Not sure if I recall the working details correctly, but a replica starting from scratch would need to 1) do a full sync of existing data; plus 2) sync all data from when 1 started up till "now" (that needs to come from the oplog). If the oldest entry in the oplog is older than when the full sync started, replication fails.
In this kind of scenario, an online replication is impossible, or at least it used to be in the past. I heard this from MongoDB engineers several years ago in one of their events.

from mongodb-container.

bparees commented on August 11, 2024

At the end of the day, the solution to these problems is "disable automatic deployments and use a controlled process to roll out a new deployment, possibly including quiescing traffic".

from mongodb-container.

wattsteve commented on August 11, 2024

@rhcarvalho I acknowledge and agree with your point on the fact that RCs can provide a Singleton pattern when there is only one replica whereas standalone pods do not offer that.

To your other point about needing "persistent storage for at least one replica of each shard". I think that depends on the size of the Mongo cluster and how many replicas there are. If you have 10 ephemeral (EmptyDir) replicas for each shard, with each pod/shard replica on an independent server (i.e. a large cluster) then perhaps you have enough replicas to be safe, whereas if you have a small cluster and can only manage 1 or 2 replicas for each shard then one of those should probably be persistent. I think eventually, it would be good for us to provide the option to tune things like this.

If we are going to use an RC for this, we need a way to ensure that there is never more than a single MongoDB pod on each server. This is because we would introduce a new failure domain if we have two pods on the same server and the one pod is providing a shard replica for the other.

from mongodb-container.

goern commented on August 11, 2024

pls take into consideration that mongodb 3.2 changes the default storage engine to wiredtiger: https://docs.mongodb.org/manual/core/wiredtiger/

from mongodb-container.

rhcarvalho commented on August 11, 2024

This issue is blocked on having PetSets in Kubernetes kubernetes/kubernetes#18016 in order to have a single Deployment Config and independent storage for replicated pods.

A known workaround for having persistent storage with replication today is to use multiple DCs and never scale up.

from mongodb-container.

omron93 commented on August 11, 2024

#206 implementing this feature was merged. So closing this issue.

from mongodb-container.

Replication example with persistent storage about mongodb-container HOT 10 CLOSED

Comments (10)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent