Comments (10)
@wattsteve @bparees @deanpeterson I'm bringing the conversation about how to best implement replication for MongoDB in containers to this spot.
[...] obviously this means the mongo replica example needs to setup replicated shards to ensure that data is not lost if a single pod fails.
@bparees, to some extent, isn't a redeploy equivalent to all pods failing in a short time frame? So it is not enough to tolerate losing a pod (with it's data replicated elsewhere), but, for what I can imagine, we need to have persistent storage for at least one replica of each shard from which we could bring all shards back on after a new deployment is rolled.
its looking like the best way to do this for now, is to not use RCs, and have curated pods that use NodeSelectors, so they are scheduled on hosts that have the right storage available.
@wattsteve, scaling up is one of the things an RC can give, but we don't really need to use that. The other aspect of having an RC is that it can work as a "supervisor" that will make sure the pods that should be running are running.
I wrote a brief note about that in the trello card for this feature. IMHO instead of deploying pods by themselves we should use OpenShift DeploymentConfigs with replicas=1 -- that's what we do in other places, like the examples we have for MySQL and PostgreSQL in which we have a single master in a DC with replicas=1.
The problem with that approach is that there's no obvious way to "synchronize" the multiple DCs to run the logic we have currently to setup the MongoDB Replica Set. The current example uses a DC post-deploy hook that will fire a run-once pod that wires together the replica set members.
I heard an idea from @mfojtik to use e.g. etcd (in yet another pod/DC) to keep the replica set info and use that to keep any and every running MongoDB replica set member configured.
Things I'd like to see in the replication example:
- Resilience to single pod failure with automatic rescheduling
- Passwords (admin & app user) can be changed via OpenShift/DeploymentConfig
- Redeployment at any time without data loss
- Usage of faster local storage wherever possible
- The example should be fully automated (via OpenShift templates), and not require users to do more than
oc new-app mongo-clustered.json
(given that some requirements are met, like the existence of suitable PVs or any other form of required storage)
Note that we don't have sharding involved in the example.
from mongodb-container.
@bparees, to some extent, isn't a redeploy equivalent to all pods failing in a short time frame? So it is not enough to tolerate losing a pod (with it's data replicated elsewhere), but, for what I can imagine, we need to have persistent storage for at least one replica of each shard from which we could bring all shards back on after a new deployment is rolled.
that's a deployment strategy problem. we have a rolling deployment which should be doing 1:1 replacements of pods so that it does not take the entire cluster down simultaneously.
from mongodb-container.
If we'll rely on rolling deployment, then we need a readiness probe or the like, to be able to wait until the replacement has joined the replica set before shutting down the next pod.
from mongodb-container.
If we'll rely on rolling deployment, then we need a readiness probe or the like, to be able to wait until the replacement has joined the replica set before shutting down the next pod.
sure. but that's just using the platform the way it's designed to be used. (strictly speaking you also need to wait for the new member to have completed replicating/catching up. not sure if mongo makes that distinction clear or not)
from mongodb-container.
AFAIK, depending on oplog size and # writes/sec, we can get to a point where it is impossible to fire up a new replica.
Not sure if I recall the working details correctly, but a replica starting from scratch would need to 1) do a full sync of existing data; plus 2) sync all data from when 1 started up till "now" (that needs to come from the oplog). If the oldest entry in the oplog is older than when the full sync started, replication fails.
In this kind of scenario, an online replication is impossible, or at least it used to be in the past. I heard this from MongoDB engineers several years ago in one of their events.
from mongodb-container.
At the end of the day, the solution to these problems is "disable automatic deployments and use a controlled process to roll out a new deployment, possibly including quiescing traffic".
from mongodb-container.
@rhcarvalho I acknowledge and agree with your point on the fact that RCs can provide a Singleton pattern when there is only one replica whereas standalone pods do not offer that.
To your other point about needing "persistent storage for at least one replica of each shard". I think that depends on the size of the Mongo cluster and how many replicas there are. If you have 10 ephemeral (EmptyDir) replicas for each shard, with each pod/shard replica on an independent server (i.e. a large cluster) then perhaps you have enough replicas to be safe, whereas if you have a small cluster and can only manage 1 or 2 replicas for each shard then one of those should probably be persistent. I think eventually, it would be good for us to provide the option to tune things like this.
If we are going to use an RC for this, we need a way to ensure that there is never more than a single MongoDB pod on each server. This is because we would introduce a new failure domain if we have two pods on the same server and the one pod is providing a shard replica for the other.
from mongodb-container.
pls take into consideration that mongodb 3.2 changes the default storage engine to wiredtiger: https://docs.mongodb.org/manual/core/wiredtiger/
from mongodb-container.
This issue is blocked on having PetSets in Kubernetes kubernetes/kubernetes#18016 in order to have a single Deployment Config and independent storage for replicated pods.
A known workaround for having persistent storage with replication today is to use multiple DCs and never scale up.
from mongodb-container.
#206 implementing this feature was merged. So closing this issue.
from mongodb-container.
Related Issues (20)
- Create StatefulSet deployment for PostgreSQL HOT 2
- Replication support HOT 1
- erroneous error message when specifying admin password and a database name HOT 3
- Having an external route for mongo HOT 10
- Using this for production apps? HOT 5
- Deploying with glusterfs storage volume HOT 2
- run-mongod ignores rest of arguments HOT 1
- petset-replicas - unclear shutdown HOT 6
- Document how to back-up data from container HOT 2
- Provide example, test and doc for using own SSL cert HOT 1
- connecting as admin via gui HOT 2
- Support for journaling HOT 3
- Regression introduce with restarting slave pods with PVs HOT 3
- Publish Mongodb 4 image HOT 5
- First replica set member is not adding to existing replica set after failure HOT 5
- PVC with replicas HOT 2
- Exception in NodeJS app when new version deployed HOT 1
- OpenShift InstantiateTemplate is not generating admin password HOT 1
- MongoDB images newer than 3.6? HOT 2
- upgrade to v > 4.0 HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mongodb-container.