Comments (9)
Both Priam and cassandra_snap leverages Cassandra Snapshot. See Priam Backup.
We may be able to run Priam as the sidecar container of the service container. The sidecar container could access the same volume with the service container. AWS ECS supports multiple containers in one TaskDefinition. Kubernetes supports Pod, which could also run multiple containers. So the sidecar container will work for both AWS ECS and Kubernetes. While, currently Docker Swarm does not have this ability. Some prototype will be required to see if Priam works in the sidecar container. Of course, developing our own sidecar container is another option.
We could also consider to leverage EBS Snapshot. FireCamp Cassandra enables the remote JMX. We could run one or multiple job containers, which runs nodetool to connect to one or multiple Cassandra containers and flush the memtables to disk. After the flush finishes for all replicas, another job container(s) could be run to take snapshot of the EBS volumes. This is also eventual consistency with taking snapshot of all Cassandra replicas, and relies on Cassandra's built-in consistency mechanisms to resume consistency for the restored snapshot.
AWS EBS Snapshots are incremental backups, which means that only the blocks on the device that have changed after your most recent snapshot are saved.
The restore will only take 3 steps: 1) stop all containers. 2) restore the EBS to snapshot. 3) start containers.
from firecamp.
Thanks for sharing your thoughts!
I might be wrong, but according to Netflix/Priam#649, Priam can't be used just as a backup solution.
What do you think of having a cronjob on the each C* node which will launch nodetool snapshot followed by a aws ec2 create-snapshot and aws ec2 delete-snapshot (for old snapshots) for the volumes of that node? This job could be created/altered/deleted by a command to firecamp-manageserver. Besides time of backup we might set snapshot volumes tags (with, for example, the node information), retention time, email or SNS topic for alerts in case of issues, etc.
It would be great to automate the restore by a command to the manage server as well, giving the time of available backup. A list command could display the available recovery points (based on snapshot tags and datetime).
Having this implemented would also simplify launching a new C* from existing backup.
from firecamp.
Yes, Priam is more than backup/recovery. Didn't check the detail design/implementation. As you posted, it might not be able to only use the backup function.
The cronjob may not be the best option. The nodes in one cluster may run multiple services. Different services will have different requirements for backup. The cronjob will end up to handle all services. It would be better to use the job container, which could be triggered on demand. The job container could launch nodetool flush and then call aws api to create EBS snapshot. Every service could have its own job container.
We should use nodetool flush instead of nodetool snapshot. The nodetool snapshot will create the hardlinks to the SSTables. If you don't delete the hardlinks by yourselves, the SSTable will never get deleted. The disk will fill up.
Yes, we could automate the restore. A list command will be part of the general data management framework.
We will evaluate other services as well for the general data management framework design.
from firecamp.
By C* node I meant a container where C* daemon is running, not an EC2 instance. Sorry for misleading.
And, yes, of course flush, not snapshot. My bad.
Looks like the separate container (within the same task) indeed better than cronjob from the different services backup management point of view: each service might have a backup job container with its own logic.
from firecamp.
Created two scripts to backup and restore:
https://gist.github.com/jazzl0ver/c6859e1615a0f97b8704052db0745e25
https://gist.github.com/jazzl0ver/c87c5ebfd76c07b56ffe8448f40e737b
from firecamp.
from firecamp.
I don't use mongodb, so no luck here. Regarding Kafka - why do you need to back it up?
from firecamp.
from firecamp.
I'm not a part of CloudStax team, so I can't spend time on services we don't use. Sorry about that.
from firecamp.
Related Issues (20)
- Kafka JMX metrics are not available HOT 1
- update service doesn't change task definition HOT 2
- Zookeeper JMX port is not reachable
- Automatically add CloudWatch Logs filters and alarms
- Kafka configuration changes HOT 6
- MySQL/MardiaDB support? HOT 2
- zookeeper needs to be restarted after upgrading instance type HOT 18
- Unable to start kafka service HOT 10
- zookeeper error HOT 2
- The following resource(s) failed to create: [LambdaCustomResource] HOT 4
- Replace Kafka with the newest HOT 10
- how to update ecs agent? HOT 1
- Referencing Subnets (need output in master stack) HOT 1
- Unable to connect kafka outside containers programatically HOT 2
- Enable SSL for kafka HOT 4
- New ecs agent HOT 1
- Restore kafka data from another volume/snapshot
- Multi-Region Deployment HOT 1
- Questions about enable_materialized_views and enable_transient_replication HOT 1
- Show Error Details
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from firecamp.