Giter Site home page Giter Site logo

remram44 / ceph-k8s-backup Goto Github PK

View Code? Open in Web Editor NEW
1.0 2.0 1.0 139 KB

A backup solution for Ceph and Kubernetes, using Restic

License: MIT License

Dockerfile 6.61% Python 84.71% Smarty 6.76% Shell 1.92%
backup ceph kubernetes restic restic-backup restic-backups

ceph-k8s-backup's Introduction

This is an application used to make regular backups of Ceph data in a Kubernetes cluster.

It runs a container that periodically enumerates PersistentVolumeClaims on the cluster and backs them up using Restic.

What's backed up

It supports RBD filesystems, RBD block devices, and CephFS volumes:

  • An RBD filesystem is the fastest storage Ceph can provide. It consists of an RBD image on Ceph that is formatted as ext4 and mounted in a pod. Because it uses an emulated device, it can only be mounted read-write on one machine at a time (ReadWriteOnce or ReadOnlyMany)
    • This tool backs up RBD filesystems by creating a snapshot, creating a new image from the snapshot (we need to write to it to fix the filesystem if it was under use), mounting the image, and running Restic on the filesystem contents
  • An RBD block device is a raw RADOS image that is exposed to container as a block device. It is useful for specific situations like running virtualization software. We don't know what's on the image (there can be multiple partitions, any filesystem, etc) and we want exact recovery of the whole disk.
    • This tool backs up RBD block devices by creating a snapshot, reading the image layout from Ceph, and streaming it from Ceph into Restic in QCOW2 format. This method allows us to skip empty blocks in the source (that we discover from the image layout) by creating a sparse QCOW2 file, rather than reading the full image from Ceph which would include unallocated blocks. Streaming it to Restic allows us to consume very little space during the process.
  • CephFS volumes are distributed file shares that are accessed using a file-based API. Their advantage is that they can be mounted on multiple machines at the same time, and Ceph can apply access control to directories.
    • TODO: Figure out plan
    • Just do regular snapshots of the CephFS?

Where is it backed up

The data backed up with Restic ends up in a single Restic repository. Each PersistentVolumeClaim appears as a different hostname:

  • rbd-fs-<kubernetes-namespace>-nspvc-<pvc name> for RBD volumes in Filesystem mode
  • rbd-block-<kubernetes-namespace>-nspvc-<pvc name> for RBD volumes in Block mode (backed up as a single qcow2 file)

Configuration

Global configuration:

  • How often to run
  • The Restic repository

Annotations on Kubernetes namespaces:

  • cephbackup.hpc.nyu.edu/backup (true/false) indicates that PVCs in this namespace should not be backed up

Annotations on Kubernetes PersistentVolumeClaims (can be set by users):

  • cephbackup.hpc.nyu.edu/backup (true/false) indicates that this PVC should not be backed up

Annotations on Kubernetes PersistentVolumes:

  • cephbackup.hpc.nyu.edu/backup (true/false) indicates that this PV should not be backed up

In addition, an annotation cephbackup.hpc.nyu.edu/last-backup is set on the PVC by this system show the date of the last backup.

A volume is backed up if:

  • backup is true on the PV
  • or backup is not set on the PV and
    • backup is true on the namespace
      • and backup is true or not set on the PVC
    • or backup is not set on the namespace
      • and backup is true or not set on the PVC

This means that an administrator, who can set annotations on namespaces and PVs, can override the decisions of a user, who can only set annotations on PVCs.

ceph-k8s-backup's People

Contributors

remram44 avatar

Stargazers

 avatar

Watchers

 avatar  avatar

Forkers

antonoks

ceph-k8s-backup's Issues

Maybe don't backup immediately

Original issue by @remram44, 2023-05-12

The system treats "no last backup date" as "needs to be backed up". This means that if you create a PVC, use it for a a few hours, and remove it, it will have get backed up before the first hour was up (depending on when the backup job runs).

Problems with that:

  • You probably don't have any data in there the first time it gets picked up by ceph-backup
  • A test PVC will get backed up, wasting backup space

Doing the first backup once the PVC has exited for a few hours might be better (for example 6 hours). This can be achieved by looking at the creationTimestamp.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.